Reverse Dependencies of datasets
The following projects have a declared dependency on datasets:
- stf-test1 — stf
- STFD — STFD: Series of deep learning-based foundation models for spatial transcriptomic data analysis
- stitchnet — no summary
- stonkgs — Sophisticated Transformers for Biomedical Text and Knowledge Graph Data
- stormtrooper — Transformer/LLM-based zero and few-shot classification in scikit-learn pipelines
- stos — Converting the American sign language into speech or text, and vice versa.
- strategais — A Python library for deploying large language models (LLMs) in local environments.
- streamlit-huggingface — Streamlit components to build interactive Huggingface-powered apps.
- string2string — String-to-String Algorithms for Natural Language Processing
- ststransformers — An easy-to-use wrapper library for using Transformers in Semantic Textual Similarity Tasks.
- sumformer2 — Summarisation Transformer 2
- summertime — Text summarization toolkit for non-experts
- SuperDuperDB — 🔮 Super-power your database with AI 🔮
- surprise-similarity — Context-aware similarity score for embedding vectors
- sweagent — The official SWE-agent package - an open source Agent Computer Interface for running language models as software engineers
- swebench — The official SWE-bench package - a benchmark for evaluating LMs on software engineering
- SwissArmyTransformer — A transformer-based framework with finetuning as the first class citizen.
- syne-tune — Distributed Hyperparameter Optimization on SageMaker
- synthegrator — Framework for code synthesis and AI4SE research
- t2t-tuner — Convenient Text-to-Text Training for Transformers
- t2v-metrics — Evaluating Text-to-Visual Generation with Image-to-Text Generation.
- t5s — T5 Summarisation Using Pytorch Lightning
- tabgenie — TabGenie: A toolkit for table-to-text generation.
- tailor-nlp — no summary
- taker — Tools for Transformer Activations Knowledge ExtRaction
- talk-summarizer — Python library to summarize talks
- tasknet — Seamless integration of tasks with huggingface models
- tasksource — Preprocessings to prepare datasets for a task
- tcrf — A deep learning based sequence tagging library with CRF layer on the top of transformer models.
- temporal-taggers — Neural temporal taggers with Transformer architectures
- tensorflow-datasets — tensorflow/datasets is a library of datasets ready to use with TensorFlow.
- test-data-modori — LMOps Tool for Korean
- test-openvalidators — Openvalidators is a collection of open source validators for the Bittensor Network.
- Tetra-Model-Zoo — Models optimized for export to run on device.
- tevatron — Tevatron: A toolkit for learning and running deep dense retrieval models.
- text-dedup — no summary
- text-det-metric — Tool of computing the metric of text detection
- text-machina — Text Machina: Seamless Generation of Machine-Generated Text Datasets
- text-rec-metric — Tool of computing the metric of text recognition
- text2tac — text2tac converts text to actions
- textattack — A library for generating text adversarial examples
- textdescriptives — A library for calculating a variety of features from text using spaCy
- textflint — Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
- textkit-learn — Helps computers to understand human languages.
- textnoisr — Add noise to text at the character level
- tf-shb-gabriel-0302 — State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
- tfds-nightly — tensorflow/datasets is a library of datasets ready to use with TensorFlow.
- tglcourse — work-in-progress course
- that-nlp-library — Aim to be a convenient NLP library with the help from HuggingFace
- thebestllmever — andromeda - Pytorch
- thermostat-datasets — Collection of NLP model explanations and accompanying analysis tools
- thirdai — A faster cpu machine learning library
- thsquant — A short description of your awesome package
- timething — Aligning text transcripts with their audio recordings.
- tnkeeh — no summary
- tok-det-metric — no summary
- tokenizers — no summary
- tokenizers-gt — no summary
- TokenProbs — Extract token-level probabilities from LLMs for classification-type outputs.
- toolformer — Implementation of Toolformer
- torchat — Package for finetuning LLMs using native PyTorch
- torchchat — Package for finetuning LLMs using native PyTorch
- torchtitan — A native-PyTorch library for large scale LLM training
- torchtune — A native-PyTorch library for LLM fine-tuning
- trailmet — Transmute AI Model Efficiency Toolkit
- trainllm — Fine-tuning LLMs for instruction using QLoRA.
- transformer-heads — Attach custom heads to transformer models.
- transformer-lens — An implementation of transformers tailored for mechanistic interpretability.
- transformer-vae — Interpolate between discrete sequences.
- transformers — State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
- transformers-collection — A collection of transformer models built using huggingface for various tasks.
- transformers-crf — Transformers CRF: CRF Token Classification for Transformers
- transformers-domain-adaptation — Adapt Transformer-based language models to new text domains
- transformers-rlfh — RLFH with transformers
- transfusion — Transformers 🤝 diffusion
- trapper — State-of-the-art NLP through transformer models in a modular design and consistent APIs.
- treeprompt — Tree prompting
- TrendFlow — A tool for literature research and analysis
- Trial2Vec — Pretrained BERT models for encoding clinical trial documents to compact embeddings.
- triple-encoders — Distributed Sentence Transformer Representations with Triple Encoders
- trl — Train transformer language models with reinforcement learning.
- trustllm — TrustLLM
- ttqakit — TTQAKit: A toolkit for Text-Table Hybrid Question Answering.
- tuned-lens — Tools for understanding how transformer predictions are built layer-by-layer
- turkish-lm-tuner — Implementation of the Turkish LM Tuner
- turque — Turkish question answering and generation tool.
- turques — Turkish question answering and generation tool.
- twinbooster — Python package for TwinBooster: Synergising Chemical Structures and Bioassay Descriptions for Enhanced Molecular Property Prediction in Drug Discovery
- twitter-demographer — Twitter Demographer
- txtchat — Retrieval augmented generation (RAG) and language model powered search applications
- txtinstruct — Datasets and models for instruction-tuning
- tydataprep — prepare your dataset for finetuning LLMs
- ultimate-utils — Brando's ultimate utils for science, machine learning and AI
- umbrela — A Package for generating query-passage relevance assessment labels.
- UnBIAS — A package based on LLMs for detecting bias, performing named entity, and debiasing text.
- uniem — unified embedding model
- unitorch — unitorch provides efficient implementation of popular unified NLU / NLG / CV / CTR / MM / RL models with PyTorch.
- unitxt — Load any mixture of text to text data in one line of code
- upit — Unpaired Image-to-Image Translation with PyTorch+fastai
- url-text-module — Text Module of REACT