Wheelodex — tokenizers — Reverse Dependencies

Wheelodex » Projects » tokenizers » Reverse Dependencies

Reverse Dependencies of tokenizers

The following projects have a declared dependency on tokenizers:

streamlit-chromadb-connection — A simple adapter connection for any Streamlit LLM-powered app to use ChromaDB vector database.
stripedhyena — Model and inference code for beyond Transformer architectures
ststransformers — An easy-to-use wrapper library for using Transformers in Semantic Textual Similarity Tasks.
stuned — Utility code from STAI (https://scalabletrustworthyai.github.io/)
styletts2-fork — Fork of StyleTTS 2 Python packge. StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani, Sidharth Rajaram.
stylometry-utils — Collection of functions and utilities to run stylometry experiments
SudachiPy — Python version of Sudachi, the Japanese Morphological Analyzer
sumformer2 — Summarisation Transformer 2
suno-bark — Bark text to audio model
swiftrank — Compact, ultra-fast SoTA reranker enhancing retrieval pipelines and terminal applications.
syntaxi — Make your tokenizer more syntax-friendly.
t2v-metrics — Evaluating Text-to-Visual Generation with Image-to-Text Generation.
taker — Tools for Transformer Activations Knowledge ExtRaction
test-petals — Easy way to efficiently run 100B+ language models without high-end GPUs
testgailbot002 — GailBot API
testgailbotapi — GailBot Test API
testgailbotapi001 — GailBot Test API
testpydebiaseddta — Python library to improve generalizability of the drug-target prediction models via DebiasedDTA
text-embeddings — zero-vocab or low-vocab embeddings
text-sim — Chinese text similarity calculation package of Tensorflow/Pytorch
text2tac — text2tac converts text to actions
textflint — Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
textwiz — An even simpler way to use open-source LLMs.
tf-shb-gabriel-0302 — State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
thaixtransformers — ThaiXtransformers: Use Pretraining RoBERTa based Thai language models from VISTEC-depa AI Research Institute of Thailand.
the-grid — Easy way to efficiently run 100B+ language models without high-end GPUs
thirdai — A faster cpu machine learning library
timething — Aligning text transcripts with their audio recordings.
tinytensor — tinytensor
TLAF — TLA is built using PyTorch, Transformers and several other State-of-the-Art machine learning techniques and it aims to expedite and structure the cumbersome process of collecting, labeling, and analyzing data from Twitter for a corpus of languages while providing detailed labeled datasets for all the languages.
tokenizer-adapter — A simple to adapt a pretrained language model to a new vocabulary
tokenizers — no summary
tokenizers-gt — no summary
topicgpt — A package for integrating LLMs like GPT-3.5 and GPT-4 into topic modelling
topicmodels — A package for topic modelling in python.
torchblocks — A PyTorch-based toolkit for natural language processing
torchblocks-chen — A PyTorch-based toolkit for natural language processing
totokenizers — Text tokenizers.
trankit — Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
transformers — State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
transformers-cfg — Extension of Transformers library for Context-Free Grammar Constrained Decoding with EBNF grammars
transformers-domain-adaptation — Adapt Transformer-based language models to new text domains
transformers-keras — Transformer-based models implemented in tensorflow 2.x(Keras)
transquest — Transformer based translation quality estimation
trustplutusmr — Entity Market Research
turkish-lm-tuner — Implementation of the Turkish LM Tuner
tweet-se-competition — a machine learning project for kaggle tweet sentiment extraction competition
uform — Pocket-Sized Multimodal AI for Content Understanding and Generation
UnicodeTokenizer — UnicodeTokenizer: tokenize all Unicode text
url-text-module — Text Module of REACT
vall-e-x — An open source implementation of Microsoft's VALL-E X zero-shot TTS
vec2text — convert embedding vectors back to text
vina2vi — no summary
vlite — A simple and blazing fast vector database
vllm-haystack — A simple adapter to use vLLM in your Haystack pipelines.
vltk — The Vision-Language Toolkit (VLTK)
vtorch — NLP research library, built on PyTorch.
weak-annotators — Weak annotators for information extraction (NER)
webull-options — no summary
whisper-s2t — An Optimized Speech-to-Text Pipeline for the Whisper Model.
xmnlp — A Lightweight Chinese Natural Language Processing Toolkit
yolo-world-open — YOLO-World: Real-time Open Vocabulary Object Detection
ytchat — An open platform for training, serving, and evaluating large language model based chatbots.
yuezhlib — Library for preprocessing Cantonese and Written Chinese
zarth-utils — Package used for my personal development on ML projects.
zeldarose — Train transformer-based models
zh-rasa — Chinese NLP tool for RASA

1 2 3 4