Wheelodex — tokenizers — Reverse Dependencies

Wheelodex » Projects » tokenizers » Reverse Dependencies

Reverse Dependencies of tokenizers

The following projects have a declared dependency on tokenizers:

adapter-transformers — A friendly fork of HuggingFace's Transformers, adding Adapters to PyTorch language models
ai-dataproc — no summary
AI-ML-Formulas-Recognizer-Extraction — A package for AI recognition tasks developed by Minh Nguyen and Liam.
ai-python — Microsoft AI Python Package
ai2-olmo — Open Language Model (OLMo)
ai21-tokenizer — no summary
aider-chat — aider is GPT powered coding in your terminal
aidevkit — 一些ai开发过程中使用到的工具模块
aihandler — AI Handler: An engine which wraps certain huggingface models
aihandlerwindows — AI Handler: An engine which wraps certain huggingface models
akasha-terminal — document QA package using langchain and chromadb
aleph-alpha-client — python client to interact with Aleph Alpha api endpoints
algorin-cli — Acceso a GPT-3 y procesamiento de documentos desde la línea de comandos.
annolid — An annotation and instance segmentation-based multiple animal tracking and behavior analysis package.
anthropic — The official Python library for the anthropic API
anthropic-bedrock — The official Python library for the anthropic-bedrock API
api2openai — Create a Python package.
archai — Platform for Neural Architecture Search
arcusapi — Arcus Data Platform Client SDK.
ares-ai — ARES is an advanced evaluation framework for Retrieval-Augmented Generation (RAG) systems,
arize — A helper library to interact with Arize AI APIs
attention-sinks — Extend LLMs to infinite length without sacrificing efficiency and performance, without retraining
audiossl — no summary
AudioSummariser — Summarises the text generated from the audio files for quicker resolution. The audio files are typically the customer support recordings for now but the usecase can be extended to more dimensions. Sentiment is analysed and depicted visually.
auto-learn-gpt — autoML for training and inference Deep Learning model
autoawq — AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.
autogluon-contrib-nlp — MXNet GluonNLP Toolkit (DeepNumpy Version)
autopeptideml — AutoML system for building trustworthy peptide bioactivity predictors
awesome-align — An awesome word alignment tool
awessome — awessome
bark — Bark text to audio model
bedrock-anthropic — Client library for the anthropic API with the AWS Bedrock endpoint.
bert-deid — Remove identifiers from data using BERT
bert-embeddings — Create positional embeddings based on TinyBERT or similar bert models
bertnlp — BERT toolkit is a Python package that performs various NLP tasks using Bidirectional Encoder Representations from Transformers (BERT) related models.
bigdl-llm — Large Language Model Develop Toolkit
bisheng-pybackend-libs — libraries for bisheng rt pybackend
blade2blade — Adversarial Training and SFT for Bot Safety Models
botiverse — botiverse is a chatbot library that offers a high-level API to access a diverse set of chatbot models
bpeasy — Fast bare-bones BPE for modern tokenizer training
caikit-nlp — Caikit NLP
canopy-sdk — Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
cclm — NLP framework for composing together models modularly
ChatLLM — Create a Python package.
ChatSQL — Create a Python package.
chromadb — Chroma.
chromadb-pysqlite3 — Chroma.
citoplasm — CITOplasm is a Python library for writing LLM code in a declarative way.
clarinpl-embeddings — no summary
cliqs — Module provides implementation of multilingual crisis social media summarization model.
closeai — Create a Python package.
codegeex — CodeGeeX: A Open Multilingual Code Generation Model.
cody-adapter-transformers — A friendly fork of HuggingFace's Transformers, adding Adapters to PyTorch language models
cohere — no summary
complwetion — Small helper library to build chat applications
compromise-marian — Marian model but with two decoders
ConsistencyBench — Tools and Techniques for Consistency Benchmarking
constituent-treelib — A lightweight Python library for constructing, processing, and visualizing constituent trees.
cpkil — CPR Python Package
CPM-Bee — Create a Python package.
cpm-live — Create a Python package.
crfm-helm — Benchmark for language models
curated-transformers — A PyTorch library of transformer models and components
cursivepy — no summary
DadmaTools — DadmaTools is a Persian NLP toolkit
dalle-pytorch — DALL-E - Pytorch
datatrove — HuggingFace library to process and filter large amounts of webdata
datumaro — Dataset Management Framework (Datumaro)
dbgpt — DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
ddochi — no summary
deepfloyd-if — DeepFloyd-IF (Imagen Free)
deepoffense — Multilingual Offensive Language Identification with Transformers
deepse — **DeepSE**: **Sentence Embeddings** based on Deep Nerual Networks, designed for **PRODUCTION** enviroment!
dgenerate — Batch image generation and manipulation tool supporting Stable Diffusion and related techniques / algorithms, with support for video and animated image processing.
dillagent — Agentic LLM library
dimweb-persona-bot — A dialogue bot with a personality
disformers — Huggingface transformers for discord.
distill-trainer — Knowledge distillation toolkit
dlk — dlk: Deep Learning Kit
Documents-Classifier — A tool to classify images
dolma — Data filters
dooly — A library that handles everything with 🤗 and supports batching to models in PORORO
easy-transformers — Utils for dealing with transformers
easyeditor — easyeditor - Editing Large Language Models
edu-segmentation — To improve EDU segmentation performance using Segbot. As Segbot has an encoder-decoder model architecture, we can replace bidirectional GRU encoder with generative pretraining models such as BART and T5. Evaluate the new model using the RST dataset by using few-shot based settings (e.g. 100 examples) to train the model, instead of using the full dataset.
eir-dl — no summary
emb3d — emb3d.co command line inteface to work with embeddings.
EMO-AI — library for the ai competition, currently private
engawa — no summary
Expanda — Integrated Corpus-Building Environment
explabox-demo-drugreview — Explabox demo for the UCI drug reviews dataset
fast-bert-no-plot — AI Library using BERT
fast-bert-xrendan — AI Library using BERT
fastembed — Fast, light, accurate library built for retrieval embedding generation
faster-translate — A simple translation utility using Hugging Face models.
faster-whisper — Faster Whisper transcription with CTranslate2
Few-Shot-Learning-NLP — This library provides tools and utilities for Few Shot Learning in Natural Language Processing (NLP).
finer — no summary
FlashRank — Ultra lite & Super fast SoTA cross-encoder based re-ranking for your search & retrieval pipelines.
flex-model — FlexModel - A Framework for Interpretability of Distributed Large Language Models

1 2 3 4