Wheelodex — tokenizers — Reverse Dependencies

Wheelodex » Projects » tokenizers » Reverse Dependencies

Reverse Dependencies of tokenizers

The following projects have a declared dependency on tokenizers:

flex-model — FlexModel - A Framework for Interpretability of Distributed Large Language Models
fluidml — FluidML is a lightweight framework for developing machine learning pipelines. Focus only on your tasks and not the boilerplate!
fms-hf-tuning — FMS HF Tuning
friday-agent — An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.
fudstop — no summary
gcgc — GCGC is a preprocessing library for biological sequence model development.
general-text-classifier — General Text Classification Library
geniusrise-huggingface — Huggingface bolts for geniusrise
geniusrise-openai — Openai bolts for geniusrise
gft — GFT (general fine-tuning) A Little Language for Deepnets: 1-line programs for fine-tuning, inference and more
gft-cpu — GFT (general fine-tuning) A Little Language for Deepnets: 1-line programs for fine-tuning, inference and more
gpt-command-line — Command-line interface for ChatGPT, Claude and Bard
gpt-readme-reader — A utility to extract setup commands from a GitHub repository
gpt3discord — A Chat GPT Discord bot
gptfast — Accelerate transformer inference by 6-8.5x. Native to Huggingface and PyTorch.
h2ogpt — no summary
halludetector — Hallucination detection package
hammadml-gpu — Hammad Python ~ Machine Learning
happytransformer — Happy Transformer makes it easy to fine-tune NLP Transformer models and use them for inference.
hebspacy — SpaCy pipeline and models for Hebrew text
hezar — Hezar: The all-in-one AI library for Persian, supporting a wide variety of tasks and modalities!
hf-doc-builder — Doc building utility
hf-trim — A tool to reduce the size of Hugging Face models via vocabulary trimming.
homegrid — A minimal home gridworld environment to test how agents use language hints.
IBITTokenizer — Tokenizer for Persian texts based on hazm
if-dsl-gui-ai — For generating and playing IF games
igfold — no summary
imat — Interactive Music Analysis Tool (I-MaT)
indic-punct — Punctuation and inverse text normalization for Indic languages and English
inflecteur — python inflector for French language : control gender, tense and number
instruction-ner — Unofficial implementation of InstructionNER
instructlab — CLI for interacting with InstructLab
insyt — Innovative Network Security Technologies
internet-ml — Internet-ML: Allowing ML to connect to the internet
internet-nlp — Allowing NLPs to connect to the internet
ipex-llm — Large Language Model Develop Toolkit
iqradre — no summary
irisml-tasks-llava — Irisml adapter tasks for LLAVA models
japre — Custom pretokenizers for Japanese language models
jarvis-akul2010 — A library built to make it extremely easy to build a simple voice assistant.
jiant — State-of-the-art Natural Language Processing toolkit for multi-task and transfer learning built on PyTorch.
jshbtf0302 — State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
kanao — Kanao is a project designed to train a GPT (Generative Pre-trained Transformer) model on custom datasets. It provides the capability to train the model using various data sources, including PDFs, Word documents, plain text files, and URLs.
kbve — ATLAS
KD-Lib — A PyTorch model compression library containing easy-to-use methods for knowledge distillation, pruning, and quantization
keywords-en — keywords extract
kgc — Cold Start Construction of Knowledge Graph.
kogpt2-transformers — Transformers library for KoGPT2
lairgpt — A Pytorch-based package by LightOn AI Research allowing to perform inference with PAGnol models.
langchain-mistralai — An integration package connecting Mistral and LangChain
langcheck — Simple, Pythonic building blocks to evaluate LLM-based applications
langml — A Keras-based and TensorFlow-backend language model toolkit.
langport — A large language model serving platform.
langs-vall — Paquete de vall-e-x para proyecto de traduccion de lenguajes
languagemodels — Simple inference for large language models
langumo — The unified corpus building environment for Language Models.
latentscope — Quickly embed, project, cluster and explore a dataset.
leya — A coding assistant to help with repository management and code queries.
litellm — Library to easily interface with LLM API providers
litGPT — Hackable implementation of state-of-the-art open-source LLMs
livestt — Simple and easy to use realtime speech to text
llama-llm — Build on large language models faster
llamada — Build on large language models faster
llava-torch — Towards GPT-4 like large language and visual assistant.
llm-docstring-generator — Code to generate docstrings for Python code using GPT-4 etc.
llm-falcon-model — Microlib for the Falcon LLM
llm2openai — Create a Python package.
llmlite — A library helps to chat with all kinds of LLMs consistently.
llmopenai — Create a Python package.
LLMSmith — Lightweight Python library designed for developing functionalities powered by Large Language Models (LLMs)
llmware — An enterprise-grade LLM-based development framework, tools, and fine-tuned models
lm-detect — Zero-Shot Machine-Generated Text Detection
logai — LogAI is unified framework for AI-based log analytics
longchat — LongChat and LongEval
lp-Aicloud — this a aicloud
manifest-ml — Manifest for Prompting Foundation Models.
methylbert — A Transformer-based model for read-level DNA methylation pattern identification and tumour deconvolution
miditok — MIDI / symbolic music tokenizers for Deep Learning models.
miditok-for-musiclang — A convenient MIDI tokenizer for Deep Learning networks, with multiple encoding strategies
mindformers — mindformers platform: linux, cpu: x86_64
mindnlp — An open source natural language processing research tool box. Git version: [sha1]:18acd45, [branch]: (HEAD -> master, ms/master)
minimagen — Minimal Imagen text-to-image model implementation.
MiSeCom — Detect if the English has missing sentence components such as Subject, Verb, Object
mlm-task-for-contextual-embedding — a machine learning project for mlm task for contextual embedding
mlx-transformers — MLX transformers is a machine learning framework with similar Interface to Huggingface transformers.
mmda — MMDA - multimodal document analysis
modelscope — ModelScope: bring the notion of Model-as-a-Service to life.
molfeat — molfeat - the hub for all your molecular featurizers
MovieChat — Long video understanding
mudes — Toxic Spans Prediction
musiclang-predict — A python package for music notation and generation
mw-adapter-transformers — A friendly fork of HuggingFace's Transformers, adding Adapters to PyTorch language models
mysfire — Fast (and opinionated) data loading for pytorch
naivenlp-datasets — Data pipelines for TensorFlow and PyTorch.
name-entity-extraction-for-contextual-embedding — a machine learning project for mlm task for contextual embedding
needlehaystack — Doing simple retrieval from LLM models at various context lengths to measure accuracy.
nepalitokenizers — Pre-trained Tokenizers for the Nepali language with an interface to HuggingFace's tokenizers library for customizability.
neumai — Package containing connectors for Neum AI.
neureca — A framework for building conversational recommender systems
neurox — Toolkit for Neuron Analysis in Deep NLP Models

1 2 3 4