Wheelodex — tokenizers — Reverse Dependencies

Wheelodex » Projects » tokenizers » Reverse Dependencies

Reverse Dependencies of tokenizers

The following projects have a declared dependency on tokenizers:

nhelper — 🧪 Behavioral tests for NLP models 🧪
nlpipes — Text Classification with Transformers
nnsight — Package for interpreting and manipulating the internals of deep learning models.
novelai-api — Python API for the NovelAI REST API
novelai-python — NovelAI Python Binding With Pydantic
NovelAILLMWrapper — no summary
npc-engine — Deep learning inference and NLP toolkit for game development.
nynoflow — NynoFlow
omdenalore — AI for Good library
one-api-tool — Use only one line of code to call multiple model APIs similar to ChatGPT. Currently supported: Azure OpenAI Resource endpoint API, OpenAI Official API, and Anthropic Claude series model API.
open-retrievals — Text Embeddings for Retrieval and RAG based on transformers
OpenBMB — Create a Python package.
opencompass — A comprehensive toolkit for large model evaluation
OpenELM — Evolution Through Large Models
openicl — An open source framework for in-context learning.
OpenNIR-XPM — OpenNIR: A Complete Neural Ad-Hoc Ranking Pipeline (Experimaestro version)
openparse — Streamlines the process of preparing documents for LLM's.
optimum-graphcore — Optimum Library is an extension of the Hugging Face Transformers library, providing a framework to integrate third-party libraries from Hardware Partners and interface with their specific functionality.
optimum-transformers — Accelerated nlp pipelines using Transformers, Optimum and ONNX Runtime
os-copilot — An self-improving embodied conversational agents seamlessly integrated into the operating system to automate our daily tasks.
osc-llm — 大模型训练,推理,部署工具
own-knowledge-gpt — Custom Knowledge GPT
pai-easynlp — PAI EasyNLP Toolkit
PaoDing — An NLP-oriented PyTorch wrapper that makes your life easier.
papermage — Papermage. Casting magic over scientific PDFs.
parlai — Unified platform for dialogue research.
peelml — Peel away the pain of ml deployment
perceiver-io — Perceiver IO
petals — Easy way to efficiently run 100B+ language models without high-end GPUs
pix2tex — pix2tex: Using a ViT to convert images of equations into LaTeX code.
platform-gen-ai — This is pipeline code for accelerating solution accelerators
promptbench — PromptBench is a powerful tool designed to scrutinize and analyze the interaction of large language models with various prompts. It provides a convenient infrastructure to simulate **black-box** adversarial **prompt attacks** on the models and evaluate their performances.
PromptEHR — Sequence patient electronic healthcare record generation with large language models (LLMs) as the neural database.
promptflow-gui — Create flowcharts to control LLMs
promptz — A Python package for interactive prompts
pubmad — Useful tools to work with biology
punctfix — Punctuation restoration library
pureml-llm — no summary
py-vcon-server — server for vCon conversational data container manipulation package
pydata-wrangler — Wrangle messy data into pandas DataFrames, with a special focus on text data and natural language processing
pydebiaseddta — Python library to improve generalizability of the drug-target prediction models via DebiasedDTA
pydta — A Python package for drug-target affinity prediction using biomolecular language processing
pygaggle — A gaggle of rerankers for CovidQA and CORD-19
pyllmsearch — LLM Powered Advanced RAG Application
pyrit — The Python Risk Identification Tool for LLMs (PyRIT) is a library used to assess the robustness of LLMs
pytorch-clip-interrogator — Prompt engineering tool using BLIP 1/2 + CLIP Interrogate approach.
qbchemchef — LLM-based tools for information retrieval
rag4p — This project I use a lot for workshops, it contains some utils for splitters, tokenizers, and a weaviate client that I reuse a lot
rannet — Recurrent Attention Networks
rapid-latex-ocr — Tool of converting images of equations into LaTeX code.
rapidnlp-datasets — Data pipelines for TensorFlow and PyTorch.
referral-augment — Official implementation of "Referral Augmentation for Zero-Shot Information Retrieval"
reinforcer — Reinforcement learning
retri-evals — Open-source tool for building and evaluating retrieval pipelines.
retvec — Resilient and Efficient Text Vectorizer
rewardbench — Tools for evaluating reward models
ReWord — Reorder word in English sentence to follow correct grammar
robocat — Robo CAT- Pytorch
rt2 — rt-2 - PyTorch
ruth-text-to-speech — A Python CLI for Ruth NLP
ruth-tts-converter — A Python CLI for Ruth NLP
ruth-tts-converter-python — A Python CLI for Ruth NLP
rwkv — The RWKV Language Model
rwkv-beta — The RWKV Language Model
rwkv-paddle — The RWKV Language Model on PaddlePaddle
rwkvstic — A package for loading rwkv on a larger range of devices
safe-mol — Implementation of the 'Gotta be SAFE: a new framework for molecular design' paper
sagemode — Deploy, scale, and monitor your ML models all with one click. Native to AWS.
samosila-core — no summary
scikit-embeddings — Tools for training word and document embeddings in scikit-learn.
sconce — Model Compresion Made Easy
seaqube — Semantic Quality Benchmark for Word Embeddings, i.e. Natural Language Models in Python. The shortname is `SeaQuBe` or `seaqube`. Simple call it '| ˈsi: kjuːb |'
searchdatamodels — no summary
semantic-search-faiss — Semantic search to query covid related papers
semantic-text-splitter — Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from Rust and Python.
semiolog — Tools for the semiological analysis of corpora
sentivi — A simple tool for Vietnamese Sentiment Analysis
separability — LLM Tools for looking at separability of LLM Capabilities
sgnlp — Machine learning models from Singapore's NLP research community
shbtf0302 — State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
short-poetry — no summary
shtec-rlhf — shtec-rlhf: Safe Reinforcement Learning from Human Feedback
simple-generation — A python package to run inference with HuggingFace checkpoints wrapping many convenient features.
simple-latex-ocr — A simple LaTeX OCR package
simpletransformers — An easy-to-use wrapper library for the Transformers library.
simpletransformers-fork-trialandsuccess — An easy-to-use wrapper library for the Transformers library. FORK: This fork adds T5TokenizerFast and umT5 support.
simpletransformers-le — An easy-to-use wrapper library for the Transformers library.
skorch — scikit-learn compatible neural network library for pytorch
smart-chromadb — Chroma.
smile-datasets — La**S**t **mile** datasets: Use `tf.data` to solve the last mile data loading problem for tensorflow.
snowflake-ml-python — The machine learning client library that is used for interacting with Snowflake to build machine learning solutions.
soco-tokenizer — Fast tokenizer
sparse_autoencoder — Sparse Autoencoder for Mechanistic Interpretability
speechless — LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.
sphinx-summaries — no summary
SPLADERunner — Ultralight and Fast wrapper for the independent implementation of SPLADE++ models for your search & retrieval pipelines. Models and Library created by Prithivi Da, For PRs and Collaboration to checkout the readme.
spokestack — Spokestack Library for Python
stf-test1 — stf
stonkgs — Sophisticated Transformers for Biomedical Text and Knowledge Graph Data
stos — Converting the American sign language into speech or text, and vice versa.

1 2 3 4