Reverse Dependencies of librosa
The following projects have a declared dependency on librosa:
- hmc-mir — Collection of tools developed by HMCs MIR Lab
- horoscopy — Python module for speech signal processing
- howl — A wake word detection toolkit
- huggingsound — HuggingSound: A toolkit for speech-related tasks based on HuggingFace's tools.
- humanlikehearing — Psychometric testing on Automatic Speech Recognition systems
- hyperion-ml — Toolkit for speaker recognition
- iatorch — PyTorch Wrapper for Inspection AI
- iautils — Image & Audio Common Utils
- ichigo-asr — Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the whisper-medium model, designed to enhance performance on multilingual with minimal impact on its original English capabilities. Unlike models that output continuous embeddings, Ichigo Whisper compresses speech into discrete tokens, making it more compatible with large language models (LLMs) for immediate speech understanding.
- ichigo-whisper — Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the whisper-medium model, designed to enhance performance on multilingual with minimal impact on its original English capabilities. Unlike models that output continuous embeddings, Ichigo Whisper compresses speech into discrete tokens, making it more compatible with large language models (LLMs) for immediate speech understanding.
- iluvatar — With each note, Eru crafts worlds, blending music and magic into stunning realities.
- inf-rvc-py — Python wrapper for fast inference with rvc
- infer-rvc-python — Python wrapper for fast inference with rvc
- inferrvc — High performance RVC inferencing, intended for multiple instances in memory at once. Also includes the latest pitch estimator RMVPE, Python 3.8-3.11 compatible, pip installable, memory + performance improvements in the pipeline and model usage.
- infervcpy — Python wrapper for fast inference with rvc
- inspiremusic — InspireMusic: A Fundamental Music, Song and Audio Generation Framework and Toolkits
- instruwav — Generate sounds using a base note
- insynth — Domain-specific generation of test inputs for robustness testing of ML models
- intelli — Build your chatbot or AI agent with Intellinode – we make every model smarter.
- iracema — Audio Content Analysis for Research on Musical Expressiveness and Individuality
- Jabberjay — 🦜 Synthetic Voice Detection
- jac-speech — no summary
- jack-audio — A Python package for stationary audio noise reduction.
- jackAudio — A Python package for stationary audio noise reduction.
- jadia — JaNet diarization package
- jarvis-akul2010 — A library built to make it extremely easy to build a simple voice assistant.
- jaseci-ai-kit — no summary
- Jems-Video — Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts.
- jina — Jina (v%s) is a cloud-native semantic search engine powered by deep neural networks.It provides a universal solution of large-scale index and query for media contents.
- joinvoice — Join few voice messages into one with this script
- jotts — JoTTS is a German text-to-speech engine.
- jrvc — Libraries for RVC inference
- jump-reward-inference — A package for fast real-time music joint rhythmic parameters tracking including beats, downbeats, tempo and meter using the BeatNet AI, a super compact 1D state space and the jump back reward technique
- kadtk — A toolkit library for Kernel Audio Distance.
- kapre — Kapre: Keras Audio Preprocessors. Keras layers for audio pre-processing in deep learning
- katara — custom dev/research toolkit by tensorkelechi
- ketos — MERIDIAN Python package for deep-learning based acoustic detector and classifiers
- kfe — File Explorer and Search Engine for locally stored multimedia
- klay-beam — Toolkit for massively parallel audio processing via Apache Beam
- klio-audio — Library for audio-related Klio transforms and helpers
- konnyaku-gpt — AI-powered multimodal subtitle generator.
- kudio — Audio Toolbox™ KUDIO
- laion-clap — Contrastive Language-Audio Pretraining Model from LAION
- langchain_1111_Dev_cerebrum — Building applications with LLMs through composability
- langchain-by-johnsnowlabs — Building applications with LLMs through composability
- langchain-xfyun — 在LangChain中流畅地使用讯飞星火大模型
- langchaincoexpert — Building applications with LLMs through composability
- langchainn — Building applications with LLMs through composability
- lcp-video — LCP video analysis
- lhvqt — Frontend filterbank learning module with HVQT initialization capabilities
- libf0 — A Python Library for Fundamental Frequency Estimation in Music Recordings
- libfmp — Python module for fundamentals of music processing
- libmv — a library to create music videos
- libquantum — Library for implementing standardized time-frequency representations.
- Librosax — Librosa in JAX
- libsoni — A Python toolbox for sonifying music annotations and feature representations
- libtsm — Python Package for Time-Scale Modification and Pitch-Shifting
- lightning-flash — Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes.
- lightwood — Lightwood is Legos for Machine Learning.
- lipsync — lipsync is a simple and updated Python library for lip synchronization, based on Wav2Lip. It synchronizes lips in videos and images based on provided audio, supports CPU/CUDA, and uses caching for faster processing.
- liveaudio — Real-time audio processing library based on librosa
- llamafactory — Unified Efficient Fine-Tuning of 100+ LLMs
- llm-clap — Generate embeddings for audio files using CLAP with llm
- llm-optimized-inference — no summary
- llmchatbot — LLM-based Chatbot
- lm-audioslicer — Tool for slicing long audio files for datasets
- lmms-eval — A framework for evaluating large multi-modality language models
- Lossless-BS-RoFormer — Lossless BS-RoFormer - Band-Split Rotary Transformer for SOTA Music Source Separation
- LPCTorch — LPC Utility for Pytorch Library.
- lungdata — no summary
- lvc — Unofficial pip package for zero-shot voice conversion
- MAAP — no summary
- macls — Audio Classification toolkit on Pytorch
- mad-metric — A metric for acoustic music evaluation based on MAUVE and MERT
- maestro-music — A simple command line tool to play songs (or any audio files, really).
- mafe — Music Audio Feature Extractor
- magenta — Use machine learning to create art and music
- magenta-gpu — Use machine learning to create art and music
- malaya-speech — Speech-Toolkit for bahasa Malaysia, powered by Tensorflow and PyTorch.
- malayalam-asr-benchmarking — A study to benchmark whisper based ASRs in Malayalam
- mamkit — A Comprehensive Multimodal Argument Mining Toolkit.
- masked_prosody_model — no summary
- matcha-tts — 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
- mayavoz — Deep learning toolkit for speech enhancement
- mcp-music-analysis — \A MCP server with music tools\
- mdxnet — Ultimate Vocal Remover using MDX Net
- measure-spkr — An app for measuring impulse-frequency response of speaker
- meddle — Agentic Medical Deep Learning Engineer
- megatts — MegaTTS 3 - A lightweight and efficient TTS system with ultra high-quality voice cloning
- melottsIcanwang — no summary
- mexca — Emotion expression capture from multiple modalities.
- microfaune-ai — Module package used for the Microfaune project
- mimikit — Python package for generating audio with neural networks
- mindtorch — MindTorch is a toolkit for support the PyTorch model running on Ascend.
- minimalml — A python package for out-of-the-box ML solutions
- mir-bootleg-score — Built for MIR Lab. Tools for converting png images into bootleg score features.
- mirdata — Common loaders for MIR datasets.
- Mix50 — Fast and simple DJ and audio effects in Python with Librosa
- mixedvoices — Analytics and Evaluation Tool for Voice Agents
- mixsim — An open-source dataset for multiple purposes, such as speaker localization/tracking, dereverberation, enhancement, separation, and recognition.