Reverse Dependencies of librosa
The following projects have a declared dependency on librosa:
- python-dataset — no summary
- pyvad — 'py-webrtcvad wrapper for trimming speech clips'
- pyw2v2 — Simple wav2vec2 wrapper
- qai-hub-models — Models optimized for export to run on device.
- quantum-inferno — Quantized Information Entropy, Nth Octave (INFERNO)
- quantumaudio — quantumaudio: A Python class implementation for Quantum Representations of Audio in Qiskit. Developed by the quantum computer music team at the Interdisciplinary Centre for Computer Music Research, University of Plymouth, UK
- radtts — RADTTS library
- rapid-paraformer — Tool of speech recognition.
- realbook — Realbook, a library to make using audio on TensorFlow easier.
- resemble-enhance — Speech denoising and enhancement with deep learning
- Resemblyzer — Analyze and compare voices with deep learning
- rest-api-supporter — Rest api supporter
- reviutils — A common library frequently used on python
- rstojnic-tfds-nightly — tensorflow/datasets is a library of datasets ready to use with TensorFlow.
- rt-pie — Real Rime PItch Estimator
- rtst — no summary
- rtvamp — Vamp plugin host for real-time audio feature analysis
- rtvc — Real-Time Voice Conversion GUI
- runes-client — Runes client enables remote execution of python code triggered from a Crucible Plugin on the Signals & Sorcery platform.
- ruptures — Change point detection for signals in Python.
- ruth-text-to-speech — A Python CLI for Ruth NLP
- ruth-tts-converter — A Python CLI for Ruth NLP
- ruth-tts-converter-python — A Python CLI for Ruth NLP
- rvc — An easy-to-use Voice Conversion framework based on VITS.
- rwave — no summary
- s3a-decorrelation-toolbox — Decorrelation algorithm and toolbox for diffuse sound objects and general upmix
- s3prl-vc — Voice conversion toolkit based on S3PRL: Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit
- sadtalker-z — sadtalker
- sagemaker-huggingface-inference-toolkit — Open source library for running inference workload with Hugging Face Deep Learning Containers on Amazon SageMaker.
- saigen-dep-test — A test of using dependencies
- saigen-dep-test-with-poetry — no summary
- samosila-core — no summary
- sample-id — Acoustic fingerprinting for Sample Identification
- scisonify — Turn Scientific Data into Sounds & Music
- seacrowd — no summary
- senselab — Senselab is a Python package that simplifies building pipelines for speech and voice analysis.
- series-intro-recognizer — Find the intro of episodes of a series
- SETools — Speech Enhancement Tools Packages
- shazbot — Sound Hierarchy Attribute Zeitgeist Before Oligarchy Take
- shrinemaiden — An auxiliary library to help process data for ML/DL purposes
- shttst — Shmart TTS tools.
- sideseeing-tools — A set of tools to load, preprocess and analyze data collected through the MultiSensor Data Collection App
- signal-transformation — The package allows performing a transformation of a signal using TensorFlow, Pytorch or LibROSA
- simo — Smart Home on Steroids!
- skelly-synchronize — Basic template of a python repository
- smfiles — Read and Edit .sm files and Measure BPM of songs.
- so-vits-svc-fork — A fork of so-vits-svc.
- so-vits-svc-fork-mandarin — A mandarin translation version of a fork of so-vits-svc.
- socaity — Interface for hosted AI models. Generative AI: text2voice, voice2voice, face2face, etc.Supports local host and remote endpoints.
- socialysis — Tool for analyzing and extracting insights from Facebook Messenger conversations
- somnus — Somnus is keyword detection made easy.
- sonorus — Named after a spell in the Harry Potter Universe, where it amplies the sound of a speaker. In muggles' terminology, this is a repository of modules for audio and speech processing for and on top of machine learning based tasks such as speech-to-text.
- sonosco — Framework for training deep automatic speech recognition models.
- sonusai — Framework for building deep neural network models for sound, speech, and voice AI
- souJpg-diffusers — State-of-the-art diffusion in PyTorch and JAX.
- sound-analyzer-encoder — Sound analyzer and encoder
- Sound-cls — no summary
- soundata — Python library for loading and working with sound datasets.
- soundpy — A research-based framework for exploring sound as well as machine learning in the context of sound.
- soundviewer — Python package for sound visualization
- speaker-verification-toolkit — A package designed to compose speaker verification systems
- speakerbox — Speaker Annotation for Transcripts using Audio Classification
- SpecAugment — A implementation of "SpecAugment"
- spectro-utils — Add a short description here
- speech-collator — A collator for speech datasets with different batching strategies and attribute extraction.
- speech-interface — An interface for neural speech synthesis with Pytorch
- speechaugs — Waveform augmentations
- speechline — An end-to-end, offline, batch audio categorization, transcription, and segmentation.
- speechmix — Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together
- speechtoolkit — ML for Speech presents SpeechToolkit, a unified, all-in-one toolkit for TTS, ASR, VC, & other models.
- speechwidgets — A library with Jupyter widgets for speech processing
- SPEEM — Calculate indicators saved as Excel.
- spela — spectrogram layers
- spiegelib — Synthesizer Programming with Intelligent Exploration, Generation, and Evaluation Library
- spychhiker — Various python class for speech analysis and speech synthesis
- ssr-eval — This package is written for the evaluation of speech super-resolution algorithms.
- stable-diffusion-videos — Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts.
- streamer-torch — Official implementation of STREAMER, a self-supervised hierarchical event segmentation and representation learning
- stt-sample-inspector — Inspect, modify, and add metadata to DeepSpeech (speech-to-text) datasets in CSV format.
- style-bert-vits2 — Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.
- styletts2 — StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani.
- styletts2-fork — Fork of StyleTTS 2 Python packge. StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani, Sidharth Rajaram.
- subaligner — Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers.
- subtoaudio — Subtitle to Audio, generate audio or speech from any subtitle file
- SuperDuperDB — 🔮 Super-power your database with AI 🔮
- supriya — A Python API for SuperCollider
- svc-toolkit — A self-contained singing voice conversion application using the so-vits-svc architecture, with Deep U-Net model for vocal separation feature and easy to use GUI.
- sxmp-mule — no summary
- synctoolbox — Python Package for Efficient, Robust, and Accurate Music Synchronization (SyncToolbox)
- synesthesia-uf — A Python audio image creation tool
- syntheon — Inference parameters of music synthesizers with deep learning
- tacotron — A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis.
- tacotron-cli — Command-line interface (CLI) to train Tacotron 2 using .wav <=> .TextGrid pairs.
- tacotron2 — Tacotron2 library
- talk-summarizer — Python library to summarize talks
- tarzan — high-level IO for tar based dataset
- teamscritique — The funniest joke in the world
- tensionflow — A Tensorflow framework for working with audio data.
- tensorflow-datasets — tensorflow/datasets is a library of datasets ready to use with TensorFlow.
- TensorFlowASR — Almost State-of-the-art Automatic Speech Recognition using Tensorflow 2