Reverse Dependencies of librosa
The following projects have a declared dependency on librosa:
- sample-id — Acoustic fingerprinting for Sample Identification
- sappl — Simple Audio Pre-Processing Library for deep learning audio applications
- scisonify — Turn Scientific Data into Sounds & Music
- seacrowd — no summary
- serato-tools — Serato crate, library database, and track metadata (cues, beatgrid, etc.) modification
- series-intro-recognizer — Find the intro of episodes of a series
- SETools — Speech Enhancement Tools Packages
- sgmse — Speech enhancement model using SGMSE
- shazbot — Sound Hierarchy Attribute Zeitgeist Before Oligarchy Take
- shira_audio — audio search/retrieval library
- shrinemaiden — An auxiliary library to help process data for ML/DL purposes
- shttst — Shmart TTS tools.
- sideseeing-tools — A set of tools to load, preprocess and analyze data collected through the MultiSensor Data Collection App
- sidlingvo — Lingvo utils for Google SVL team
- signal-transformation — The package allows performing a transformation of a signal using TensorFlow, Pytorch or LibROSA
- silentcipher — Artefact free robust high quality audio watermarking
- silero-vad-nuaazs — Silero VAD
- simo — Smart Home on Steroids!
- simple-talking-anime-avatar — A simple python package to generate image or video of a talking avatar based on text or audio.
- sinapsis-langchain-readers — Package that provides support for Langchain community data loaders.
- skelly-synchronize — Basic template of a python repository
- sl-sources — Code for Society Library Sources
- slg-nimrod — minimal deep learning framework
- smfiles — Read and Edit .sm files and Measure BPM of songs.
- sndid — AI Sound Identification.
- snr-calc — A package to calculate SNR from audio files
- so-vits-svc-fork — A fork of so-vits-svc.
- so-vits-svc-fork-mandarin — A mandarin translation version of a fork of so-vits-svc.
- socialysis — Tool for analyzing and extracting insights from Facebook Messenger conversations
- sociaML — sociaML - the Swiss Army knife for audiovisual and textual video feature extraction.
- somnus — Somnus is keyword detection made easy.
- sonata-asr — SONATA: SOund and Narrative Advanced Transcription Assistant
- sonorus — Named after a spell in the Harry Potter Universe, where it amplies the sound of a speaker. In muggles' terminology, this is a repository of modules for audio and speech processing for and on top of machine learning based tasks such as speech-to-text.
- sonosco — Framework for training deep automatic speech recognition models.
- sonosthesia-audio-pipeline — Sonosthesia tools for baking audio analysis data
- sonusai — Framework for building deep neural network models for sound, speech, and voice AI
- sota-asr — 基于funasr的语音识别服务
- souJpg-diffusers — State-of-the-art diffusion in PyTorch and JAX.
- sound-analyzer-encoder — Sound analyzer and encoder
- Sound-cls — no summary
- soundata — Python library for loading and working with sound datasets.
- soundpy — A research-based framework for exploring sound as well as machine learning in the context of sound.
- soundviewer — Python package for sound visualization
- speaker-diarization — no summary
- speaker-verification-toolkit — A package designed to compose speaker verification systems
- speakerbox — Speaker Annotation for Transcripts using Audio Classification
- SpecAugment — A implementation of "SpecAugment"
- spectral-sound-analysis — A Python package for performing spectral analysis, audio signal processing, and related computations.
- spectro-utils — Add a short description here
- SpectrogramUtils — Make usage of spectrogram easy
- speech-articulatory-coding — Python code for analyzing and synthesizing articulatory features of speech
- speech-collator — A collator for speech datasets with different batching strategies and attribute extraction.
- speech-interface — An interface for neural speech synthesis with Pytorch
- speech-recognition-inference — A Speech-to-Text API.
- speech-text-pipeline — A Python package for speech transcription and speaker diarization with speaker matching functionality.
- speechaugs — Waveform augmentations
- speechline — An end-to-end, offline, batch audio categorization, transcription, and segmentation.
- speechmix — Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together
- speechtoolkit — ML for Speech presents SpeechToolkit, a unified, all-in-one toolkit for TTS, ASR, VC, & other models.
- speechwidgets — A library with Jupyter widgets for speech processing
- SPEEM — Calculate indicators saved as Excel.
- spela — spectrogram layers
- spiegelib — Synthesizer Programming with Intelligent Exploration, Generation, and Evaluation Library
- spychhiker — Various python class for speech analysis and speech synthesis
- sq_codec — SQCodec: A nerual audio codec with one quantizer
- ssak — Multi-lingual Automatic Speech Recognition (ASR) based on Whisper models, with accurate word timestamps, access to language detection confidence, several options for Voice Activity Detection (VAD), and more.
- ssr-eval — This package is written for the evaluation of speech super-resolution algorithms.
- stable-diffusion-videos — Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts.
- stem-mixer — Data augmentation for source separation models
- stopes — Large-Scale Translation Data Mining.
- str2speech — A powerful, Transformer-based text-to-speech (TTS) tool.
- streamer-torch — Official implementation of STREAMER, a self-supervised hierarchical event segmentation and representation learning
- StreamSender — A SDK for sending RTP and RTMP streams
- stt-sample-inspector — Inspect, modify, and add metadata to DeepSpeech (speech-to-text) datasets in CSV format.
- styletts2 — StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani.
- styletts2-fork — Fork of StyleTTS 2 Python packge. StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani, Sidharth Rajaram.
- subaligner — Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers.
- subtoaudio — Subtitle to Audio, generate audio or speech from any subtitle file
- supriya — A Python API for SuperCollider
- svc-toolkit — A self-contained singing voice conversion application using the so-vits-svc architecture, with Deep U-Net model for vocal separation feature and easy to use GUI.
- sxmp-mule — no summary
- sylber — Python code for "Sylber: Syllabic Embedding Representation of Speech from Raw Audio"
- synapse-ai-tools — A Python package for artificial intelligence development, providing utilities for machine learning, deep learning, data processing, and model deployment.
- synctoolbox — Python Package for Efficient, Robust, and Accurate Music Synchronization (Sync Toolbox)
- synesthesia-uf — A Python audio image creation tool
- synth-mapping-helper — Toolbox for manipulating the JSON-Format used by Synth Riders Beatmap Editor in the clipboard
- syntheon — Inference parameters of music synthesizers with deep learning
- synthoor — no summary
- tacotron — A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis.
- tacotron-cli — Command-line interface (CLI) to train Tacotron 2 using .wav <=> .TextGrid pairs.
- tacotron2 — Tacotron2 library
- talk-summarizer — Python library to summarize talks
- taoverse — A utilities library for model training subnets.
- tarzan — high-level IO for tar based dataset
- teamscritique — The funniest joke in the world
- tensionflow — A Tensorflow framework for working with audio data.
- tensorflow-datasets — tensorflow/datasets is a library of datasets ready to use with TensorFlow.
- TensorFlowASR — Almost State-of-the-art Automatic Speech Recognition using Tensorflow 2
- TensorflowTTS — TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for TensorFlow 2
- terra-ai-datasets-framework — Framework to create a dataset to train a neural network.