SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.
With Speakabo, Convert your text to audio and download files in MP3. Speakabo has one of largest collection of realistic voices(100+ voices 20+ Languages) powered by AI. Get dazzling audio created in minutes with advanced customization with SSML tags.
Clone voices and transcribe text to speech. I implement yet another text-to-speech model, dc-tts, introduced in Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention.