VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




Speechelo is described as 'Instantly transform any text into a 100% human-sounding voiceover with only 3 clicks! Paste your text. Choose a voice. Generate & download' and is a Text to Speech service. There are more than 25 alternatives to Speechelo for a variety of platforms, including Web-based, SaaS, Windows, iPhone and Mac apps. The best Speechelo alternative is VoiceCraft, which is both free and Open Source. Other great apps like Speechelo are X to Voice, Kokoro, Voice Engine and NaturalReader.
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




Open-source tool that analyzes your X/Twitter profile data to generate a custom voice with ElevenLabs Voice Design API, integrating with Hedra's video API for an innovative audio-visual experience.


Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient.

Voice Engine is a text-to-voice generation platform from OpenAI, which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.


Natural Reader is a professional text to speech program that converts any written text into spoken words. The paid versions of Natural Reader have many more features.



Create AI podcasts by uploading websites, PDFs, or documents, selecting customizable hosts and scripts, generating episodes with outline planning, editing outputs, and publishing audio content—streamlining production for creators without manual recording.




Wondercraft AI is a tool that allows users to easily create studio-quality podcasts using generative AI technology. It eliminates the need for extensive recording and scripting by allowing users to record just a 60-second sample of their voice, which the AI uses to clone their...




Choose from 60+ human-like, emotional voices in various accents, languages, and characters to turn any text into a commercial-grade audio. Or Clone your own voice.


TTSMaker is a free text-to-speech tool that provides speech synthesis services, supports multiple languages: English, French, German, Spanish, Arabic, Chinese, Japanese, Korean, Vietnamese... and a variety of voice styles, you can use it reads text and e-books aloud, and can...

Amazon Polly uses deep learning technologies to synthesize natural-sounding human speech, so you can convert articles to speech. With dozens of lifelike voices across a broad set of languages, use Amazon Polly to build speech-activated applications.



Convert text into realistic speech or short-form video with synthetic AI voices, over 700 options in 65+ languages, automatic subtitles, and fast web-based tools ideal for social, educational, e-learning, and marketing content while improving accessibility.




Datareel.ai is the next-generation AI video & analytics platform. Trusted by enterprises in Healthcare, Banking, Finance, and Insurance, we deliver hyper-personalized video experiences that boost engagement, optimize communication, and unlock data-driven decisions.



