VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




HeyGen is described as 'Transforms text into HD browser-based videos with AI voices in 40+ languages, customizable assets, scene merging, unlimited downloads, and easy sharing' and is a ai video generator in the ai tools & services category. There are more than 50 alternatives to HeyGen for a variety of platforms, including Web-based, SaaS, iPhone, Mac and Windows apps. The best HeyGen alternative is VoiceCraft, which is both free and Open Source. Other great apps like HeyGen are Voice Engine, X to Voice, NaturalReader and Mirage Studio.
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including audiobooks, internet videos, and podcasts.




Voice Engine is a text-to-voice generation platform from OpenAI, which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.


Open-source tool that analyzes your X/Twitter profile data to generate a custom voice with ElevenLabs Voice Design API, integrating with Hedra's video API for an innovative audio-visual experience.


Natural Reader is a professional text to speech program that converts any written text into spoken words. The paid versions of Natural Reader have many more features.



Meet the world's first AI model designed to generate UGC content. Mirage by Captions generates original actors with natural expressions and body language—completely free from licensing restrictions.




AI clones lip-sync to your voice in real-time calls. Replace your camera on Zoom, Twitch, TikTok and more.


Wavel is a cutting-edge video solution for businesses and teams. Our AI creates subtitles, captions, and dubbing in multiple languages/accents and even generates voice-over and emotions, increasing video reach.




Audiomatic is a web app that seamlessly translates videos into other languages. Our state-of-the-art pipeline delivers contextually-accurate dubbed translations that preserve the tone, style, and emotion of the original speakers.



Synthesia.io is a software that allows users to convert text into video content within a short span of time. The software is equipped with artificial intelligence capabilities, enabling the creation of studio-quality videos featuring AI avatars and voiceovers in over 140...



Vidnoz AI enables fast text-to-video creation with over 70 lifelike avatars and 100+ realistic voices. It offers pre-designed templates, subtitles, and effects—no editing skills needed. The user-friendly interface and customizable options support learning, social media, and more.




Amazon Polly uses deep learning technologies to synthesize natural-sounding human speech, so you can convert articles to speech. With dozens of lifelike voices across a broad set of languages, use Amazon Polly to build speech-activated applications.



Free open source AI voice cloning and text to speech synthesis. Clone a voice in 5 seconds to generate arbitrary speech in real-time.