User-friendly GUI for OpenAI's Whisper offering unlimited transcription across multiple languages with various export options, available for Windows, macOS, and Linux.

Whisper is described as 'End-to-end speech recognition model trained on 680,000 hours of multitask, multilingual audio data, offering robust transcription, translation, and language identification' and is a audio transcription tool in the audio & music category. There are more than 100 alternatives to Whisper for a variety of platforms, including Mac, Web-based, Windows, iPhone and iPad apps. The best Whisper alternative is Handy STT, which is both free and Open Source. Other great apps like Whisper are Vibe Transcribe, Voxtral, FUTO Voice Input and TypeWhisper.
User-friendly GUI for OpenAI's Whisper offering unlimited transcription across multiple languages with various export options, available for Windows, macOS, and Linux.

Glimpse is a local-first voice dictation app for Mac. No subscription, no cloud required. It's just fast, accurate transcription powered by models running entirely on-device.




Powered by deep AI, Deciphr timestamps and summarizes your entire podcast transcript for you.

Turn your voice memos into organized text! Just talk & let the AI create lists, blog post and more for you!.

AI-powered speech-to-text platform that converts audio and video into accurate transcripts, captions, and translations in 100+ languages.




A meeting-centered workspace that aggregates audio, video, documents, and research into projects. Shadow works inside your content to search, edit, and generate multi-format deliverables with verifiable citations—tripling collaboration efficiency, cutting manual organization...




CMU Sphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
TranscribeMe offers a suite of transcription products that deliver the highest quality human readable text quickly and with the lowest prices.




Transgate is a powerful speech-to-text web application designed to transform audio and video recordings into accurate text with ease. It offers exceptional transcription quality and an intuitive interface, making it ideal for professionals across various fields, from researchers...



Bring structure to your meetings and save time using speech to text notes. Turn the team's conversation into meaningful notes. Easy to use workspace provides everything you need for productive meetings: AI speech to text transcription, real-time editor, agenda timeboxing and...

VibeVoice is a novel framework designed for generating expressive, long-form, multi-speaker conversational audio, such as podcasts, from text. It addresses significant challenges in traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and...


Amphion is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.