Speech to Note is a cutting-edge AI-driven tool that seamlessly converts your spoken words into a concise and informative summary.
Cost / License
- Freemium (Pay once)
- Proprietary
Platforms
- Online



Amphion is described as 'Toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development' and is a audio transcription tool in the ai tools & services category. There are more than 50 alternatives to Amphion for a variety of platforms, including Mac, Web-based, Windows, iPhone and SaaS apps. The best Amphion alternative is Vibe Transcribe, which is both free and Open Source. Other great apps like Amphion are Handy STT, FUTO Voice Input, Voxtral and Whisper.
Speech to Note is a cutting-edge AI-driven tool that seamlessly converts your spoken words into a concise and informative summary.



A straightforward macOS application that allows the user to use different Whisper services (OpenAI API, Runpod Faster Whisper) from your macOS desktop. You have the flexibility to use your own API key, ensuring that you only incur charges for the services you actively use.




Batch transcribe audio files or movie files into text with OpenAI's Whisper AI Model. With an embed subtitles editor to preview the transcription result segment by segment. All transcribe operation is processing in local machine. Keep your privacy safe.




Letterly is a mobile app that converts any speech to clear and well-structured text. It's more than just a transcription. With the help of AI, you can transform your voice into structured notes, catchy social media posts, readable meeting summaries, formal emails and much more




Txtplay.ai delivers AI-powered real-time captioning, transcription, and translation for TV and online streaming. It integrates with encoders like PixelPower and Evertz, plus OVPs such as Kaltura and Brightcove. Cloud, hybrid, or on-prem — accessible and multilingual.

Buzz Captions is an offline audio transcription and translation tool powered by OpenAI's Whisper model. It allows users to import audio and video files to generate transcripts in CSV, SRT, TXT and VTT formats.

AI executive assistant that records Google Meet, Zoom, Teams, Webex; transcribes and makes instant summaries: decisions, action items, minutes, smart titles. Search/tag, jump to quotes, export/share.









CMU Sphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
Windows Speech Recognition makes using a keyboard and mouse optional. You can control your PC with your voice and dictate text instead.