Private, on-device audio transcription for macOS. Your audio never leaves your Mac — no cloud uploads, no subscriptions, no data collection. Real-time ASR with Qwen3-ASR, MLX Whisper & Whisper, plus system-wide dictation, all 100% local.




OmniDictate is described as 'Free, open-source, real-time dictation for Windows. Runs locally (no cloud!), uses AI, and types directly into any application via a user-friendly GUI' and is a audio transcription tool in the audio & music category. There are more than 10 alternatives to OmniDictate for a variety of platforms, including Mac, Web-based, Windows, Linux and Self-Hosted apps. The best OmniDictate alternative is Vibe Transcribe, which is both free and Open Source. Other great apps like OmniDictate are Voxtral, Whisper, TranscribeX and Transcription Pro.
Private, on-device audio transcription for macOS. Your audio never leaves your Mac — no cloud uploads, no subscriptions, no data collection. Real-time ASR with Qwen3-ASR, MLX Whisper & Whisper, plus system-wide dictation, all 100% local.




Speech to Note is a cutting-edge AI-driven tool that seamlessly converts your spoken words into a concise and informative summary.



This is Scriberr, a self-hostable AI audio transcription app. Scriber uses the Whisper models from OpenAI, to transcribe audio files offline, on your hardware.




NotchLive is a macOS menu bar app that displays real-time AI-powered captions and translations directly in your MacBook's notch. It uses on-device Whisper AI (via CoreML) for speech recognition and Apple Translation for real-time translation — nothing ever leaves your Mac.


VibeVoice is a novel framework designed for generating expressive, long-form, multi-speaker conversational audio, such as podcasts, from text. It addresses significant challenges in traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and...


Gazelle is a joint speech-language model by Tincans — for more details and prompt ideas, see our v0.2 announcement. This is an early research preview -- please temper expectations! Gazelle can take in text and audio as input (interchangeably) and generates text as output.
