Mistral unveils Voxtral Transcribe 2, a cheap open source speech model that runs on-device

Mistral unveils Voxtral Transcribe 2, a cheap open source speech model that runs on-device

French company Mistral AI has released Voxtral Transcribe 2, introducing two next-generation speech-to-text models. Both models offer advanced transcription quality, speaker diarization, and ultra-low latency. The product family comprises Voxtral Mini Transcribe V2, designed for batch processing, and Voxtral Realtime, built for live transcription workflows.

For batch tasks, Voxtral Mini Transcribe V2 delivers accurate results with low word error rates at a competitive price. It provides speaker diarization, context biasing, and word-level timestamps, and supports 13 languages. The language coverage includes English, Chinese, Hindi, Spanish, Arabic, French, Portuguese, Russian, German, Japanese, Korean, Italian, and Dutch.

Switching to real-time applications, Voxtral Realtime is purpose-built for live speech transcription. It features latency settings as low as 200 milliseconds, supporting voice agents and other time-sensitive systems. Voxtral Realtime is also open-weights under the Apache 2.0 license, and its 4 billion parameter footprint allows efficient and private operation on edge devices.

Beyond the models themselves, Mistral AI is launching an audio playground in Mistral Studio. This tool enables users to test out Voxtral Transcribe 2 with real-time diarization and timestamps, supporting rapid evaluation and experimentation.

by Paul

justarandom
justarandom found this interesting
Voxtral iconVoxtral
  21
  • ...

Voxtral offers advanced speech understanding models designed for audio transcription and speech recognition tasks. Available in two sizes—a 24B variant for large-scale production and a 3B variant for local and edge environments—Voxtral leverages AI-powered technology to deliver precise transcription capabilities. Both versions are distributed under the Apache 2.0 license, providing flexibility for various deployment needs.

No comments so far, maybe you want to be first?
Gu