QwenVoice icon
QwenVoice icon

QwenVoice

QwenVoice is a native SwiftUI macOS application that brings state-of-the-art text-to-speech to Apple Silicon Macs with no Python install, no terminal, and no dependencies required of the user — just download and run.

QwenVoice screenshot 1

Cost / License

  • Free
  • Open Source

Application type

Platforms

  • Mac
0likes
0comments
0articles

Features

Properties

  1.  Privacy focused

Features

  1.  No registration required
  2.  AI Voice Cloning
  3.  No Tracking
  4.  Command line interface
  5.  Works Offline
  6.  Ad-free
  7.  Text to Speech
  8.  Waveform
  9.  Voice synthesis
  10.  Support for Keyboard Shortcuts
  11.  Apple Silicon support

QwenVoice News & Activities

Highlights All activities

Recent activities

QwenVoice information

  • Developed by

    PowerBeef
  • Licensing

    Open Source and Free product.
  • Written in

  • Alternatives

    48 alternatives listed
  • Supported Languages

    • English

AlternativeTo Categories

AI Tools & ServicesOS & UtilitiesSystem & Hardware

GitHub repository

  •  29 Stars
  •  0 Forks
  •  1 Open Issues
  •   Updated  
View on GitHub
QwenVoice was added to AlternativeTo by Paul on and this page was last updated .
No comments or reviews, maybe you want to be first?

What is QwenVoice?

QwenVoice is a native SwiftUI macOS application that brings state-of-the-art text-to-speech to Apple Silicon Macs with no Python install, no terminal, and no dependencies required of the user — just download and run.

It runs the Qwen3-TTS model family entirely offline via Apple's MLX framework, delivering fast, low-latency, low-heat inference on M-series chips. The app communicates with a Python backend over JSON-RPC 2.0 via stdin/stdout, managed transparently as a background process.

Features:

Custom Voice & Voice Design

Generate speech using 4 built-in English speakers (Ryan, Aiden, Serena, Vivian) or create entirely new voice identities from a text description (e.g. "deep narrator", "excited child"). Both modes are controlled entirely through natural language instructions — there are no sliders or SSML tags. The underlying discrete multi-codebook language model natively interprets prompts to modulate breath, pitch, resonance, and emotional delivery.

Voice Cloning

Clone any voice from a short 5–10 second audio sample (WAV, MP3, AIFF, M4A, FLAC, or OGG). Optionally provide a transcript of the reference audio to improve accuracy.

Model Manager

Download and manage MLX models directly from HuggingFace inside the app. No browser or command line needed. Uses a native URLSession-based downloader with real-time progress tracking.

Generation History

Every generation is persisted to a local SQLite database (via GRDB). The History view lists generations sorted by date (newest first) and supports text search filtering. Each entry can be played back instantly, revealed in Finder, or deleted.

Batch Generation

Submit multiple text entries for sequential generation in a single session.

Additional features:

  • Temperature & max-token controls — Fine-tune the model's sampling behaviour from the UI
  • Waveform visualisation — Live waveform rendered for generated audio clips (via AVFoundation + vDSP)
  • Reveal in Finder — Jump directly to any generated file (Cmd+Shift+R)
  • Keyboard shortcuts — Cmd+Return to generate, Space to play/pause, Cmd+. to stop, Cmd+Shift+O to open the output folder
  • CLI companion — A standalone Python CLI in cli/ for headless or scripted use

Official Links