
xAI launches Custom Voices for instant voice cloning in TTS and agents
xAI has introduced Custom Voices, allowing users to create voice clones by recording about one minute of natural speech in the xAI console, then use them across Grok Text-to-Speech and Voice Agent APIs. The process takes under two minutes and includes verification, processing, and delivery of a production-ready model. Once generated, custom voices can be used anywhere xAI’s built-in voices are supported.
To address voice security concerns, xAI uses a two-stage verification process. Users first read a passphrase, which is transcribed in real time to confirm consent and presence. The system then compares speaker data from the passphrase and full recording to confirm both belong to the same person, preventing cloning from pre-existing recordings or unauthorized samples.
Custom voices support speech tags, multilingual output, REST API access, and WebSocket streaming, with use cases including creator narration, brand voice agents, accessibility, gaming, and audiobook production. xAI also introduced Voice Library, a console section for managing and previewing built-in and custom voices, with more than 80 built-in voices across 28 languages and no extra charge for using custom voices with its APIs.
