Skip to main content
Synthesize text to speech. Supports WAV and MP3 formats. The format is automatically selected based on the model: mistv2 and mist output MP3 while arcana and arcanav2 default to WAV but support MP3 as well. Use --format to override.
rime tts TEXT --speaker VOICE --model-id MODEL
Demo of the rime tts command

Required flags

FlagShortDescription
--speaker-sVoice speaker to use (e.g., astra, celeste, orion)
--model-id-mModel ID: arcana, arcanav2, mistv2, or mist

Optional flags

FlagShortDefaultDescription
--output-oOutput file path. Use - for stdout. If omitted, plays audio directly
--play-pfalsePlay audio after synthesis (default behavior when no output is specified)
--lang-lengLanguage code (e.g., eng, es, fra). Valid codes depend on model
--format-fAudio format: wav or mp3 (overrides model default)
--speed-alpha1Speed multiplier — must be greater than 0
--sampling-rateOutput sampling rate in Hz. Arcana: 8000, 16000, 22050, 24000, 44100, 48000, 96000. Mist: 400044100
--api-urlAPI URL (default: $RIME_API_URL or https://users.rime.ai/v1/rime-tts)

Arcana/arcanav2 flags

FlagDefaultDescription
--temperature0.5Sampling temperature (0–1)
--top-p1Top-p nucleus sampling (0–1)
--max-tokens1200Max output tokens (200–5000)
--repetition-penalty1.5Repetition penalty (1–2)

Mist/mistv2 flags

FlagDescription
--no-text-normalizationDisable text normalization
--pause-between-bracketsInsert pause at bracketed markers
--phonemize-between-bracketsPhonemize text in brackets
--inline-speed-alphaComma-separated per-segment speed values
--save-oovsSave out-of-vocabulary words

Examples

# Play audio directly through speakers
rime tts "Hello world" -s astra -m arcana

# Save to a WAV file
rime tts "Hello world" -s astra -m arcana -o output.wav

# Pipe audio to stdout
rime tts "Hello world" -s astra -m arcana -o - > audio.wav

# Use mistv2 (requires MP3 format)
rime tts "Hello world" -s peak -m mistv2 -f mp3

# Synthesize in Spanish with Arcana
rime tts "Hola mundo" -s astra -m arcana -l es

# JSON output with timing metadata
rime tts "Hello world" -s astra -m arcana -o output.wav --json

Supported languages by model

ModelLanguages
arcanaeng, ara, fra, ger, heb, hin, jpn, por, sin, spa, tam (and ISO 639-1 equivalents)
arcanav2eng, spa, ger, fra, hin (and ISO 639-1 equivalents)
mistv2 / misteng, fra, ger, spa (and ISO 639-1 equivalents)
The mist and mistv2 models require --format mp3. The CLI returns an error if you omit it.