> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rime.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Models

> Pick the right Rime model: Coda (flagship), Arcana, Mist v3, Mist v2.

Models are constantly being trained and fine-tuned based on user and customer feedback. Please check back often, as we push changes frequently.

<Tip>Rime's API is available in multiple regions. Use the [regional endpoint](/docs/regional-endpoints) closest to your deployment for the lowest latency.</Tip>

Rime currently has five models in production: `coda`, `arcanav3`, `arcanav2`, `mistv3`, and `mistv2`. All are available via the cloud API and on-premises.

## Which model should I use?

| Pick this | When                                                                                                                                                                            |
| :-------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `coda`    | **Default for most new apps.** Rime's flagship — top-rated voice quality in human evaluations, sub-100ms model latency, and the recommended replacement for all Arcana traffic. |
| `arcana`  | You need multilingual code-switching across languages that Coda doesn't yet support. Otherwise, prefer Coda.                                                                    |
| `mistv3`  | You need the fastest TTFA (\~37 ms P50).                                                                                                                                        |
| `mistv2`  | You need [custom pronunciation control](/docs/custom-pronunciation) for brand names and uncommon words (not yet on Mist v3). For other use cases, prefer Mist v3.               |

For benchmarked latency and throughput numbers, see [Latency](/docs/latency#real-time-performance-benchmarks).

## Feature matrix

| Attribute                                               | Coda | Arcana | Mist |
| :------------------------------------------------------ | :--: | :----: | :--: |
| Number of voices                                        |  184 |   94   |  94  |
| Multilingual                                            |   ✅  |    ✅   |   ❌  |
| [Text normalization](/docs/text-normalization)          |   ✅  |    ✅   |   ✅  |
| [`spell()` function](/docs/spell)                       |   ✅  |    ✅   |   ✅  |
| [Speed adjustment](/docs/speed)                         |   ✅  |    ✅   |   ✅  |
| [Custom pronunciation (Speech QA)](/platform/speech-qa) |   ❌  |    ❌   |   ✅  |
| [Custom pauses](/docs/custom-pauses)                    |   ❌  |    ❌   |   ✅  |

## Coda

**Coda**, released May 2026, is Rime's new flagship TTS model and the successor to Arcana. It pairs a sophisticated LLM backbone with a dedicated speech inference engine, trained on the conversational full-duplex data preferred by production voice AI deployments.

* In human-led voice-quality evaluations, Coda surpasses both prior Rime models and competitor TTS offerings — including naturalness, prosody, and artifact-free output
* **Sub-100ms model latency on the GPU engine** when self-hosted or on-prem.
  * Cloud API users add roughly 25–50ms network round-trip from most of the continental US when routed to the closest [regional endpoint](/docs/regional-endpoints).
* Multilingual support for English, Spanish, French, Portuguese, German, and Japanese using a shared voice lineup
* Word-level timestamps for text-audio alignment and interruption handling
* Supports the [`spell()` function](/docs/spell) for spelling out sequences letter by letter or number by number
* Available via `modelId: coda` through Rime's API endpoints

## Arcana

<Tip>Coda is meaningfully better than Arcana across naturalness, prosody, and artifact-free output. We recommend migrating all existing Arcana traffic to Coda — just swap `modelId: arcana` for `modelId: coda` in your requests.</Tip>

**Arcana**, released April 2025, is Rime's previous flagship TTS model — known for naturalness and emotional depth in synthesized speech.

* Highly expressive, natural-sounding speech with emotional nuance
* Fine-grained control over prosody, pacing, and tone
* Supports a wide range of vocal demographics, including different ages, accents, and cultural backgrounds
* Enhanced realism for dynamic, conversational, and character-driven use cases
* Supports the [`spell()` function](/docs/spell) for spelling out sequences letter by letter or number by number
* Available via `modelId: arcana` through Rime's API endpoints

## Mist v3

**Mist v3**, released March 2026, is a major update to the engine powering our classic Mist model.

* Typical TTFB is now well below 100ms — a significant performance improvement over previous versions, achieved without sacrificing the quality and predictability of Mist
* `modelId: mistv3`
* Our most popular Mist speakers are all available — see the [full voice list](/api-reference/data/voices)
* [`speedAlpha`](/docs/speed) behavior is reversed compared to Mist and Mist v2, bringing it to parity with Arcana: **higher values produce faster speech**

## Mist v2

**Mist v2**, released February 2025, has the following features:

* Multi-lingual English + Spanish, plus more languages coming soon
* More realistic speech with natural and contextual nuances
* Advanced pronunciation control
* Ultra-fast on-prem latency of \~70ms, perfect for real-time applications
* More accents, demographics, and speaking styles

## Mist (legacy)

**Mist** is Rime’s next-generation TTS engine, released April 2023, capable of synthesizing conversational speech. Using the `modelId` parameter for Rime’s TTS endpoints, specifying `mistv2` or `mist`, will allow you to synthesize speech using this newer family of models. As of February 2025, the default value for `modelId` when unspecified is `mist`.

**Model v1 was released in April 2022 and has been deprecated.**

## Additional controls (Arcana only)

<Note>The controls in this section apply to Arcana only. Coda, Mist v3, and Mist v2 do not expose `temperature`, `top_p`, or `repetition_penalty`.</Note>

Arcana also supports several additional controls due to its LLM backbone. We recommend leaving these on the default values.

* `temperature`: Controls the randomness of the generated speech.
  * **Low** (0): Produces more predictable and focused speech.
  * **High** (1+): Introduces variability in prosody and expression, potentially leading to more dynamic speech patterns.
* `repetition_penalty`: Discourages the model from repeating the same sounds.
  * **Low** (`<1`): May result in repetitive speech patterns.
  * **High** (`>1`): Encourages variation, leading to more natural-sounding speech and realistic laughter.
* `top_p`: Determines the diversity of choices by limiting the selection to a subset of probable sounds.
  * **Low** (0): Restricts the model to the most probable sounds, resulting in more monotonic speech.
  * **High** (1): Allows for a broader range of sound choices, enhancing the naturalness and variability of speech.

See the [Arcana API reference pages](/api-reference/arcana/streaming-mp3) for more details.
