> ## Documentation Index
> Fetch the complete documentation index at: https://docs.rime.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction

> Rime's text-to-speech API for real-time voice agents and IVR: low-latency streaming TTS with Coda, Arcana, and Mist models across English, Spanish, French, Portuguese, German, and Japanese.

Rime provides industry-leading text-to-speech (TTS) AI models built for **real-time conversational experiences at scale**.

Our latest flagship model, **Coda**, pairs a sophisticated LLM backbone with a dedicated speech inference engine.

It's trained on Rime's massive proprietary data set of full-duplex conversational speech between real people — not voice actors, audiobook narrators, or YouTube influencers — which means it's perfect for production voice AI agents, whether you're building intelligent IVRs, multilingual voice agents, or anything in between.

## Hello from Coda in 30 seconds

```bash theme={null}
curl --request POST \
  --url https://users.rime.ai/v1/rime-tts \
  --header 'Authorization: Bearer YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --header 'Accept: audio/mpeg' \
  --output hello.mp3 \
  --data '{
    "speaker": "astra",
    "text": "Hello from Coda.",
    "modelId": "coda",
    "language": "en"
  }'
```

Get your `YOUR_API_KEY` from the [API Tokens page](https://app.rime.ai/tokens).

The [Rime CLI](/docs/quickstart-cli) wraps this same endpoint behind a `rime tts` command if you'd rather not write the request yourself.

## What Makes Coda Different

<CardGroup cols={2}>
  <Card title="Real-time conversational performance" icon="bolt">
    Sub-100ms model latency on the GPU engine and sub-200ms end-to-end via the cloud API — fast enough for mid-utterance control and barge-in without awkward silences. See [Latency](/docs/latency) for benchmarks.
  </Card>

  <Card title="Top-rated voice quality" icon="trophy">
    In human-led evaluations, Coda surpasses both prior Rime models and competitor TTS offerings on naturalness, prosody, and artifact-free output.
  </Card>

  <Card title="Multilingual support" icon="globe">
    One model speaks English, Spanish, French, Portuguese, German, and Japanese using a shared expressive voice lineup.
  </Card>

  <Card title="Word-level timestamps" icon="clock">
    Structural metadata enables text-audio alignment, real-time highlighting, better interruption handling, and smarter orchestration.
  </Card>
</CardGroup>

## Rime's TTS Models

Rime now offers a suite of models tailored for different production needs:

* **Coda**
  * `modelId: coda`
  * Our flagship TTS model. LLM backbone with a dedicated speech inference engine, trained on conversational full-duplex data.
  * Surpasses prior Rime models and competitor offerings in human-led voice-quality evaluations. [See the announcement](https://rime.ai/resources/coda-tts).
  * **Sub-100ms model latency on the GPU engine** when self-hosted or running on-prem.
    * Via the cloud API, expect roughly 25–50ms additional network round-trip from most of the continental US when you pick the closest [regional endpoint](/docs/regional-endpoints).
  * Native multilingual support across English, Spanish, French, Portuguese, German, and Japanese.
  * Word-level timestamps for fine-grained text-audio alignment and interruption handling.
* **Arcana v3**
  * `modelId: arcana`
  * The previous-generation flagship: ultra-realistic, expressive voices with **low latency (\~120ms TTFB out of engine)** and **native multilingual code-switching** across more than 10 languages.
  * **Coda is the recommended successor for all Arcana traffic.**
* **Arcana v2**
  * `modelId: arcanav2`
  * **Ultra-realistic and expressive voices** (including laughter and whispering) with low latency (\~250 ms TTFB out of the engine).
  * Built for high-volume conversational applications.
* **Mist v3**
  * `modelId: mistv3`
  * Major update to the Mist engine — **TTFA around 37 ms (P50)** on the GPU engine, significantly faster than Coda or Arcana while preserving Mist's pronunciation control and predictability.
* **Mist v2**
  * `modelId: mistv2`
  * Previous-generation Mist model. For new projects, prefer Mist v3.

For full details on each model — including the latency benchmarks, voice counts, and feature matrix — see [Models](/docs/models).

## Migrating from Arcana

<Tip>Coda is meaningfully better than Arcana across naturalness, prosody, and artifact-free output. We recommend migrating all existing Arcana traffic to Coda. The API contract is the same — just swap `modelId: arcana` for `modelId: coda`.</Tip>

## Language & Voice Support

Coda supports global voice experiences across English, Spanish, French, Portuguese, German, and Japanese, with a shared voice identity across languages.

Rime exposes a rich set of **demographically diverse voices** you can select via API to match your brand, audience, and use case.

## Flexible Deployment

Rime supports flexible infrastructure options — from the [cloud API](/docs/api-reference) and virtual private cloud to [on-premises deployments](/docs/on-prem/quickstart) — without artificial concurrency limits. Whether your application must run close to users for real-time responsiveness or within secure enterprise environments, Rime fits your architecture.

## Ready to Get Started?

Follow the [quickstart guide](/docs/quickstart-five-minute) to begin generating text-to-speech with Rime's models — including Coda — in under five minutes.

<CardGroup cols={2}>
  <Card title="Streaming TTS" icon="wave-sine" href="/docs/streaming">
    Stream audio over HTTP, WebSockets, or SSE — and how to choose for your voice agent.
  </Card>

  <Card title="WebSocket API" icon="plug" href="/docs/websockets">
    Persistent connections, incremental text input, word-level timestamps, and interruption handling.
  </Card>

  <Card title="Latency" icon="gauge-simple-max" href="/docs/latency">
    Measured benchmarks per model and how to minimize time-to-first-audio.
  </Card>

  <Card title="Voices" icon="microphone-lines" href="/docs/voices">
    Browse Rime's voice catalog across models and languages.
  </Card>
</CardGroup>