Websockets

import asyncio
import websockets

class RimeClient:
    def __init__(self, speaker, api_key):
        self.url = f"wss://users-ws.rime.ai/ws?speaker={speaker}&modelId=mistv3&audioFormat=mp3"
        self.auth_headers = {
            "Authorization": f"Bearer {api_key}"
        }
        self.audio_data = b''

    async def send_tokens(self, websocket, message):
        for token in message:
            await websocket.send(token)

    async def handle_audio(self, websocket):
        while True:
            try:
                audio = await websocket.recv()
            except websockets.exceptions.ConnectionClosedOK:
                break
            self.audio_data += audio

    async def run(self, message):
        async with websockets.connect(self.url, additional_headers=self.auth_headers) as websocket:
            await asyncio.gather(
                self.send_tokens(websocket, message),
                self.handle_audio(websocket),
            )

    def save_audio(self, file_path):
        with open(file_path, 'wb') as f:
            f.write(self.audio_data)

message = [
    "This ",
    "is ",
    "a ",
    "sentence, ",
    "that ",
    "will ",
    "produce ",
    "audio.",
    "<EOS>",
]

client = RimeClient("cove", api_key="YOUR_API_KEY")
asyncio.run(client.run(message))

client.save_audio("output.mp3")

import asyncio
import websockets

class RimeClient:
    def __init__(self, speaker, api_key):
        self.url = f"wss://users-ws.rime.ai/ws?speaker={speaker}&modelId=mistv3&audioFormat=mp3"
        self.auth_headers = {
            "Authorization": f"Bearer {api_key}"
        }
        self.audio_data = b''

    async def send_tokens(self, websocket, message):
        for token in message:
            await websocket.send(token)

    async def handle_audio(self, websocket):
        while True:
            try:
                audio = await websocket.recv()
            except websockets.exceptions.ConnectionClosedOK:
                break
            self.audio_data += audio

    async def run(self, message):
        async with websockets.connect(self.url, additional_headers=self.auth_headers) as websocket:
            await asyncio.gather(
                self.send_tokens(websocket, message),
                self.handle_audio(websocket),
            )

    def save_audio(self, file_path):
        with open(file_path, 'wb') as f:
            f.write(self.audio_data)

message = [
    "This ",
    "is ",
    "a ",
    "sentence, ",
    "that ",
    "will ",
    "produce ",
    "audio.",
    "<EOS>",
]

client = RimeClient("cove", api_key="YOUR_API_KEY")
asyncio.run(client.run(message))

client.save_audio("output.mp3")

Overview

Rime’s websocket implementation accepts bare text, and responds with audio bytes of the selected format. All synthesis arguments are provided as query parameters when establishing the connection. The websocket API buffers inputs up to one of the following punctuation characters: ., ,, ?, !. This is most pertinent for the initial messages sent to the API, as synthesis won’t begin until there are sufficient tokens to generate audio with natural prosody. After the first synthesis of any given utterance, typically enough time has elapsed that subsequent audio contains multiple clauses, and the buffering becomes largely invisible.

Messages

Send

The messages your client will send to the websocket API will be bare (non-serialized) text.

This will be converted to audio via websockets

Receive

The messages your client will receive will be raw audio bytes in the audio format specified at connection time.

Commands

`<CLEAR>`

This clears the current buffer. Used in the event of interruptions.

`<FLUSH>`

This forces whatever buffer exists, if any, to be synthesized, and the generated audio to be sent over.

`<EOS>`

This forces whatever buffer exists, if any, to be synthesized, and for the server to close the connection after sending the generated audio.

Variable Parameters

speaker

string

required

Must be one of the voices listed in our documentation.

modelId

string

Set to mistv3.

audioFormat

string

One of pcm, mulaw, or mp3

lang

string

default:"eng"

If provided, the language must match the language spoken by the provided speaker. This can be checked in our voices documentation.

pauseBetweenBrackets

bool

default:"false"

When set to true, adds pauses between words enclosed in angle brackets. The number inside the brackets specifies the pause duration in milliseconds. Example: Hi. <200> I'd love to have a conversation with you. adds a 200ms pause. Learn more about custom pauses.

phonemizeBetweenBrackets

bool

default:"false"

When set to true, you can specify the phonemes for a word enclosed in curly brackets. Example: {h'El.o} World will pronounce “Hello” as expected. Learn more about custom pronunciation.

samplingRate

int

The value, if provided, must be between 4000 and 44100. Default: 22050

inlineSpeedAlpha

string

Comma-separated list of speed values applied to words in square brackets. Values > 1.0 speed up speech, < 1.0 slow it down. Example: “This is [slow] and [fast]”, use “0.5, 3” to make “slow” slower and “fast” faster.

speedAlpha

float

default:"1.0"

Adjusts the speed of speech. Higher than 1.0 is faster and lower than 1.0 is slower.

segment

string

default:"bySentence"

Controls how text is segmented for synthesis. Available options:

“immediate” - Synthesizes text immediately without waiting for complete sentences
“never” - Never segments the text, waits for explicit flush or EOS
“bySentence” (default) - Waits for complete sentences before synthesis

import asyncio
import websockets

class RimeClient:
    def __init__(self, speaker, api_key):
        self.url = f"wss://users-ws.rime.ai/ws?speaker={speaker}&modelId=mistv3&audioFormat=mp3"
        self.auth_headers = {
            "Authorization": f"Bearer {api_key}"
        }
        self.audio_data = b''

    async def send_tokens(self, websocket, message):
        for token in message:
            await websocket.send(token)

    async def handle_audio(self, websocket):
        while True:
            try:
                audio = await websocket.recv()
            except websockets.exceptions.ConnectionClosedOK:
                break
            self.audio_data += audio

    async def run(self, message):
        async with websockets.connect(self.url, additional_headers=self.auth_headers) as websocket:
            await asyncio.gather(
                self.send_tokens(websocket, message),
                self.handle_audio(websocket),
            )

    def save_audio(self, file_path):
        with open(file_path, 'wb') as f:
            f.write(self.audio_data)

message = [
    "This ",
    "is ",
    "a ",
    "sentence, ",
    "that ",
    "will ",
    "produce ",
    "audio.",
    "<EOS>",
]

client = RimeClient("cove", api_key="YOUR_API_KEY")
asyncio.run(client.run(message))

client.save_audio("output.mp3")

Streaming HTTP Websockets JSON

⌘I

Arcana API reference

Mist v3 API reference

Mist v2 API reference

API Metadata

Other APIs

Overview

Messages

Send

Receive

Commands

`<CLEAR>`

`<FLUSH>`

`<EOS>`

Variable Parameters

Arcana API reference

Mist v3 API reference

Mist v2 API reference

API Metadata

Other APIs

Documentation Index

​Overview

​Messages

​Send

​Receive

​Commands

​<CLEAR>

​<FLUSH>

​<EOS>

​Variable Parameters

Overview

Messages

Send

Receive

Commands

`<CLEAR>`

`<FLUSH>`

`<EOS>`

Variable Parameters