Rime On-Premises Deployment Quickstart
On-prem is in public beta. For more information regarding access to Docker images and pricing info, reach out to help@rime.ai.
Introduction
Why On-Premises?
Deploying on-premises offers several advantages over using cloud APIs over a public network. One of the main benefits is speed; by hosting the services locally, you can significantly reduce network latency, resulting in faster system responses and data processing.Security
With an on-premises deployment, all sensitive data remains within your corporate network, ensuring enhanced security as it is not transmitted over the Internet. This setup helps in complying with strict data privacy and protection regulations.Performance
Latency
- Mist: Our tests have shown median latency of around 80ms with randomly generated sentences between 40 and 50 characters.
- Arcana: You should be getting a time-to-first-frame latency around 400ms on H100, and a real-time-factor (RTF) lower than 1.
Components

Prerequisites
Hardware Requirements
- GPU
- For Mist
- NVIDIA T4, L4, A10G, or higher
- For Arcana
- NVIDIA H100 (MIG
3g.40gb
), A100, or higher
- NVIDIA H100 (MIG
- For Mist
- Storage
- 50 GB storage
- CPU
- 8 vCPUs
- Memory requirements
- 32 GiB
Software Requirements
- Supported Linux Distributions
- Debian 12 (
bookworm
), x86_64 - Ubuntu Server 24.04 (
jammy
), x86_64
- Debian 12 (
- NVIDIA drivers
570.133.20
or higher
- Docker
- NVIDIA Container Toolkit
Installations
NVIDIA Drivers
Follow https://www.nvidia.com/en-us/drivers to install the latest NVIDIA drivers, or use the following instructions on Debian-based systems:NVIDIA Driver Installation (Debian-based)
Docker
Follow https://docs.docker.com/engine/install to install Docker on your system.NVIDIA Container Toolkit
Follow https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html to install the NVIDIA Container Toolkit.Verification
To verify that you have all the prerequisites installed, run the following command:Verify Prerequisites
Firewall Requirements
The Rime API instance will listen on port8000
for http traffic, and port 8001
for websockets traffic.
You will also need to allow the following outbound traffic in your firewall rules:
http://optimize.rime.ai/usage
: registers on-prem usage with our servers.http://optimize.rime.ai/license
: verifies that your on-prem license is active.us-docker.pkg.dev
on port443
: container image registry.
Self-Service Licensing & Credentials
API Key Generation
Refer to our user interface dashboard to generate the necessary keys and credentials for authenticating and authorizing the deployment and use of our services.Deployment
The deployment consists of two services, each powered by a container image:- API service: responsible for handling the HTTP and WebSocket requests, and for verifying the license. It serves as a proxy to the TTS service.
- TTS service: responsible for model inference.
Artifact Registry Login
Key file to be provided by Rime.Login to Artifact Registry
Container Images
TTS Service
Currently the latest image versions are:us-docker.pkg.dev/rime-labs/arcana/v2/de:20250913
us-docker.pkg.dev/rime-labs/arcana/v2/en:20250913
us-docker.pkg.dev/rime-labs/arcana/v2/es:20250913
us-docker.pkg.dev/rime-labs/arcana/v2/fr:20250913
us-docker.pkg.dev/rime-labs/mist/v2/en:20250814
API Service
The latest image version is:us-docker.pkg.dev/rime-labs/api/service:20250909
Docker Compose Configuration
A simple way of deploying on a machine is to use Docker Compose. Create adocker-compose.yml
file with your editor of choice to define the services and their configurations:
docker-compose.yml
When running on Kubernetes, ensure thatMODEL_URL
points tohttp://0.0.0.0:8080/invocations
instead of the Docker Compose service name.
Start Docker Compose
Start Docker Compose
Deployment Steps
- Environment Setup: Prepare your AWS environment according to the specifications required for optimal deployment.
- Service Deployment: Using Docker, deploy the images on your server.
- Networking Setup: Configure the network settings, including the Internet Gateway and port settings, to ensure connectivity and security.
- Licensing and Authentication: Generate and apply the necessary API key via our dashboard to start using the services.
Note: Once the containers are started, expect 5 minutes delay for warm up before sending first tts requests.
Additional Information
- Troubleshooting Guide: A troubleshooting guide will be provided to help resolve common issues during deployment.
- Available voices/models: all voices are currently available.
Requests and Response Formats
HTTP Requests
Request:Health Check
Request Example
Response Format
Receiving a response in mp3 format
Request:Request Example
Receiving a response in pcm (raw) format
Request:Request Example
Websockets Endpoints
The json websockets endpoint will be served at port8001
. For example ws://localhost:8001
which will be eqivalent to our cloud websockets-json api .