Deploying on-premises offers several advantages over using cloud APIs over a public network. One of the main benefits is speed; by hosting the services locally, you can significantly reduce network latency, resulting in faster system responses and data processing.
Our tests have shown median latency of around 80ms
with randomly generated sentences between 40 and 50 characters.
With an on-premises deployment, all data remains within your corporate network, ensuring enhanced security as it is not transmitted over the Internet. This setup helps in complying with strict data privacy and protection regulations.
NVIDIA Drivers: Ensure that NVIDIA drivers are installed and properly configured on your systems to support necessary computations and operations.
Docker: Install Docker on your system to manage the containerized application.
NVIDIA Container Toolkit: Install the NVIDIA Container Toolkit to enable GPU support within Docker containers.
Other Prerequisites:
g5g.xlarge
or larger EC2 machine for on-prem rime-tts GPU
t2.micro
or larger EC2 machine with 10 GB storage for on-prem API instance.Ubuntu Server 22.04
. If there is a limitation on your side not to use this distribution, please let us know. For reference, below are other Linux distributions supported by NVIDIA:
This documentation will cover specific instructions and considerations for deploying the services within an AWS environment, ensuring optimal configuration and performance.
API Key Generation: Refer to our user interface dashboard to generate the necessary keys and credentials for authenticating and authorizing the deployment and use of our services.
Text-to-Speech (TTS) Image: Pull the latest TTS service image from quay.io using the provided Docker command.
API Image: Similarly, pull the latest API service image from quay.io to be used in conjunction with the TTS service.
Docker Compose File: Create a docker-compose.yml
file with your editor of choice to define the services and their configurations.
When running on Kubernetes, ensure that
MODEL_URL
points tohttp://0.0.0.0:8080/invocations
instead of the Docker Compose service name.
Start docker compose:
Rime API instance will listen on port 8000
.
You’ll need to permit outbound network traffic to http://optimize.rime.ai/usage
and http://optimize.rime.ai/license
to verify with our servers that you have an active on-prem licensing agreement and to register usage. Additionally, you’ll need access to quay.io, a container image repository platform, so you’ll need to allow outbound traffic to their servers on port 443
.
Note: Once the containers are started, expect 5 minutes delay for warm up before sending first tts requests.
Request:
Response:
Sample response file: result.txt
Request:
Response:
Sample response file: result.mp3
Request:
Response:
Sample response file: result.pcm
Deploying on-premises offers several advantages over using cloud APIs over a public network. One of the main benefits is speed; by hosting the services locally, you can significantly reduce network latency, resulting in faster system responses and data processing.
Our tests have shown median latency of around 80ms
with randomly generated sentences between 40 and 50 characters.
With an on-premises deployment, all data remains within your corporate network, ensuring enhanced security as it is not transmitted over the Internet. This setup helps in complying with strict data privacy and protection regulations.
NVIDIA Drivers: Ensure that NVIDIA drivers are installed and properly configured on your systems to support necessary computations and operations.
Docker: Install Docker on your system to manage the containerized application.
NVIDIA Container Toolkit: Install the NVIDIA Container Toolkit to enable GPU support within Docker containers.
Other Prerequisites:
g5g.xlarge
or larger EC2 machine for on-prem rime-tts GPU
t2.micro
or larger EC2 machine with 10 GB storage for on-prem API instance.Ubuntu Server 22.04
. If there is a limitation on your side not to use this distribution, please let us know. For reference, below are other Linux distributions supported by NVIDIA:
This documentation will cover specific instructions and considerations for deploying the services within an AWS environment, ensuring optimal configuration and performance.
API Key Generation: Refer to our user interface dashboard to generate the necessary keys and credentials for authenticating and authorizing the deployment and use of our services.
Text-to-Speech (TTS) Image: Pull the latest TTS service image from quay.io using the provided Docker command.
API Image: Similarly, pull the latest API service image from quay.io to be used in conjunction with the TTS service.
Docker Compose File: Create a docker-compose.yml
file with your editor of choice to define the services and their configurations.
When running on Kubernetes, ensure that
MODEL_URL
points tohttp://0.0.0.0:8080/invocations
instead of the Docker Compose service name.
Start docker compose:
Rime API instance will listen on port 8000
.
You’ll need to permit outbound network traffic to http://optimize.rime.ai/usage
and http://optimize.rime.ai/license
to verify with our servers that you have an active on-prem licensing agreement and to register usage. Additionally, you’ll need access to quay.io, a container image repository platform, so you’ll need to allow outbound traffic to their servers on port 443
.
Note: Once the containers are started, expect 5 minutes delay for warm up before sending first tts requests.
Request:
Response:
Sample response file: result.txt
Request:
Response:
Sample response file: result.mp3
Request:
Response:
Sample response file: result.pcm