This document provides information about the Helm chart for deploying Rime Labs services on Kubernetes.

Chart Overview

The Helm chart deploys a two-tier application consisting of an API service and a model service. The API service communicates with the model service for inference operations.

Prerequisites

Kubernetes 1.19+ Helm 3.0+ NVIDIA GPU Operator installed (for GPU support) PV provisioner support in the underlying infrastructure (if using persistent storage)

Chart Structure

rime-labs/
├── Chart.yaml
├── values.yaml
├── templates/
│   ├── _helpers.tpl
│   ├── deployment-api.yaml
│   ├── deployment-model.yaml
│   ├── service-api.yaml
│   ├── service-model.yaml
│   ├── configmap.yaml
│   ├── serviceaccount.yaml
│   └── NOTES.txt
└── charts/

Installation

helm install rime-labs ./rime-labs

Example values.yaml

api:
  image:
    repository: rime/api
    tag: 0a111d625e17
    pullPolicy: IfNotPresent
  service:
    type: ClusterIP
    port: 8000
  resources:
    limits:
      cpu: 1000m
      memory: 2Gi
    requests:
      cpu: 1000m
      memory: 2Gi
  env:
    - name: MODEL_URL
      value: "http://{{ .Release.Name }}-model:8080/invocations"

model:
  image:
    repository: rime/model
    tag: 7bd3a89c3b05
    pullPolicy: IfNotPresent
  service:
    type: ClusterIP
    port: 8080
  gpu:
    enabled: true
    count: all
  resources:
    limits:
      nvidia.com/gpu: 1
      cpu: 2000m
      memory: 10Gi
    requests:
      cpu: 2000m
      memory: 10Gi

Example Deployment Templates

Here’s a simplified example of what the deployment templates might look like:

API Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "rime-labs.fullname" . }}-api
spec:
  replicas: {{ .Values.api.replicaCount | default 1 }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
    spec:
      containers:
        - name: api
          image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
          imagePullPolicy: {{ .Values.api.image.pullPolicy }}
          ports:
            - containerPort: 8000
          env:
            {{- range .Values.api.env }}
            - name: {{ .name }}
              value: {{ .value }}
            {{- end }}
          resources:
            {{- toYaml .Values.api.resources | nindent 12 }}

Model Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "rime-labs.fullname" . }}-model
spec:
  replicas: {{ .Values.model.replicaCount | default 1 }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
    spec:
      containers:
        - name: model
          image: "{{ .Values.model.image.repository }}:{{ .Values.model.image.tag }}"
          imagePullPolicy: {{ .Values.model.image.pullPolicy }}
          ports:
            - containerPort: 8080
          resources:
            {{- toYaml .Values.model.resources | nindent 12 }}
      {{- if .Values.model.gpu.enabled }}
      nodeSelector:
        accelerator: nvidia-gpu
      {{- end }}

Troubleshooting

Common Issues

  1. GPU not recognized: Ensure the NVIDIA GPU Operator is installed correctly in your cluster.
  2. Services cannot communicate: Verify that service names are correctly referenced in environment variables.
  3. Resource constraints: If pods are in a pending state, check if you have sufficient resources (CPU, memory, GPUs) in your cluster.

This document provides information about the Helm chart for deploying Rime Labs services on Kubernetes.

Chart Overview

The Helm chart deploys a two-tier application consisting of an API service and a model service. The API service communicates with the model service for inference operations.

Prerequisites

Kubernetes 1.19+ Helm 3.0+ NVIDIA GPU Operator installed (for GPU support) PV provisioner support in the underlying infrastructure (if using persistent storage)

Chart Structure

rime-labs/
├── Chart.yaml
├── values.yaml
├── templates/
│   ├── _helpers.tpl
│   ├── deployment-api.yaml
│   ├── deployment-model.yaml
│   ├── service-api.yaml
│   ├── service-model.yaml
│   ├── configmap.yaml
│   ├── serviceaccount.yaml
│   └── NOTES.txt
└── charts/

Installation

helm install rime-labs ./rime-labs

Example values.yaml

api:
  image:
    repository: rime/api
    tag: 0a111d625e17
    pullPolicy: IfNotPresent
  service:
    type: ClusterIP
    port: 8000
  resources:
    limits:
      cpu: 1000m
      memory: 2Gi
    requests:
      cpu: 1000m
      memory: 2Gi
  env:
    - name: MODEL_URL
      value: "http://{{ .Release.Name }}-model:8080/invocations"

model:
  image:
    repository: rime/model
    tag: 7bd3a89c3b05
    pullPolicy: IfNotPresent
  service:
    type: ClusterIP
    port: 8080
  gpu:
    enabled: true
    count: all
  resources:
    limits:
      nvidia.com/gpu: 1
      cpu: 2000m
      memory: 10Gi
    requests:
      cpu: 2000m
      memory: 10Gi

Example Deployment Templates

Here’s a simplified example of what the deployment templates might look like:

API Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "rime-labs.fullname" . }}-api
spec:
  replicas: {{ .Values.api.replicaCount | default 1 }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "rime-labs.name" . }}-api
    spec:
      containers:
        - name: api
          image: "{{ .Values.api.image.repository }}:{{ .Values.api.image.tag }}"
          imagePullPolicy: {{ .Values.api.image.pullPolicy }}
          ports:
            - containerPort: 8000
          env:
            {{- range .Values.api.env }}
            - name: {{ .name }}
              value: {{ .value }}
            {{- end }}
          resources:
            {{- toYaml .Values.api.resources | nindent 12 }}

Model Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "rime-labs.fullname" . }}-model
spec:
  replicas: {{ .Values.model.replicaCount | default 1 }}
  selector:
    matchLabels:
      app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
  template:
    metadata:
      labels:
        app.kubernetes.io/name: {{ include "rime-labs.name" . }}-model
    spec:
      containers:
        - name: model
          image: "{{ .Values.model.image.repository }}:{{ .Values.model.image.tag }}"
          imagePullPolicy: {{ .Values.model.image.pullPolicy }}
          ports:
            - containerPort: 8080
          resources:
            {{- toYaml .Values.model.resources | nindent 12 }}
      {{- if .Values.model.gpu.enabled }}
      nodeSelector:
        accelerator: nvidia-gpu
      {{- end }}

Troubleshooting

Common Issues

  1. GPU not recognized: Ensure the NVIDIA GPU Operator is installed correctly in your cluster.
  2. Services cannot communicate: Verify that service names are correctly referenced in environment variables.
  3. Resource constraints: If pods are in a pending state, check if you have sufficient resources (CPU, memory, GPUs) in your cluster.