Status: Needs Review
This page has not been reviewed for accuracy and completeness. Content may be outdated or contain errors.
gRPC API Deployment Guide¶
This guide covers running the cuvis.ai gRPC service in local, containerized, and clustered environments.
Prerequisites¶
- Python 3.10+
- PyTorch and PyTorch Lightning dependencies installed via project tooling
- gRPC runtime (
grpcio) - Access to
.cu3sdata and optional annotation JSON files
Installation¶
Using UV (recommended)¶
Using pip¶
git clone git@gitlab.cubert.local:cubert/cuvis.ai.git
cd cuvis.ai
pip install -e .
pip install grpcio grpcio-tools
cd proto && buf generate
Running the Server¶
The reference server lives in examples/grpc/server.py:
Configure message sizes or thread pool depth as needed:
server = grpc.server(
futures.ThreadPoolExecutor(max_workers=10),
options=[
("grpc.max_send_message_length", 100 * 1024 * 1024),
("grpc.max_receive_message_length", 100 * 1024 * 1024),
],
)
TLS¶
Generate a key/cert pair and enable TLS on the server:
openssl genrsa -out server.key 2048
openssl req -new -x509 -key server.key -out server.crt -days 365
credentials = grpc.ssl_server_credentials([(open("server.key", "rb").read(), open("server.crt", "rb").read())])
server.add_secure_port("[::]:50051", credentials)
Docker¶
Dockerfile example:
FROM python:3.10-slim
WORKDIR /app
COPY pyproject.toml uv.lock ./
RUN pip install uv && uv sync
COPY . .
RUN cd proto && buf generate
EXPOSE 50051
CMD ["uv", "run", "python", "examples/grpc/server.py"]
Build and run:
Kubernetes¶
Minimal deployment + service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cuvis-ai-grpc
spec:
replicas: 3
selector:
matchLabels:
app: cuvis-ai-grpc
template:
metadata:
labels:
app: cuvis-ai-grpc
spec:
containers:
- name: grpc-server
image: cuvis-ai-grpc:latest
ports:
- containerPort: 50051
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: cuvis-data-pvc
---
apiVersion: v1
kind: Service
metadata:
name: cuvis-ai-grpc
spec:
selector:
app: cuvis-ai-grpc
ports:
- port: 50051
targetPort: 50051
type: LoadBalancer
Monitoring and Health¶
- Enable structured logging with
logging.basicConfig. - Expose Prometheus metrics (e.g., via
prometheus-clientcounters/histograms). - Add gRPC health checks using
grpc_health.v1.health_pb2_grpc.
Security¶
- Add an authentication interceptor (token or mTLS).
- Enforce rate limiting per client to avoid resource exhaustion.
- Restrict message sizes and concurrency to fit hardware budgets.
Troubleshooting¶
NOT_FOUNDerrors: verifysession_idexists and was not closed.- Message size exceeded: raise
grpc.max_send_message_length/grpc.max_receive_message_lengthon both client and server. - Slow throughput: increase thread pool size or reduce batch sizes.
- Port conflicts: choose an available port or free the existing listener.