Skip to content

Orion Documentation

Production Deployment

Production Deployment¶

Checklist and recommendations for deploying Orion to production.

Pre-Deployment Checklist¶

Security Hardening¶

JWT Secret¶

# Generate a secure random secret
openssl rand -base64 48

Set it in your production environment:

ORION_JWT_SECRET=your-64-character-random-secret-here

TLS Configuration¶

Deploy a reverse proxy (e.g., Nginx, Caddy, or Traefik) in front of the gateway:

Client --> TLS Termination (443) --> Gateway (8000)

Network Isolation¶

In production, only expose the gateway and dashboard to external traffic:

services:
  gateway:
    ports:
      - "8000:8000" # Only service exposed externally
  dashboard:
    ports:
      - "3001:3000"
  # All other services: no port mapping
  scout:
    expose:
      - "8001" # Internal only

Database¶

PostgreSQL¶

Enable SSL connections
Use dedicated credentials per service (not shared orion user)
Configure connection pooling (PgBouncer recommended)
Set up automated backups with point-in-time recovery
Monitor connection count and query performance

Redis¶

Set requirepass in Redis configuration
Disable FLUSHALL and FLUSHDB commands
Configure maxmemory policy (allkeys-lru recommended)
Enable AOF persistence

Milvus¶

Configure authentication
Set up data backups
Monitor collection sizes and query latency

Resource Limits¶

Set container resource limits in production:

services:
  gateway:
    deploy:
      resources:
        limits:
          cpus: "1.0"
          memory: 512M
  director:
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 2G
  media:
    deploy:
      resources:
        limits:
          cpus: "2.0"
          memory: 4G

Scaling¶

Gateway: Stateless, can be horizontally scaled behind a load balancer
Scout: Single instance sufficient (polling-based)
Director: Scale carefully -- LangGraph checkpoints are per-thread
Media: Scale horizontally for parallel image generation
Editor: CPU/GPU-intensive -- scale based on rendering demand
Pulse: Single instance sufficient (event aggregation)
Publisher: Single instance sufficient (rate-limited by platforms)