Production Deployment¶
Checklist and recommendations for deploying Orion to production.
Pre-Deployment Checklist¶
- Change
ORION_JWT_SECRETto a strong random value (32+ characters) - Create admin user via registration or database seeding (env-var admin is deprecated)
- Configure OAuth providers (GitHub/Google) with production callback URLs
- Set
APP_ENV=production - Restrict CORS origins (default allows all)
- Enable TLS termination (reverse proxy or load balancer)
- Configure PostgreSQL with SSL and strong credentials
- Set Redis authentication password
- Configure backup schedules for PostgreSQL and Milvus
- Set up monitoring (Prometheus + Grafana)
- Configure log aggregation
- Set resource limits on all containers
- Enable Docker health checks (already configured)
Security Hardening¶
JWT Secret¶
Set it in your production environment:
TLS Configuration¶
Deploy a reverse proxy (e.g., Nginx, Caddy, or Traefik) in front of the gateway:
Network Isolation¶
In production, only expose the gateway and dashboard to external traffic:
services:
gateway:
ports:
- "8000:8000" # Only service exposed externally
dashboard:
ports:
- "3001:3000"
# All other services: no port mapping
scout:
expose:
- "8001" # Internal only
Database¶
PostgreSQL¶
- Enable SSL connections
- Use dedicated credentials per service (not shared
orionuser) - Configure connection pooling (PgBouncer recommended)
- Set up automated backups with point-in-time recovery
- Monitor connection count and query performance
Redis¶
- Set
requirepassin Redis configuration - Disable
FLUSHALLandFLUSHDBcommands - Configure maxmemory policy (
allkeys-lrurecommended) - Enable AOF persistence
Milvus¶
- Configure authentication
- Set up data backups
- Monitor collection sizes and query latency
Resource Limits¶
Set container resource limits in production:
services:
gateway:
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
director:
deploy:
resources:
limits:
cpus: "2.0"
memory: 2G
media:
deploy:
resources:
limits:
cpus: "2.0"
memory: 4G
Scaling¶
- Gateway: Stateless, can be horizontally scaled behind a load balancer
- Scout: Single instance sufficient (polling-based)
- Director: Scale carefully -- LangGraph checkpoints are per-thread
- Media: Scale horizontally for parallel image generation
- Editor: CPU/GPU-intensive -- scale based on rendering demand
- Pulse: Single instance sufficient (event aggregation)
- Publisher: Single instance sufficient (rate-limited by platforms)