Scaling & High Availability
Capacity Planning
| Connected Agents | Gateway Pods | Auth Pods | Pulse Pods | Database Connections |
|---|---|---|---|---|
| 100 | 2 | 2 | 2 | 20 |
| 1,000 | 3 | 3 | 3 | 50 |
| 10,000 | 5 | 5 | 5 | 100 |
| 100,000 | 15 | 10 | 10 | 300 |
Horizontal Pod Autoscaling
gateway:
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
targetCPUUtilizationPercentage: 60
auth:
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70 Or manually:
kubectl scale deployment avon-gateway -n avon --replicas=5 Pod Disruption Budgets
Enabled by default to ensure availability during upgrades and node maintenance:
podDisruptionBudget:
enabled: true
minAvailable: 1 For stricter HA requirements:
podDisruptionBudget:
enabled: true
minAvailable: 2 Zone-Aware Spreading
Production deployments should spread pods across availability zones:
gateway:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app.kubernetes.io/component: gateway Database High Availability
- PostgreSQL: Use Multi-AZ managed services (RDS, Cloud SQL). Configure read replicas for read-heavy workloads. Enable automated backups.
- Redis: Use managed services with replication (ElastiCache, Memorystore). Enable Redis Cluster mode for deployments over 10,000 agents.