Scaling & High Availability

Capacity Planning

Connected Agents	Gateway Pods	Auth Pods	Pulse Pods	Database Connections
100	2	2	2	20
1,000	3	3	3	50
10,000	5	5	5	100
100,000	15	10	10	300

Horizontal Pod Autoscaling

gateway:
  autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 20
    targetCPUUtilizationPercentage: 60

auth:
  autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 10
    targetCPUUtilizationPercentage: 70

Or manually:

kubectl scale deployment avon-gateway -n avon --replicas=5

Pod Disruption Budgets

Enabled by default to ensure availability during upgrades and node maintenance:

podDisruptionBudget:
  enabled: true
  minAvailable: 1

For stricter HA requirements:

podDisruptionBudget:
  enabled: true
  minAvailable: 2

Zone-Aware Spreading

Production deployments should spread pods across availability zones:

gateway:
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/component: gateway

Database High Availability

PostgreSQL: Use Multi-AZ managed services (RDS, Cloud SQL). Configure read replicas for read-heavy workloads. Enable automated backups.
Redis: Use managed services with replication (ElastiCache, Memorystore). Enable Redis Cluster mode for deployments over 10,000 agents.