Scaling & High Availability

Capacity Planning

Connected AgentsGateway PodsAuth PodsPulse PodsDatabase Connections
10022220
1,00033350
10,000555100
100,000151010300

Horizontal Pod Autoscaling

gateway:
  autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 20
    targetCPUUtilizationPercentage: 60

auth:
  autoscaling:
    enabled: true
    minReplicas: 3
    maxReplicas: 10
    targetCPUUtilizationPercentage: 70

Or manually:

kubectl scale deployment avon-gateway -n avon --replicas=5

Pod Disruption Budgets

Enabled by default to ensure availability during upgrades and node maintenance:

podDisruptionBudget:
  enabled: true
  minAvailable: 1

For stricter HA requirements:

podDisruptionBudget:
  enabled: true
  minAvailable: 2

Zone-Aware Spreading

Production deployments should spread pods across availability zones:

gateway:
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule
      labelSelector:
        matchLabels:
          app.kubernetes.io/component: gateway

Database High Availability

  • PostgreSQL: Use Multi-AZ managed services (RDS, Cloud SQL). Configure read replicas for read-heavy workloads. Enable automated backups.
  • Redis: Use managed services with replication (ElastiCache, Memorystore). Enable Redis Cluster mode for deployments over 10,000 agents.