Kubernetes is the most powerful infrastructure platform available — and one of the easiest to overspend on. Teams routinely run K8s clusters at 15–20% utilisation while paying for 100%. After optimising infrastructure for a dozen clients, here are the eight strategies that consistently deliver the biggest savings.
Strategy 1: Right-Size Your Resources
Most teams set resource requests and limits conservatively — and never revisit them. A pod requesting 2 CPU and 4GB RAM that actually uses 0.3 CPU and 600MB RAM is wasting 85% of its allocated capacity. Use Kubernetes VPA (Vertical Pod Autoscaler) in recommendation mode for 2 weeks, then apply the suggested values.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api
updatePolicy:
updateMode: 'Off' # recommendation mode — won't change pods automatically
resourcePolicy:
containerPolicies:
- containerName: api
minAllowed: { cpu: 50m, memory: 64Mi }
maxAllowed: { cpu: 2, memory: 2Gi }
The 8 Strategies
- <strong>Right-size requests and limits.</strong> Use VPA recommendations. Typically saves 20–40% immediately.
- <strong>Use Spot/Preemptible instances for non-critical workloads.</strong> Spot instances are 60–90% cheaper. Use them for batch jobs, CI runners, and stateless services with proper disruption budgets.
- <strong>Implement cluster autoscaling.</strong> Scale down during off-peak hours. A dev/staging cluster running overnight is wasted spend.
- <strong>Use Reserved Instances for baseline capacity.</strong> Commit to 1-year Reserved Instances for your predictable baseline. 30–40% discount vs on-demand.
- <strong>Consolidate small clusters.</strong> Running 10 small clusters is more expensive than 2 medium ones. Namespace isolation is sufficient for most use cases.
- <strong>Implement KEDA for event-driven scaling.</strong> Scale to zero when queues are empty. Huge savings for batch and async workloads.
- <strong>Optimise container image sizes.</strong> Smaller images mean faster pulls, less storage cost, and faster scaling events.
- <strong>Use Kubecost or OpenCost for visibility.</strong> You cannot optimise what you cannot see. Namespace-level cost visibility changes team behaviour immediately.
For a SaaS client running 3 EKS clusters with a £28K/month AWS bill, applying strategies 1, 2, 3, and 4 reduced the bill to £17K/month in 6 weeks — a 39% saving with zero impact on reliability or performance.
Got a project in mind?
I work directly with founders and CTOs to build reliable, scalable software. Let's have a conversation about your goals.
Angebot einholen