Search…

Cloud cost management

In this series (10 parts)
  1. Cloud fundamentals and the shared responsibility model
  2. Compute: VMs, containers, serverless
  3. Networking in the cloud
  4. Cloud storage services
  5. Managed databases in the cloud
  6. Cloud IAM and access control
  7. Serverless architecture patterns
  8. Cloud cost management
  9. Multi-cloud and cloud-agnostic design
  10. Cloud Well-Architected Framework

A startup scaled from 800/monthto800/month to 47,000/month in cloud bills within six months. Traffic grew 3x. Their bill grew 58x. Nobody looked at cost dashboards until the CFO noticed. By then, hundreds of orphaned load balancers, oversized instances, and forgotten development environments had been burning money for months.

Cloud cost management is not about being cheap. It is about spending intentionally.

The three pricing models

Cloud providers offer three fundamental pricing tiers for compute. Understanding when to use each one determines whether you overpay by 10% or 70%.

On-demand

You pay a fixed hourly rate with no commitment. Start an instance, pay by the hour or second, stop it whenever you want. On-demand is the default and the most expensive option per unit of compute.

Use on-demand for unpredictable workloads, short experiments, and traffic spikes that exceed your baseline capacity. Never run steady-state production workloads on full on-demand pricing if you can avoid it.

Reserved instances and Savings Plans

You commit to a certain amount of compute usage for one or three years. In return, you get a discount of 30% to 72% depending on the commitment term and payment option.

Reserved Instances (RIs) lock you to a specific instance type in a specific region. Savings Plans are more flexible. You commit to a dollar amount of hourly compute spend, and the discount applies across instance families, regions, and even services.

Commitment type       Discount    Flexibility
----------------------------------------------
1-year RI, no upfront    ~30%     Locked to instance type
1-year RI, all upfront   ~40%     Locked to instance type
3-year RI, all upfront   ~60%     Locked to instance type
1-year Savings Plan      ~30%     Any instance family
3-year Savings Plan      ~50%     Any instance family

The break-even point for a 1-year reserved commitment versus on-demand is roughly 7-8 months. If a workload will run longer than that, reservations save money.

Spot instances

Cloud providers have unused capacity. Spot instances let you bid on that capacity at discounts of 60-90% off on-demand prices. The catch: the provider can reclaim your instance with a 2-minute warning.

Spot works for fault-tolerant workloads. Batch processing, CI/CD build agents, distributed training jobs, and stateless web servers behind a load balancer all handle interruptions gracefully. Databases and single-instance applications do not.

On-demand stays flat. Reserved offers a steady discount. Spot pricing fluctuates but stays dramatically cheaper.

Cost allocation tags

You cannot manage what you cannot see. Tags are key-value pairs attached to cloud resources that let you slice your bill by team, environment, project, or any other dimension.

Define a mandatory tagging policy before your cloud footprint grows:

Tag key          Example values        Purpose
-------------------------------------------------
team             platform, data, ml    Charge back to team
environment      prod, staging, dev    Spot wasted dev spend
project          search-v2, migration  Track project costs
cost-center      eng-1234              Map to accounting

Enforce tags through IAM policies that deny resource creation when required tags are missing. Run weekly reports to catch untagged resources. Most organizations find that 15-30% of their cloud spend is untagged and therefore unattributable.

Rightsizing

Rightsizing means matching instance sizes to actual workload needs. It is the single most impactful cost optimization for most organizations.

Cloud providers offer rightsizing recommendations based on CPU and memory utilization metrics. A common finding: 40% of instances are at least one size too large.

The process is straightforward:

  1. Collect 14 days of CPU, memory, network, and disk metrics
  2. Identify instances averaging below 30% utilization
  3. Recommend one size smaller
  4. Test the smaller size in staging
  5. Apply to production during a maintenance window

Repeat quarterly. Workload patterns change. An instance that was correctly sized six months ago may be oversized today after a code optimization.

Compute dominates most cloud bills. Rightsizing compute has the largest impact on total spend.

S3 and object storage cost traps

Object storage seems cheap at $0.023 per GB per month for standard tiers. But costs compound through mechanisms that are easy to overlook.

Request charges. Every GET, PUT, and LIST call costs money. A service that lists millions of objects daily or writes tiny objects in a tight loop can generate surprising bills. Batch operations and larger object sizes reduce per-request overhead.

Lifecycle rules. Data rarely needs to stay in the most expensive storage tier forever. Move objects to infrequent access after 30 days, to cold storage after 90 days, and to archive after a year. Without lifecycle rules, every byte stays in standard storage indefinitely.

Incomplete multipart uploads. Failed large uploads leave orphaned parts in your bucket. These parts consume storage and cost money. Set a bucket lifecycle rule to abort incomplete multipart uploads after a few days.

Cross-region replication. Replicating data to another region doubles storage costs and adds data transfer charges. Only replicate what disaster recovery actually requires.

Versioning without expiration. Object versioning keeps every previous version of every object. Without expiration rules, storage grows without bound. Set a policy to delete old versions after a retention period.

FinOps practices

FinOps is the practice of bringing financial accountability to cloud spending. It is not a tool. It is an operating model.

The core loop has three phases:

Inform. Make costs visible to every team. Share dashboards. Break down spending by team, service, and environment. Anomaly detection alerts catch unexpected spikes before they accumulate.

Optimize. Act on what the data reveals. Rightsize instances. Buy reservations for stable workloads. Delete unused resources. Schedule development environments to shut down outside business hours.

Operate. Embed cost awareness into engineering culture. Include cost impact in architecture reviews. Set budgets with alerts at 80% and 100% thresholds. Review spending in sprint retrospectives.

Week 1: Tag audit, identify untagged resources
Week 2: Rightsizing analysis, generate recommendations
Week 3: Reservation planning, calculate break-even
Week 4: Clean up unused resources, review anomalies

This four-week cadence keeps cost optimization continuous rather than a one-time project that drifts.

Budgets and alerts

Every cloud account should have budget alerts. Set them at the account level, the project level, and the team level.

Configure alerts at multiple thresholds. An alert at 50% of budget gives early warning. One at 80% signals urgency. One at 100% triggers an immediate review. Automated actions can shut down non-production environments when budgets are exceeded.

Forecasting tools in cloud consoles project your end-of-month spend based on current run rate. Check these weekly. A sudden divergence between forecast and budget means something changed, either traffic grew, a new service launched, or resources leaked.

Practical checklist

Use this as a starting point for any cloud cost review:

  • All resources tagged with team, environment, project
  • Reserved instances or Savings Plans cover 60%+ of steady compute
  • Spot instances used for all fault-tolerant batch workloads
  • Rightsizing recommendations reviewed quarterly
  • S3 lifecycle rules on every bucket
  • Development environments auto-stop outside business hours
  • Budget alerts at 50%, 80%, 100% on every account
  • Monthly cost review with engineering and finance stakeholders

What comes next

Cost optimization often surfaces a harder question: are we locked into this provider? The next article on multi-cloud and cloud-agnostic design examines why companies use multiple clouds, what abstraction layers help, and when vendor lock-in is acceptable.

Start typing to search across all content
navigate Enter open Esc close