Compute: VMs, containers, serverless
In this series (10 parts)
- Cloud fundamentals and the shared responsibility model
- Compute: VMs, containers, serverless
- Networking in the cloud
- Cloud storage services
- Managed databases in the cloud
- Cloud IAM and access control
- Serverless architecture patterns
- Cloud cost management
- Multi-cloud and cloud-agnostic design
- Cloud Well-Architected Framework
Every application needs compute. The cloud gives you three broad options: virtual machines for full control, containers for portable workloads, and serverless for event-driven execution. Picking the right one depends on your workload characteristics, team skills, and cost tolerance.
Virtual machines
A virtual machine (VM) is a software-defined server running on shared physical hardware. You pick the CPU, memory, and disk. You install an OS. You SSH in and configure everything. It feels like a real server because, from your application’s perspective, it is one.
Provider implementations
| Feature | AWS EC2 | GCP Compute Engine | Azure Virtual Machines |
|---|---|---|---|
| Instance families | 600+ types | 50+ machine types | 700+ sizes |
| Custom sizing | Limited | Custom machine types | Constrained configs |
| Billing granularity | Per second | Per second | Per second |
| Spot pricing | Spot Instances | Preemptible/Spot VMs | Spot VMs |
When VMs make sense
VMs work best for legacy applications that expect a traditional OS, workloads requiring specific kernel modules, and software with licensing tied to hardware characteristics. They also serve as the escape hatch when a managed service does not support your exact configuration.
Auto-scaling groups
Static VM fleets waste money. Auto-scaling adjusts capacity based on demand.
graph LR LB["Load Balancer"] --> ASG["Auto-Scaling Group"] ASG --> VM1["VM 1"] ASG --> VM2["VM 2"] ASG --> VM3["VM 3 (new)"] M["Metrics: CPU, requests"] --> ASG style VM3 fill:#2ecc71,color:#fff style M fill:#f39c12,color:#fff
The auto-scaling group launches or terminates VMs based on metric thresholds.
Key concepts:
- Minimum/maximum/desired capacity: Set guardrails so scaling does not go wild.
- Scaling policies: Target tracking (maintain 60% CPU) or step scaling (add 2 instances when CPU exceeds 80%).
- Cooldown periods: Prevent rapid oscillation by waiting between scaling actions.
- Health checks: Replace unhealthy instances automatically.
AWS calls this Auto Scaling Groups. GCP uses Managed Instance Groups. Azure uses Virtual Machine Scale Sets. The concepts are identical.
Containers
Containers package an application with its dependencies into an isolated, portable unit. Unlike VMs, containers share the host OS kernel. This makes them lighter, faster to start, and more efficient with resources.
Container orchestration
Running one container is simple. Running hundreds across multiple hosts requires orchestration. Kubernetes dominates this space, but each provider also offers simpler alternatives.
| Service type | AWS | GCP | Azure |
|---|---|---|---|
| Managed Kubernetes | EKS | GKE | AKS |
| Simpler container service | ECS | Cloud Run | Azure Container Apps |
| Single container hosting | App Runner | Cloud Run | Container Instances (ACI) |
ECS, Cloud Run, and ACI
Not every team needs Kubernetes. These services let you deploy containers without managing cluster infrastructure.
AWS ECS uses task definitions to describe containers, services to maintain desired count, and integrates with Fargate for serverless container execution. No nodes to manage.
GCP Cloud Run takes a container image and runs it. It scales to zero when idle, meaning you pay nothing during quiet periods. Cold starts add latency on the first request after idle.
Azure Container Instances runs containers on demand without servers. Good for burst workloads and batch processing. Not designed for long-running services.
Containers vs VMs
Containers start in seconds. VMs take minutes. Containers use less memory overhead because they share the kernel. But containers provide weaker isolation than VMs. If your security model requires hard boundaries between tenants, VMs or dedicated hosts may be necessary.
Serverless functions
Serverless takes abstraction further. You write a function, upload it, and the provider runs it in response to events. No servers, no containers, no capacity planning.
Provider implementations
| Feature | AWS Lambda | GCP Cloud Functions | Azure Functions |
|---|---|---|---|
| Max execution time | 15 min | 60 min (2nd gen) | 10 min (Consumption) |
| Memory range | 128 MB - 10 GB | 128 MB - 32 GB | 128 MB - 14 GB |
| Languages | Node, Python, Java, Go, .NET, Ruby | Node, Python, Java, Go, .NET, Ruby, PHP | Node, Python, Java, C#, PowerShell, Go |
| Concurrency model | Per-invocation | Per-invocation | Per-invocation |
Event sources
Serverless functions are triggered by events: an HTTP request, a message on a queue, a file upload to object storage, a database change, or a scheduled timer.
S3 upload --> Lambda --> Process image --> Store in DynamoDB
API Gateway --> Lambda --> Query database --> Return JSON
CloudWatch Event --> Lambda --> Run nightly cleanup
Cold starts
When a function has not been invoked recently, the provider must initialize a new execution environment. This cold start adds latency, typically 100ms to several seconds depending on runtime and package size.
Strategies to mitigate cold starts:
- Keep deployment packages small.
- Use provisioned concurrency (Lambda) or minimum instances (Cloud Functions).
- Choose lighter runtimes. Python and Node start faster than Java.
- Avoid heavy initialization in the global scope.
When serverless fits
Serverless excels at event-driven, bursty workloads. Processing uploaded images, handling webhooks, running scheduled tasks, and serving low-traffic APIs are good candidates. It struggles with long-running processes, workloads requiring persistent connections, and applications needing consistent sub-10ms latency.
Choosing the right compute model
The decision is not about which option is “best.” It is about which option fits your workload.
graph TD
Start["New workload"] --> Q1{"Need full OS control?"}
Q1 -->|Yes| VM["Use VMs"]
Q1 -->|No| Q2{"Event-driven and short-lived?"}
Q2 -->|Yes| Q3{"Execution under 15 min?"}
Q3 -->|Yes| Serverless["Use Serverless"]
Q3 -->|No| Container["Use Containers"]
Q2 -->|No| Q4{"Need fine-grained scaling?"}
Q4 -->|Yes| Container
Q4 -->|No| Q5{"Team knows Kubernetes?"}
Q5 -->|Yes| K8s["Use Managed K8s"]
Q5 -->|No| Simple["Use Simpler Container Service"]
style VM fill:#e74c3c,color:#fff
style Serverless fill:#2ecc71,color:#fff
style Container fill:#3498db,color:#fff
style K8s fill:#9b59b6,color:#fff
style Simple fill:#f39c12,color:#fff
Walk through the decision tree from top to bottom. Most teams land on containers or serverless.
Cost comparison
Cost varies dramatically by usage pattern. Here is a simplified comparison for a workload handling 1 million requests per month with an average 200ms execution time.
Serverless wins at low traffic. At high, sustained traffic, reserved VMs or containers become cheaper.
The crossover point depends on your specific workload. Run the numbers. AWS, GCP, and Azure all provide pricing calculators.
Combining compute models
Most production systems use multiple compute types. A common pattern:
- VMs for stateful workloads like databases (when not using managed services).
- Containers for the core application services that handle steady traffic.
- Serverless for event processing, scheduled jobs, and API endpoints with variable traffic.
This hybrid approach matches each workload to the most efficient compute model. It adds operational complexity, so adopt it incrementally rather than all at once.
Practical tips
- Start simple: Use the simplest compute model that meets your requirements. Migrate to more complex options only when you hit real limitations.
- Automate everything: Never manually configure a VM. Use launch templates, user data scripts, or better yet, containers with immutable images.
- Monitor utilization: Underutilized VMs are the most common source of cloud waste.
- Test scaling: Do not assume auto-scaling works correctly. Simulate load and verify that instances launch, pass health checks, and receive traffic.
- Use spot instances for batch work: Process data, run CI/CD builds, and train models at a fraction of on-demand cost.
What comes next
Compute needs a network. The next article covers cloud networking: VPC design, security groups, peering, and DNS. Understanding the network layer is essential for building secure, well-connected architectures.