Environments and promotion strategies
In this series (10 parts)
- What DevOps actually is
- The software delivery lifecycle
- Agile, Scrum, and Kanban for DevOps teams
- Trunk-based development and branching strategies
- Environments and promotion strategies
- Configuration management
- Secrets management
- Deployment strategies
- On-call culture and incident management
- DevOps metrics and measuring maturity
Every software team has at least two environments: the place where developers work and the place where users interact with the product. What happens between those two endpoints determines whether deployments are routine or terrifying. Environment strategy is the bridge between “it works on my machine” and “it works in production.”
The classic three: dev, staging, production
Most organizations start with three environments:
- Dev (or development). Where developers run and test code locally or on shared infrastructure. Configuration is loose. Data is synthetic.
- Staging. A pre-production environment that mimics production as closely as possible. Used for final validation before release.
- Production. Where real users interact with the system. Uptime matters. Data is real.
This model is a starting point, not a destination. It breaks down as teams grow and deployment frequency increases.
graph LR Dev["Dev Environment"] -->|"build artifact"| CI["CI Pipeline"] CI -->|"deploy"| Staging["Staging"] Staging -->|"manual approval"| Prod["Production"]
The classic promotion path. An artifact builds in CI, deploys to staging for validation, then promotes to production after approval.
Why three environments is often not enough
Several forces push teams beyond the basic three:
Shared staging becomes a bottleneck. When five teams share one staging environment, deploying team A’s changes blocks team B’s testing. Queues form. Teams start scheduling staging slots. Deployment frequency drops.
Integration testing needs isolation. Some tests require a full system stack but should not pollute staging data or depend on staging availability.
Compliance requires separation. Regulated industries often mandate distinct environments for security testing, performance testing, and user acceptance testing (UAT).
Demo environments. Sales teams need stable environments to demonstrate features to customers. These cannot break because a developer is testing a migration.
A more realistic environment topology:
graph TD Dev["Developer Local"] --> CI["CI Pipeline"] CI --> IntTest["Integration Test"] CI --> PerfTest["Performance Test"] CI --> SecurityScan["Security Scan"] IntTest --> Staging["Staging"] PerfTest --> Staging SecurityScan --> Staging Staging --> Canary["Canary (5% traffic)"] Canary --> Prod["Production (100%)"]
A production-grade promotion path. Artifacts pass through multiple validation stages before reaching full production traffic. The canary stage catches issues that only appear under real user behavior.
Environment parity
Environment parity means every environment runs the same operating system, same runtime versions, same database engine, same network topology. Differences between environments are the number one cause of “works in staging, fails in production” bugs.
Common parity violations:
| Difference | What breaks |
|---|---|
| Different database engine (SQLite in dev, PostgreSQL in prod) | Query behavior, type handling, migration compatibility |
| Different OS (macOS in dev, Linux in prod) | File path handling, system call behavior, package availability |
| Missing services (no Redis in staging) | Caching behavior, session management, rate limiting |
| Different resource limits (unlimited CPU in dev, constrained in prod) | Performance characteristics, timeout behavior |
| Stale data (staging data from six months ago) | Queries that work on old schemas fail on new ones |
The fix is infrastructure-as-code. When every environment is defined by the same configuration templates, parity becomes the default instead of the aspiration. Terraform and similar tools make this practical by expressing infrastructure as declarative code that can be versioned, reviewed, and applied consistently.
Ephemeral environments
Ephemeral environments are created on demand, used for a specific purpose, and destroyed when that purpose is complete. A pull request opens, a fresh environment spins up, tests run against it, reviewers interact with it, and when the PR merges, the environment is deleted.
Benefits:
- No contention. Every PR gets its own environment. No scheduling. No blocking.
- Perfect parity. Each ephemeral environment is created from the same infrastructure-as-code templates as production.
- Clean state. No leftover data from previous tests. No accumulated configuration drift.
- Cost efficiency. Environments exist only while needed. Cloud billing reflects actual usage.
The trade-off is complexity. Ephemeral environments require:
- Infrastructure-as-code that can provision a full stack in minutes
- A container orchestration platform (Kubernetes, ECS, or similar)
- Seed data management to populate databases for testing
- DNS or routing configuration to make the environment accessible
- Cleanup automation to destroy environments when they are no longer needed
Tools like Vercel preview deployments, Argo CD pull request generators, and Terraform workspaces make this increasingly accessible. What once required a dedicated platform team is now a CI pipeline configuration.
Artifact promotion
The core principle: build once, deploy everywhere. An artifact (container image, binary, bundle) is built exactly once in CI. That same artifact is deployed to every environment. Only configuration changes between environments.
This means:
- No rebuilding per environment. The staging artifact is the production artifact.
- Configuration is injected. Database URLs, API keys, and feature flags are environment variables or secrets, never baked into the artifact.
- Versioning is immutable. Artifact version
v2.3.47always contains the same code, regardless of where it runs.
graph LR Build["Build Artifact v2.3.47"] --> Dev["Dev Config"] Build --> Staging["Staging Config"] Build --> Prod["Prod Config"] Dev --> DevEnv["Dev Environment"] Staging --> StagingEnv["Staging Environment"] Prod --> ProdEnv["Production"]
Build once, configure per environment. The artifact is immutable. Only the configuration layer changes between environments.
Anti-pattern: rebuilding the application for each environment. This introduces the possibility that dev builds differ from production builds due to different dependency resolution, different build flags, or different environment variables at build time.
Infrastructure-as-code as the mechanism
Environment parity is impossible to maintain manually at scale. Infrastructure-as-code (IaC) solves this by defining environments in version-controlled configuration files.
The workflow:
- Define environment infrastructure in code (Terraform, Pulumi, CloudFormation).
- Store the code in the same repository as the application, or in a dedicated infrastructure repository.
- Use CI/CD to apply infrastructure changes through the same promotion pipeline as application code.
- Drift detection alerts when an environment diverges from its declared state.
This approach gives you:
- Reproducibility. Delete an environment and recreate it identically from code.
- Auditability. Every infrastructure change is a commit with an author, a timestamp, and a review trail.
- Consistency. The same Terraform module provisions dev, staging, and production. Parameters control the differences (instance size, replica count).
Secrets and configuration management
Environments differ in their secrets: database passwords, API keys, TLS certificates. These must never be stored in source code or baked into artifacts.
Common approaches:
- Environment variables. Simple and universal. Set them in the deployment platform.
- Secrets managers. AWS Secrets Manager, HashiCorp Vault, or similar. Applications fetch secrets at runtime. Rotation happens without redeployment.
- Sealed secrets. Encrypted secrets stored in Git. Only the target cluster can decrypt them. Useful for GitOps workflows.
The rule: secrets are injected at deployment time, never at build time. If your CI pipeline needs a secret to build the application, your build process has a design problem.
Promotion strategies
How an artifact moves through environments:
Linear promotion. Dev to staging to production. Simple. Each environment acts as a gate.
Parallel validation. The artifact deploys to integration, performance, and security environments simultaneously. All must pass before staging.
Canary promotion. The artifact deploys to production but only receives a small percentage of traffic. Metrics are compared against the baseline. If error rates or latency increase, the deployment is rolled back automatically.
Blue-green promotion. Two identical production environments exist. One serves traffic (blue). The new version deploys to the idle environment (green). Traffic switches atomically. If something fails, switch back.
| Strategy | Risk level | Rollback speed | Infrastructure cost |
|---|---|---|---|
| Linear | Medium | Minutes (redeploy previous version) | Low |
| Canary | Low | Seconds (shift traffic back) | Medium |
| Blue-green | Low | Seconds (DNS/load balancer switch) | High (double infrastructure) |
What comes next
This article concludes the DevOps Fundamentals series introduction. The next series of articles dives into CI/CD pipelines, starting with continuous integration: how to set up build automation, test strategies, and pipeline design patterns that make the promotion path described here fully automated.