Configuration management
In this series (10 parts)
- What DevOps actually is
- The software delivery lifecycle
- Agile, Scrum, and Kanban for DevOps teams
- Trunk-based development and branching strategies
- Environments and promotion strategies
- Configuration management
- Secrets management
- Deployment strategies
- On-call culture and incident management
- DevOps metrics and measuring maturity
Your application does not live in a vacuum. It connects to databases, calls external APIs, toggles features on and off, and adjusts its behavior depending on which environment it runs in. All of that state lives in configuration. The code stays the same across environments. Configuration is what changes.
Getting this wrong leads to leaked credentials, broken deployments, and hours of debugging differences between staging and production. Getting it right means your application is portable, secure, and easy to reason about.
The twelve-factor config principle
The Twelve-Factor App methodology defines config as “everything that varies between deploys.” Database URLs, API keys, feature flags, log levels, third-party service credentials. If it changes between staging and production, it is config.
The core rule: strict separation of config from code. You should be able to open-source your codebase right now without exposing a single credential or environment-specific value.
This is more than a good idea. It is a litmus test. If publishing your repo would leak secrets, config is leaking into code.
# Bad: hardcoded in source
DATABASE_URL = "postgres://admin:s3cret@prod-db:5432/app"
# Good: pulled from environment
DATABASE_URL = os.environ["DATABASE_URL"]
Twelve-factor says to store config in environment variables. That advice is sound as a baseline but incomplete for production systems at scale.
Environment variables: strengths and limits
Environment variables are the simplest config delivery mechanism. Every language reads them. Every container runtime supports them. Every CI/CD system injects them.
They work well for small sets of values. A database URL here, a log level there. No files to manage, no libraries to import.
But they have real limits:
- No structure. Every value is a flat string. Nested config requires naming conventions like
DB_PRIMARY_HOSTandDB_PRIMARY_PORT. - No versioning. Who changed
MAX_RETRIESfrom 3 to 10? When? Nobody knows. - Process visibility. On Linux, any process running as the same user can read
/proc/<pid>/environ. Child processes inherit the full environment by default. - No validation. A typo in a variable name produces a silent failure, not a crash.
For a team of five running three services, environment variables are fine. For fifty services across four environments, you need something more.
Config files: structured but static
Config files give you structure that environment variables lack. YAML, JSON, TOML, or HCL files let you group related settings, add comments, and validate schemas.
# config/production.yaml
database:
host: prod-db.internal
port: 5432
pool_size: 20
ssl: true
logging:
level: warn
format: json
features:
new_checkout: true
dark_mode: false
You get hierarchy, types, and documentation in one place. Version control tracks every change. Code review gates modifications.
The tradeoff is deployment coupling. Changing a config file means building and deploying the application. That is fine for settings that rarely change. It is a problem for anything you want to adjust without a redeploy.
Config services: dynamic and centralized
Config services like Consul, etcd, AWS AppConfig, and Spring Cloud Config sit between your application and its configuration. The app fetches config at startup or watches for changes at runtime.
graph LR A["Application"] -->|"GET /config/app-name"| CS["Config Service<br/>(Consul / etcd / AppConfig)"] CS -->|"JSON response"| A CS --> S["Encrypted Store"] CS --> AL["Audit Log"] D["Developer"] -->|"Update via CLI/UI"| CS
Config service as a centralized source of truth with audit logging.
Benefits are significant:
- Dynamic updates without redeployment
- Centralized audit trail showing who changed what and when
- Access control per service, per environment
- Encryption at rest for sensitive values
- Versioning and rollback built in
The cost is operational complexity. You now depend on a config service being available. If Consul goes down, can your app start? You need caching, fallback values, and health checks around your config fetches.
Secrets vs non-sensitive config
Not all config is created equal. A log level and a database password require very different handling.
| Aspect | Non-sensitive config | Secrets |
|---|---|---|
| Storage | Config files, env vars, config service | Vault, AWS Secrets Manager, sealed secrets |
| Access control | Team-wide read | Least privilege, role-based |
| Rotation | Change when needed | Regular rotation schedule |
| Audit | Nice to have | Mandatory |
| Encryption | Optional | Required at rest and in transit |
The mistake teams make is treating everything as a secret or treating nothing as a secret. Encrypting your log level adds friction with no security benefit. Storing your database password in a plain YAML file in Git is a breach waiting to happen.
Draw a clear line. Anything that grants access to a system or contains user data is a secret. Everything else is config. Then handle each category appropriately.
For deeper coverage on managing secrets specifically, see secrets management.
Runtime vs build-time config
Build-time config is baked into the artifact. Think of compiled feature flags, minified JavaScript bundles with API endpoints, or Docker images with embedded config files. Once built, these values cannot change without rebuilding.
Runtime config is read when the application starts or while it runs. Environment variables, config service fetches, and file watches all deliver runtime config.
graph TD
subgraph Build Time
BC["Build Config<br/>API_URL, CDN_HOST"] --> BA["Build Artifact<br/>(Docker image, JS bundle)"]
end
subgraph Runtime
BA --> APP["Running Application"]
RC["Runtime Config<br/>DB_URL, LOG_LEVEL, FLAGS"] --> APP
end
Build-time config is frozen in the artifact. Runtime config can vary per deploy.
The rule of thumb: minimize build-time config. Every value baked into a build means a new artifact for each environment. If staging and production need different API endpoints and those endpoints are build-time config, you need two separate builds. That violates the twelve-factor principle of one build, many deploys.
Some values must be build-time. Frontend apps that compile JavaScript need API URLs at build time unless you use a config injection pattern at startup. Backend services have almost no reason for build-time config beyond the language and framework version.
Feature flags as config
Feature flags are a special category of configuration. They control what code paths execute, who sees new features, and how rollouts progress.
Simple flags are booleans:
{
"new_checkout_flow": true,
"experimental_search": false
}
Advanced flags carry targeting rules:
{
"new_checkout_flow": {
"enabled": true,
"rollout_percentage": 25,
"allowed_regions": ["us-east-1", "eu-west-1"],
"excluded_user_ids": [101, 202]
}
}
Feature flags sit in a gray area between config and code. They change application behavior. They have dependencies. They accumulate technical debt if left in place after a full rollout.
Treat flags with discipline:
- Every flag has an owner and an expiration date. No orphan flags.
- Remove flags after full rollout. A flag at 100% for six months is dead code with extra steps.
- Test both paths. Your CI pipeline should exercise the on and off states.
- Separate operational flags from release flags. A circuit breaker flag that disables a failing dependency is operational. A flag that shows a new UI is a release mechanism. They have different lifecycles.
Putting it all together
A healthy config strategy layers these mechanisms:
graph TD D["Defaults in Code"] --> CF["Config Files<br/>(version-controlled)"] CF --> EV["Environment Variables<br/>(per-environment overrides)"] EV --> CS["Config Service<br/>(dynamic runtime config)"] CS --> SM["Secrets Manager<br/>(credentials, keys, tokens)"] SM --> FF["Feature Flag Service<br/>(rollout control)"] style SM fill:#f96,stroke:#333
Config hierarchy: each layer overrides the one below it.
Defaults live in code. Config files capture environment-specific overrides. Environment variables handle deployment-level tuning. A config service provides dynamic changes without redeploys. A secrets manager handles credentials. Feature flags control rollout mechanics.
Each layer has a clear purpose. Each layer overrides the one below it. And each layer has appropriate access controls, audit logging, and encryption for the sensitivity of the data it holds.
Common mistakes
Committing .env files to Git. Add .env to .gitignore on day one. Provide a .env.example with placeholder values instead.
Using the same secret across environments. If staging and production share a database password, a staging breach becomes a production breach.
No schema validation. Applications should fail fast if required config is missing or malformed. Do not let a missing DATABASE_URL surface as a cryptic connection error ten minutes after startup.
Config drift. Manually tweaking a production server’s config creates invisible differences. All config changes should flow through version control or a config service.
What comes next
Configuration is one half of the problem. The other half is managing the credentials that config points to. Rotating database passwords, issuing short-lived tokens, and auditing who accessed what. That is secrets management, and it builds directly on everything covered here.