Secrets management in practice
In this series (10 parts)
Every application has secrets: database credentials, API keys, TLS certificates, encryption keys. How you store, distribute, and rotate these secrets determines whether a single compromise cascades into a full breach.
HashiCorp Vault deployment patterns
Vault is the most widely adopted secrets management platform. It centralizes secrets storage, provides dynamic credentials, and maintains an audit log of every access.
Architecture
graph TB
subgraph Vault Cluster
V1[Vault Node 1<br/>Active]
V2[Vault Node 2<br/>Standby]
V3[Vault Node 3<br/>Standby]
V1 --- V2
V2 --- V3
V1 --- V3
end
subgraph Storage Backend
C[(Consul / Raft)]
end
subgraph Consumers
K8s[Kubernetes Pods]
CI[CI/CD Pipelines]
Dev[Developer CLI]
end
V1 --> C
K8s --> V1
CI --> V1
Dev --> V1
style V1 fill:#2ecc71,color:#fff
style V2 fill:#3498db,color:#fff
style V3 fill:#3498db,color:#fff
Vault HA cluster with Raft storage. One active node handles requests while standby nodes replicate data and provide failover.
Deploy Vault with integrated Raft storage for simplicity:
# vault-config.hcl
storage "raft" {
path = "/vault/data"
node_id = "vault-1"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/vault/tls/tls.crt"
tls_key_file = "/vault/tls/tls.key"
}
api_addr = "https://vault-1.vault.svc:8200"
cluster_addr = "https://vault-1.vault.svc:8201"
Vault on Kubernetes
The official Helm chart deploys Vault with auto-unseal:
helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault \
--set server.ha.enabled=true \
--set server.ha.replicas=3 \
--set seal.awsKms.kmsKeyId="alias/vault-unseal"
AWS KMS auto-unseal eliminates the operational burden of manual unsealing after restarts.
Authentication methods
Vault supports multiple authentication backends. For Kubernetes workloads, use the Kubernetes auth method:
vault auth enable kubernetes
vault write auth/kubernetes/config \
kubernetes_host="https://kubernetes.default.svc"
vault write auth/kubernetes/role/api-app \
bound_service_account_names=api-app \
bound_service_account_namespaces=production \
policies=api-app-policy \
ttl=1h
Pods authenticate using their service account token. No long-lived credentials are needed. Vault validates the token with the Kubernetes API server.
Secrets engines
KV (Key-Value): Static secrets with versioning:
vault kv put secret/api-app/database \
username="app_user" \
password="s3cret"
Database: Dynamic credentials with automatic expiry:
vault write database/config/postgres \
plugin_name=postgresql-database-plugin \
connection_url="postgresql://{{username}}:{{password}}@db:5432/app" \
allowed_roles="api-app" \
username="vault_admin" \
password="admin_pass"
vault write database/roles/api-app \
db_name=postgres \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl="1h" \
max_ttl="24h"
Each application instance gets unique credentials that automatically expire. If one set is compromised, blast radius is limited to one instance for one hour.
External Secrets Operator
The External Secrets Operator (ESO) synchronizes secrets from external providers into Kubernetes Secrets:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: api-database-creds
namespace: production
spec:
refreshInterval: 1h
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: api-database-creds
data:
- secretKey: username
remoteRef:
key: secret/data/api-app/database
property: username
- secretKey: password
remoteRef:
key: secret/data/api-app/database
property: password
ESO polls Vault and updates the Kubernetes Secret automatically. Combined with pod restart policies, applications pick up new credentials without manual intervention.
ClusterSecretStore
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: "https://vault.vault.svc:8200"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "external-secrets"
AWS Secrets Manager
For AWS-native workloads, Secrets Manager integrates with IAM for access control:
import boto3
import json
client = boto3.client("secretsmanager")
def get_secret(secret_name):
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response["SecretString"])
# Usage
db_creds = get_secret("production/api/database")
Enable automatic rotation with a Lambda function:
aws secretsmanager rotate-secret \
--secret-id production/api/database \
--rotation-lambda-arn arn:aws:lambda:us-east-1:123:function:rotate-db \
--rotation-rules AutomaticallyAfterDays=30
Secret rotation without downtime
Rotation fails when applications hold stale credentials. Two patterns prevent this:
Dual-credential rotation. Maintain two active credentials simultaneously. The rotation process creates a new credential, waits for propagation, then deactivates the old one:
- Create new credential (version N+1)
- Update secret store with new credential
- Wait for all consumers to pick up version N+1
- Verify old credential is no longer in use
- Revoke old credential (version N)
Connection pool refresh. Applications check credential validity before using connections from the pool. On authentication failure, refresh credentials from the secret store and rebuild the connection:
class ManagedPool:
def get_connection(self):
try:
return self.pool.getconn()
except AuthenticationError:
self.refresh_credentials()
self.rebuild_pool()
return self.pool.getconn()
def refresh_credentials(self):
self.creds = vault_client.read("database/creds/api-app")
Detecting leaked secrets
Despite best practices, secrets leak. Monitor for them:
GitHub secret scanning detects over 200 token patterns in public and private repositories. It alerts immediately and can auto-revoke tokens from participating providers.
TruffleHog continuous monitoring:
# Scan for verified secrets across all branches
trufflehog git file://. --only-verified --since-commit HEAD~100
Log monitoring. Secrets accidentally logged are a common vector. Configure log sanitization:
import re
import logging
class SecretFilter(logging.Filter):
patterns = [
re.compile(r'password["\s:=]+\S+', re.IGNORECASE),
re.compile(r'(api[_-]?key|token)["\s:=]+\S+', re.IGNORECASE),
]
def filter(self, record):
for pattern in self.patterns:
record.msg = pattern.sub("[REDACTED]", str(record.msg))
return True
Emergency response
When a secret is confirmed leaked, execute this runbook:
- Rotate immediately. Do not wait for impact assessment. Generate new credentials and deploy them.
- Revoke the leaked credential. Ensure it cannot be used even if an attacker has it.
- Audit access logs. Determine if the credential was used between leak and rotation.
- Identify the leak source. Was it committed to a repo? Logged? Sent in a message?
- Remediate the source. Add scanning rules to prevent recurrence.
- Document the incident. Include timeline, impact, and prevention measures.
The goal is to minimize the window between leak and rotation. Automated detection and rotation can reduce this window from days to minutes.
What comes next
The next article on cloud security posture management covers monitoring your cloud infrastructure for security misconfigurations. You will learn about CSPM tools, detecting common cloud security issues like public storage buckets and over-privileged IAM roles, and continuous compliance monitoring.