Search…

Storage in Kubernetes

In this series (14 parts)
  1. Why Kubernetes exists
  2. Kubernetes architecture
  3. Core Kubernetes objects
  4. Kubernetes networking
  5. Storage in Kubernetes
  6. Kubernetes configuration and secrets
  7. Resource management and autoscaling
  8. Kubernetes workload types
  9. Kubernetes observability
  10. Kubernetes security
  11. Helm and package management
  12. GitOps with ArgoCD
  13. Kubernetes cluster operations
  14. Service mesh concepts

When a container restarts, its filesystem resets. Any data written inside the container is gone. That is fine for stateless web servers. It is a problem for databases, message queues, and anything that needs to survive a pod restart.

The storage model

Kubernetes decouples storage requests from storage provisioning using three objects.

graph LR
  Pod["Pod"] --> PVC["PersistentVolumeClaim"]
  PVC --> |"Bound"| PV["PersistentVolume"]
  SC["StorageClass"] --> |"Dynamic provisioning"| PV
  PV --> Disk["Underlying Storage (EBS, GCE PD, NFS)"]

A pod requests storage through a PVC. The PVC binds to a PV, which maps to actual disk. StorageClass automates PV creation.

PersistentVolume (PV) represents a piece of storage in the cluster. It could be an AWS EBS volume, a GCE persistent disk, or an NFS share. PVs exist independently of pods.

PersistentVolumeClaim (PVC) is a request for storage. A pod references a PVC, and Kubernetes binds it to a matching PV.

StorageClass defines how storage is provisioned. Instead of pre-creating PVs, you define a class and Kubernetes creates volumes on demand.

PersistentVolume

A manually provisioned PV:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: postgres-pv
spec:
  capacity:
    storage: 50Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: /data/postgres

The persistentVolumeReclaimPolicy controls what happens when the PVC is deleted:

PolicyBehavior
RetainPV is kept. Data is preserved. Manual cleanup required.
DeletePV and underlying storage are deleted.
RecycleDeprecated. Data is scrubbed and PV is made available again.

PersistentVolumeClaim

A PVC requests storage by size and access mode:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
  storageClassName: manual

Kubernetes finds a PV that matches the requested size, access mode, and storage class. Once bound, the relationship is exclusive: no other PVC can use that PV.

Access modes

ModeAbbreviationDescription
ReadWriteOnceRWOMounted read-write by a single node
ReadOnlyManyROXMounted read-only by many nodes
ReadWriteManyRWXMounted read-write by many nodes
ReadWriteOncePodRWOPMounted read-write by a single pod (K8s 1.22+)

Most cloud block storage only supports RWO. For RWX you need a network filesystem like NFS, EFS, or CephFS.

StorageClass and dynamic provisioning

Manually creating PVs does not scale. StorageClasses let Kubernetes provision volumes automatically when a PVC is created.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  encrypted: "true"
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

Key fields:

  • provisioner: The CSI driver that creates the actual volume.
  • parameters: Provider-specific settings (disk type, IOPS, encryption).
  • volumeBindingMode: WaitForFirstConsumer delays provisioning until a pod is scheduled, ensuring the volume is created in the correct availability zone.
  • allowVolumeExpansion: Permits resizing PVCs after creation.

A PVC using this class:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 20Gi

When this PVC is created, Kubernetes asks the ebs.csi.aws.com driver to provision a 20Gi gp3 volume. No manual PV creation needed.

Using PVCs in pods

Mount a PVC into a container:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-storage
spec:
  containers:
    - name: app
      image: myregistry/app:1.0.0
      volumeMounts:
        - name: data
          mountPath: /var/lib/app/data
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: app-data

The volume persists across pod restarts. If the pod is rescheduled to a different node (and the volume supports it), Kubernetes detaches and reattaches the volume.

StatefulSet and stable storage

Deployments treat pods as interchangeable. StatefulSets give each pod a stable identity and dedicated storage. This matters for databases, distributed caches, and any workload where each replica needs its own data.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: production
spec:
  serviceName: postgres-headless
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16
          ports:
            - containerPort: 5432
          env:
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            - name: pgdata
              mountPath: /var/lib/postgresql/data
          resources:
            requests:
              cpu: "500m"
              memory: "1Gi"
            limits:
              cpu: "1"
              memory: "2Gi"
  volumeClaimTemplates:
    - metadata:
        name: pgdata
      spec:
        accessModes:
          - ReadWriteOnce
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 50Gi
---
apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
  namespace: production
spec:
  clusterIP: None
  selector:
    app: postgres
  ports:
    - port: 5432
      targetPort: 5432

StatefulSet guarantees:

  • Pods are named postgres-0, postgres-1, postgres-2. The names are stable across restarts.
  • Each pod gets its own PVC from volumeClaimTemplates. postgres-0 always gets pgdata-postgres-0.
  • Pods are created and deleted in order. postgres-1 waits for postgres-0 to be ready.
  • The headless service creates DNS records like postgres-0.postgres-headless.production.svc.cluster.local.

CSI (Container Storage Interface)

CSI is the standard interface between Kubernetes and storage providers. Each provider ships a CSI driver that handles volume creation, attachment, and deletion.

Common CSI drivers:

ProviderDriverVolume types
AWSebs.csi.aws.comEBS (gp2, gp3, io1, io2)
AWSefs.csi.aws.comEFS (NFS-based, RWX)
GCPpd.csi.storage.gke.ioPersistent Disk
Azuredisk.csi.azure.comManaged Disk
Cephrbd.csi.ceph.comCeph RBD

CSI drivers run as pods in the cluster. They register with the kubelet and handle all storage operations through a gRPC interface.

Expanding volumes

If your StorageClass has allowVolumeExpansion: true, you can resize a PVC:

kubectl patch pvc app-data -n production \
  -p '{"spec":{"resources":{"requests":{"storage":"40Gi"}}}}'

Some drivers require a pod restart for the filesystem to expand. Others support online expansion. Check your CSI driver documentation.

What comes next

Storage keeps your data safe. But applications also need configuration and credentials. The next article covers Kubernetes configuration and secrets: ConfigMaps, Secrets, and external secret management tools.

Start typing to search across all content
navigate Enter open Esc close