Skip to content

Expose pod volumes / volumeClaimTemplates for persistent or SSD-backed storage #43

@jensens

Description

@jensens

Problem

spec.storage[].type=file emits -s <name>=file,<path>,<size> to varnishd, but there is currently no way to back <path> with anything other than the operator's built-in varnish-workdir EmptyDir at /var/lib/varnish (internal/controller/statefulset.go#L97-L100).

That's fine as a first step — on nodes with NVMe/SSD kubelet-ephemeral-storage the file ends up on SSD anyway — but it has real limits:

  • No persistence across pod restart. For a pure HTTP cache that's usually acceptable; for larger working sets the warmup cost is non-trivial.
  • No emptyDir.sizeLimit, no control over the EmptyDir medium, no way to cap disk use below the node's ephemeral-storage capacity.
  • No choice of StorageClass — on clusters with a dedicated SSD StorageClass (e.g. Hetzner hcloud-volumes, csi-driver-provisioned NVMe) there's no way to say "use that, 100 Gi, for the cache spill file".
  • No separation between OS disk and cache disk — heavy file-backed caches compete with node ephemeral use (logs, image layers).

Proposal

Add pod-volume / volume-claim support on the VinylCache CRD. Two layered options, smallest useful shape first:

Option A (minimum): spec.pod.volumes + spec.pod.volumeMounts

Arbitrary corev1.Volume[] + corev1.VolumeMount[] passthrough. Users bring their own PVC / hostPath / emptyDir-with-sizeLimit and reference the mount path from spec.storage[].path. Simple to implement, fully generic. Downside: users manage the PVC lifecycle externally.

Option B (ideal): spec.volumeClaimTemplates

StatefulSet-native volumeClaimTemplates[] passthrough. The operator turns these into per-pod PVCs (<claim>-<vc>-<ord>), lifecycle-bound to the StatefulSet. Users additionally declare volumeMounts referencing the claim name. This is the right model for cache-per-replica on persistent SSD.

Option B subsumes A for the SSD-persistent case but is more work — Option A alone unlocks SSD-backed file storage via user-provisioned PVCs.

Example end-state (Option B)

apiVersion: vinyl.bluedynamics.eu/v1alpha1
kind: VinylCache
spec:
  replicas: 2
  storage:
    - name: mem
      type: malloc
      size: 1500M
    - name: disk
      type: file
      path: /var/lib/varnish-cache/disk.bin
      size: 80Gi
  volumeClaimTemplates:
    - metadata:
        name: cache-ssd
      spec:
        accessModes: [ReadWriteOnce]
        storageClassName: hcloud-volumes
        resources:
          requests:
            storage: 100Gi
  pod:
    volumeMounts:
      - name: cache-ssd
        mountPath: /var/lib/varnish-cache

Validation / webhook considerations

  • Reject spec.storage[].path values that resolve into the reserved mounts (/var/lib/varnish, /tmp, /run/vinyl, /etc/varnish/*) — those are operator-owned.
  • If both volumeClaimTemplates and pod.volumes name the same volume, webhook error.
  • Warn if storage[].type=file is declared without any user volume covering its path (i.e. file will land in EmptyDir) — fine, but worth surfacing.

Out of scope for this issue

  • Multi-tier eviction policies, cache warming, snapshotting. Just plumbing the volume surface through.

Context

Came up while extending @bluedynamics/cdk8s-plone's PloneVinylCache construct to expose spec.storage (bluedynamics/cdk8s-plone#148). For a stage environment on Hetzner kup6s we'd like to trial an SSD-backed spill file on a dedicated hcloud-volumes PVC rather than sharing node ephemeral storage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions