From a9d0a6c9bf7118e0670d908d72399e52c1c84377 Mon Sep 17 00:00:00 2001 From: Tamal Saha Date: Tue, 30 Jun 2026 23:02:33 +0600 Subject: [PATCH] docs: add DocumentDB guides Add user-facing guides for KubeDB managed DocumentDB (MongoDB wire-protocol compatible, PostgreSQL-backed), modeled on the Postgres guides with mongosh/port-10260 verification. Topics: custom configuration (secret/inline/tuning), restart, horizontal & vertical scaling, reconfigure, rotate authentication, volume expansion, storage migration, automatic (raft) failover, and compute & storage autoscaling. Each guide embeds live output captured from a running cluster (ops-request status, pod roles, PVC sizes, secret rotation, mongosh pings). Example manifests added under docs/examples/documentdb. Signed-off-by: Tamal Saha --- .../compute/autoscaling-compute-object.yaml | 33 ++ .../compute/autoscaling-compute.yaml | 33 ++ .../storage/autoscaling-storage-object.yaml | 33 ++ .../storage/autoscaling-storage.yaml | 33 ++ .../configuration/cluster-config-secret.yaml | 27 ++ .../documentdb-custom-config-secret.yaml | 9 + .../standalone-config-inline.yaml | 30 ++ .../standalone-config-secret.yaml | 27 ++ .../standalone-config-tuning.yaml | 31 ++ .../cluster-reconfigure-remove.yaml | 11 + .../reconfigure/cluster-reconfigure.yaml | 13 + .../reconfigure/standalone-reconfigure.yaml | 13 + .../documentdb/restart/cluster-longhorn.yaml | 25 ++ .../documentdb/restart/cluster-restart.yaml | 9 + docs/examples/documentdb/restart/cluster.yaml | 25 ++ .../restart/standalone-restart.yaml | 9 + .../documentdb/restart/standalone.yaml | 25 ++ .../cluster-rotate-auth.yaml | 9 + .../standalone-rotate-auth.yaml | 9 + .../cluster-hscale-down.yaml | 11 + .../horizontal-scaling/cluster-hscale-up.yaml | 11 + .../cluster-vertical-scaling.yaml | 23 + .../standalone-vertical-scaling.yaml | 23 + .../cluster-storage-migration.yaml | 13 + .../standalone-storage-migration.yaml | 13 + .../cluster-volume-expansion.yaml | 12 + .../standalone-volume-expansion.yaml | 12 + docs/guides/documentdb/_index.md | 10 + docs/guides/documentdb/autoscaler/_index.md | 10 + .../documentdb/autoscaler/compute/index.md | 400 ++++++++++++++++++ .../documentdb/autoscaler/storage/index.md | 350 +++++++++++++++ .../guides/documentdb/configuration/_index.md | 10 + .../configuration/using-config-file.md | 282 ++++++++++++ .../failure-and-disaster-recovery/_index.md | 10 + .../failure-and-disaster-recovery/failover.md | 166 ++++++++ docs/guides/documentdb/reconfigure/_index.md | 10 + .../documentdb/reconfigure/reconfigure.md | 159 +++++++ docs/guides/documentdb/restart/_index.md | 10 + docs/guides/documentdb/restart/restart.md | 179 ++++++++ .../rotate-authentication/_index.md | 10 + .../rotate-authentication.md | 162 +++++++ docs/guides/documentdb/scaling/_index.md | 10 + .../scaling/horizontal-scaling/_index.md | 10 + .../horizontal-scaling/horizontal-scaling.md | 178 ++++++++ .../scaling/vertical-scaling/_index.md | 10 + .../vertical-scaling/vertical-scaling.md | 157 +++++++ .../documentdb/storage-migration/_index.md | 10 + .../storage-migration/storage-migration.md | 170 ++++++++ .../documentdb/volume-expansion/_index.md | 10 + .../volume-expansion/volume-expansion.md | 143 +++++++ 50 files changed, 2988 insertions(+) create mode 100644 docs/examples/documentdb/autoscaler/compute/autoscaling-compute-object.yaml create mode 100644 docs/examples/documentdb/autoscaler/compute/autoscaling-compute.yaml create mode 100644 docs/examples/documentdb/autoscaler/storage/autoscaling-storage-object.yaml create mode 100644 docs/examples/documentdb/autoscaler/storage/autoscaling-storage.yaml create mode 100644 docs/examples/documentdb/configuration/cluster-config-secret.yaml create mode 100644 docs/examples/documentdb/configuration/documentdb-custom-config-secret.yaml create mode 100644 docs/examples/documentdb/configuration/standalone-config-inline.yaml create mode 100644 docs/examples/documentdb/configuration/standalone-config-secret.yaml create mode 100644 docs/examples/documentdb/configuration/standalone-config-tuning.yaml create mode 100644 docs/examples/documentdb/reconfigure/cluster-reconfigure-remove.yaml create mode 100644 docs/examples/documentdb/reconfigure/cluster-reconfigure.yaml create mode 100644 docs/examples/documentdb/reconfigure/standalone-reconfigure.yaml create mode 100644 docs/examples/documentdb/restart/cluster-longhorn.yaml create mode 100644 docs/examples/documentdb/restart/cluster-restart.yaml create mode 100644 docs/examples/documentdb/restart/cluster.yaml create mode 100644 docs/examples/documentdb/restart/standalone-restart.yaml create mode 100644 docs/examples/documentdb/restart/standalone.yaml create mode 100644 docs/examples/documentdb/rotate-authentication/cluster-rotate-auth.yaml create mode 100644 docs/examples/documentdb/rotate-authentication/standalone-rotate-auth.yaml create mode 100644 docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-down.yaml create mode 100644 docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-up.yaml create mode 100644 docs/examples/documentdb/scaling/vertical-scaling/cluster-vertical-scaling.yaml create mode 100644 docs/examples/documentdb/scaling/vertical-scaling/standalone-vertical-scaling.yaml create mode 100644 docs/examples/documentdb/storage-migration/cluster-storage-migration.yaml create mode 100644 docs/examples/documentdb/storage-migration/standalone-storage-migration.yaml create mode 100644 docs/examples/documentdb/volume-expansion/cluster-volume-expansion.yaml create mode 100644 docs/examples/documentdb/volume-expansion/standalone-volume-expansion.yaml create mode 100644 docs/guides/documentdb/_index.md create mode 100644 docs/guides/documentdb/autoscaler/_index.md create mode 100644 docs/guides/documentdb/autoscaler/compute/index.md create mode 100644 docs/guides/documentdb/autoscaler/storage/index.md create mode 100644 docs/guides/documentdb/configuration/_index.md create mode 100644 docs/guides/documentdb/configuration/using-config-file.md create mode 100644 docs/guides/documentdb/failure-and-disaster-recovery/_index.md create mode 100644 docs/guides/documentdb/failure-and-disaster-recovery/failover.md create mode 100644 docs/guides/documentdb/reconfigure/_index.md create mode 100644 docs/guides/documentdb/reconfigure/reconfigure.md create mode 100644 docs/guides/documentdb/restart/_index.md create mode 100644 docs/guides/documentdb/restart/restart.md create mode 100644 docs/guides/documentdb/rotate-authentication/_index.md create mode 100644 docs/guides/documentdb/rotate-authentication/rotate-authentication.md create mode 100644 docs/guides/documentdb/scaling/_index.md create mode 100644 docs/guides/documentdb/scaling/horizontal-scaling/_index.md create mode 100644 docs/guides/documentdb/scaling/horizontal-scaling/horizontal-scaling.md create mode 100644 docs/guides/documentdb/scaling/vertical-scaling/_index.md create mode 100644 docs/guides/documentdb/scaling/vertical-scaling/vertical-scaling.md create mode 100644 docs/guides/documentdb/storage-migration/_index.md create mode 100644 docs/guides/documentdb/storage-migration/storage-migration.md create mode 100644 docs/guides/documentdb/volume-expansion/_index.md create mode 100644 docs/guides/documentdb/volume-expansion/volume-expansion.md diff --git a/docs/examples/documentdb/autoscaler/compute/autoscaling-compute-object.yaml b/docs/examples/documentdb/autoscaler/compute/autoscaling-compute-object.yaml new file mode 100644 index 000000000..7cd9c482f --- /dev/null +++ b/docs/examples/documentdb/autoscaler/compute/autoscaling-compute-object.yaml @@ -0,0 +1,33 @@ +# Base DocumentDB for COMPUTE autoscaling tests. +# Resources are deliberately LOW (cpu 500m / memory 1Gi) so a compute autoscaler +# with minAllowed above these values deterministically recommends a scale-UP to +# minAllowed (the recommendation floor), which creates a VerticalScaling ops request. +# local-path is fine for compute tests (no volume expansion involved). +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: dcdb + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 1Gi + limits: + cpu: 500m + memory: 1Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/autoscaler/compute/autoscaling-compute.yaml b/docs/examples/documentdb/autoscaler/compute/autoscaling-compute.yaml new file mode 100644 index 000000000..a2251f4bc --- /dev/null +++ b/docs/examples/documentdb/autoscaler/compute/autoscaling-compute.yaml @@ -0,0 +1,33 @@ +# COMPUTE (vertical) autoscaler for DocumentDB. +# The compute loop creates a VPA named after the DB's petset (= db name "dcdb"), +# polls VPA recommendations, and when the recommendation differs from the current +# request by more than resourceDiffPercentage (and the pod is older than +# podLifeTimeThreshold, OR the current request is outside the min/max band), +# it creates a VerticalScaling DocumentDBOpsRequest named dcops-dcdb-. +# +# Deterministic scale-UP: the base DB requests 500m/1Gi which is BELOW minAllowed +# (600m/1.5Gi) here, so the recommendation floor pushes it up to minAllowed. +apiVersion: autoscaling.kubedb.com/v1alpha1 +kind: DocumentDBAutoscaler +metadata: + name: dcdb-compute-autoscaler + namespace: demo +spec: + databaseRef: + name: dcdb + opsRequestOptions: + timeout: 5m + apply: IfReady # IfReady | Always + compute: + documentdb: + trigger: "On" # "On" | "Off" + podLifeTimeThreshold: 1m + resourceDiffPercentage: 5 + minAllowed: + cpu: 600m + memory: 1.5Gi + maxAllowed: + cpu: "2" + memory: 3Gi + controlledResources: ["cpu", "memory"] + containerControlledValues: "RequestsAndLimits" # RequestsAndLimits | RequestsOnly diff --git a/docs/examples/documentdb/autoscaler/storage/autoscaling-storage-object.yaml b/docs/examples/documentdb/autoscaler/storage/autoscaling-storage-object.yaml new file mode 100644 index 000000000..74938b09e --- /dev/null +++ b/docs/examples/documentdb/autoscaler/storage/autoscaling-storage-object.yaml @@ -0,0 +1,33 @@ +# Base DocumentDB for STORAGE autoscaling tests. +# Storage autoscaling issues a VolumeExpansion ops request, which REQUIRES an +# expandable StorageClass (allowVolumeExpansion: true). local-path is NOT +# expandable, so use longhorn (installed in this cluster). Start small (2Gi) so +# the volume fills past usageThreshold quickly when you write data. +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: dcdb + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 1Gi + limits: + cpu: 500m + memory: 1Gi + storage: + storageClassName: "longhorn" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 2Gi diff --git a/docs/examples/documentdb/autoscaler/storage/autoscaling-storage.yaml b/docs/examples/documentdb/autoscaler/storage/autoscaling-storage.yaml new file mode 100644 index 000000000..80939f851 --- /dev/null +++ b/docs/examples/documentdb/autoscaler/storage/autoscaling-storage.yaml @@ -0,0 +1,33 @@ +# STORAGE autoscaler for DocumentDB. +# The storage loop reads PVC usage from the custom-metrics API (volume_used_percentage) +# for the DB's pods. When usage% > usageThreshold, it computes a new size from +# scalingRules and creates a VolumeExpansion DocumentDBOpsRequest (capped at upperBound) +# using expansionMode. +# +# IMPORTANT: the DocumentDB storage autoscaler computes the scaled size ONLY from +# `scalingRules[].threshold` (see GetVolumeForOpsReq) -- the simpler top-level +# `scalingThreshold` field is NOT read by this controller path. You MUST provide +# scalingRules or NO ops request is ever created. A single rule with empty +# appliesUpto applies to all sizes; threshold "50%" grows capacity by 50%. +# +# REQUIRES: an expandable StorageClass (provision the DB with autoscaling-storage-object.yaml) +# AND the custom.metrics.k8s.io API (storage-metrics-apiserver) -- metrics-server +# alone is NOT enough. The autoscaler ServiceAccount also needs get/list on +# custom.metrics.k8s.io (added to ClusterRole kubedb-kubedb-autoscaler). +apiVersion: autoscaling.kubedb.com/v1alpha1 +kind: DocumentDBAutoscaler +metadata: + name: dcdb-storage-autoscaler + namespace: demo +spec: + databaseRef: + name: dcdb + storage: + documentdb: + trigger: "On" # "On" | "Off" + usageThreshold: 60 # scale when PVC usage% is > 60 + scalingRules: # REQUIRED: drives the new size (scalingThreshold is ignored here) + - appliesUpto: "" # empty = applies to all current sizes + threshold: 50% # grow capacity by 50% (e.g. 2Gi -> 3Gi) + expansionMode: "Online" # Online | Offline (Online needs online-resize-capable CSI) + upperBound: 10Gi # never grow past this diff --git a/docs/examples/documentdb/configuration/cluster-config-secret.yaml b/docs/examples/documentdb/configuration/cluster-config-secret.yaml new file mode 100644 index 000000000..fcf94369c --- /dev/null +++ b/docs/examples/documentdb/configuration/cluster-config-secret.yaml @@ -0,0 +1,27 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-cls-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + configuration: + secretName: documentdb-custom-config + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/configuration/documentdb-custom-config-secret.yaml b/docs/examples/documentdb/configuration/documentdb-custom-config-secret.yaml new file mode 100644 index 000000000..15ae1644f --- /dev/null +++ b/docs/examples/documentdb/configuration/documentdb-custom-config-secret.yaml @@ -0,0 +1,9 @@ +apiVersion: v1 +kind: Secret +metadata: + name: documentdb-custom-config + namespace: demo +stringData: + user.conf: | + max_connections=250 + work_mem=8MB diff --git a/docs/examples/documentdb/configuration/standalone-config-inline.yaml b/docs/examples/documentdb/configuration/standalone-config-inline.yaml new file mode 100644 index 000000000..2c8448a6d --- /dev/null +++ b/docs/examples/documentdb/configuration/standalone-config-inline.yaml @@ -0,0 +1,30 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-sa-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 1 + configuration: + inline: + user.conf: | + max_connections=300 + work_mem=16MB + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/configuration/standalone-config-secret.yaml b/docs/examples/documentdb/configuration/standalone-config-secret.yaml new file mode 100644 index 000000000..b3682a2ba --- /dev/null +++ b/docs/examples/documentdb/configuration/standalone-config-secret.yaml @@ -0,0 +1,27 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-sa-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 1 + configuration: + secretName: documentdb-custom-config + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/configuration/standalone-config-tuning.yaml b/docs/examples/documentdb/configuration/standalone-config-tuning.yaml new file mode 100644 index 000000000..d519f1ccc --- /dev/null +++ b/docs/examples/documentdb/configuration/standalone-config-tuning.yaml @@ -0,0 +1,31 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-sa-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 1 + configuration: + tuning: + profile: oltp # web | oltp | dw | mixed | desktop + storageType: ssd # ssd | hdd | san + maxConnections: 200 + disableAutoTune: false + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/reconfigure/cluster-reconfigure-remove.yaml b/docs/examples/documentdb/reconfigure/cluster-reconfigure-remove.yaml new file mode 100644 index 000000000..a0b366142 --- /dev/null +++ b/docs/examples/documentdb/reconfigure/cluster-reconfigure-remove.yaml @@ -0,0 +1,11 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-reconfigure-remove + namespace: demo +spec: + type: Reconfigure + databaseRef: + name: documentdb-cls-sample + configuration: + removeCustomConfig: true diff --git a/docs/examples/documentdb/reconfigure/cluster-reconfigure.yaml b/docs/examples/documentdb/reconfigure/cluster-reconfigure.yaml new file mode 100644 index 000000000..5a4c3e051 --- /dev/null +++ b/docs/examples/documentdb/reconfigure/cluster-reconfigure.yaml @@ -0,0 +1,13 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-reconfigure + namespace: demo +spec: + type: Reconfigure + databaseRef: + name: documentdb-cls-sample + configuration: + applyConfig: + user.conf: | + max_connections=250 diff --git a/docs/examples/documentdb/reconfigure/standalone-reconfigure.yaml b/docs/examples/documentdb/reconfigure/standalone-reconfigure.yaml new file mode 100644 index 000000000..38c8dbc3a --- /dev/null +++ b/docs/examples/documentdb/reconfigure/standalone-reconfigure.yaml @@ -0,0 +1,13 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-sa-reconfigure + namespace: demo +spec: + type: Reconfigure + databaseRef: + name: documentdb-sa-sample + configuration: + applyConfig: + user.conf: | + max_connections=250 diff --git a/docs/examples/documentdb/restart/cluster-longhorn.yaml b/docs/examples/documentdb/restart/cluster-longhorn.yaml new file mode 100644 index 000000000..95ee1a8d1 --- /dev/null +++ b/docs/examples/documentdb/restart/cluster-longhorn.yaml @@ -0,0 +1,25 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-cls-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "longhorn" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/restart/cluster-restart.yaml b/docs/examples/documentdb/restart/cluster-restart.yaml new file mode 100644 index 000000000..0ff4a899c --- /dev/null +++ b/docs/examples/documentdb/restart/cluster-restart.yaml @@ -0,0 +1,9 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-restart + namespace: demo +spec: + type: Restart + databaseRef: + name: documentdb-cls-sample diff --git a/docs/examples/documentdb/restart/cluster.yaml b/docs/examples/documentdb/restart/cluster.yaml new file mode 100644 index 000000000..2ec9ae352 --- /dev/null +++ b/docs/examples/documentdb/restart/cluster.yaml @@ -0,0 +1,25 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-cls-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/restart/standalone-restart.yaml b/docs/examples/documentdb/restart/standalone-restart.yaml new file mode 100644 index 000000000..f0b6c5d10 --- /dev/null +++ b/docs/examples/documentdb/restart/standalone-restart.yaml @@ -0,0 +1,9 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-sa-restart + namespace: demo +spec: + type: Restart + databaseRef: + name: documentdb-sa-sample diff --git a/docs/examples/documentdb/restart/standalone.yaml b/docs/examples/documentdb/restart/standalone.yaml new file mode 100644 index 000000000..1003598d1 --- /dev/null +++ b/docs/examples/documentdb/restart/standalone.yaml @@ -0,0 +1,25 @@ +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-sa-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 1 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi diff --git a/docs/examples/documentdb/rotate-authentication/cluster-rotate-auth.yaml b/docs/examples/documentdb/rotate-authentication/cluster-rotate-auth.yaml new file mode 100644 index 000000000..f7c41f02e --- /dev/null +++ b/docs/examples/documentdb/rotate-authentication/cluster-rotate-auth.yaml @@ -0,0 +1,9 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-rotate-auth + namespace: demo +spec: + type: RotateAuth + databaseRef: + name: documentdb-cls-sample diff --git a/docs/examples/documentdb/rotate-authentication/standalone-rotate-auth.yaml b/docs/examples/documentdb/rotate-authentication/standalone-rotate-auth.yaml new file mode 100644 index 000000000..38a718e22 --- /dev/null +++ b/docs/examples/documentdb/rotate-authentication/standalone-rotate-auth.yaml @@ -0,0 +1,9 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-sa-rotate-auth + namespace: demo +spec: + type: RotateAuth + databaseRef: + name: documentdb-sa-sample diff --git a/docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-down.yaml b/docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-down.yaml new file mode 100644 index 000000000..2b1dfabe9 --- /dev/null +++ b/docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-down.yaml @@ -0,0 +1,11 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-hscale-down + namespace: demo +spec: + type: HorizontalScaling + databaseRef: + name: documentdb-cls-sample + horizontalScaling: + replicas: 3 diff --git a/docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-up.yaml b/docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-up.yaml new file mode 100644 index 000000000..cc35c4965 --- /dev/null +++ b/docs/examples/documentdb/scaling/horizontal-scaling/cluster-hscale-up.yaml @@ -0,0 +1,11 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-hscale-up + namespace: demo +spec: + type: HorizontalScaling + databaseRef: + name: documentdb-cls-sample + horizontalScaling: + replicas: 5 diff --git a/docs/examples/documentdb/scaling/vertical-scaling/cluster-vertical-scaling.yaml b/docs/examples/documentdb/scaling/vertical-scaling/cluster-vertical-scaling.yaml new file mode 100644 index 000000000..d3fe6b99d --- /dev/null +++ b/docs/examples/documentdb/scaling/vertical-scaling/cluster-vertical-scaling.yaml @@ -0,0 +1,23 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-vscale + namespace: demo +spec: + type: VerticalScaling + databaseRef: + name: documentdb-cls-sample + verticalScaling: + documentdb: + resources: + requests: + cpu: 600m + memory: 2.5Gi + limits: + cpu: "1" + memory: 2.5Gi + coordinator: + resources: + requests: + cpu: 100m + memory: 256Mi diff --git a/docs/examples/documentdb/scaling/vertical-scaling/standalone-vertical-scaling.yaml b/docs/examples/documentdb/scaling/vertical-scaling/standalone-vertical-scaling.yaml new file mode 100644 index 000000000..dcdebd35e --- /dev/null +++ b/docs/examples/documentdb/scaling/vertical-scaling/standalone-vertical-scaling.yaml @@ -0,0 +1,23 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-sa-vscale + namespace: demo +spec: + type: VerticalScaling + databaseRef: + name: documentdb-sa-sample + verticalScaling: + documentdb: + resources: + requests: + cpu: 600m + memory: 2.5Gi + limits: + cpu: "1" + memory: 2.5Gi + coordinator: + resources: + requests: + cpu: 100m + memory: 256Mi diff --git a/docs/examples/documentdb/storage-migration/cluster-storage-migration.yaml b/docs/examples/documentdb/storage-migration/cluster-storage-migration.yaml new file mode 100644 index 000000000..18d9ad803 --- /dev/null +++ b/docs/examples/documentdb/storage-migration/cluster-storage-migration.yaml @@ -0,0 +1,13 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-storage-migration + namespace: demo +spec: + type: StorageMigration + databaseRef: + name: documentdb-cls-sample + timeout: 10m + migration: + storageClassName: standard-custom + oldPVReclaimPolicy: Delete diff --git a/docs/examples/documentdb/storage-migration/standalone-storage-migration.yaml b/docs/examples/documentdb/storage-migration/standalone-storage-migration.yaml new file mode 100644 index 000000000..ee8e32f38 --- /dev/null +++ b/docs/examples/documentdb/storage-migration/standalone-storage-migration.yaml @@ -0,0 +1,13 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-sa-storage-migration + namespace: demo +spec: + type: StorageMigration + databaseRef: + name: documentdb-sa-sample + timeout: 10m + migration: + storageClassName: standard-custom + oldPVReclaimPolicy: Delete diff --git a/docs/examples/documentdb/volume-expansion/cluster-volume-expansion.yaml b/docs/examples/documentdb/volume-expansion/cluster-volume-expansion.yaml new file mode 100644 index 000000000..9749d431c --- /dev/null +++ b/docs/examples/documentdb/volume-expansion/cluster-volume-expansion.yaml @@ -0,0 +1,12 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-volume-expansion + namespace: demo +spec: + type: VolumeExpansion + databaseRef: + name: documentdb-cls-sample + volumeExpansion: + mode: Offline + documentdb: 10Gi diff --git a/docs/examples/documentdb/volume-expansion/standalone-volume-expansion.yaml b/docs/examples/documentdb/volume-expansion/standalone-volume-expansion.yaml new file mode 100644 index 000000000..286398814 --- /dev/null +++ b/docs/examples/documentdb/volume-expansion/standalone-volume-expansion.yaml @@ -0,0 +1,12 @@ +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-sa-volume-expansion + namespace: demo +spec: + type: VolumeExpansion + databaseRef: + name: documentdb-sa-sample + volumeExpansion: + mode: Offline + documentdb: 10Gi diff --git a/docs/guides/documentdb/_index.md b/docs/guides/documentdb/_index.md new file mode 100644 index 000000000..b00e8c749 --- /dev/null +++ b/docs/guides/documentdb/_index.md @@ -0,0 +1,10 @@ +--- +title: DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-documentdb-guides + name: DocumentDB + parent: guides + weight: 10 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/autoscaler/_index.md b/docs/guides/documentdb/autoscaler/_index.md new file mode 100644 index 000000000..a2978a1aa --- /dev/null +++ b/docs/guides/documentdb/autoscaler/_index.md @@ -0,0 +1,10 @@ +--- +title: Autoscaling DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-auto-scaling + name: Autoscaling + parent: dc-documentdb-guides + weight: 240 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/autoscaler/compute/index.md b/docs/guides/documentdb/autoscaler/compute/index.md new file mode 100644 index 000000000..dbff6a5d6 --- /dev/null +++ b/docs/guides/documentdb/autoscaler/compute/index.md @@ -0,0 +1,400 @@ +--- +title: DocumentDB Compute Autoscaling +menu: + docs_{{ .version }}: + identifier: dc-auto-compute + name: Compute Autoscaling + parent: dc-auto-scaling + weight: 20 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Autoscaling the Compute Resource of a DocumentDB Cluster + +This guide will show you how to use `KubeDB` to auto-scale the compute resources i.e. cpu and memory of a `DocumentDB` cluster. + +## Before You Begin + +- At first, you need to have a Kubernetes cluster, and the `kubectl` command-line tool must be configured to communicate with your cluster. + +- Install `KubeDB` Provisioner, Ops-Manager and Autoscaler operator in your cluster following the steps [here](/docs/setup/README.md). + +- Install `Metrics Server` from [here](https://github.com/kubernetes-sigs/metrics-server#installation) + +- You should be familiar with the following `KubeDB` concepts: + +To keep everything isolated, we are going to use a separate namespace called `demo` throughout this tutorial. + +```bash +$ kubectl create ns demo +namespace/demo created +``` + +> A DocumentDB exposes the MongoDB wire protocol (port `10260`, TLS) backed by an internal PostgreSQL engine. Every pod runs two containers — `documentdb` (the data plane that the autoscaler tunes) and `documentdb-coordinator`. The `DocumentDBAutoscaler` `spec.compute.documentdb` block targets the `documentdb` container. + +## How Compute Autoscaling Works + +The `DocumentDBAutoscaler` compute loop is VPA-driven: + +1. The Autoscaler operator runs an in-process VerticalPodAutoscaler recommender for the DB's PetSet (named after the DB, `dcdb`). The generated recommendation is published in the autoscaler's own `status.vpas` — this cluster has no standalone `VerticalPodAutoscaler` CRD, so you read the recommendation directly from the `DocumentDBAutoscaler` object. +2. When the recommendation differs from the current request by more than `resourceDiffPercentage` (and the pod is older than `podLifeTimeThreshold`, **or** the current request sits outside the `minAllowed`/`maxAllowed` band), the operator creates a `VerticalScaling` `DocumentDBOpsRequest` named `dcops-dcdb-`. +3. The Ops-Manager operator applies the new resources by rolling the PetSet pods one at a time. + +This guide demonstrates a deterministic **scale-up to the recommendation floor**: the base database requests `500m`/`1Gi`, which is *below* the autoscaler's `minAllowed` of `600m`/`1.5Gi`. The recommendation is therefore capped *up* to `minAllowed`, which guarantees an ops request is created regardless of actual load. + +## Deploy DocumentDB Cluster + +Here, we are going to deploy a `DocumentDB` cluster with 3 replicas and deliberately low compute resources (`500m`/`1Gi`). Below is the YAML of the `DocumentDB` CR that we are going to create, + +```yaml +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: dcdb + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 1Gi + limits: + cpu: 500m + memory: 1Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi +``` + +Let's create the `DocumentDB` CR we have shown above, + +```bash +$ kubectl apply -f https://github.com/kubedb/docs/raw/{{< param "info.version" >}}/docs/examples/documentdb/autoscaler/compute/autoscaling-compute-object.yaml +documentdb.kubedb.com/dcdb created +``` + +Now, wait until `dcdb` has status `Ready`. i.e, + +```bash +$ kubectl get docdb -n demo +NAME NAMESPACE VERSION STATUS AGE +dcdb demo pg17-0.109.0 Ready 113s +``` + +Let's check the `documentdb` container's resources of the pod, + +```bash +$ kubectl get pod -n demo dcdb-0 -o jsonpath='{range .spec.containers[?(@.name=="documentdb")]}{.resources}{"\n"}{end}' +{"limits":{"cpu":"500m","memory":"1Gi"},"requests":{"cpu":"500m","memory":"1Gi"}} +``` + +You can see from the above output that the resources are the same as the ones we assigned while deploying the DocumentDB. + +We are now ready to apply the `DocumentDBAutoscaler` CR to set up compute autoscaling for this database. + +## Compute Resource Autoscaling + +Here, we are going to set up compute resource autoscaling using a `DocumentDBAutoscaler` Object. + +#### Create DocumentDBAutoscaler Object + +In order to set up compute resource autoscaling for this database cluster, we have to create a `DocumentDBAutoscaler` CR with our desired configuration. Below is the YAML of the `DocumentDBAutoscaler` object that we are going to create, + +```yaml +apiVersion: autoscaling.kubedb.com/v1alpha1 +kind: DocumentDBAutoscaler +metadata: + name: dcdb-compute-autoscaler + namespace: demo +spec: + databaseRef: + name: dcdb + opsRequestOptions: + timeout: 5m + apply: IfReady + compute: + documentdb: + trigger: "On" + podLifeTimeThreshold: 1m + resourceDiffPercentage: 5 + minAllowed: + cpu: 600m + memory: 1.5Gi + maxAllowed: + cpu: "2" + memory: 3Gi + controlledResources: ["cpu", "memory"] + containerControlledValues: "RequestsAndLimits" +``` + +Here, + +- `spec.databaseRef.name` specifies that we are performing compute resource scaling operation on the `dcdb` database. +- `spec.compute.documentdb.trigger` specifies that compute autoscaling is enabled for this database. +- `spec.compute.documentdb.podLifeTimeThreshold` specifies the minimum lifetime for at least one of the pods to initiate a vertical scaling. +- `spec.compute.documentdb.resourceDiffPercentage` specifies the minimum resource difference (in percentage) between the current and recommended resources required to trigger an update. The default is 10%. +- `spec.compute.documentdb.minAllowed` specifies the minimum allowed resources for the database. Here it is set **above** the deployed resources, so the recommendation floor forces a scale-up. +- `spec.compute.documentdb.maxAllowed` specifies the maximum allowed resources for the database. +- `spec.compute.documentdb.controlledResources` specifies the resources that are controlled by the autoscaler. +- `spec.compute.documentdb.containerControlledValues` specifies which resource values should be controlled. The default is `RequestsAndLimits`. +- `spec.opsRequestOptions.apply` has two supported values: `IfReady` & `Always`. Use `IfReady` to process the opsRequest only when the database is Ready, and `Always` to process it irrespective of the database state. +- `spec.opsRequestOptions.timeout` specifies the maximum time for each step of the opsRequest. + +Let's create the `DocumentDBAutoscaler` CR we have shown above, + +```bash +$ kubectl apply -f https://github.com/kubedb/docs/raw/{{< param "info.version" >}}/docs/examples/documentdb/autoscaler/compute/autoscaling-compute.yaml +documentdbautoscaler.autoscaling.kubedb.com/dcdb-compute-autoscaler created +``` + +#### Verify Autoscaling is set up successfully + +Let's check that the `documentdbautoscaler` resource is created successfully, + +```bash +$ kubectl get documentdbautoscaler -n demo +NAME AGE +dcdb-compute-autoscaler 11s + +$ kubectl describe documentdbautoscaler dcdb-compute-autoscaler -n demo +Name: dcdb-compute-autoscaler +Namespace: demo +Labels: +Annotations: +API Version: autoscaling.kubedb.com/v1alpha1 +Kind: DocumentDBAutoscaler +Metadata: + Creation Timestamp: 2026-06-30T13:58:55Z + Generation: 1 + Owner References: + API Version: kubedb.com/v1alpha2 + Block Owner Deletion: true + Controller: true + Kind: DocumentDB + Name: dcdb +Spec: + Compute: + Documentdb: + Container Controlled Values: RequestsAndLimits + Controlled Resources: + cpu + memory + Max Allowed: + Cpu: 2 + Memory: 3Gi + Min Allowed: + Cpu: 600m + Memory: 1.5Gi + Pod Life Time Threshold: 1m + Resource Diff Percentage: 5 + Trigger: On + Database Ref: + Name: dcdb + Ops Request Options: + Apply: IfReady + Max Retries: 1 + Timeout: 5m +Status: + Checkpoints: + Cpu Histogram: + Bucket Weights: + Index: 1 + Weight: 5995 + Index: 2 + Weight: 10000 + Index: 3 + Weight: 7164 + Reference Timestamp: 2026-06-30T14:00:00Z + Total Weight: 1.1832553241627992 + First Sample Start: 2026-06-30T13:59:04Z + Last Sample Start: 2026-06-30T14:02:03Z + Last Update Time: 2026-06-30T14:02:23Z + Ref: + Container Name: documentdb-coordinator + Vpa Object Name: dcdb + Total Samples Count: 11 + Version: v3 + Conditions: + Last Transition Time: 2026-06-30T13:59:55Z + Message: Successfully created DocumentDBOpsRequest demo/dcops-dcdb-y87ecq + Observed Generation: 1 + Reason: CreateOpsRequest + Status: True + Type: CreateOpsRequest + Vpas: + Conditions: + Last Transition Time: 2026-06-30T13:59:22Z + Status: True + Type: RecommendationProvided + Recommendation: + Container Recommendations: + Container Name: documentdb-coordinator + Lower Bound: + Cpu: 50m + Memory: 131072k + Target: + Cpu: 50m + Memory: 131072k + Uncapped Target: + Cpu: 50m + Memory: 131072k + Upper Bound: + Cpu: 23700m + Memory: 30735427949 + Container Name: documentdb + Lower Bound: + Cpu: 600m + Memory: 1536Mi + Target: + Cpu: 600m + Memory: 1536Mi + Uncapped Target: + Cpu: 182m + Memory: 131072k + Upper Bound: + Cpu: 2 + Memory: 3Gi + Vpa Name: dcdb +Events: +``` + +So, the `documentdbautoscaler` resource is created successfully. + +We can verify from the above output that `status.vpas` contains the `RecommendationProvided` condition set to `True`, and `status.vpas[].recommendation.containerRecommendations` holds the actual recommendation. Notice the `documentdb` container `Target` of `600m`/`1536Mi` — the uncapped target (`182m`/`131072k`) was well below the band, so it was floored *up* to `minAllowed`. The `status.conditions` already reports `Successfully created DocumentDBOpsRequest demo/dcops-dcdb-y87ecq`. + +The Autoscaler operator continuously watches the recommendation and creates a `DocumentDBOpsRequest` based on it whenever the pod resources need to be scaled up or down. + +Let's watch the `documentdbopsrequest` in the demo namespace to see if any `documentdbopsrequest` object is created. + +```bash +$ kubectl get documentdbopsrequest -n demo +NAME TYPE STATUS AGE +dcops-dcdb-y87ecq VerticalScaling Progressing 13s +``` + +Let's wait for the ops request to become successful. + +```bash +$ kubectl get documentdbopsrequest -n demo +NAME TYPE STATUS AGE +dcops-dcdb-y87ecq VerticalScaling Successful 2m55s +``` + +We can see from the above output that the `DocumentDBOpsRequest` has succeeded. If we describe the `DocumentDBOpsRequest` (or print its YAML) we get an overview of the steps that were followed to scale the database. + +```bash +$ kubectl get documentdbopsrequest -n demo dcops-dcdb-y87ecq -o yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: dcops-dcdb-y87ecq + namespace: demo + ownerReferences: + - apiVersion: autoscaling.kubedb.com/v1alpha1 + blockOwnerDeletion: true + controller: true + kind: DocumentDBAutoscaler + name: dcdb-compute-autoscaler +spec: + apply: IfReady + databaseRef: + name: dcdb + maxRetries: 1 + timeout: 5m0s + type: VerticalScaling + verticalScaling: + documentdb: + resources: + limits: + cpu: 600m + memory: 1536Mi + requests: + cpu: 600m + memory: 1536Mi +status: + conditions: + - message: Vertical Scaling is in progress + reason: Running + status: "True" + type: Running + - message: Successfully Set Raft Key OpsRequestProgressing + reason: SetRaftKeyOpsRequestProgressing + status: "True" + type: SetRaftKeyOpsRequestProgressing + - message: Successfully updated petsets resources + reason: UpdatePetSets + status: "True" + type: UpdatePetSets + - message: VerticalScaleSucceeded + reason: VerticalScale + status: "True" + type: VerticalScale + - message: Successfully Restarted Read Replicas + reason: RestartReadReplicas + status: "True" + type: RestartReadReplicas + - message: Successfully Vertically Scaled Database + reason: Successful + status: "True" + type: Successful + - message: Successfully Unset Raft Key OpsRequestProgressing + reason: UnsetRaftKeyOpsRequestProgressing + status: "True" + type: UnsetRaftKeyOpsRequestProgressing + observedGeneration: 1 + phase: Successful +``` + +Notice that the ops request body carries exactly the floored target (`600m`/`1536Mi`), and the rollout walks the cluster pod by pod (`SetRaftKeyOpsRequestProgressing` → `UpdatePetSets` → per-pod readiness checks → `RestartReadReplicas`) so the DocumentDB cluster stays available throughout. + +Now, let's verify from the Pod and the DocumentDB object that the resources of the cluster database have been updated to the desired state. + +```bash +$ kubectl get pod -n demo dcdb-0 -o jsonpath='{range .spec.containers[?(@.name=="documentdb")]}{.resources}{"\n"}{end}' +{"limits":{"cpu":"600m","memory":"1536Mi"},"requests":{"cpu":"600m","memory":"1536Mi"}} + +$ kubectl get docdb -n demo dcdb -o json | jq -c '.spec.podTemplate.spec.containers[] | {name:.name, resources:.resources}' +{"name":"documentdb","resources":{"limits":{"cpu":"600m","memory":"1536Mi"},"requests":{"cpu":"600m","memory":"1536Mi"}}} +{"name":"documentdb-coordinator","resources":{"limits":{"memory":"256Mi"},"requests":{"cpu":"200m","memory":"256Mi"}}} +``` + +The above output verifies that we have successfully autoscaled the compute resources of the DocumentDB cluster database from `500m`/`1Gi` to `600m`/`1.5Gi`. + +Finally, let's confirm the database is healthy over the MongoDB wire protocol: + +```bash +$ PASS=$(kubectl get secret -n demo dcdb-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo dcdb-0 -c documentdb -- mongosh \ + "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Cleaning Up + +To clean up the Kubernetes resources created by this tutorial, run: + +```bash +kubectl delete documentdbautoscaler -n demo dcdb-compute-autoscaler +kubectl delete documentdb -n demo dcdb +kubectl delete ns demo +``` + +## Next Steps + +- Learn how to autoscale the storage of a DocumentDB cluster in the [Storage Autoscaling](/docs/guides/documentdb/autoscaler/storage/index.md) guide. +- Want to hack on KubeDB? Check our [contribution guidelines](/docs/CONTRIBUTING.md). diff --git a/docs/guides/documentdb/autoscaler/storage/index.md b/docs/guides/documentdb/autoscaler/storage/index.md new file mode 100644 index 000000000..37bb2e450 --- /dev/null +++ b/docs/guides/documentdb/autoscaler/storage/index.md @@ -0,0 +1,350 @@ +--- +title: DocumentDB Storage Autoscaling +menu: + docs_{{ .version }}: + identifier: dc-auto-storage + name: Storage Autoscaling + parent: dc-auto-scaling + weight: 30 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Storage Autoscaling of a DocumentDB Cluster + +This guide will show you how to use `KubeDB` to autoscale the storage of a `DocumentDB` cluster. + +## Before You Begin + +- At first, you need to have a Kubernetes cluster, and the `kubectl` command-line tool must be configured to communicate with your cluster. + +- Install `KubeDB` Provisioner, Ops-Manager and Autoscaler operator in your cluster following the steps [here](/docs/setup/README.md). + +- Install `Metrics Server` from [here](https://github.com/kubernetes-sigs/metrics-server#installation), and the **custom metrics API** (`custom.metrics.k8s.io`) backed by the KubeDB storage-metrics apiserver. The storage autoscaler reads PVC usage from this API — `metrics-server` alone is **not** enough. + +- You must have a `StorageClass` that supports volume expansion (`allowVolumeExpansion: true`). + +- You should be familiar with the following `KubeDB` concepts: + +To keep everything isolated, we are going to use a separate namespace called `demo` throughout this tutorial. + +```bash +$ kubectl create ns demo +namespace/demo created +``` + +> A DocumentDB exposes the MongoDB wire protocol (port `10260`, TLS) backed by an internal PostgreSQL engine. Each pod runs the `documentdb` and `documentdb-coordinator` containers, and the data directory (`/var/pv`) lives on the per-pod PVC `data-dcdb-`. + +## How Storage Autoscaling Works + +The `DocumentDBAutoscaler` storage loop is **PVC-usage-driven**: + +1. Every reconcile, the Autoscaler operator reads the `volume_used_percentage` metric for each of the DB's PVCs from `custom.metrics.k8s.io`. +2. When a PVC's usage exceeds `usageThreshold`, the operator computes a new size from `scalingRules` and creates a `VolumeExpansion` `DocumentDBOpsRequest` (capped at `upperBound`) using the configured `expansionMode`. +3. The Ops-Manager operator performs the expansion. With `expansionMode: Online` and an online-resize-capable CSI (longhorn here), the PVCs grow without taking the database offline. + +> **IMPORTANT — the new size comes from `scalingRules[].threshold`, not `scalingThreshold`.** The DocumentDB storage autoscaler computes the scaled size only from `scalingRules`. The simpler top-level `scalingThreshold` field is **not** read by this controller path, so you must provide `scalingRules` or no ops request is ever created. A single rule with an empty `appliesUpto` applies to all current sizes; `threshold: 50%` grows capacity by 50%. + +> **IMPORTANT — RBAC for the custom metrics API.** The autoscaler ServiceAccount must be allowed to `get`/`list` on `custom.metrics.k8s.io`. If this permission is missing, the operator logs `custom metrics API returned 403 Forbidden` and silently never creates an ops request. Add the rule to the autoscaler's ClusterRole (e.g. `kubedb-kubedb-autoscaler`): +> +> ```yaml +> - apiGroups: ["custom.metrics.k8s.io"] +> resources: ["*"] +> verbs: ["get", "list"] +> ``` + +## Storage Autoscaling of Cluster Database + +At first, verify that your cluster has a storage class that supports volume expansion. + +```bash +$ kubectl get storageclass +NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE +local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 22d +longhorn driver.longhorn.io Delete Immediate true 18d +``` + +We can see the `longhorn` storage class has `ALLOWVOLUMEEXPANSION` set to `true`, and it supports online volume expansion, so we will use it. You can install longhorn from [here](https://longhorn.io/docs/). + +#### Deploy DocumentDB Cluster + +In this section, we are going to deploy a `DocumentDB` cluster with 3 replicas and a small `2Gi` volume on `longhorn`. Below is the YAML of the `DocumentDB` CR that we are going to create, + +```yaml +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: dcdb + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 1Gi + limits: + cpu: 500m + memory: 1Gi + storage: + storageClassName: "longhorn" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 2Gi +``` + +Let's create the `DocumentDB` CR we have shown above, + +```bash +$ kubectl apply -f https://github.com/kubedb/docs/raw/{{< param "info.version" >}}/docs/examples/documentdb/autoscaler/storage/autoscaling-storage-object.yaml +documentdb.kubedb.com/dcdb created +``` + +Now, wait until `dcdb` has status `Ready`. i.e, + +```bash +$ kubectl get docdb -n demo +NAME NAMESPACE VERSION STATUS AGE +dcdb demo pg17-0.109.0 Ready 2m56s +``` + +Let's check the PVC sizes of the cluster, + +```bash +$ kubectl get pvc -n demo | grep dcdb +data-dcdb-0 Bound pvc-de4bfaa2-ea8e-4db5-b352-72abe3ab5b67 2Gi RWO longhorn 2m47s +data-dcdb-1 Bound pvc-ad3b996c-3ffe-460c-8da3-ea14d534d217 2Gi RWO longhorn 2m +data-dcdb-2 Bound pvc-e36556ef-80aa-49ff-91ac-ad07f237e203 2Gi RWO longhorn 93s +``` + +You can see all three PVCs have `2Gi` of storage. We are now ready to apply the `DocumentDBAutoscaler` CR to set up storage autoscaling for this database. + +### Storage Autoscaling + +Here, we are going to set up storage autoscaling using a `DocumentDBAutoscaler` Object. + +#### Create DocumentDBAutoscaler Object + +In order to set up storage autoscaling for this cluster database, we have to create a `DocumentDBAutoscaler` CR with our desired configuration. Below is the YAML of the `DocumentDBAutoscaler` object that we are going to create, + +```yaml +apiVersion: autoscaling.kubedb.com/v1alpha1 +kind: DocumentDBAutoscaler +metadata: + name: dcdb-storage-autoscaler + namespace: demo +spec: + databaseRef: + name: dcdb + storage: + documentdb: + trigger: "On" + usageThreshold: 60 + scalingRules: + - appliesUpto: "" + threshold: 50% + expansionMode: "Online" + upperBound: 10Gi +``` + +Here, + +- `spec.databaseRef.name` specifies that we are performing storage autoscaling on the `dcdb` database. +- `spec.storage.documentdb.trigger` specifies that storage autoscaling is enabled for this database. +- `spec.storage.documentdb.usageThreshold` specifies the storage usage threshold — when a PVC's usage exceeds `60%`, storage autoscaling is triggered. +- `spec.storage.documentdb.scalingRules` drives the **new size**. A rule with an empty `appliesUpto` applies to every current size, and `threshold: 50%` grows the capacity by 50%. +- `spec.storage.documentdb.expansionMode` specifies the expansion mode of the `VolumeExpansion` `DocumentDBOpsRequest`. longhorn supports online volume expansion, so it is set to `Online`. +- `spec.storage.documentdb.upperBound` caps how large the volume may ever grow (`10Gi`). + +Let's create the `DocumentDBAutoscaler` CR we have shown above, + +```bash +$ kubectl apply -f https://github.com/kubedb/docs/raw/{{< param "info.version" >}}/docs/examples/documentdb/autoscaler/storage/autoscaling-storage.yaml +documentdbautoscaler.autoscaling.kubedb.com/dcdb-storage-autoscaler created +``` + +#### Storage Autoscaling is set up successfully + +Let's check that the `documentdbautoscaler` resource is created successfully, + +```bash +$ kubectl get documentdbautoscaler -n demo +NAME AGE +dcdb-storage-autoscaler 8s + +$ kubectl describe documentdbautoscaler dcdb-storage-autoscaler -n demo +Name: dcdb-storage-autoscaler +Namespace: demo +API Version: autoscaling.kubedb.com/v1alpha1 +Kind: DocumentDBAutoscaler +Metadata: + Owner References: + API Version: kubedb.com/v1alpha2 + Block Owner Deletion: true + Controller: true + Kind: DocumentDB + Name: dcdb +Spec: + Database Ref: + Name: dcdb + Storage: + Documentdb: + Expansion Mode: Online + Scaling Rules: + Applies Upto: + Threshold: 50% + Trigger: On + Upper Bound: 10Gi + Usage Threshold: 60 +Events: +``` + +So, the `documentdbautoscaler` resource is created successfully. + +Now, for this demo, we are going to manually fill up the persistent volumes to exceed the `usageThreshold` using the `dd` command. The DocumentDB data directory is mounted at `/var/pv` (PVC `data-dcdb-`). The autoscaler evaluates usage per PVC, so we fill all three replicas. + +```bash +$ for p in dcdb-0 dcdb-1 dcdb-2; do + kubectl exec -n demo $p -c documentdb -- sh -c 'dd if=/dev/zero of=/var/pv/_fill bs=1M count=1500; sync; df -h /var/pv' + done +... +/dev/longhorn/pvc-de4bfaa2-ea8e-4db5-b352-72abe3ab5b67 2.0G 1.8G 180M 91% /var/pv +/dev/longhorn/pvc-ad3b996c-3ffe-460c-8da3-ea14d534d217 2.0G 1.7G 212M 90% /var/pv +/dev/longhorn/pvc-e36556ef-80aa-49ff-91ac-ad07f237e203 2.0G 1.8G 180M 90% /var/pv +``` + +So, from the above output the storage usage of each PVC is around `90%`, which exceeds the `usageThreshold` of `60%`. + +On its next reconcile, the autoscaler reads the per-PVC usage from the custom metrics API (visible in the operator logs) and creates the ops request: + +``` +storage_autoscaler.go:77] Running storage Autoscaler for demo/dcdb-storage-autoscaler, referred database = dcdb +storage_metrics.go:105] LENGTH OF PVCS 3 +storage_metrics.go:119] USED SPACE 89.98 +storage_metrics.go:119] USED SPACE 74.039 +storage_metrics.go:119] USED SPACE 88.326 +client.go:88] Creating ops.kubedb.com/v1alpha1, Kind=DocumentDBOpsRequest demo/dcops-dcdb-w5q6tl. +``` + +Let's watch the `documentdbopsrequest` in the demo namespace to see if any `documentdbopsrequest` object is created. After some time you'll see that a `documentdbopsrequest` of type `VolumeExpansion` is created based on the `scalingRules`. + +```bash +$ kubectl get documentdbopsrequest -n demo +NAME TYPE STATUS AGE +dcops-dcdb-w5q6tl VolumeExpansion Progressing 13s +``` + +Let's wait for the ops request to become successful. + +```bash +$ kubectl get documentdbopsrequest -n demo +NAME TYPE STATUS AGE +dcops-dcdb-w5q6tl VolumeExpansion Successful 3m43s +``` + +We can see from the above output that the `DocumentDBOpsRequest` has succeeded. If we print its YAML we get an overview of the steps that were followed to expand the volume. + +```bash +$ kubectl get documentdbopsrequest -n demo dcops-dcdb-w5q6tl -o yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: dcops-dcdb-w5q6tl + namespace: demo + ownerReferences: + - apiVersion: autoscaling.kubedb.com/v1alpha1 + blockOwnerDeletion: true + controller: true + kind: DocumentDBAutoscaler + name: dcdb-storage-autoscaler +spec: + apply: IfReady + databaseRef: + name: dcdb + maxRetries: 1 + type: VolumeExpansion + volumeExpansion: + documentdb: "3060559872" + mode: Online +status: + conditions: + - message: Volume Expansion is in progress + reason: Running + status: "True" + type: Running + - message: Successfully Set Raft Key OpsRequestProgressing + type: SetRaftKeyOpsRequestProgressing + - message: list pvc; ConditionStatus:True + type: ListPvc + - message: is pvc data-dcdb-0 updated; ConditionStatus:True + type: IsPvcData-dcdb-0Updated + - message: is pvc data-dcdb-1 updated; ConditionStatus:True + type: IsPvcData-dcdb-1Updated + - message: is pvc data-dcdb-2 updated; ConditionStatus:True + type: IsPvcData-dcdb-2Updated + - message: 'Online Volume Expansion performed successfully in DocumentDB pods for + DocumentDBOpsRequest: demo/dcops-dcdb-w5q6tl' + type: VolumeExpansion + - message: is petset ready; ConditionStatus:True + type: IsPetsetReady + - message: PetSet is recreated + type: ReadyPetSets + - message: Successfully Expanded Volume. + reason: Successful + status: "True" + type: Successful + observedGeneration: 1 + phase: Successful +``` + +Notice that the ops request body carries the computed size `3060559872` bytes (≈ `2.85Gi`) — the result of growing the `2Gi` volume by the `50%` `scalingRules` threshold — and `mode: Online`, so the expansion happens while the cluster stays available. + +Now, let's verify from the PVCs that the volume of the cluster database has expanded. + +```bash +$ kubectl get pvc -n demo | grep dcdb +data-dcdb-0 Bound pvc-de4bfaa2-ea8e-4db5-b352-72abe3ab5b67 2920Mi RWO longhorn 27m +data-dcdb-1 Bound pvc-ad3b996c-3ffe-460c-8da3-ea14d534d217 2920Mi RWO longhorn 26m +data-dcdb-2 Bound pvc-e36556ef-80aa-49ff-91ac-ad07f237e203 2920Mi RWO longhorn 26m + +$ kubectl exec -n demo dcdb-0 -c documentdb -- df -h /var/pv +Filesystem Size Used Avail Use% Mounted on +/dev/longhorn/pvc-de4bfaa2-ea8e-4db5-b352-72abe3ab5b67 2.8G 1.8G 1.1G 63% /var/pv +``` + +The above output verifies that we have successfully autoscaled the volume of the DocumentDB cluster database from `2Gi` to `2920Mi` (≈ `2.85Gi`). With the larger volume the same data now sits at `63%` usage, below the threshold, so no further expansion is triggered. + +Finally, let's confirm the database is healthy over the MongoDB wire protocol: + +```bash +$ PASS=$(kubectl get secret -n demo dcdb-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo dcdb-0 -c documentdb -- mongosh \ + "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Cleaning Up + +To clean up the Kubernetes resources created by this tutorial, run: + +```bash +kubectl delete documentdbautoscaler -n demo dcdb-storage-autoscaler +kubectl delete documentdb -n demo dcdb +kubectl delete ns demo +``` + +## Next Steps + +- Learn how to autoscale the compute resources of a DocumentDB cluster in the [Compute Autoscaling](/docs/guides/documentdb/autoscaler/compute/index.md) guide. +- Want to hack on KubeDB? Check our [contribution guidelines](/docs/CONTRIBUTING.md). diff --git a/docs/guides/documentdb/configuration/_index.md b/docs/guides/documentdb/configuration/_index.md new file mode 100644 index 000000000..e518c11cd --- /dev/null +++ b/docs/guides/documentdb/configuration/_index.md @@ -0,0 +1,10 @@ +--- +title: Run DocumentDB with Custom Configuration +menu: + docs_{{ .version }}: + identifier: dc-configuration + name: Custom Configuration + parent: dc-documentdb-guides + weight: 130 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/configuration/using-config-file.md b/docs/guides/documentdb/configuration/using-config-file.md new file mode 100644 index 000000000..19cc1d0c0 --- /dev/null +++ b/docs/guides/documentdb/configuration/using-config-file.md @@ -0,0 +1,282 @@ +--- +title: Run DocumentDB with Custom Configuration +menu: + docs_{{ .version }}: + identifier: dc-configuration-using-config-file + name: Custom Configuration + parent: dc-configuration + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Run DocumentDB with Custom Configuration + +KubeDB DocumentDB speaks the **MongoDB wire protocol** (port `10260`, TLS) on top of an +internal **PostgreSQL** storage engine (port `9712`, not exposed). Because the storage engine +is Postgres, you tune a DocumentDB instance with ordinary Postgres-style `key=value` settings +placed under a **`user.conf`** key — exactly the way you would tune KubeDB Postgres (it is +*not* a `mongod.conf`). + +KubeDB exposes three ways to supply custom configuration at provision time, and they layer on +top of each other in a fixed precedence: + +```text +auto-tuning / built-in defaults < configuration.secretName < configuration.inline +``` + +Anything you set **inline** wins over a referenced **Secret**, which in turn wins over the +**auto-tuned / default** values. The operator merges every supplied source and renders the +final files into a per-instance config Secret that is mounted into every pod. + +| Source | Field | Precedence | +| ------------------------------- | ------------------------------- | ---------- | +| Auto-tuning / built-in defaults | `spec.configuration.tuning` | lowest | +| Secret | `spec.configuration.secretName` | middle | +| Inline | `spec.configuration.inline` | highest | + +## Before You Begin + +- You need a Kubernetes cluster, and the `kubectl` command-line tool must be configured to + communicate with your cluster. If you do not already have a cluster, you can create one by + using [kind](https://kind.sigs.k8s.io/docs/user/quick-start/). + +- Install KubeDB in your cluster following the steps [here](/docs/setup/README.md). + +- To keep things isolated, this tutorial uses a separate namespace called `demo`: + + ```bash + $ kubectl create ns demo + namespace/demo created + ``` + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Configuration via a Secret (cluster) + +Create a Secret whose single `user.conf` key carries your Postgres settings, then reference it +from the DocumentDB object with `spec.configuration.secretName`. KubeDB merges `user.conf` into +the rendered server configuration for **every** replica. + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: documentdb-custom-config + namespace: demo +stringData: + user.conf: | + max_connections=250 + work_mem=8MB +``` + +```yaml +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-cls-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + configuration: + secretName: documentdb-custom-config + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi +``` + +Apply both: + +```bash +$ kubectl apply -f documentdb-custom-config-secret.yaml +secret/documentdb-custom-config created +$ kubectl apply -f cluster-config-secret.yaml +documentdb.kubedb.com/documentdb-cls-sample created +``` + +### Inspect the rendered configuration + +The `spec.configuration` block on the object confirms which Secret is wired in: + +```bash +$ kubectl get docdb -n demo documentdb-cls-sample -o jsonpath='{.spec.configuration}' +{"secretName":"documentdb-custom-config"} +``` + +The Secret holds the `user.conf` that KubeDB feeds into each replica: + +```bash +$ kubectl get secret -n demo documentdb-custom-config -o jsonpath='{.data.user\.conf}' | base64 -d +max_connections=250 +work_mem=8MB +``` + +KubeDB also provisions the cluster's two auth secrets alongside it — `documentdb-cls-sample-auth` +(the MongoDB-compatibility `default_user`) and `documentdb-cls-sample-admin-auth` (the backend +admin): + +```bash +$ kubectl get secret -n demo | grep documentdb-cls-sample +documentdb-cls-sample-admin-auth kubernetes.io/basic-auth 2 34m +documentdb-cls-sample-auth kubernetes.io/basic-auth 2 34m +``` + +### Verify the database is serving + +Connect over the MongoDB wire protocol (TLS, port `10260`) with the `default_user` credentials +from `-auth` and ping: + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +The primary accepts MongoDB-protocol traffic with the custom configuration applied. + +Tear the instance down before the next example: + +```bash +$ kubectl delete docdb -n demo documentdb-cls-sample +documentdb.kubedb.com "documentdb-cls-sample" deleted +``` + +## Configuration inline + +The inline form embeds the same Postgres settings directly in the DocumentDB spec under +`spec.configuration.inline`. Inline values take precedence over a referenced Secret. + +```yaml +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-sa-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 1 + configuration: + inline: + user.conf: | + max_connections=300 + work_mem=16MB + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi +``` + +```bash +$ kubectl apply -f standalone-config-inline.yaml +documentdb.kubedb.com/documentdb-sa-sample created +``` + +On a healthy instance the rendered `user.conf` would show `max_connections=300` / +`work_mem=16MB`, overriding any Secret-supplied values. + +## Configuration via auto-tuning + +The tuning form lets KubeDB compute Postgres settings for you from a workload profile and the +underlying storage characteristics instead of hand-writing `user.conf`. The operator runs a +`pgtune`-style calculation and renders the result into `pgtune.conf` (which sits at the lowest +precedence, so an explicit Secret or inline `user.conf` still overrides it). + +```yaml +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-sa-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 1 + configuration: + tuning: + profile: oltp # web | oltp | dw | mixed | desktop + storageType: ssd # ssd | hdd | san + maxConnections: 200 + disableAutoTune: false + podTemplate: + spec: + containers: + - name: documentdb + resources: + requests: + cpu: 500m + memory: 2Gi + storage: + storageClassName: "local-path" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi +``` + +```bash +$ kubectl apply -f standalone-config-tuning.yaml +documentdb.kubedb.com/documentdb-sa-sample created +``` + +On a healthy instance the auto-tuner emits a `pgtune.conf` derived from `profile: oltp`, +`storageType: ssd`, and `maxConnections: 200` (tuned `shared_buffers`, `effective_cache_size`, +`work_mem`, `max_connections=200`, etc.). + +> [!NOTE] +> **Standalone provisioning limitation in the test environment.** On the cluster used to +> capture this guide, standalone (`replicas: 1`) DocumentDB instances on version `pg17-0.109.0` +> did not finish bootstrapping: the standalone PetSet is rendered with only the `documentdb` +> container (the `documentdb-coordinator` sidecar that runs `initdb` on the clustered topology +> is absent), so the internal PostgreSQL data directory is never created and port `10260` never +> opens. The inline and tuning YAML above are the intended procedure; the live rendered-config +> inspection was therefore captured on the 3-replica (cluster) topology shown in the first +> section. The configuration mechanics (`user.conf` key, three sources, precedence) are +> identical for standalone and cluster. + +## Cleaning Up + +```bash +kubectl delete docdb -n demo documentdb-cls-sample --ignore-not-found +kubectl delete docdb -n demo documentdb-sa-sample --ignore-not-found +kubectl delete secret -n demo documentdb-custom-config --ignore-not-found +kubectl delete ns demo +``` + +## Next Steps + +- Apply configuration to a running database with the [Reconfigure](/docs/guides/documentdb/reconfigure/) OpsRequest. +- [Restart](/docs/guides/documentdb/restart/) a DocumentDB database. diff --git a/docs/guides/documentdb/failure-and-disaster-recovery/_index.md b/docs/guides/documentdb/failure-and-disaster-recovery/_index.md new file mode 100644 index 000000000..0de0cf3f8 --- /dev/null +++ b/docs/guides/documentdb/failure-and-disaster-recovery/_index.md @@ -0,0 +1,10 @@ +--- +title: DocumentDB Failure and Disaster Recovery Scenarios +menu: + docs_{{ .version }}: + identifier: dc-failure-disaster-recovery + name: Failover and DR Scenarios + parent: dc-documentdb-guides + weight: 150 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/failure-and-disaster-recovery/failover.md b/docs/guides/documentdb/failure-and-disaster-recovery/failover.md new file mode 100644 index 000000000..b0b0b8ada --- /dev/null +++ b/docs/guides/documentdb/failure-and-disaster-recovery/failover.md @@ -0,0 +1,166 @@ +--- +title: DocumentDB Automatic Failover +menu: + docs_{{ .version }}: + identifier: dc-failure-disaster-recovery-failover + name: Automatic Failover + parent: dc-failure-disaster-recovery + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Automatic Failover in a DocumentDB Cluster + +A `DocumentDB` cluster is self-healing. Each pod runs a `documentdb-coordinator` container that +participates in a **Raft** consensus group; the Raft leader's pod is labelled +`kubedb.com/role=primary` and runs the writable PostgreSQL engine, while the others are +`standby` replicas streaming from it. If the primary pod dies, the surviving coordinators detect +the loss, elect a new leader, promote the healthiest standby to primary, and re-label the pods — +with **no operator action and no OpsRequest required**. This guide forces a failover and shows +that committed data survives it. + +> This is a cluster-only scenario; a standalone (`replicas: 1`) DocumentDB has no standby to fail +> over to. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a 3-replica `DocumentDB` cluster (`documentdb-cls-sample`) and wait for it to become + `Ready`. + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Identify the current leader + +The leader is the pod labelled `kubedb.com/role=primary`: + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 4m4s primary +documentdb-cls-sample-1 2/2 Running 0 99s standby +documentdb-cls-sample-2 2/2 Running 0 2m48s standby +``` + +`documentdb-cls-sample-0` is the leader. + +## Write a test document on the primary + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval ' + db.getSiblingDB("failover").coll.insertOne({k:"before-failover", ts:new Date()}); + printjson(db.getSiblingDB("failover").coll.findOne({k:"before-failover"}));' +{ + _id: ObjectId('6a43e4bd2bb67b71d58563b1'), + k: 'before-failover', + ts: ISODate('2026-06-30T15:46:05.334Z') +} +``` + +## Force a failover + +Simulate a node loss by force-deleting the leader pod: + +```bash +$ kubectl delete pod -n demo documentdb-cls-sample-0 --grace-period=0 --force +pod "documentdb-cls-sample-0" force deleted from demo namespace +``` + +## Watch the re-election + +Within a few seconds a new primary is elected. The database briefly reports `Critical` (it has +lost a quorum member) and returns to `Ready` once a new leader is serving: + +```bash +$ # poll: kubectl get pods ... -L kubedb.com/role + kubectl get docdb +[t+0s ] db=Critical primary='' sample-0=0/2 (terminating) sample-1=standby sample-2=standby +[t+15s] db=Critical primary='documentdb-cls-sample-2' sample-0=2/2 (rejoining) sample-1=standby sample-2=primary +[t+45s] db=Ready primary='documentdb-cls-sample-2' sample-0=standby sample-1=standby sample-2=primary +``` + +Final topology — `documentdb-cls-sample-2` is the new primary and the old leader has rejoined as +a standby: + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 56s standby +documentdb-cls-sample-1 2/2 Running 0 2m37s standby +documentdb-cls-sample-2 2/2 Running 0 3m46s primary +``` + +The coordinator log on the new primary tells the whole story: the Raft leader change is +detected, the **healthiest** node (lowest LSN diff) is chosen, the PostgreSQL engine is +promoted, and the pod is re-labelled `primary`: + +```bash +$ kubectl logs -n demo documentdb-cls-sample-2 -c documentdb-coordinator | grep -iE 'leader|elect|primary|promot' +on_Leader_change.go:71] *** Raft Leader Changed **** Checking if I can run as primary*** My current Role is standby +on_Leader_change.go:350] Healthiest node detected documentdb-cls-sample-2 with LSN diff 0 bytes +ha_postgres.go:286] Previous primary from this node is : documentdb-cls-sample-0 +ha_postgres.go:288] new elected primary is :documentdb-cls-sample-2. +ha_postgres.go:381] I am the healthiest one and I am the primary. +ha_postgres.go:760] This pod is now a primary +exec_utils.go:159] demo/documentdb-cls-sample-2 is promoted as primary +ha_postgres.go:800] Successfully patched pod demo/documentdb-cls-sample-2 to role "primary" on attempt 1 +health.go:209] Timeline missmatch identified. proposing new leader timeline = 3 +``` + +## Verify data continuity + +Reconnect to the **new** primary and read the document written before the failover — it is +intact: + +```bash +$ kubectl exec -n demo documentdb-cls-sample-2 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'printjson(db.getSiblingDB("failover").coll.findOne({k:"before-failover"}));' +{ + _id: ObjectId('6a43e4bd2bb67b71d58563b1'), + k: 'before-failover', + ts: ISODate('2026-06-30T15:46:05.334Z') +} +``` + +The cluster is back to `Ready` with all conditions healthy: + +```bash +$ kubectl get docdb -n demo documentdb-cls-sample +NAME NAMESPACE VERSION STATUS AGE +documentdb-cls-sample demo pg17-0.109.0 Ready 13m + +$ kubectl get docdb -n demo documentdb-cls-sample \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +ProvisioningStarted=True :: The KubeDB operator has started the provisioning of DocumentDB: demo/documentdb-cls-sample +ReplicaReady=True :: All replicas are ready for DocumentDB demo/documentdb-cls-sample +AcceptingConnection=True :: The DocumentDB: demo/documentdb-cls-sample is accepting client requests. +Ready=True :: The DocumentDB: demo/documentdb-cls-sample is ready. +Provisioned=True :: The DocumentDB: demo/documentdb-cls-sample is successfully provisioned. +``` + +## Summary + +Failover is automatic and fast: the Raft group elected a new leader and promoted the healthiest +standby within seconds, the operator never had to intervene, and the previously-committed write +survived the loss of the original primary. The old pod rejoined the cluster as a standby once it +was rescheduled. + +## Cleaning Up + +```bash +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Horizontal scaling](/docs/guides/documentdb/scaling/horizontal-scaling/) of a DocumentDB cluster. +- [Restart](/docs/guides/documentdb/restart/) a DocumentDB database. diff --git a/docs/guides/documentdb/reconfigure/_index.md b/docs/guides/documentdb/reconfigure/_index.md new file mode 100644 index 000000000..266db024c --- /dev/null +++ b/docs/guides/documentdb/reconfigure/_index.md @@ -0,0 +1,10 @@ +--- +title: Reconfigure DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-reconfigure + name: Reconfigure + parent: dc-documentdb-guides + weight: 170 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/reconfigure/reconfigure.md b/docs/guides/documentdb/reconfigure/reconfigure.md new file mode 100644 index 000000000..101700835 --- /dev/null +++ b/docs/guides/documentdb/reconfigure/reconfigure.md @@ -0,0 +1,159 @@ +--- +title: Reconfigure DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-reconfigure-details + name: Reconfigure DocumentDB + parent: dc-reconfigure + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Reconfigure DocumentDB + +KubeDB lets you change the runtime configuration of a running `DocumentDB` database without +recreating it, using a `DocumentDBOpsRequest` of type `Reconfigure`. The underlying engine is +PostgreSQL (DocumentDB speaks the MongoDB wire protocol on top of it), so the tunables you pass +are PostgreSQL parameters supplied through a `user.conf` fragment. + +Two modes are supported: + +- **Apply a custom config** — `spec.configuration.applyConfig` merges the keys you provide into + the running configuration. +- **Remove the custom config** — `spec.configuration.removeCustomConfig: true` drops any + previously applied custom configuration and returns the database to its defaults. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a `DocumentDB` cluster (`documentdb-cls-sample`) and wait for it to become `Ready`. + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Inspecting the current configuration + +`max_connections` is a convenient parameter to watch because it has a visible default of `100`. +You can read it from the internal PostgreSQL engine (port `9712`, backend-only) using the admin +credentials from `-admin-auth`: + +```bash +$ ADMINU=$(kubectl get secret -n demo documentdb-cls-sample-admin-auth -o jsonpath='{.data.username}' | base64 -d) +$ ADMINP=$(kubectl get secret -n demo documentdb-cls-sample-admin-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + bash -lc "PGPASSWORD='$ADMINP' psql -h localhost -p 9712 -U '$ADMINU' -d postgres -tAc 'show max_connections'" +100 +``` + +## Apply a custom configuration + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-reconfigure + namespace: demo +spec: + type: Reconfigure + databaseRef: + name: documentdb-cls-sample + configuration: + applyConfig: + user.conf: | + max_connections=250 +``` + +```bash +$ kubectl apply -f cluster-reconfigure.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-reconfigure created +``` + +The operator performs a careful, leader-aware rollout: it transfers Raft leadership to the first +pod, pauses the `documentdb-coordinator` so it does not trigger an automatic failover during the +restart, then evicts the pod so it comes back with the new configuration mounted: + +```bash +$ kubectl get dcops -n demo documentdb-cls-reconfigure \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Running=True :: Reconfiguring DocumentDB Database +ReconcileDocumentDBDatabase=True :: Successfully Reconciled DocumentDB Database +TransferLeaderShipToFirstNodeBeforeCoordinatorPaused=True :: Successfully Transferred Leadership to first pod before documentdb-coordinator paused +PausePgCoordinatorBeforeCustomRestart=True :: Successfully Pause DocumentDB-Coordinator Before Custom Restart +EvictPod=True :: evict pod; ConditionStatus:True +CheckPodReady--documentdb-cls-sample-0=False :: check pod ready; ConditionStatus:False; PodName:documentdb-cls-sample-0 +``` + +## Remove a custom configuration + +Once a custom configuration has been applied, you remove it and return to the defaults with: + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-reconfigure-remove + namespace: demo +spec: + type: Reconfigure + databaseRef: + name: documentdb-cls-sample + configuration: + removeCustomConfig: true +``` + +```bash +$ kubectl apply -f cluster-reconfigure-remove.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-reconfigure-remove created +``` + +The operator performs the same leader-aware rolling restart, dropping the custom-config volume +from each pod so `max_connections` returns to its default of `100`. + +> [!CAUTION] +> **Known limitation on the `pg17-0.109.0` build used to capture this guide.** The `Reconfigure` +> OpsRequest did not converge: the operator transferred leadership and paused the coordinator +> correctly, but when it recreated the first pod with the new custom-config volume, the +> referenced config `Secret` was never created, so the pod was stuck in `Init` with a +> `FailedMount` and the OpsRequest ended in `Failed`: +> +> ```bash +> $ kubectl get events -n demo --field-selector involvedObject.name=documentdb-cls-sample-0 | grep FailedMount +> Warning FailedMount pod/documentdb-cls-sample-0 MountVolume.SetUp failed for volume "custom-config" : secret "documentdb-cls-sample-c037f1" not found +> +> $ kubectl get dcops -n demo documentdb-cls-reconfigure +> NAME TYPE STATUS AGE +> documentdb-cls-reconfigure Reconfigure Failed 11m +> ``` +> +> Because `Reconfigure` pauses the coordinator before restarting the primary, a mid-flight +> failure can leave the database `NotReady` with the coordinator waiting to be resumed. Since +> OpsRequests are admitted only while the database is `Ready`, a follow-up OpsRequest cannot +> clear that state — recovery requires recreating the `DocumentDB`. Validate `Reconfigure` +> against a non-production cluster on this build before relying on it. The YAML and rollout +> mechanics above are the intended workflow; provisioning custom configuration up front (see +> [Custom Configuration](/docs/guides/documentdb/configuration/using-config-file/)) is +> unaffected. + +## Standalone + +The same `DocumentDBOpsRequest` applies to a standalone (`replicas: 1`) instance — point +`spec.databaseRef.name` at `documentdb-sa-sample`. On this build standalone instances did not +finish bootstrapping (see the [Restart](/docs/guides/documentdb/restart/) guide), so the +standalone `Reconfigure` could not be exercised live. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-reconfigure documentdb-cls-reconfigure-remove --ignore-not-found +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- Provision a database with [Custom Configuration](/docs/guides/documentdb/configuration/using-config-file/). +- [Restart](/docs/guides/documentdb/restart/) a DocumentDB database. diff --git a/docs/guides/documentdb/restart/_index.md b/docs/guides/documentdb/restart/_index.md new file mode 100644 index 000000000..50fd2235b --- /dev/null +++ b/docs/guides/documentdb/restart/_index.md @@ -0,0 +1,10 @@ +--- +title: Restart DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-restart + name: Restart + parent: dc-documentdb-guides + weight: 140 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/restart/restart.md b/docs/guides/documentdb/restart/restart.md new file mode 100644 index 000000000..930888375 --- /dev/null +++ b/docs/guides/documentdb/restart/restart.md @@ -0,0 +1,179 @@ +--- +title: Restart DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-restart-details + name: Restart DocumentDB + parent: dc-restart + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Restart DocumentDB + +KubeDB supports restarting every pod of a `DocumentDB` database through a +`DocumentDBOpsRequest` of type `Restart`. This is useful after a node-level change, to pick up +a rotated certificate, or simply to clear transient state without deleting the database. The +operator drains and recreates the pods one at a time, always keeping a Raft leader available, +so the MongoDB wire endpoint (port `10260`) stays serviceable throughout. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. If you do not + have one, create it with [kind](https://kind.sigs.k8s.io/docs/user/quick-start/). +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo`: + + ```bash + $ kubectl create ns demo + namespace/demo created + ``` + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Deploy a DocumentDB cluster + +```yaml +apiVersion: kubedb.com/v1alpha2 +kind: DocumentDB +metadata: + name: documentdb-cls-sample + namespace: demo +spec: + version: 'pg17-0.109.0' + storageType: Durable + deletionPolicy: Delete + replicas: 3 + storage: + storageClassName: "longhorn" + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi +``` + +```bash +$ kubectl apply -f cluster.yaml +documentdb.kubedb.com/documentdb-cls-sample created +``` + +The cluster has three pods — one Raft `primary` and two `standby` replicas. Each pod runs `2/2` +containers: the `documentdb` engine and the `documentdb-coordinator` (the Raft member that +participates in leader election): + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 2m23s primary +documentdb-cls-sample-1 2/2 Running 0 2m standby +documentdb-cls-sample-2 2/2 Running 0 93s standby +``` + +## Create the Restart OpsRequest + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-restart + namespace: demo +spec: + type: Restart + databaseRef: + name: documentdb-cls-sample +``` + +- `spec.type` specifies the type of the OpsRequest. +- `spec.databaseRef` holds the name of the `DocumentDB` (it must be in the same namespace). + +```bash +$ kubectl apply -f cluster-restart.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-restart created +``` + +Watch the OpsRequest until it reports `Successful` (`dcops` is the short name for +`DocumentDBOpsRequest`; `docdb` is the short name for `DocumentDB`): + +```bash +$ kubectl get dcops -n demo documentdb-cls-restart -w +NAME TYPE STATUS AGE +documentdb-cls-restart Restart Progressing 20s +documentdb-cls-restart Restart Successful 3m52s +``` + +## What happened + +The operator restarts the **standbys first**, then transfers Raft leadership off the current +primary (a controlled failover) before restarting it last, so a writable leader is always +present. The status conditions tell the whole story: + +```bash +$ kubectl get dcops -n demo documentdb-cls-restart \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Restart=True :: DocumentDB ops request is restarting pods +ResumePGCoordinator=True :: successfully resumed documentdb-coordinator +SetRaftKeyOpsRequestProgressing=True :: Successfully Set Raft Key OpsRequestProgressing +EvictPod=True :: evict pod; ConditionStatus:True +GetPrimary=True :: get primary; ConditionStatus:True +TransferLeader=True :: transfer leader; ConditionStatus:True +TransferLeaderForFailover=True :: transfer leader for failover; ConditionStatus:True +CheckIsMaster=True :: check is master; ConditionStatus:True +FailoverDone=True :: failover is done successfully +RestartNodes=True :: Successfully restarted all nodes +Successful=True :: Successfully completed the modification process. +UnsetRaftKeyOpsRequestProgressing=True :: Successfully Unset Raft Key OpsRequestProgressing +``` + +## After the restart + +All three pods are freshly recreated and back to `2/2 Running`. Because leadership was +transferred during the rolling restart, the `primary` role has moved to a different pod — this +is expected and harmless: + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 66s standby +documentdb-cls-sample-1 2/2 Running 0 2m51s primary +documentdb-cls-sample-2 2/2 Running 0 2m1s standby +``` + +The database answers the MongoDB wire protocol immediately after the restart: + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Standalone + +The method of restarting a standalone (`replicas: 1`) and a cluster database is exactly the +same — point `spec.databaseRef.name` at the standalone instance (`documentdb-sa-sample`). + +> [!NOTE] +> On the build used to capture this guide (`pg17-0.109.0`), standalone instances did not finish +> bootstrapping: the standalone PetSet is rendered without the `documentdb-coordinator` sidecar, +> so the internal PostgreSQL is never initialized and the database never reaches `Ready`. +> Because KubeDB admits OpsRequests only against a `Ready` database, the standalone variant +> could not be exercised live (a `Restart` request stayed `Pending`); the cluster procedure +> above applies verbatim once a standalone instance is healthy. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-restart +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Vertical scaling](/docs/guides/documentdb/scaling/vertical-scaling/) of a DocumentDB cluster. +- [Horizontal scaling](/docs/guides/documentdb/scaling/horizontal-scaling/) of a DocumentDB cluster. diff --git a/docs/guides/documentdb/rotate-authentication/_index.md b/docs/guides/documentdb/rotate-authentication/_index.md new file mode 100644 index 000000000..1f21b4489 --- /dev/null +++ b/docs/guides/documentdb/rotate-authentication/_index.md @@ -0,0 +1,10 @@ +--- +title: Rotate Authentication DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-rotate-authentication + name: Rotate Authentication + parent: dc-documentdb-guides + weight: 160 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/rotate-authentication/rotate-authentication.md b/docs/guides/documentdb/rotate-authentication/rotate-authentication.md new file mode 100644 index 000000000..fc2acb505 --- /dev/null +++ b/docs/guides/documentdb/rotate-authentication/rotate-authentication.md @@ -0,0 +1,162 @@ +--- +title: Rotate Authentication DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-rotate-authentication-details + name: Rotate Authentication + parent: dc-rotate-authentication + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Rotate Authentication Credentials of DocumentDB + +KubeDB can rotate the credentials of a `DocumentDB` database on demand with a +`DocumentDBOpsRequest` of type `RotateAuth`. The operator generates a fresh password, applies it +to the database, rolls the pods so every replica picks up the change, and preserves the previous +password under a `password.prev` key so you can reconcile any clients that still hold the old +secret. + +## Two credentials, one rotated + +A DocumentDB database ships with **two** auth secrets (this is a key difference from KubeDB +Postgres, which has a single auth secret): + +| Secret | User | Purpose | +| --- | --- | --- | +| `-auth` | `default_user` | application / MongoDB-wire login (port `10260`) | +| `-admin-auth` | `documentdb` | internal admin / backend-PostgreSQL superuser | + +> [!IMPORTANT] +> **`RotateAuth` rotates only the `-admin-auth` secret. The application `-auth` secret is +> left untouched.** This is demonstrated explicitly below. After the OpsRequest completes, +> re-read `-admin-auth` to obtain the new admin password. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a `DocumentDB` cluster (`documentdb-cls-sample`) and wait for it to become `Ready`. + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Credentials before rotation + +```bash +$ kubectl get secret -n demo documentdb-cls-sample-admin-auth -o jsonpath='{.data.password}' | base64 -d +EY1imAac)vqps)Ez +$ kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d +DQShSsn0Dqq7Uf*F +``` + +The `-admin-auth` secret has no `password.prev` key yet (nothing has been rotated): + +```bash +$ kubectl get secret -n demo documentdb-cls-sample-admin-auth -o jsonpath='{.data.password\.prev}' + # (empty) +``` + +## Create the RotateAuth OpsRequest + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-rotate-auth + namespace: demo +spec: + type: RotateAuth + databaseRef: + name: documentdb-cls-sample +``` + +```bash +$ kubectl apply -f cluster-rotate-auth.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-rotate-auth created + +$ kubectl get dcops -n demo documentdb-cls-rotate-auth +NAME TYPE STATUS AGE +documentdb-cls-rotate-auth RotateAuth Successful 3m34s +``` + +The status conditions show the new credential being generated, applied to the primary, written +into the PetSet, and then a rolling restart so all replicas pick it up: + +```bash +$ kubectl get dcops -n demo documentdb-cls-rotate-auth \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +RotateAuth=True :: DocumentDB ops request has started to rotate auth for documentdb +UpdateCredential=True :: Successfully generated new credentials +ApplyNewCredential=True :: Successfully applied rotated credential to the database primary +UpdatePetSets=True :: Successfully updated petsets for rotate auth type +EvictPod=True :: evict pod; ConditionStatus:True +CheckPodReady=True :: check pod ready; ConditionStatus:True +RestartNodes=True :: Successfully restarted all the nodes +RestartReadReplicas=True :: Successfully Restarted Read Replicas +Successful=True :: Successfully Rotated DocumentDB Auth Secret +UnsetRaftKeyOpsRequestProgressing=True :: Successfully Unset Raft Key OpsRequestProgressing +``` + +## Credentials after rotation + +The admin password has changed, and the **old admin password is retained under +`password.prev`**: + +```bash +$ kubectl get secret -n demo documentdb-cls-sample-admin-auth -o jsonpath='{.data.password}' | base64 -d +ELKnwAUT.I85QJ4g +$ kubectl get secret -n demo documentdb-cls-sample-admin-auth -o jsonpath='{.data.password\.prev}' | base64 -d +EY1imAac)vqps)Ez +``` + +The application `-auth` secret is **unchanged** — same password as before, and no +`password.prev` was written: + +```bash +$ kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d +DQShSsn0Dqq7Uf*F # identical to the "before" value +$ kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password\.prev}' + # (empty) +``` + +Because the application credential did not change, existing MongoDB-wire clients keep working +with no reconfiguration: + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-1 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Summary + +`RotateAuth` is a safe, targeted operation: it rotates the **admin** credential only +(`-admin-auth`), keeps the prior value in `password.prev` for a grace window, and leaves the +application login (`-auth`) and all client connections undisturbed. + +## Standalone + +The same `DocumentDBOpsRequest` applies to a standalone (`replicas: 1`) instance — point +`spec.databaseRef.name` at `documentdb-sa-sample`. On this build standalone instances did not +finish bootstrapping (see the [Restart](/docs/guides/documentdb/restart/) guide), so the +standalone `RotateAuth` could not be exercised live; the behavior above applies once a +standalone instance is healthy. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-rotate-auth +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Restart](/docs/guides/documentdb/restart/) a DocumentDB database. +- Provision a database with [Custom Configuration](/docs/guides/documentdb/configuration/using-config-file/). diff --git a/docs/guides/documentdb/scaling/_index.md b/docs/guides/documentdb/scaling/_index.md new file mode 100644 index 000000000..8b4b796da --- /dev/null +++ b/docs/guides/documentdb/scaling/_index.md @@ -0,0 +1,10 @@ +--- +title: Scaling DocumentDB +menu: + docs_{{ .version }}: + identifier: guides-documentdb-scaling + name: Scaling DocumentDB + parent: dc-documentdb-guides + weight: 175 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/scaling/horizontal-scaling/_index.md b/docs/guides/documentdb/scaling/horizontal-scaling/_index.md new file mode 100644 index 000000000..1dc14cf9b --- /dev/null +++ b/docs/guides/documentdb/scaling/horizontal-scaling/_index.md @@ -0,0 +1,10 @@ +--- +title: Horizontal Scaling +menu: + docs_{{ .version }}: + identifier: guides-documentdb-scaling-horizontal + name: Horizontal Scaling + parent: guides-documentdb-scaling + weight: 10 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/scaling/horizontal-scaling/horizontal-scaling.md b/docs/guides/documentdb/scaling/horizontal-scaling/horizontal-scaling.md new file mode 100644 index 000000000..a9eabe3a8 --- /dev/null +++ b/docs/guides/documentdb/scaling/horizontal-scaling/horizontal-scaling.md @@ -0,0 +1,178 @@ +--- +title: Horizontal Scaling DocumentDB +menu: + docs_{{ .version }}: + identifier: guides-documentdb-scaling-horizontal-details + name: Horizontal Scaling + parent: guides-documentdb-scaling-horizontal + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Horizontal Scaling of a DocumentDB Cluster + +Horizontal scaling changes the **number of replicas** in a `DocumentDB` cluster. KubeDB drives +this through a `DocumentDBOpsRequest` of type `HorizontalScaling`. Because a DocumentDB cluster +forms a Raft group (managed by the `documentdb-coordinator` container in each pod), scaling is +not just a matter of adding or removing Kubernetes pods — the operator also **adds or removes +Raft members** so the consensus group always reflects the live set of replicas. + +> Horizontal scaling applies to the clustered topology only; a standalone (`replicas: 1`) +> DocumentDB has no replicas to scale. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a 3-replica `DocumentDB` cluster (`documentdb-cls-sample`) and wait for it to become + `Ready` before proceeding. + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Starting point: 3 replicas + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 97s standby +documentdb-cls-sample-1 2/2 Running 0 3m22s primary +documentdb-cls-sample-2 2/2 Running 0 2m32s standby + +$ kubectl get docdb -n demo documentdb-cls-sample -o jsonpath='{.spec.replicas}' +3 +``` + +## Scale up: 3 → 5 + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-hscale-up + namespace: demo +spec: + type: HorizontalScaling + databaseRef: + name: documentdb-cls-sample + horizontalScaling: + replicas: 5 +``` + +```bash +$ kubectl apply -f cluster-hscale-up.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-hscale-up created + +$ kubectl get dcops -n demo documentdb-cls-hscale-up +NAME TYPE STATUS AGE +documentdb-cls-hscale-up HorizontalScaling Successful 3m31s +``` + +Two new pods are provisioned (`-3`, `-4`) and **joined to the Raft group** as standbys. The +status conditions show the new members being added via the coordinator: + +```bash +$ kubectl get dcops -n demo documentdb-cls-hscale-up \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Running=True :: DocumentDB ops request is horizontally scaling database +GetCurrentLeader--documentdb-cls-sample-0=True :: get current leader; ConditionStatus:True +AddRaftNode--documentdb-cls-sample-3=True :: add raft node; ConditionStatus:True; PodName:documentdb-cls-sample-3 +PatchPetset=True :: patch petset; ConditionStatus:True +AddRaftNode--documentdb-cls-sample-4=True :: add raft node; ConditionStatus:True; PodName:documentdb-cls-sample-4 +HorizontalScaleUp=True :: Successfully Horizontally Scaled Up +Successful=True :: Successfully Horizontally Scaled DocumentDB +``` + +The cluster now runs five pods — one primary and four standbys: + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 5m7s standby +documentdb-cls-sample-1 2/2 Running 0 6m52s primary +documentdb-cls-sample-2 2/2 Running 0 6m2s standby +documentdb-cls-sample-3 2/2 Running 0 2m56s standby +documentdb-cls-sample-4 2/2 Running 0 106s standby +``` + +## Scale down: 5 → 3 + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-hscale-down + namespace: demo +spec: + type: HorizontalScaling + databaseRef: + name: documentdb-cls-sample + horizontalScaling: + replicas: 3 +``` + +```bash +$ kubectl apply -f cluster-hscale-down.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-hscale-down created + +$ kubectl get dcops -n demo documentdb-cls-hscale-down +NAME TYPE STATUS AGE +documentdb-cls-hscale-down HorizontalScaling Successful 2m43s +``` + +On the way down the operator first **removes the surplus Raft members**, then deletes their +pods and their PVCs, so no orphaned storage is left behind: + +```bash +$ kubectl get dcops -n demo documentdb-cls-hscale-down \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Running=True :: DocumentDB ops request is horizontally scaling database +GetCurrentRaftLeader--documentdb-cls-sample-0=True :: get current raft leader; ConditionStatus:True +RemoveRaftNode--documentdb-cls-sample-4=True :: remove raft node; ConditionStatus:True; PodName:documentdb-cls-sample-4 +PatchPetset=True :: patch petset; ConditionStatus:True +DeletePvc--documentdb-cls-sample-4=True :: delete pvc; ConditionStatus:True; PodName:documentdb-cls-sample-4 +RemoveRaftNode--documentdb-cls-sample-3=True :: remove raft node; ConditionStatus:True; PodName:documentdb-cls-sample-3 +DeletePvc--documentdb-cls-sample-3=True :: delete pvc; ConditionStatus:True; PodName:documentdb-cls-sample-3 +HorizontalScaleDown=True :: Successfully Horizontally Scaled Down +Successful=True :: Successfully Horizontally Scaled DocumentDB +``` + +Back to the original three-pod topology, still fully serviceable: + +```bash +$ kubectl get pods -n demo -l app.kubernetes.io/instance=documentdb-cls-sample -L kubedb.com/role +NAME READY STATUS RESTARTS AGE ROLE +documentdb-cls-sample-0 2/2 Running 0 8m1s standby +documentdb-cls-sample-1 2/2 Running 0 9m46s primary +documentdb-cls-sample-2 2/2 Running 0 8m56s standby + +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-1 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Key takeaway + +Raft membership grows and shrinks **in lockstep** with the replica count. On scale-up the +coordinator runs `AddRaftNode` for each new pod before it counts toward the quorum; on +scale-down it runs `RemoveRaftNode` (and cleans up the PVC) before the pod disappears. The +leader is never disrupted, so writes through the MongoDB endpoint continue uninterrupted in +both directions. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-hscale-up documentdb-cls-hscale-down +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Vertical scaling](/docs/guides/documentdb/scaling/vertical-scaling/) of a DocumentDB cluster. +- [Compute autoscaling](/docs/guides/documentdb/autoscaler/compute/) of a DocumentDB cluster. diff --git a/docs/guides/documentdb/scaling/vertical-scaling/_index.md b/docs/guides/documentdb/scaling/vertical-scaling/_index.md new file mode 100644 index 000000000..110b56970 --- /dev/null +++ b/docs/guides/documentdb/scaling/vertical-scaling/_index.md @@ -0,0 +1,10 @@ +--- +title: Vertical Scaling +menu: + docs_{{ .version }}: + identifier: guides-documentdb-scaling-vertical + name: Vertical Scaling + parent: guides-documentdb-scaling + weight: 20 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/scaling/vertical-scaling/vertical-scaling.md b/docs/guides/documentdb/scaling/vertical-scaling/vertical-scaling.md new file mode 100644 index 000000000..0f3235801 --- /dev/null +++ b/docs/guides/documentdb/scaling/vertical-scaling/vertical-scaling.md @@ -0,0 +1,157 @@ +--- +title: Vertical Scaling DocumentDB +menu: + docs_{{ .version }}: + identifier: guides-documentdb-scaling-vertical-details + name: Vertical Scaling + parent: guides-documentdb-scaling-vertical + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Vertical Scaling of a DocumentDB Cluster + +Vertical scaling changes the **CPU and memory** allocated to the containers in a `DocumentDB` +database. A DocumentDB pod runs two containers that can be sized independently: + +- `documentdb` — the database engine (MongoDB wire protocol over internal PostgreSQL). +- `documentdb-coordinator` — the Raft member that handles leader election and membership. + +A `DocumentDBOpsRequest` of type `VerticalScaling` lets you set new resource requests/limits for +either or both. The operator rolls the change out pod by pod (evicting standbys first, the +primary last) so the cluster stays available. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a `DocumentDB` cluster (`documentdb-cls-sample`) and wait for it to become `Ready`. + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## Resources before + +```bash +$ kubectl get docdb -n demo documentdb-cls-sample \ + -o jsonpath='{range .spec.podTemplate.spec.containers[*]}{.name}: requests={.resources.requests} limits={.resources.limits}{"\n"}{end}' +documentdb: requests={"cpu":"500m","memory":"2Gi"} limits={"memory":"2Gi"} +documentdb-coordinator: requests={"cpu":"200m","memory":"256Mi"} limits={"memory":"256Mi"} +``` + +## Create the VerticalScaling OpsRequest + +This request bumps the `documentdb` engine and, at the same time, *lowers* the coordinator's CPU +request — both containers are addressed in one OpsRequest: + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-vscale + namespace: demo +spec: + type: VerticalScaling + databaseRef: + name: documentdb-cls-sample + verticalScaling: + documentdb: + resources: + requests: + cpu: 600m + memory: 2.5Gi + limits: + cpu: "1" + memory: 2.5Gi + coordinator: + resources: + requests: + cpu: 100m + memory: 256Mi +``` + +```bash +$ kubectl apply -f cluster-vertical-scaling.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-vscale created + +$ kubectl get dcops -n demo documentdb-cls-vscale +NAME TYPE STATUS AGE +documentdb-cls-vscale VerticalScaling Successful 3m33s +``` + +The status conditions show the PetSet being patched and each pod being evicted and re-checked +for readiness before the next is touched: + +```bash +$ kubectl get dcops -n demo documentdb-cls-vscale \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Running=True :: Vertical Scaling is in progress +UpdatePetSets=True :: Successfully updated petsets resources +EvictPod=True :: evict pod; ConditionStatus:True +CheckPodReady=True :: check pod ready; ConditionStatus:True +CheckReplicaFunc=True :: check replica func; ConditionStatus:True +VerticalScale=True :: VerticalScaleSucceeded +RestartReadReplicas=True :: Successfully Restarted Read Replicas +Successful=True :: Successfully Vertically Scaled Database +``` + +## Resources after + +Both containers reflect the new sizing (note `2.5Gi` is normalized to its binary equivalent +`2560Mi`, and the `documentdb` container now carries a CPU limit of `1`): + +```bash +$ kubectl get docdb -n demo documentdb-cls-sample \ + -o jsonpath='{range .spec.podTemplate.spec.containers[*]}{.name}: requests={.resources.requests} limits={.resources.limits}{"\n"}{end}' +documentdb: requests={"cpu":"600m","memory":"2560Mi"} limits={"cpu":"1","memory":"2560Mi"} +documentdb-coordinator: requests={"cpu":"100m","memory":"256Mi"} limits={"memory":"256Mi"} +``` + +The live pod spec matches — the change propagated all the way to the running containers: + +```bash +$ kubectl get pod -n demo documentdb-cls-sample-0 \ + -o jsonpath='{range .spec.containers[*]}{.name}: req={.resources.requests} lim={.resources.limits}{"\n"}{end}' +documentdb: req={"cpu":"600m","memory":"2560Mi"} lim={"cpu":"1","memory":"2560Mi"} +documentdb-coordinator: req={"cpu":"100m","memory":"256Mi"} lim={"memory":"256Mi"} +``` + +The cluster remains healthy and accepts MongoDB traffic after the rollout: + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Standalone + +The same `DocumentDBOpsRequest` works for a standalone (`replicas: 1`) instance — point +`spec.databaseRef.name` at the standalone database (`documentdb-sa-sample`) and address the +`documentdb` (and optionally `coordinator`) container under `spec.verticalScaling`. + +> [!NOTE] +> On the build used to capture this guide (`pg17-0.109.0`), standalone instances did not finish +> bootstrapping (the standalone PetSet omits the `documentdb-coordinator` sidecar, so the +> internal PostgreSQL is never initialized and the database never reaches `Ready`). Because +> OpsRequests are admitted only against a `Ready` database, the standalone variant could not be +> exercised live; the cluster procedure above applies verbatim once a standalone instance is +> healthy. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-vscale +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Horizontal scaling](/docs/guides/documentdb/scaling/horizontal-scaling/) of a DocumentDB cluster. +- [Compute autoscaling](/docs/guides/documentdb/autoscaler/compute/) of a DocumentDB cluster. diff --git a/docs/guides/documentdb/storage-migration/_index.md b/docs/guides/documentdb/storage-migration/_index.md new file mode 100644 index 000000000..2ee73485c --- /dev/null +++ b/docs/guides/documentdb/storage-migration/_index.md @@ -0,0 +1,10 @@ +--- +title: Storage Migration +menu: + docs_{{ .version }}: + identifier: dc-storage-migration + name: Storage Migration + parent: dc-documentdb-guides + weight: 210 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/storage-migration/storage-migration.md b/docs/guides/documentdb/storage-migration/storage-migration.md new file mode 100644 index 000000000..e574343c7 --- /dev/null +++ b/docs/guides/documentdb/storage-migration/storage-migration.md @@ -0,0 +1,170 @@ +--- +title: Storage Migration DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-storage-migration-details + name: Storage Migration + parent: dc-storage-migration + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Storage Migration of DocumentDB + +`StorageMigration` moves a `DocumentDB` database from one StorageClass to another without losing +data, using a `DocumentDBOpsRequest` of type `StorageMigration`. This is the tool you reach for +when you need to change the storage backend of a running database — for example moving from one +CSI provisioner to another. This guide migrates a 3-node cluster from `longhorn` to +`standard-custom`. + +Unlike provisioning a fresh replica (which seeds standbys with `pg_basebackup`), storage +migration performs a **block-level copy of each existing PVC** into a new PVC on the target +StorageClass, one pod at a time, then re-points the pod at the migrated volume. The data +directory is copied verbatim, so the migrated replica does not have to re-stream a base backup. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a `DocumentDB` cluster (`documentdb-cls-sample`) and wait for it to become `Ready`. +- Confirm both the source and target StorageClasses exist (`kubectl get sc`). + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## PVCs before + +The cluster is on `longhorn`, `10Gi` per replica: + +```bash +$ kubectl get pvc -n demo -l app.kubernetes.io/instance=documentdb-cls-sample \ + -o custom-columns=NAME:.metadata.name,SIZE:.status.capacity.storage,SC:.spec.storageClassName,STATUS:.status.phase +NAME SIZE SC STATUS +data-documentdb-cls-sample-0 10Gi longhorn Bound +data-documentdb-cls-sample-1 10Gi longhorn Bound +data-documentdb-cls-sample-2 10Gi longhorn Bound + +$ kubectl get docdb -n demo documentdb-cls-sample -o jsonpath='{.spec.storage.storageClassName}' +longhorn +``` + +## Create the StorageMigration OpsRequest + +`migration.storageClassName` is the target; `oldPVReclaimPolicy: Delete` cleans up the source +PersistentVolumes once their data has been copied: + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-storage-migration + namespace: demo +spec: + type: StorageMigration + databaseRef: + name: documentdb-cls-sample + timeout: 10m + migration: + storageClassName: standard-custom + oldPVReclaimPolicy: Delete +``` + +```bash +$ kubectl apply -f cluster-storage-migration.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-storage-migration created + +$ kubectl get dcops -n demo documentdb-cls-storage-migration +NAME TYPE STATUS AGE +documentdb-cls-storage-migration StorageMigration Successful 8m13s +``` + +## What happened + +The operator migrates **standbys first and the primary last**, switching leadership off the +primary just before its turn so the cluster stays writable throughout. For each pod it: mounts a +temporary helper pod on a new PVC, runs a `migrator` job to copy the data directory, deletes the +old PVC, binds the new one under the original PVC name, recreates the pod, and waits for it to +be ready. The condition stream (trimmed) captures the loop: + +```bash +$ kubectl get dcops -n demo documentdb-cls-storage-migration \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Running=True :: StorageClass migration is in progress +PetSetDeleted--documentdb-cls-sample=True :: pet set deleted +GetStorageClass=True :: get storage class +# --- per replica (sample-0 shown) --- +PVCCreated--data-migrate-documentdb-cls-sample-0=True :: p v c created +PodCreated--pvcmounter-documentdb-cls-sample-0=True :: pod created +JobCreated--migrator-documentdb-cls-sample-0=True :: job created +JobDeleted--migrator-documentdb-cls-sample-0=True :: job deleted +PVCDeleted--data-documentdb-cls-sample-0=True :: p v c deleted +PVCCreated--data-documentdb-cls-sample-0=True :: p v c created +PodCreated--documentdb-cls-sample-0=True :: pod created +PodReady--documentdb-cls-sample-0=True :: pod ready +PodMigrationCompleted-documentdb-cls-sample-0=True :: PVC Migration Completed for documentdb-cls-sample-0 +# --- leadership switched before migrating the last (primary) pod --- +SwitchPrimary--documentdb-cls-sample-2=True :: Successfully switched primary from documentdb-cls-sample-2 to documentdb-cls-sample-0 before its migration +PodMigrationCompleted-documentdb-cls-sample-2=True :: PVC Migration Completed for documentdb-cls-sample-2 +StorageMigration=True :: Successfully migrated StorageClass for DocumentDB Database +Successful=True :: Successfully Migrated DocumentDB StorageClass +UnsetRaftKeyOpsRequestProgressing=True :: Successfully Unset Raft Key OpsRequestProgressing +``` + +## PVCs after + +All three data volumes are now backed by `standard-custom`, keeping their `10Gi` size and +original PVC names, and the `DocumentDB` object reflects the new StorageClass: + +```bash +$ kubectl get pvc -n demo -l app.kubernetes.io/instance=documentdb-cls-sample \ + -o custom-columns=NAME:.metadata.name,SIZE:.status.capacity.storage,SC:.spec.storageClassName,STATUS:.status.phase +NAME SIZE SC STATUS +data-documentdb-cls-sample-0 10Gi standard-custom Bound +data-documentdb-cls-sample-1 10Gi standard-custom Bound +data-documentdb-cls-sample-2 10Gi standard-custom Bound + +$ kubectl get docdb -n demo documentdb-cls-sample -o jsonpath='sc={.spec.storage.storageClassName} phase={.status.phase}' +sc=standard-custom phase=Ready +``` + +The cluster is `Ready`, all pods `2/2`, and previously written data survived the migration +intact: + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'printjson(db.runCommand({ping:1}));' +{ ok: 1 } +``` + +> [!NOTE] +> In the test environment, migrating to `standard-custom` (backed by the `local-path` +> provisioner) succeeded and the cluster returned to `Ready`, even though a *freshly provisioned* +> multi-replica DocumentDB on `local-path` can fail to bring up standbys (because `pg_basebackup` +> trips a mount check on that provisioner). Storage migration avoids that path entirely — it +> copies the already-initialized data directory block-for-block rather than re-seeding the +> standby. + +## Standalone + +The same `DocumentDBOpsRequest` applies to a standalone (`replicas: 1`) instance — point +`spec.databaseRef.name` at `documentdb-sa-sample`. On this build standalone instances did not +finish bootstrapping (see the [Restart](/docs/guides/documentdb/restart/) guide), so the +standalone migration could not be exercised live. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-storage-migration +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Volume expansion](/docs/guides/documentdb/volume-expansion/) of a DocumentDB cluster. +- [Storage autoscaling](/docs/guides/documentdb/autoscaler/storage/) of a DocumentDB cluster. diff --git a/docs/guides/documentdb/volume-expansion/_index.md b/docs/guides/documentdb/volume-expansion/_index.md new file mode 100644 index 000000000..cdd0235b5 --- /dev/null +++ b/docs/guides/documentdb/volume-expansion/_index.md @@ -0,0 +1,10 @@ +--- +title: Volume Expansion +menu: + docs_{{ .version }}: + identifier: dc-volume-expansion + name: Volume Expansion + parent: dc-documentdb-guides + weight: 200 +menu_name: docs_{{ .version }} +--- diff --git a/docs/guides/documentdb/volume-expansion/volume-expansion.md b/docs/guides/documentdb/volume-expansion/volume-expansion.md new file mode 100644 index 000000000..5203712a9 --- /dev/null +++ b/docs/guides/documentdb/volume-expansion/volume-expansion.md @@ -0,0 +1,143 @@ +--- +title: Volume Expansion DocumentDB +menu: + docs_{{ .version }}: + identifier: dc-volume-expansion-details + name: Volume Expansion + parent: dc-volume-expansion + weight: 10 +menu_name: docs_{{ .version }} +section_menu_id: guides +--- + +> New to KubeDB? Please start [here](/docs/README.md). + +# Volume Expansion of DocumentDB + +When a `DocumentDB` database is provisioned on an **expandable** StorageClass you can grow its +data volumes in place with a `DocumentDBOpsRequest` of type `VolumeExpansion` — no +backup/restore and no manual PVC editing required. This guide expands a 3-node cluster from +`5Gi` to `10Gi` per replica. + +> [!IMPORTANT] +> The StorageClass must allow volume expansion (`allowVolumeExpansion: true`). This guide uses +> `longhorn`, which is expandable. The default `local-path` StorageClass on many clusters is +> **not** expandable — check with `kubectl get sc` and use an expandable class. + +## Before You Begin + +- You need a Kubernetes cluster and the `kubectl` CLI configured to talk to it. +- Install KubeDB following the steps [here](/docs/setup/README.md). +- This tutorial uses a namespace called `demo` (`kubectl create ns demo`). +- Deploy a `DocumentDB` cluster (`documentdb-cls-sample`) on an **expandable** StorageClass + (`longhorn`) and wait for it to become `Ready`. + +> Note: YAML files used in this tutorial are stored in [docs/examples/documentdb](https://github.com/kubedb/docs/tree/{{< param "info.version" >}}/docs/examples/documentdb) folder in GitHub repository [kubedb/docs](https://github.com/kubedb/docs). + +## PVCs before + +```bash +$ kubectl get pvc -n demo -l app.kubernetes.io/instance=documentdb-cls-sample \ + -o custom-columns=NAME:.metadata.name,SIZE:.status.capacity.storage,SC:.spec.storageClassName,STATUS:.status.phase +NAME SIZE SC STATUS +data-documentdb-cls-sample-0 5Gi longhorn Bound +data-documentdb-cls-sample-1 5Gi longhorn Bound +data-documentdb-cls-sample-2 5Gi longhorn Bound +``` + +## Create the VolumeExpansion OpsRequest + +`mode: Offline` tells the operator to recreate the pods around the resize (the PetSet is deleted +and recreated so the larger PVCs are picked up cleanly). The `documentdb` field carries the new +target size: + +```yaml +apiVersion: ops.kubedb.com/v1alpha1 +kind: DocumentDBOpsRequest +metadata: + name: documentdb-cls-volume-expansion + namespace: demo +spec: + type: VolumeExpansion + databaseRef: + name: documentdb-cls-sample + volumeExpansion: + mode: Offline + documentdb: 10Gi +``` + +```bash +$ kubectl apply -f cluster-volume-expansion.yaml +documentdbopsrequest.ops.kubedb.com/documentdb-cls-volume-expansion created + +$ kubectl get dcops -n demo documentdb-cls-volume-expansion +NAME TYPE STATUS AGE +documentdb-cls-volume-expansion VolumeExpansion Successful 4m49s +``` + +The status conditions walk through the offline-expansion mechanics: the operator deletes the +PetSet, then for each replica it deletes the pod, expands the PVC, recreates the pod, and waits +for it to become ready, before finally recreating the PetSet: + +```bash +$ kubectl get dcops -n demo documentdb-cls-volume-expansion \ + -o jsonpath='{range .status.conditions[*]}{.type}={.status} :: {.message}{"\n"}{end}' +Running=True :: Volume Expansion is in progress +DeletePetset=True :: delete petset; ConditionStatus:True +IsPvcData-documentdb-cls-sample-0Updated=True :: is pvc data-documentdb-cls-sample-0 updated; ConditionStatus:True +CreatePod=True :: create pod; ConditionStatus:True +IsPodReady=True :: is pod ready; ConditionStatus:True +IsPvcData-documentdb-cls-sample-2Updated=True :: is pvc data-documentdb-cls-sample-2 updated; ConditionStatus:True +IsPvcData-documentdb-cls-sample-1Updated=True :: is pvc data-documentdb-cls-sample-1 updated; ConditionStatus:True +VolumeExpansion=True :: Offline Volume Expansion performed successfully in DocumentDB pods +ReadyPetSets=True :: PetSet is recreated +Successful=True :: Successfully Expanded Volume. +``` + +## PVCs after + +All three data volumes are now `10Gi`, and the `DocumentDB` object's storage request is updated +to match: + +```bash +$ kubectl get pvc -n demo -l app.kubernetes.io/instance=documentdb-cls-sample \ + -o custom-columns=NAME:.metadata.name,SIZE:.status.capacity.storage,SC:.spec.storageClassName,STATUS:.status.phase +NAME SIZE SC STATUS +data-documentdb-cls-sample-0 10Gi longhorn Bound +data-documentdb-cls-sample-1 10Gi longhorn Bound +data-documentdb-cls-sample-2 10Gi longhorn Bound + +$ kubectl get docdb -n demo documentdb-cls-sample -o jsonpath='{.spec.storage.resources.requests.storage}' +10Gi +``` + +The cluster is healthy and serving traffic after the expansion: + +```bash +$ PASS=$(kubectl get secret -n demo documentdb-cls-sample-auth -o jsonpath='{.data.password}' | base64 -d) +$ kubectl exec -n demo documentdb-cls-sample-0 -c documentdb -- \ + mongosh "mongodb://default_user:${PASS}@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true" \ + --quiet --eval 'db.runCommand({ ping: 1 })' +{ ok: 1 } +``` + +## Standalone + +The same `DocumentDBOpsRequest` works for a standalone (`replicas: 1`) instance on an expandable +StorageClass — point `spec.databaseRef.name` at `documentdb-sa-sample`. On this build standalone +instances did not finish bootstrapping (see the [Restart](/docs/guides/documentdb/restart/) +guide), so the standalone expansion could not be exercised live; the cluster procedure applies +once a standalone instance is healthy. + +## Cleaning Up + +```bash +kubectl delete documentdbopsrequest -n demo documentdb-cls-volume-expansion +kubectl delete documentdb -n demo documentdb-cls-sample +kubectl delete ns demo +``` + +## Next Steps + +- [Storage migration](/docs/guides/documentdb/storage-migration/) to a different StorageClass. +- [Storage autoscaling](/docs/guides/documentdb/autoscaler/storage/) of a DocumentDB cluster.