Skip to content

Add MariaDB distributed disaster recovery (DC-DR) guide#939

Open
tamalsaha wants to merge 1 commit into
masterfrom
pr/mariadb-dc-dr-docs
Open

Add MariaDB distributed disaster recovery (DC-DR) guide#939
tamalsaha wants to merge 1 commit into
masterfrom
pr/mariadb-dc-dr-docs

Conversation

@tamalsaha

@tamalsaha tamalsaha commented Jun 30, 2026

Copy link
Copy Markdown
Member

Summary

Adds a new Disaster Recovery (DC-DR) subsection under the MariaDB distributed guide, documenting the cross data center disaster recovery design for distributed MariaDB.

New section docs/guides/mariadb/distributed/disaster-recovery/:

  • overview/index.md (weight 10): architecture and concepts. Covers why a stretched Galera cluster is fragile across DCs, the per Member DC self contained Galera model with strictly intra-DC quorum, the dr-controlplane three site etcd quorum and the single primary-dc Lease, the projected marker ConfigMap fail-closed fence (namespace dc-failover, 30s TTL), the net-new leader to leader GTID asynchronous cross-DC link and its required Galera to Galera GTID config, role labeling so only the active DC resolves on the primary Service, the Arbiter DC and per-DC garbd, planned switchover (zero RPO) vs unplanned failover vs SST re-seed failback, the cross-DC lag guard, and status.disasterRecovery. Includes a text architecture diagram (no image files).
  • setup/index.md (weight 20): how-to that enables DC-DR, applies a PlacementPolicy with failoverPolicy (mode TwoDC, trigger scope Global) and Member/Arbiter distributionRules, creates the distributed MariaDB CR, and verifies exactly one writable DC (Lease, status, role labels, standby read-only fence, write rejection). Also shows the planned switchover annotation.
  • setup/yamls/: example PlacementPolicy (two Member DCs plus one Arbiter DC) and a distributed MariaDB manifest, mirroring the design's key patterns.

The section sorts at weight 40, between the distributed overview (10) and autoscaler (47).

All pages use the standard Hugo docs_{{ .version }} menu templating and the New to KubeDB? lead line. No em-dashes. Internal links use root-absolute /docs/... paths.

Summary by CodeRabbit

  • Documentation
    • Added a new Disaster Recovery (DC-DR) guide section for MariaDB.
    • Published an overview explaining cross-data-center failover, replication, fencing, and recovery behavior.
    • Added a setup guide with step-by-step DC-DR configuration and verification instructions.
    • Included example manifests for a distributed MariaDB deployment and placement policy.

Signed-off-by: Tamal Saha <tamal@appscode.com>
@coderabbitai

coderabbitai Bot commented Jun 30, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds new documentation for MariaDB Cross Data Center Disaster Recovery (DC-DR) under the distributed MariaDB guides, including an index page, an architecture overview, a step-by-step setup guide, and example YAML manifests for a PlacementPolicy and a DC-DR-enabled MariaDB custom resource.

Changes

MariaDB DC-DR Documentation

Layer / File(s) Summary
Documentation index
docs/guides/mariadb/distributed/disaster-recovery/_index.md
Adds front-matter and menu entry for the DC-DR documentation section.
Architecture overview
docs/guides/mariadb/distributed/disaster-recovery/overview/index.md
Explains DC-DR architecture: intra-DC Galera quorum, dr-controlplane/Lease-based failover authority, GTID async cross-DC replication, fail-closed fencing, role labeling, Arbiter DC, operational flows (switchover/failover/failback), lag guard, status fields, and enablement annotation.
Setup guide
docs/guides/mariadb/distributed/disaster-recovery/setup/index.md
Walks through prerequisites, creating a PlacementPolicy, deploying a DC-DR MariaDB CR, verifying single-writable-DC behavior, triggering switchover, and cleanup.
Example manifests
docs/guides/mariadb/distributed/disaster-recovery/setup/yamls/mariadb.yaml, .../yamls/placement-policy.yaml
Adds sample MariaDB CR and PlacementPolicy YAML referenced by the setup guide.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant dr-controlplane
  participant ActiveDC as Active DC Galera
  participant StandbyDC as Standby DC Galera
  Client->>dr-controlplane: Switchover annotation
  dr-controlplane->>dr-controlplane: Update primary-dc Lease
  dr-controlplane->>ActiveDC: Quiesce writes, GTID catch-up
  ActiveDC->>StandbyDC: Async GTID replication
  dr-controlplane->>StandbyDC: Promote to Primary
  Client->>StandbyDC: Route writes to new active DC
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

A rabbit hops 'cross data centers two,
Carrying GTIDs, fresh and true.
One Lease to hold, one DC to write,
Standbys stay quiet, fenced up tight.
Docs now bloom like clover fair —
Hop along, the failover's there! 🐇⚡

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding a MariaDB distributed disaster recovery guide.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pr/mariadb-dc-dr-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (7)
docs/guides/mariadb/distributed/disaster-recovery/overview/index.md (1)

86-91: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add language tags to fenced code blocks.

Several fenced code blocks lack language identifiers, which triggers markdownlint warnings and can cause incorrect syntax highlighting:

  • Line 86: Use text for the ConfigMap structure pseudo-code.
  • Line 188: Use yaml for the annotation example.
  • Line 237: Use text for the ASCII architecture diagram.
  • Line 274: Use yaml for the annotation example.
📝 Proposed fixes
-```
+```text
 ConfigMap  primary-dc   (namespace dc-failover, on each spoke)
-```
+```yaml
 dr.kubedb.com/switchover-to: <dc>
-```
+```text
                          dr-controlplane (3 site etcd quorum)
-```
+```yaml
 dr.kubedb.com/enabled: "true"

Also applies to: 188-190, 237-262, 274-276

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/mariadb/distributed/disaster-recovery/overview/index.md` around
lines 86 - 91, Add language identifiers to the fenced code blocks in the
disaster-recovery overview markdown so markdownlint stops flagging them and
syntax highlighting is correct; update the ConfigMap pseudo-code and ASCII
diagram blocks to use text, and the annotation example blocks to use yaml. Check
the affected fenced sections in this document and ensure each fence has the
appropriate language tag.
docs/guides/mariadb/distributed/disaster-recovery/setup/index.md (5)

235-235: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Hyphenate compound modifier in heading: "read-only".

"read only" should be "read-only" for correct compound-modifier hyphenation in the heading.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/mariadb/distributed/disaster-recovery/setup/index.md` at line
235, Update the heading in the disaster recovery setup guide to use the
hyphenated compound modifier “read-only” instead of “read only”. This is a
straightforward wording fix in the markdown heading “Confirm the standby DC is
read only and following”; change only the heading text so it reads naturally and
consistently with standard compound-modifier usage.

67-67: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Hyphenate compound modifier: "self-contained".

"self contained Galera cluster" should be "self-contained Galera cluster" for correct compound-modifier hyphenation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/mariadb/distributed/disaster-recovery/setup/index.md` at line 67,
The phrase in the disaster-recovery setup doc uses an unhyphenated compound
modifier; update the wording around the “self contained Galera cluster” text to
“self-contained Galera cluster” for correct grammar. Locate the sentence
describing each cluster and adjust only that compound adjective in the markdown
content.

43-43: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Hyphenate compound modifier: "three-site".

"three site etcd quorum" should be "three-site etcd quorum" for correct compound-modifier hyphenation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/mariadb/distributed/disaster-recovery/setup/index.md` at line 43,
Update the wording in the affected disaster-recovery setup text to hyphenate the
compound modifier by changing the phrase used with the etcd quorum to
“three-site etcd quorum”; keep the rest of the sentence unchanged and ensure the
copy in the same section remains consistent with that terminology.

268-268: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Prefer "deliberately" over "on purpose".

"on purpose" is colloquial; "deliberately" is more precise for technical documentation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/mariadb/distributed/disaster-recovery/setup/index.md` at line
268, Replace the colloquial phrase “on purpose” in the disaster-recovery setup
prose with “deliberately” to keep the documentation precise and technical;
update the sentence in the affected paragraph without changing its meaning.

15-15: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Descriptive link text recommended.

The "here" link text is not descriptive per markdownlint MD059. Consider: "Please start with the KubeDB quickstart guide."

- > **New to KubeDB?** Please start [here](/docs/README.md).
+ > **New to KubeDB?** Please start with the [KubeDB quickstart guide](/docs/README.md).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/mariadb/distributed/disaster-recovery/setup/index.md` at line 15,
The introductory markdown link text is too generic and should be made
descriptive to satisfy the link-text lint rule. Update the sentence containing
the link to /docs/README.md in this guide so it uses meaningful text like the
KubeDB quickstart guide, and keep the rest of the sentence intact while making
the anchor text descriptive enough to indicate what the target page is.
docs/guides/mariadb/distributed/disaster-recovery/setup/yamls/placement-policy.yaml (1)

12-13: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Hyphenate in comment: "self-contained".

"self contained" should be "self-contained" in the comment describing the Galera cluster.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@docs/guides/mariadb/distributed/disaster-recovery/setup/yamls/placement-policy.yaml`
around lines 12 - 13, Update the comment in placement-policy.yaml so the
description of the Galera cluster uses the hyphenated form “self-contained”
instead of “self contained.” Keep the change limited to the surrounding
failoverPolicy commentary and preserve the existing meaning and wording
otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/guides/mariadb/distributed/disaster-recovery/overview/index.md`:
- Around line 86-91: Add language identifiers to the fenced code blocks in the
disaster-recovery overview markdown so markdownlint stops flagging them and
syntax highlighting is correct; update the ConfigMap pseudo-code and ASCII
diagram blocks to use text, and the annotation example blocks to use yaml. Check
the affected fenced sections in this document and ensure each fence has the
appropriate language tag.

In `@docs/guides/mariadb/distributed/disaster-recovery/setup/index.md`:
- Line 235: Update the heading in the disaster recovery setup guide to use the
hyphenated compound modifier “read-only” instead of “read only”. This is a
straightforward wording fix in the markdown heading “Confirm the standby DC is
read only and following”; change only the heading text so it reads naturally and
consistently with standard compound-modifier usage.
- Line 67: The phrase in the disaster-recovery setup doc uses an unhyphenated
compound modifier; update the wording around the “self contained Galera cluster”
text to “self-contained Galera cluster” for correct grammar. Locate the sentence
describing each cluster and adjust only that compound adjective in the markdown
content.
- Line 43: Update the wording in the affected disaster-recovery setup text to
hyphenate the compound modifier by changing the phrase used with the etcd quorum
to “three-site etcd quorum”; keep the rest of the sentence unchanged and ensure
the copy in the same section remains consistent with that terminology.
- Line 268: Replace the colloquial phrase “on purpose” in the disaster-recovery
setup prose with “deliberately” to keep the documentation precise and technical;
update the sentence in the affected paragraph without changing its meaning.
- Line 15: The introductory markdown link text is too generic and should be made
descriptive to satisfy the link-text lint rule. Update the sentence containing
the link to /docs/README.md in this guide so it uses meaningful text like the
KubeDB quickstart guide, and keep the rest of the sentence intact while making
the anchor text descriptive enough to indicate what the target page is.

In
`@docs/guides/mariadb/distributed/disaster-recovery/setup/yamls/placement-policy.yaml`:
- Around line 12-13: Update the comment in placement-policy.yaml so the
description of the Galera cluster uses the hyphenated form “self-contained”
instead of “self contained.” Keep the change limited to the surrounding
failoverPolicy commentary and preserve the existing meaning and wording
otherwise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 9a0f9ce6-6739-45db-b7d2-1079689a7685

📥 Commits

Reviewing files that changed from the base of the PR and between 405b88b and 85f3806.

📒 Files selected for processing (5)
  • docs/guides/mariadb/distributed/disaster-recovery/_index.md
  • docs/guides/mariadb/distributed/disaster-recovery/overview/index.md
  • docs/guides/mariadb/distributed/disaster-recovery/setup/index.md
  • docs/guides/mariadb/distributed/disaster-recovery/setup/yamls/mariadb.yaml
  • docs/guides/mariadb/distributed/disaster-recovery/setup/yamls/placement-policy.yaml

@github-actions

Copy link
Copy Markdown

Visit the preview URL for this PR (updated for commit 85f3806):

https://kubedb-v2-hugo--pr939-pr-mariadb-dc-dr-doc-jaw1axzd.web.app

(expires Tue, 07 Jul 2026 21:44:14 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 0f29ae8ae0bd54a99bf2b223b6833be47acd5943

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant