Add RabbitMQ cross data center disaster recovery (DC-DR) docs#947
Add RabbitMQ cross data center disaster recovery (DC-DR) docs#947tamalsaha wants to merge 1 commit into
Conversation
Signed-off-by: Tamal Saha <tamal@appscode.com>
📝 WalkthroughWalkthroughThis pull request adds new RabbitMQ Disaster Recovery documentation to the docs site: a navigation index page, a comprehensive DC-DR user guide, a conceptual overview page, and an operational runbook with scenario-based procedures for handling failover, switchover, and failback. ChangesRabbitMQ DC-DR Documentation
Estimated code review effort: 2 (Simple) | ~12 minutes Sequence Diagram(s)Not applicable — this PR consists solely of documentation additions with no code or control-flow changes. Compact metadata
Related issues: None referenced. Related PRs: None referenced. Suggested labels: documentation Suggested reviewers: None specified. 🐰
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (6)
docs/guides/rabbitmq/dr/guide/index.md (1)
25-25: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUse descriptive link text instead of "here".
Replace "here" with descriptive text like "the KubeDB getting started guide" for accessibility and to satisfy markdownlint MD059.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/rabbitmq/dr/guide/index.md` at line 25, The intro link text is too generic and should be made descriptive to satisfy markdownlint MD059 and improve accessibility. Update the link in the guide index to use meaningful text instead of “here”, such as referring to the KubeDB getting started guide, while keeping the same destination path.docs/guides/rabbitmq/dr/runbook/index.md (4)
15-16: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueAdd hyphen to "cross-data-center".
"cross data center disaster recovery" should be hyphenated as "cross-data-center disaster recovery" for correct compound modifier usage.
-Scenario-by-scenario procedures for operating a RabbitMQ workload in cross data center -disaster recovery (DC-DR) mode. +Scenario-by-scenario procedures for operating a RabbitMQ workload in cross-data-center +disaster recovery (DC-DR) mode.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/rabbitmq/dr/runbook/index.md` around lines 15 - 16, The runbook text uses “cross data center” as a compound modifier, so update the wording in the RabbitMQ DR guide to hyphenate it consistently as “cross-data-center disaster recovery.” Make the edit in the affected introductory sentence in the docs content so the phrase reads naturally and matches the preferred terminology throughout the guide.
23-23: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueUse descriptive link text.
"here" is non-descriptive link text. Use meaningful text that describes the destination.
-> **New to KubeDB?** Please start [here](/docs/README.md). +> **New to KubeDB?** Please [start with the KubeDB introduction](/docs/README.md).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/rabbitmq/dr/runbook/index.md` at line 23, The link text in the runbook introduction is too generic; update the Markdown link in the introductory line to use descriptive text that names the destination instead of “here.” Adjust the sentence containing the reference to the docs README so the linked text clearly describes what readers will find, keeping the same target path.
246-247: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueClarify annotation removal syntax.
The command
kubectl annotate rabbitmq -n demo rm-dcdr dr.kubedb.com/switchover-to-uses the trailing-syntax to remove an annotation, which is correct kubectl syntax but may confuse readers. Consider adding a brief note that the trailing hyphen removes the annotation.- `kubectl annotate rabbitmq -n demo rm-dcdr dr.kubedb.com/switchover-to-`. The active + `kubectl annotate rabbitmq -n demo rm-dcdr dr.kubedb.com/switchover-to-` (the trailing `-` removes the annotation). The active🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/rabbitmq/dr/runbook/index.md` around lines 246 - 247, Clarify the kubectl annotation removal syntax in the RabbitMQ DR runbook example: the command using kubectl annotate on rm-dcdr with dr.kubedb.com/switchover-to- should explicitly note that the trailing hyphen removes the annotation. Update the surrounding prose in the runbook section to mention this syntax so readers understand the command without mistaking it for a typo.
324-325: 🎯 Functional Correctness | 🔵 Trivial | 💤 Low valueClarify which cluster to check federation links on in split-write diagnosis.
In scenario 12, the command checks
rm-dcdr-dc-b-0, but during a split-write scenario, the wrong-direction upstream could be on either DC. Consider noting that you should check both DCs' federation links, or clarify why dc-b specifically is the right target.-kubectl exec -n demo rm-dcdr-dc-b-0 -- rabbitmqctl list_federation_links # both directions must not be enabled +kubectl exec -n demo rm-dcdr-dc-b-0 -- rabbitmqctl list_federation_links # check both DCs; both directions must not be enabled🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/rabbitmq/dr/runbook/index.md` around lines 324 - 325, Clarify the split-write federation check in the runbook: the current use of kubectl exec on rm-dcdr-dc-b-0 implies only dc-b should be inspected, but the wrong-direction upstream may exist on either cluster. Update the surrounding guidance in the scenario 12 section to explicitly say to check federation links on both DCs (or explain why rm-dcdr-dc-b-0 is the correct target), and keep the RabbitMQ troubleshooting command aligned with that guidance.docs/guides/rabbitmq/dr/overview/index.md (1)
31-31: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueDescriptive link text improves accessibility.
"here" is non-descriptive for screen-reader users scanning links. Consider "Please start with the KubeDB introduction" or similar.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/rabbitmq/dr/overview/index.md` at line 31, Update the link text in the documentation overview to use descriptive wording instead of “here” so it is accessible for screen-reader users. In the markdown content near the “New to KubeDB?” note, change the anchor text to something like “KubeDB introduction” while keeping the same target, and make sure the sentence still reads naturally.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/guides/rabbitmq/dr/guide/index.md`:
- Line 15: The guide text uses the compound modifier “cross data center” and
should be hyphenated as “cross-data-center” before “disaster recovery.” Update
the wording in the RabbitMQ DR guide content to use the hyphenated form
consistently, keeping the existing meaning intact.
- Around line 39-49: The DC-name contract lists the marker with an inconsistent
path, so update the marker identifier in the guide to match the DC-DR Overview
contract. Use the existing “The DC-name contract” section and the
`data.activeDC` bullet as the target for correction, ensuring it consistently
uses the same `activeDC` identifier everywhere across the docs.
- Around line 149-159: The federation policy example uses the wrong definition
key for the upstream reference, so update the example in the RabbitMQ DR guide
to match the operator’s single upstream configuration. In the policy snippet
near the federation example, replace the use of federation-upstream-set in the
definition block with federation-upstream unless you are explicitly documenting
a federation-upstream-set runtime parameter elsewhere; keep the example aligned
with the actual upstream name dcdr-upstream-from-dc-a and the surrounding
federation policy text.
In `@docs/guides/rabbitmq/dr/overview/index.md`:
- Line 76: The description of classic queues is too absolute in the RabbitMQ DR
overview: it says they are non-replicated, but the doc should reflect that
classic queues do not automatically replicate and can be mirrored via deprecated
classic queue mirroring. Update the wording in the overview text to use the
existing classic-queue explanation consistently, avoiding any claim that classic
queues are inherently non-replicated or equivalent to quorum-style replication.
---
Nitpick comments:
In `@docs/guides/rabbitmq/dr/guide/index.md`:
- Line 25: The intro link text is too generic and should be made descriptive to
satisfy markdownlint MD059 and improve accessibility. Update the link in the
guide index to use meaningful text instead of “here”, such as referring to the
KubeDB getting started guide, while keeping the same destination path.
In `@docs/guides/rabbitmq/dr/overview/index.md`:
- Line 31: Update the link text in the documentation overview to use descriptive
wording instead of “here” so it is accessible for screen-reader users. In the
markdown content near the “New to KubeDB?” note, change the anchor text to
something like “KubeDB introduction” while keeping the same target, and make
sure the sentence still reads naturally.
In `@docs/guides/rabbitmq/dr/runbook/index.md`:
- Around line 15-16: The runbook text uses “cross data center” as a compound
modifier, so update the wording in the RabbitMQ DR guide to hyphenate it
consistently as “cross-data-center disaster recovery.” Make the edit in the
affected introductory sentence in the docs content so the phrase reads naturally
and matches the preferred terminology throughout the guide.
- Line 23: The link text in the runbook introduction is too generic; update the
Markdown link in the introductory line to use descriptive text that names the
destination instead of “here.” Adjust the sentence containing the reference to
the docs README so the linked text clearly describes what readers will find,
keeping the same target path.
- Around line 246-247: Clarify the kubectl annotation removal syntax in the
RabbitMQ DR runbook example: the command using kubectl annotate on rm-dcdr with
dr.kubedb.com/switchover-to- should explicitly note that the trailing hyphen
removes the annotation. Update the surrounding prose in the runbook section to
mention this syntax so readers understand the command without mistaking it for a
typo.
- Around line 324-325: Clarify the split-write federation check in the runbook:
the current use of kubectl exec on rm-dcdr-dc-b-0 implies only dc-b should be
inspected, but the wrong-direction upstream may exist on either cluster. Update
the surrounding guidance in the scenario 12 section to explicitly say to check
federation links on both DCs (or explain why rm-dcdr-dc-b-0 is the correct
target), and keep the RabbitMQ troubleshooting command aligned with that
guidance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 8efbf509-06f7-44df-9a79-d8713a94c0b1
📒 Files selected for processing (4)
docs/guides/rabbitmq/dr/_index.mddocs/guides/rabbitmq/dr/guide/index.mddocs/guides/rabbitmq/dr/overview/index.mddocs/guides/rabbitmq/dr/runbook/index.md
|
|
||
| # Running RabbitMQ in DC-DR Mode: User Guide | ||
|
|
||
| This guide covers every aspect of operating a distributed RabbitMQ in cross data center |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Fix hyphenation: "cross-data-center".
"cross data center" should be hyphenated as "cross-data-center" when used as a compound modifier before "disaster recovery."
🧰 Tools
🪛 LanguageTool
[grammar] ~15-~15: Use a hyphen to join words.
Context: ...perating a distributed RabbitMQ in cross data center disaster recovery (DC-DR) mo...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/guides/rabbitmq/dr/guide/index.md` at line 15, The guide text uses the
compound modifier “cross data center” and should be hyphenated as
“cross-data-center” before “disaster recovery.” Update the wording in the
RabbitMQ DR guide content to use the hyphenated form consistently, keeping the
existing meaning intact.
| ## The DC-name contract | ||
|
|
||
| One string identifies a data center everywhere. **Keep these identical:** | ||
|
|
||
| - the OCM spoke cluster name | ||
| - the agent `--dc-name` | ||
| - the primary-DC Lease `holderIdentity` | ||
| - the marker `data.activeDC` | ||
| - the pod label `open-cluster-management.io/cluster-name` | ||
| - the `PlacementPolicy` `distributionRule.clusterName` | ||
|
|
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Cross-file inconsistency: marker field name contradicts overview.
The guide lists data.activeDC (Line 46), but the DC-DR Overview documents the contract as activeDC without the data. prefix. Reconcile the marker path so both pages use the same identifier.
As per path instructions, cross-file contract consistency is required for DR configuration accuracy.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/guides/rabbitmq/dr/guide/index.md` around lines 39 - 49, The DC-name
contract lists the marker with an inconsistent path, so update the marker
identifier in the guide to match the DC-DR Overview contract. Use the existing
“The DC-name contract” section and the `data.activeDC` bullet as the target for
correction, ensuring it consistently uses the same `activeDC` identifier
everywhere across the docs.
| ```jsonc | ||
| // federation policy, set by the operator on the standby (dc-b) cluster | ||
| { | ||
| "name": "dcdr-federation", | ||
| "pattern": "^(?!amq\\.).*", // federate user queues and exchanges | ||
| "apply-to": "queues", | ||
| "definition": { | ||
| "federation-upstream-set": "dcdr-upstream-from-dc-a" | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win
Fix federation policy: federation-upstream-set should be federation-upstream.
The policy example references "federation-upstream-set": "dcdr-upstream-from-dc-a", but federation-upstream-set expects the name of a federation-upstream-set runtime parameter (a collection), not a single federation-upstream name. Since the operator defines a single upstream (dcdr-upstream-from-dc-a), the policy should use "federation-upstream" to reference it directly.
"definition": {
- "federation-upstream-set": "dcdr-upstream-from-dc-a"
+ "federation-upstream": "dcdr-upstream-from-dc-a"
}Alternatively, if the operator actually defines a federation-upstream-set containing this upstream, document that component instead and keep the policy as-is.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ```jsonc | |
| // federation policy, set by the operator on the standby (dc-b) cluster | |
| { | |
| "name": "dcdr-federation", | |
| "pattern": "^(?!amq\\.).*", // federate user queues and exchanges | |
| "apply-to": "queues", | |
| "definition": { | |
| "federation-upstream-set": "dcdr-upstream-from-dc-a" | |
| } | |
| } | |
| ``` |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/guides/rabbitmq/dr/guide/index.md` around lines 149 - 159, The
federation policy example uses the wrong definition key for the upstream
reference, so update the example in the RabbitMQ DR guide to match the
operator’s single upstream configuration. In the policy snippet near the
federation example, replace the use of federation-upstream-set in the definition
block with federation-upstream unless you are explicitly documenting a
federation-upstream-set runtime parameter elsewhere; keep the example aligned
with the actual upstream name dcdr-upstream-from-dc-a and the surrounding
federation policy text.
| group. The Raft group never crosses the DC boundary, so inter-DC latency or a | ||
| partition can never flap queue leadership or stall a queue. There is no cross-DC | ||
| RabbitMQ voter. Use quorum queues (not classic queues) so intra-DC HA survives node | ||
| loss; classic queues are non-replicated. |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Classic queue replication claim is inaccurate.
Classic queues can be mirrored via classic queue mirroring (though deprecated). The statement that they are "non-replicated" overstates the case and may mislead readers evaluating HA options. Prefer "classic queues do not automatically replicate" or "classic queues lack quorum-style replication."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/guides/rabbitmq/dr/overview/index.md` at line 76, The description of
classic queues is too absolute in the RabbitMQ DR overview: it says they are
non-replicated, but the doc should reflect that classic queues do not
automatically replicate and can be mirrored via deprecated classic queue
mirroring. Update the wording in the overview text to use the existing
classic-queue explanation consistently, avoiding any claim that classic queues
are inherently non-replicated or equivalent to quorum-style replication.
Adds documentation for KubeDB RabbitMQ cross data center disaster recovery (DC-DR), mirroring the structure of the Kafka DR docs and adapting it to RabbitMQ semantics.
New pages under
docs/guides/rabbitmq/dr/:_index.mdmenu entry (Disaster Recovery).overview/index.md: concept overview and quick start. Explains why RabbitMQ DR is the async-replication camp (no cluster-wide primary; quorum queues run their own intra-DC Raft), the five architecture rules, DC roles (Member/Arbiter), the single-CR single-endpoint model, deploy walkthrough,status.disasterRecovery, failover, planned switchover, failback, cleanup.guide/index.md: full user guide. Components, DC-name contract, deployment, operator-managed Federation upstreams and policies, connecting/publishing over AMQP, consumers resuming after a flip, monitoring federation lag, the publish fence (permission or listener gate), planned switchover, failback, scaling and day-2 ops, limitations.runbook/index.md: 12 scenario-by-scenario procedures plus an escalation checklist.RabbitMQ-specific mechanics vs the Kafka template:
ConnectorCRD; the operator manages federation runtime parameters and policies.status.disasterRecoveryreportsactiveDC,phase, and per-DCnodesReady/federationLagMessages/writable/healthy. Planned switchover is annotation-triggered (dr.kubedb.com/switchover-to), no Switchover ops type.The pages hedge that the distributed substrate and DC-DR layer are net-new and forward-looking, matching the Kafka docs' tone.
Summary by CodeRabbit