mds scrub fix by sajibreadd-croit · Pull Request #5 · croit/ceph

sajibreadd-croit · 2025-12-18T12:41:38Z

remote link damage identification with reverse parent scrubbing
- remote link identification becomes tricky if inode is cached.
- Try to open the link normally, if issue while opening mark as damaged
- If openned successfully, it can be possible there is damage but inode is cached that's why it is succssful while opening. In that case take that openned inode, and scrub ancestors recursively. If any of the ancestor is damaged it remote link is marked as damaged.
- while scrubbing some flag is maintained in the inode, e.g. whether scrub is backward or forward or both
- his backward scrubbing will only work in read-only scrub that means without repair flag and mds_scrub_hard_link this ceph flag is turned on.
- A new type of damage introduced, using which multiple links point to same inode can be identified, which was not possible previously.
mds_damage_log_to_file and mds_damage_log_file is used to print out damages in a file persistently as it's not safe to keep it in memory
missing dirfrag can make scrub recurring, so a flag from_scrub is used to identify when dirfrag fetch is from scrub function.

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins test classic perf Jenkins Job | Jenkins Job Definition
jenkins test crimson perf Jenkins Job | Jenkins Job Definition
jenkins test signed Jenkins Job | Jenkins Job Definition
jenkins test make check Jenkins Job | Jenkins Job Definition
jenkins test make check arm64 Jenkins Job | Jenkins Job Definition
jenkins test submodules Jenkins Job | Jenkins Job Definition
jenkins test dashboard Jenkins Job | Jenkins Job Definition
jenkins test dashboard cephadm Jenkins Job | Jenkins Job Definition
jenkins test api Jenkins Job | Jenkins Job Definition
jenkins test docs ReadTheDocs | Github Workflow Definition
jenkins test ceph-volume all Jenkins Jobs | Jenkins Jobs Definition
jenkins test windows Jenkins Job | Jenkins Job Definition
jenkins test rook e2e Jenkins Job | Jenkins Job Definition

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

- remote link identification becomes tricky if inode is cached. - Try to open the link normally, if issue while opening mark as damaged - If openned successfully, it can be possible there is damage but inode is cached that's why it is succssful while opening. In that case take that openned inode, and scrub ancestors recursively. If any of the ancestor is damaged it remote link is marked as damaged. - while scrubbing some flag is maintained in the inode, e.g. whether scrub is backward or forward or both - his backward scrubbing will only work in read-only scrub that means without repair flag and mds_scrub_hard_link this ceph flag is turned on. - A new type of damage introduced, using which multiple links point to same inode can be identified, which was not possible previously. 2. mds_damage_log_to_file and mds_damage_log_file is used to print out damages in a file persistently as it's not safe to keep it in memory 3. missing dirfrag can make scrub recurring, so a flag `from_scrub` is used to identify when dirfrag fetch is from scrub function. Fixes: https://tracker.ceph.com/issues/68611 Signed-off-by: Md Mahamudur Rahaman Sajib <mahamudur.sajib@croit.io>

ifed01 · 2025-12-18T17:41:37Z

src/mds/ScrubStack.cc

    } else if (dir->get_version() == 0) {
      dout(20) << __func__ << " barebones " << *dir  << dendl;
-      dir->fetch_keys({}, gather.new_sub());
+      dir->fetch_keys({}, gather.new_sub(), true);


This (plus all the changes in CDir.*) look like a pretty independent fix, IMO better to make a standalone commit for that.

ifed01 · 2025-12-18T17:44:52Z

src/mds/ScrubStack.cc

+        add_remote_link_damage(remote_link_path, remote_ino);
+        header->inc_scrubbed_remote_link_count();
+      }
+      in->scrub_reset_remote_links();


redundant, scrub_finished() does reset on its own.

ifed01 · 2025-12-18T17:45:02Z

src/mds/ScrubStack.cc

+        add_remote_link_damage(remote_link_path, remote_ino);
+        header->inc_scrubbed_remote_link_count();
+      }
+      in->scrub_reset_remote_links();


redundant, scrub_finished() does reset on its own.

ifed01 · 2025-12-18T17:53:43Z

src/mds/CInode.cc

+void CInode::scrub_add_remote_link(
+    std::vector<std::pair<std::string, inodeno_t>> &&remote_links) {
+
+  for (auto& [remote_link_path, remote_ino]: remote_links) {


Why not using a whole pair:
for (auto& p : remote_links) {
emplace_back(std::move(p));
}

ifed01 · 2025-12-18T18:38:33Z

src/mds/ScrubStack.cc

+        if (done) {
+          dout(20) << __func__ << " dir inode, done" << dendl;
+          in->set_forward_scrub(false);
+          dequeue(in);


Shouldn't we call scrub_dir_inode_final(in) here if (!remote_links().empty()) ?

ifed01 · 2025-12-18T19:10:10Z

src/mds/DamageTable.h

+      log_to_file = _log_to_file;
+      if (log_to_file) {
+        log_file_opened = open_damage_log_file(fout, log_file);
+      }


For the sake of completeness IMO better to timplement log file close as well.

ifed01 · 2025-12-18T19:11:00Z

src/mds/DamageTable.h

+    void set_log_to_file(bool _log_to_file) {
+      log_to_file = _log_to_file;
+      if (log_to_file) {
+        log_file_opened = open_damage_log_file(fout, log_file);


IMO log_file_opened better to be assigned inside open_damage_log_file (and symetric close method if any).

ifed01 · 2025-12-18T19:17:48Z

src/mds/DamageTable.cc

+        remote_links.erase(df_remote_link_it);
+      }
+    }
+    remote_links.erase(remote_link_entry->ino);


Apparently either this one or above remote_links.erase() call is redundant

ifed01 · 2025-12-18T19:20:53Z

src/mds/DamageTable.cc

+    if (df_remote_link_it != remote_links.end()) {
+      auto damage_it = df_remote_link_it->second.find(entry->path);
+      if (damage_it != df_remote_link_it->second.end()) {
+        df_remote_link_it->second.erase(entry->path);


Why not using damage_it for erase() call?

ifed01 · 2025-12-18T19:32:41Z

src/mds/MDSRank.cc

-    starttime(mono_clock::now()),
-    ioc(ioc)
-{
+MDSRank::MDSRank(mds_rank_t whoami_, ceph::fair_mutex &mds_lock_,


IMO the original version was better readable...

ifed01 · 2026-01-13T09:25:57Z

@sajibreadd - this can be closed in favor of #6, right?

…yed static" ``` Jan 20 09:27:16 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: AddressSanitizer:DEADLYSIGNAL Jan 20 09:27:16 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ================================================================= Jan 20 09:27:16 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ==3==ERROR: AddressSanitizer: stack-overflow on address 0x7b512f6c8dd8 (pc 0x0000046e7a72 bp 0x7b512de7c900 sp 0x7b512f6c8dd8 T0) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #0 0x0000046e7a72 in get_global_options() (/usr/bin/ceph-osd-crimson+0x46e7a72) (BuildId: 2a86043f51c9be9cb19801e276fb3ee36239556a) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #1 0x0000046e540e in build_options() (/usr/bin/ceph-osd-crimson+0x46e540e) (BuildId: 2a86043f51c9be9cb19801e276fb3ee36239556a) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #2 0x0000033b7949 in get_ceph_options() (/usr/bin/ceph-osd-crimson+0x33b7949) (BuildId: 2a86043f51c9be9cb19801e276fb3ee36239556a) Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #3 0x000003440540 in md_config_t::md_config_t(ConfigValues&, ConfigTracker const&, bool) (/usr/bin/ceph-osd-crimson+0x3440540) (BuildId: 2a860> Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #4 0x0000046856a8 in crimson::common::ConfigProxy::ConfigProxy(EntityName const&, std::basic_string_view<char, std::char_traits<char> >) (/usr> Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: #5 0x000000eb6cb5 in seastar::shared_ptr_count_for<crimson::common::ConfigProxy>::shared_ptr_count_for<EntityName&, std::__cxx11::basic_string> .. Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ceph#40 0x000000ed6434 in seastar::future<int> seastar::futurize<int>::apply<crimson::osd::_get_early_config(int, char const**)::{lambda()#1}::ope> Jan 20 09:27:17 ceph-node-0 ceph-e818662e-f5e1-11f0-b263-525400908ba7-osd-1[12300]: ceph#41 0x000000ed672b in seastar::async<crimson::osd::_get_early_config(int, char const**)::{lambda()#1}::operator()() const::{lambda()#1}>(seast> ``` This reverts commit 1ab0a8c. Fixes: https://tracker.ceph.com/issues/74481 Signed-off-by: Matan Breizman <mbreizma@redhat.com>

github-actions bot added cephfs common config-change labels Dec 18, 2025

sajibreadd-croit requested a review from ifed01 December 18, 2025 12:43

ifed01 reviewed Dec 18, 2025

View reviewed changes

sajibreadd-croit mentioned this pull request Dec 19, 2025

mds: scrub fix #6

Open

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mds scrub fix#5

mds scrub fix#5
sajibreadd-croit wants to merge 1 commit intocroit-ceph-v18.2.4from
washu-scrub-fix-v18.2.4

sajibreadd-croit commented Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 Dec 18, 2025

Uh oh!

ifed01 commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sajibreadd-croit commented Dec 18, 2025

Contribution Guidelines

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ifed01 commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants