Skip to content

Conversation

@xy2953396112
Copy link
Contributor

@xy2953396112 xy2953396112 commented Dec 8, 2025

What changes were proposed in this pull request?

Avoiding regular expressions for DiskFileInfo storage type determination.

Why are the changes needed?

When sending a heartbeat, the Worker iterates all FileInfo objects and uses regex matching on a large number of them to check if the file is an HDFS file, thus reducing processing efficiency.

image

Does this PR resolve a correctness bug?

No.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI.

@xy2953396112 xy2953396112 changed the title [CELEBORN-2235] Avoiding regular expressions for DiskFileInfo storage… [CELEBORN-2236] Avoiding regular expressions for DiskFileInfo storage… Dec 8, 2025
@SteNicholas SteNicholas changed the title [CELEBORN-2236] Avoiding regular expressions for DiskFileInfo storage… [CELEBORN-2236] Avoiding regular expressions for DiskFileInfo storage type determination Dec 11, 2025
@xy2953396112 xy2953396112 force-pushed the CELEBORN-2236 branch 2 times, most recently from 7d9f544 to 8b883fe Compare December 15, 2025 12:21
@xy2953396112
Copy link
Contributor Author

@RexXiong @ErikFang Thanks~, PTAL.

@RexXiong
Copy link
Contributor

@RexXiong @ErikFang Thanks~, PTAL.

Agree with serialize storage type first, but It's also important to note that the storageType was not serialized before, which could lead to compatibility issues if used directly. We need think about this situation.

@xy2953396112 xy2953396112 force-pushed the CELEBORN-2236 branch 4 times, most recently from 491cee8 to 7422104 Compare December 22, 2025 11:55
Copy link
Contributor

@RexXiong RexXiong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@xy2953396112 xy2953396112 force-pushed the CELEBORN-2236 branch 4 times, most recently from 5bc3db7 to b8fee72 Compare December 24, 2025 10:04
@xy2953396112 xy2953396112 reopened this Dec 24, 2025
@xy2953396112 xy2953396112 force-pushed the CELEBORN-2236 branch 3 times, most recently from 16ebffd to 508cdea Compare December 25, 2025 02:38
@codecov
Copy link

codecov bot commented Dec 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.15%. Comparing base (ffff5bb) to head (508cdea).
⚠️ Report is 16 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3562      +/-   ##
==========================================
+ Coverage   67.05%   67.15%   +0.10%     
==========================================
  Files         357      357              
  Lines       21779    21860      +81     
  Branches     1930     1943      +13     
==========================================
+ Hits        14602    14677      +75     
- Misses       6160     6164       +4     
- Partials     1017     1019       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@RexXiong
Copy link
Contributor

RexXiong commented Jan 2, 2026

Thanks, merge to main(v0.7.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants