consumer: correct the order of ddl and dml in storage consumer by wk989898 · Pull Request #4328 · pingcap/ticdc

wk989898 · 2026-03-02T09:06:10Z

What problem does this PR solve?

Issue Number: close #4327

What is changed and how it works?

DML events should be executed before DDL event in the same commitTs

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Summary by CodeRabbit

Bug Fixes
- Updated the processing sequence to ensure data modifications are handled before schema changes within the same table version, improving consistency in database operation execution.

gemini-code-assist · 2026-03-02T09:06:25Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue in the storage consumer where the order of processing Data Definition Language (DDL) and Data Manipulation Language (DML) events was incorrect. The changes introduce a revised sorting mechanism to ensure that for any given table version, all DML operations are processed and applied before any corresponding DDL operations, thereby maintaining data integrity and consistency during replication.

Highlights

Corrected DDL/DML Ordering: The processing order of DDL (Data Definition Language) and DML (Data Manipulation Language) events in the storage consumer has been corrected. For the same table version, DML operations will now be consumed before DDL operations.
Refactored Sorting Logic: The complex sorting logic for DmlPathKey has been extracted from an anonymous function within handleNewFiles into a new, dedicated helper function named dmlPathKeyLess.
Improved Code Clarity: Internal comments related to the DDL/DML ordering and the purpose of the fake DML key for schema files have been updated to accurately reflect the new processing strategy.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

cmd/storage-consumer/consumer.go
- Updated comments to clarify the intended DDL/DML processing order, specifically noting that DMLs should precede DDLs for the same table version.
- Refactored the inline sorting logic for DmlPathKey into a new, standalone function dmlPathKeyLess.
- Implemented the dmlPathKeyLess function to correctly sort DmlPathKey instances, ensuring DML files are ordered before schema (DDL) files for identical table versions.
- Adjusted the call to sort.Slice in handleNewFiles to utilize the newly introduced dmlPathKeyLess function.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-02T09:06:30Z

📝 Walkthrough

Walkthrough

The change refactors DML path key sorting in the storage consumer by replacing field-by-field comparisons with a dedicated comparator function (dmlPathKeyLess) that ensures DML files are consumed before DDL/schema entries for the same table version, and updates related schema file key handling.

Changes

Cohort / File(s)	Summary
Sort Logic Refactoring `cmd/storage-consumer/consumer.go`	Introduces `dmlPathKeyLess` comparator function to centralize DML path key ordering logic. Implements special handling to prioritize DML file consumption over DDL/schema entries for identical table versions. Updates schema.json handling to use DML keys with explicit empty Date fields and adjusts related code paths and documentation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

lgtm, approved, size/S

Suggested reviewers

3AceShowHand
lidezhu
hongyunyan

Poem

🐰 Whiskers twitching with delight,
Files now sorted just right!
DML hops before DDL's way,
Schema knows its proper day. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	The PR description includes the required issue number reference and a brief explanation of changes, but lacks detail on implementation, test coverage specifics, and release notes.	Provide more details on what was changed and how it works, specify which tests were added/updated, and clarify whether this requires a release note or if it should be marked as 'None'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: correcting DDL and DML ordering in the storage consumer, which matches the file changes that modify sorting semantics.
Linked Issues check	✅ Passed	The PR addresses issue `#4327` by fixing DDL/DML ordering in the storage consumer. The code changes implement a comparator function to ensure correct consumption order (DML before DDL for same table version).
Out of Scope Changes check	✅ Passed	All changes in consumer.go are focused on correcting DDL/DML ordering logic and are within the scope of fixing the unstable storage test referenced in issue `#4327`.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch wk989898-patch-1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request corrects the processing order of DDL and DML events in the storage consumer, ensuring that for the same table version, DML events are processed before DDL events. This is achieved by refactoring the sorting logic into a new dmlPathKeyLess function with the corrected comparison logic. The change is clear and addresses the issue. My main feedback is to add unit tests for the new critical sorting function to ensure its correctness and prevent future regressions.

gemini-code-assist · 2026-03-02T09:08:23Z

cmd/storage-consumer/consumer.go

+func dmlPathKeyLess(lhs, rhs cloudstorage.DmlPathKey) bool {
+	if lhs.TableVersion != rhs.TableVersion {
+		return lhs.TableVersion < rhs.TableVersion
+	}
+
+	// For the same table version(commitTs), execute schema(DDL) after all DML files.
+	if lhs.PartitionNum != rhs.PartitionNum {
+		if lhs.PartitionNum == fakePartitionNumForSchemaFile {
+			return false
+		}
+		if rhs.PartitionNum == fakePartitionNumForSchemaFile {
+			return true
+		}
+		return lhs.PartitionNum < rhs.PartitionNum
+	}
+
+	if lhs.Date != rhs.Date {
+		return lhs.Date < rhs.Date
+	}
+	if lhs.Schema != rhs.Schema {
+		return lhs.Schema < rhs.Schema
+	}
+	return lhs.Table < rhs.Table
+}


The new function dmlPathKeyLess contains critical sorting logic for ensuring DDLs are processed after DMLs. While the logic appears correct, it would be beneficial to add a unit test for this function to prevent future regressions. The function is pure and should be straightforward to test with various DmlPathKey combinations, including cases with schema files and DML files for the same and different table versions.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cmd/storage-consumer/consumer.go (1)

547-547: ⚠️ Potential issue | 🔴 Critical

Handle DML flush errors before continuing to DDL.

Line 547 drops the flushDMLEvents error. If flush fails, processing can continue and execute later DDL, breaking ordering guarantees under failure paths.

Suggested fix

-		c.flushDMLEvents(ctx, tableID)
+		if err := c.flushDMLEvents(ctx, tableID); err != nil {
+			return errors.Trace(err)
+		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cmd/storage-consumer/consumer.go` at line 547, The call to
c.flushDMLEvents(ctx, tableID) currently ignores any returned error which can
allow subsequent DDL to run and break ordering; update the call site to capture
the error (err := c.flushDMLEvents(ctx, tableID)), check it, and if non-nil
log/return/propagate the error (e.g., return fmt.Errorf or propagate the
existing function's error) so processing stops and DDL is not executed when DML
flush fails, preserving ordering guarantees; ensure you reference
c.flushDMLEvents and the enclosing function's error flow when adding the
early-return or error propagation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/storage-consumer/consumer.go`:
- Line 547: The call to c.flushDMLEvents(ctx, tableID) currently ignores any
returned error which can allow subsequent DDL to run and break ordering; update
the call site to capture the error (err := c.flushDMLEvents(ctx, tableID)),
check it, and if non-nil log/return/propagate the error (e.g., return fmt.Errorf
or propagate the existing function's error) so processing stops and DDL is not
executed when DML flush fails, preserving ordering guarantees; ensure you
reference c.flushDMLEvents and the enclosing function's error flow when adding
the early-return or error propagation.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1fdee2 and bfbfee6.

📒 Files selected for processing (1)

cmd/storage-consumer/consumer.go

ti-chi-bot · 2026-03-02T12:15:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 3AceShowHand, wlwilliamx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [3AceShowHand,wlwilliamx]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ti-chi-bot · 2026-03-02T12:15:51Z

[LGTM Timeline notifier]

Timeline:

2026-03-02 09:51:19.246782717 +0000 UTC m=+178923.824861910: ☑️ agreed by 3AceShowHand.
2026-03-02 12:15:50.736433743 +0000 UTC m=+187595.314512947: ☑️ agreed by wlwilliamx.

wk989898 · 2026-03-02T13:56:12Z

/retest

ti-chi-bot · 2026-03-02T14:59:24Z

@wk989898: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-unit-test-next-gen	`bfbfee6`	link	unknown	`/test pull-unit-test-next-gen`
pull-cdc-pulsar-integration-heavy	`bfbfee6`	link	unknown	`/test pull-cdc-pulsar-integration-heavy`
pull-cdc-storage-integration-heavy	`bfbfee6`	link	unknown	`/test pull-cdc-storage-integration-heavy`
pull-unit-test	`bfbfee6`	link	unknown	`/test pull-unit-test`
pull-cdc-storage-integration-light	`bfbfee6`	link	unknown	`/test pull-cdc-storage-integration-light`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Refactor comments and sorting logic for clarity

bfbfee6

ti-chi-bot bot added do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 2, 2026

ti-chi-bot bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 2, 2026

gemini-code-assist bot reviewed Mar 2, 2026

View reviewed changes

ti-chi-bot bot removed the do-not-merge/needs-triage-completed label Mar 2, 2026

coderabbitai bot reviewed Mar 2, 2026

View reviewed changes

3AceShowHand approved these changes Mar 2, 2026

View reviewed changes

ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Mar 2, 2026

wlwilliamx approved these changes Mar 2, 2026

View reviewed changes

ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 2, 2026

wk989898 added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consumer: correct the order of ddl and dml in storage consumer#4328

consumer: correct the order of ddl and dml in storage consumer#4328
wk989898 wants to merge 1 commit intomasterfrom
wk989898-patch-1

wk989898 commented Mar 2, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Uh oh!

coderabbitai bot commented Mar 2, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 2, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

ti-chi-bot bot commented Mar 2, 2026

Uh oh!

ti-chi-bot bot commented Mar 2, 2026

Uh oh!

wk989898 commented Mar 2, 2026

Uh oh!

ti-chi-bot bot commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wk989898 commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Check List

Tests

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Summary by CodeRabbit

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented Mar 2, 2026

Uh oh!

ti-chi-bot bot commented Mar 2, 2026

[LGTM Timeline notifier]

Uh oh!

wk989898 commented Mar 2, 2026

Uh oh!

ti-chi-bot bot commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wk989898 commented Mar 2, 2026 •

edited

Loading

coderabbitai bot commented Mar 2, 2026 •

edited

Loading