importinto,external: run merge serially with 64-file groups by joechenrh · Pull Request #68135 · pingcap/tidb

joechenrh · 2026-04-30T06:58:56Z

What problem does this PR solve?

Issue Number: ref #00000

Problem Summary:

For IMPORT INTO global-sort merge-sort workloads with many overlap files, concurrent merge workers can create a high burst of object-storage reads and cause unstable throughput.

What changed and how does it work?

Run merge-sort tasks serially in import-into merge step:
- NewMergeOperator(..., concurrency=1, ...)
- MergeOverlappingFiles(..., concurrency=1, ...)
Reduce per-merge-task file fan-in cap:
- MaxMergingFilesPerThread from 250 to 64

Net effect:

One merge task runs at a time per subtask.
Each merge task handles at most 64 files.
Read pressure is smoother and more predictable.

Check List

Tests

Unit test
- go test ./pkg/lightning/backend/external -run 'TestMergeKVIter|TestMergeKVIterPassWrongParam|TestSplitDataFiles|TestByteReader|TestByteReaderAuxBuf' -count=1
- go test ./pkg/dxf/importinto -run 'TestGetOnDupForKVGroup' -count=1
Integration test
Manual test (add detailed scripts or steps below)
- Verify in real environment with IMPORT INTO global-sort: compare merge-sort throughput/latency stability and object-storage retry/reset behavior before/after.
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

None

Summary by CodeRabbit

Bug Fixes
- Merge operations now run with enforced serial merging and smaller per-part file counts for more stable, predictable imports.
New Features
- Added merge-sort read and write timing metrics for cloud storage operations.
- Added end-to-end import runtime and per-job timing metrics, plus extra timing logged for internal data transfer steps to aid performance analysis.

ti-chi-bot · 2026-04-30T06:58:59Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2026-04-30T06:59:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a3e54ea4-4ddd-4dc5-9901-d0c77bb9c5a1

📥 Commits

Reviewing files that changed from the base of the PR and between 2a8d31b and 4c583ea.

📒 Files selected for processing (1)

pkg/lightning/backend/external/engine.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/lightning/backend/external/engine.go

📝 Walkthrough

Walkthrough

For merge-sort, concurrency is hardcoded to 1 and MaxMergingFilesPerThread is reduced (250→64); Prometheus histograms were added to observe merge-sort read and write durations; engine send-to-out timing was added. Separately, import runtime stats were introduced and threaded through job generation and workers with final summary logging.

Changes

Merge Concurrency Serialization & I/O Metrics

Layer / File(s)	Summary
Merge Configuration Constant `pkg/lightning/backend/external/merge.go`	`MaxMergingFilesPerThread` changed from `250` to `64`, altering share sizing in `splitDataFiles`.
Merge Task Scheduling `pkg/dxf/importinto/task_executor.go`	`mergeSortStepExecutor.RunSubtask` now passes `1` (serial) instead of `int(m.GetResource().CPU.Capacity())` to `external.NewMergeOperator` and `external.MergeOverlappingFiles`.
Merge-read instrumentation `pkg/lightning/backend/external/iter.go`, `pkg/lightning/backend/external/byte_reader.go`	`NewMergeKVIter` sets `byteReader.mergeSortReadHist`; `byteReader` imports `time` and records reload latency to the histogram when configured.
Merge-write instrumentation `pkg/lightning/backend/external/onefile_writer.go`	`OneFileWriter` gains `mergeSortWriteHist` and a `durationObservedWriter` wrapper; writes and closes are timed and observed; histogram initialized in `InitPartSizeAndLogger`.
Engine send timing `pkg/lightning/backend/external/engine.go`	`loadRangeBatchData` records duration spent sending DataAndRanges to `outCh` and logs `sendToOutChDur` alongside existing metrics.

Local Import Runtime Stats

Layer / File(s)	Summary
Stats data structure `pkg/lightning/backend/local/import_runtime_stats.go`	Adds `importRuntimeStats` with atomic counters and helper methods; `zapFields()` exposes metrics as zap fields.
Worker timing instrumentation `pkg/lightning/backend/local/job_worker.go`	`regionJobBaseWorker` gains `stats *importRuntimeStats`; job run, write, and ingest durations are measured and reported when enabled; `getRegionJobWorkerPool` accepts optional `stats`.
Import pipeline wiring `pkg/lightning/backend/local/local.go`	`doImport` creates `runtimeStats`, passes it into job generation and worker setup; generation and job-send durations are measured; import runtime summary is logged at completion.

Sequence Diagram(s)

sequenceDiagram
  participant doImport
  participant JobGenerator
  participant JobPool
  participant Worker
  participant RuntimeStats
  doImport->>RuntimeStats: create
  doImport->>JobGenerator: pass RuntimeStats
  JobGenerator->>JobPool: send jobs (timed)
  JobPool->>Worker: create with RuntimeStats
  Worker->>RuntimeStats: record durations
  doImport->>doImport: log runtime summary from RuntimeStats

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

pingcap/tidb#66958: Also modifies pkg/dxf/importinto/task_executor.go merge execution arguments.
pingcap/tidb#66564: Related changes to concurrency sizing for merge/encode operations.

Suggested labels

component/import, component/lightning, ok-to-test

Suggested reviewers

GMHDBJD
Benjamin2037

Poem

🐰 I time each read and watch each write,
I shrink the batches, merge by single light,
One thread hops onward, tidy and neat,
Counters count hops and durations complete,
When import ends, the logs hum soft and sweet.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly summarizes the main changes: running merge serially with 64-file groups in importinto and external modules.
Description check	✅ Passed	The description includes the required sections with clear problem statement, detailed explanation of changes, and a comprehensive checklist with unit tests and manual verification.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tiprow · 2026-04-30T06:59:15Z

Hi @joechenrh. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

This reverts commit db77b9d.

ti-chi-bot · 2026-05-07T08:19:10Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign benjamin2037 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/dxf/OWNERS
pkg/lightning/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pantheon-ai · 2026-05-07T09:40:23Z

@joechenrh I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details.

⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.

_{ℹ️ Learn more details on Pantheon AI.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/dxf/importinto/task_executor.go`:
- Around line 525-533: The call to external.MergeOverlappingFiles currently
hardcodes the merge-task concurrency to 1 (the third arg), which serializes
merges; change this to use a configurable or CPU-based value (e.g.,
mergeTaskConcurrency from config or runtime.NumCPU()/max(1, cpu/…)) and keep a
separate per-task reader-open cap (the parameter that controls reader-open
bursts) set to 1 to limit startup open pressure; update the call site of
external.MergeOverlappingFiles (the invocation with wctx, sm.DataFiles, 1, ...)
to pass mergeTaskConcurrency for task-level parallelism and the reader-open
concurrency value for per-task limit so merge parallelism is preserved while
bounding reader-open bursts.

In `@pkg/lightning/backend/external/merge.go`:
- Around line 40-41: Update the in-file worked memory estimate comment to
reflect the reduced MaxMergingFilesPerThread value (MaxMergingFilesPerThread =
64) so the example matches current constants; locate the existing worked memory
estimate comment in the same file (merge.go) and change the numeric example from
250 to an appropriate value derived from 64 (or recalc the example using
MaxMergingFilesPerThread) and clarify the units so future readers/operators see
an accurate, maintainable performance trade-off tied to
MaxMergingFilesPerThread.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dccadb49-3c70-4be0-b39a-c0978ed7cf67

📥 Commits

Reviewing files that changed from the base of the PR and between 84d8269 and 770d10e.

📒 Files selected for processing (2)

pkg/dxf/importinto/task_executor.go
pkg/lightning/backend/external/merge.go

coderabbitai · 2026-05-07T09:50:11Z

+		1,
 		false,
 		onDup,
 	)

 	if err = external.MergeOverlappingFiles(
 		wctx,
 		sm.DataFiles,
-		int(m.GetResource().CPU.Capacity()), // the concurrency used to split subtask
+		1, // run merge tasks serially


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This change serializes merge execution, not just reader-open bursts.

Setting both concurrencies to 1 fully disables merge parallelism and can significantly increase merge wall time. If the goal is to cap startup open pressure to 64, keep merge-task concurrency configurable (or CPU-based) and only bound per-task reader-open concurrency.

As per coding guidelines, important performance trade-offs should be explicit and maintainable for future readers.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/dxf/importinto/task_executor.go` around lines 525 - 533, The call to external.MergeOverlappingFiles currently hardcodes the merge-task concurrency to 1 (the third arg), which serializes merges; change this to use a configurable or CPU-based value (e.g., mergeTaskConcurrency from config or runtime.NumCPU()/max(1, cpu/…)) and keep a separate per-task reader-open cap (the parameter that controls reader-open bursts) set to 1 to limit startup open pressure; update the call site of external.MergeOverlappingFiles (the invocation with wctx, sm.DataFiles, 1, ...) to pass mergeTaskConcurrency for task-level parallelism and the reader-open concurrency value for per-task limit so merge parallelism is preserved while bounding reader-open bursts.

coderabbitai · 2026-05-07T09:50:11Z

+	// single thread.
+	MaxMergingFilesPerThread = 64


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the in-file memory example to match the new merge-file cap.

After reducing MaxMergingFilesPerThread to 64, the worked memory estimate comment (later in this file) still uses 250, which is now misleading for operators/debugging.

As per coding guidelines, comments should capture important performance trade-offs and stay maintainable for future readers.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/lightning/backend/external/merge.go` around lines 40 - 41, Update the in-file worked memory estimate comment to reflect the reduced MaxMergingFilesPerThread value (MaxMergingFilesPerThread = 64) so the example matches current constants; locate the existing worked memory estimate comment in the same file (merge.go) and change the numeric example from 250 to an appropriate value derived from 64 (or recalc the example using MaxMergingFilesPerThread) and clarify the units so future readers/operators see an accurate, maintainable performance trade-off tied to MaxMergingFilesPerThread.

codecov · 2026-05-07T10:01:16Z

Codecov Report

❌ Patch coverage is 61.60000% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.2375%. Comparing base (9d1ac81) to head (4c583ea).
⚠️ Report is 18 commits behind head on master.

Additional details and impacted files

@@               Coverage Diff                @@
##             master     #68135        +/-   ##
================================================
- Coverage   77.7620%   77.2375%   -0.5246%     
================================================
  Files          1990       1985         -5     
  Lines        551774     558722      +6948     
================================================
+ Hits         429071     431543      +2472     
- Misses       121783     126386      +4603     
+ Partials        920        793       -127

Flag	Coverage Δ
integration	`42.7462% <61.6000%> (+2.9443%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
dumpling	`60.4888% <ø> (ø)`
parser	`∅ <ø> (∅)`
br	`50.0597% <ø> (-13.0338%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

pkg/lightning/backend/local/import_runtime_stats.go (1)
41-47: 💤 Low value

sendCnt parameter is always identical to jobCnt at the call site — consider collapsing.

In local.go the call is always addGenerateStats(len(jobs), generateDur, time.Since(sendStart), len(jobs)), meaning generateJobCnt and sendJobToWorkerChCnt in zapFields() will always be equal, which adds noise to the log output without providing additional signal. Either drop the separate sendCnt tracking or document why the two counts can diverge.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/lightning/backend/local/import_runtime_stats.go` around lines 41 - 47,
The addGenerateStats function records both generateJobCnt and
sendJobToWorkerChCnt but sendCnt is always passed as len(jobs) at the call site,
so collapse the duplicate metric: remove the sendCnt parameter from
addGenerateStats and either (a) increment sendJobToWorkerChCnt from jobCnt
inside addGenerateStats (use the existing jobCnt value) or (b) drop
sendJobToWorkerChCnt entirely and remove its usage in zapFields (and any related
metric registration). Update all callers (e.g., the call in local.go that
currently does addGenerateStats(len(jobs), generateDur, time.Since(sendStart),
len(jobs))) to the new signature and adjust zapFields to avoid logging a
redundant count or add a short comment if you intentionally keep both because
they may diverge in future.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/lightning/backend/external/engine.go`:
- Around line 414-419: The logged "loadedKVs" value is taken from
len(e.memKVsAndBuffers.kvs) after that slice is cleared, so it will always be
zero; fix by capturing the count before the cleanup (e.g., assign loadedKVsCount
:= len(e.memKVsAndBuffers.kvs) before the slice is cleared) and then use that
captured loadedKVsCount in the log call that builds "load range batch done"
(replace the current len(...) usage with the saved variable).

In `@pkg/lightning/backend/local/import_runtime_stats.go`:
- Around line 1-88: Add the new source file import_runtime_stats.go to the
go_library srcs in the pkg/lightning/backend/local BUILD.bazel so Bazel can see
the definitions (e.g., importRuntimeStats, addGenerateStats, zapFields); open
the BUILD.bazel's go_library that currently lists "iterator.go" and
"job_worker.go" and insert "import_runtime_stats.go", into the srcs list in
alphabetical order between "iterator.go" and "job_worker.go" and then run bazel
build to verify the undefined symbol errors are resolved.

---

Nitpick comments:
In `@pkg/lightning/backend/local/import_runtime_stats.go`:
- Around line 41-47: The addGenerateStats function records both generateJobCnt
and sendJobToWorkerChCnt but sendCnt is always passed as len(jobs) at the call
site, so collapse the duplicate metric: remove the sendCnt parameter from
addGenerateStats and either (a) increment sendJobToWorkerChCnt from jobCnt
inside addGenerateStats (use the existing jobCnt value) or (b) drop
sendJobToWorkerChCnt entirely and remove its usage in zapFields (and any related
metric registration). Update all callers (e.g., the call in local.go that
currently does addGenerateStats(len(jobs), generateDur, time.Since(sendStart),
len(jobs))) to the new signature and adjust zapFields to avoid logging a
redundant count or add a short comment if you intentionally keep both because
they may diverge in future.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1e759cda-62cf-410a-8a6f-adf2758441aa

📥 Commits

Reviewing files that changed from the base of the PR and between 2236406 and 2a8d31b.

📒 Files selected for processing (4)

pkg/lightning/backend/external/engine.go
pkg/lightning/backend/local/import_runtime_stats.go
pkg/lightning/backend/local/job_worker.go
pkg/lightning/backend/local/local.go

coderabbitai · 2026-05-07T14:07:46Z

+// Copyright 2026 PingCAP, Inc.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package local
+
+import (
+	"time"
+
+	"go.uber.org/atomic"
+	"go.uber.org/zap"
+)
+
+// importRuntimeStats tracks ingest runtime breakdown to locate bottlenecks.
+type importRuntimeStats struct {
+	generateBatchCnt        atomic.Int64
+	generateJobCnt          atomic.Int64
+	generateJobsDuration    atomic.Int64
+	sendJobToWorkerChCnt    atomic.Int64
+	sendJobToWorkerChDur    atomic.Int64
+	workerJobRunCnt         atomic.Int64
+	workerJobRunDuration    atomic.Int64
+	workerWriteCnt          atomic.Int64
+	workerWriteDuration     atomic.Int64
+	workerIngestCnt         atomic.Int64
+	workerIngestDuration    atomic.Int64
+	retryableWriteErrorCnt  atomic.Int64
+	retryableIngestErrorCnt atomic.Int64
+}
+
+func (s *importRuntimeStats) addGenerateStats(jobCnt int, genDur, sendDur time.Duration, sendCnt int) {
+	s.generateBatchCnt.Inc()
+	s.generateJobCnt.Add(int64(jobCnt))
+	s.generateJobsDuration.Add(genDur.Nanoseconds())
+	s.sendJobToWorkerChCnt.Add(int64(sendCnt))
+	s.sendJobToWorkerChDur.Add(sendDur.Nanoseconds())
+}
+
+func (s *importRuntimeStats) addWorkerJobRunDur(d time.Duration) {
+	s.workerJobRunCnt.Inc()
+	s.workerJobRunDuration.Add(d.Nanoseconds())
+}
+
+func (s *importRuntimeStats) addWriteDur(d time.Duration) {
+	s.workerWriteCnt.Inc()
+	s.workerWriteDuration.Add(d.Nanoseconds())
+}
+
+func (s *importRuntimeStats) addIngestDur(d time.Duration) {
+	s.workerIngestCnt.Inc()
+	s.workerIngestDuration.Add(d.Nanoseconds())
+}
+
+func (s *importRuntimeStats) incRetryableWriteErr() {
+	s.retryableWriteErrorCnt.Inc()
+}
+
+func (s *importRuntimeStats) incRetryableIngestErr() {
+	s.retryableIngestErrorCnt.Inc()
+}
+
+func (s *importRuntimeStats) zapFields() []zap.Field {
+	return []zap.Field{
+		zap.Int64("generate-batches", s.generateBatchCnt.Load()),
+		zap.Int64("generated-jobs", s.generateJobCnt.Load()),
+		zap.Duration("generate-jobs-total", time.Duration(s.generateJobsDuration.Load())),
+		zap.Int64("send-job-to-worker-ch-count", s.sendJobToWorkerChCnt.Load()),
+		zap.Duration("send-job-to-worker-ch-total", time.Duration(s.sendJobToWorkerChDur.Load())),
+		zap.Int64("worker-job-run-count", s.workerJobRunCnt.Load()),
+		zap.Duration("worker-job-run-total", time.Duration(s.workerJobRunDuration.Load())),
+		zap.Int64("worker-write-count", s.workerWriteCnt.Load()),
+		zap.Duration("worker-write-total", time.Duration(s.workerWriteDuration.Load())),
+		zap.Int64("worker-ingest-count", s.workerIngestCnt.Load()),
+		zap.Duration("worker-ingest-total", time.Duration(s.workerIngestDuration.Load())),
+		zap.Int64("retryable-write-errors", s.retryableWriteErrorCnt.Load()),
+		zap.Int64("retryable-ingest-errors", s.retryableIngestErrorCnt.Load()),
+	}
+}


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find BUILD.bazel in the local package and check if import_runtime_stats.go is listed fd -e bazel "BUILD" pkg/lightning/backend/local/ | xargs grep -l "go_library" 2>/dev/null | head -5 echo "--- current srcs in BUILD.bazel ---" fd -e bazel "BUILD" pkg/lightning/backend/local/ | xargs grep -A 80 "go_library" | grep "\.go"

Repository: pingcap/tidb

Length of output: 590

🏁 Script executed:

cat pkg/lightning/backend/local/BUILD.bazel | grep -A 200 "go_library" | head -150

Repository: pingcap/tidb

Length of output: 5023

🏁 Script executed:

ls -la pkg/lightning/backend/local/import_runtime_stats.go 2>&1

Repository: pingcap/tidb

Length of output: 157

Add import_runtime_stats.go to the srcs list in pkg/lightning/backend/local/BUILD.bazel.

The file exists but is missing from the go_library targets' srcs declaration, causing Bazel builds to fail with undefined symbol errors. Insert "import_runtime_stats.go", in alphabetical order between "iterator.go" and "job_worker.go".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/lightning/backend/local/import_runtime_stats.go` around lines 1 - 88, Add the new source file import_runtime_stats.go to the go_library srcs in the pkg/lightning/backend/local BUILD.bazel so Bazel can see the definitions (e.g., importRuntimeStats, addGenerateStats, zapFields); open the BUILD.bazel's go_library that currently lists "iterator.go" and "job_worker.go" and insert "import_runtime_stats.go", into the srcs list in alphabetical order between "iterator.go" and "job_worker.go" and then run bazel build to verify the undefined symbol errors are resolved.

ti-chi-bot · 2026-05-07T14:19:14Z

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

Notice: To remove the do-not-merge/needs-tests-checked label, please finished the tests then check the finished items in description.

For example:

Tests

Unit test

Integration test

Manual test (add detailed scripts or steps below)

No code

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

ti-chi-bot · 2026-05-07T14:46:41Z

@joechenrh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-build-next-gen	`4c583ea`	link	true	`/test pull-build-next-gen`
idc-jenkins-ci-tidb/build	`4c583ea`	link	true	`/test build`
idc-jenkins-ci-tidb/unit-test	`4c583ea`	link	true	`/test unit-test`
pull-unit-test-next-gen	`4c583ea`	link	true	`/test pull-unit-test-next-gen`
pull-lightning-integration-test	`4c583ea`	link	true	`/test pull-lightning-integration-test`
idc-jenkins-ci-tidb/check_dev_2	`4c583ea`	link	true	`/test check-dev2`
pull-integration-realcluster-test-next-gen	`4c583ea`	link	true	`/test pull-integration-realcluster-test-next-gen`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ti-chi-bot Bot added the do-not-merge/needs-linked-issue label Apr 30, 2026

ti-chi-bot Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Apr 30, 2026

ti-chi-bot Bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 30, 2026

joechenrh force-pushed the codex/limit-open-readers-64 branch 2 times, most recently from 35034c8 to 6ec846d Compare May 7, 2026 06:40

ti-chi-bot Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 7, 2026

joechenrh added 2 commits May 7, 2026 14:46

external: increase merge reader buffer for default prefetch

c3dee91

importinto,objstore: add merge and s3 read concurrency limits

afd283e

joechenrh force-pushed the codex/limit-open-readers-64 branch from 6ec846d to afd283e Compare May 7, 2026 06:50

joechenrh added 4 commits May 7, 2026 14:54

importinto,external: remove merge layout estimation wiring

db77b9d

Revert "importinto,external: remove merge layout estimation wiring"

c15f02b

This reverts commit db77b9d.

importinto,external: run merge serially with 64-file groups

950a8f5

external,objstore: remove additional read limiter layers

4137898

ti-chi-bot Bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 7, 2026

external,importinto: simplify merge read settings

9a638a3

ti-chi-bot Bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 7, 2026

external: revert default read buffer size to 64KiB

770d10e

ti-chi-bot Bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 7, 2026

joechenrh marked this pull request as ready for review May 7, 2026 09:40

ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2026

joechenrh changed the title ~~lightning: limit merge reader open concurrency to 64~~ importinto,external: run merge serially with 64-file groups May 7, 2026

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

external: add merge-sort cloud read/write duration metrics

2236406

ti-chi-bot Bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 7, 2026

lightning: add ingest pipeline runtime breakdown logs

2a8d31b

ti-chi-bot Bot added do-not-merge/needs-tests-checked size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 7, 2026

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

lightning: fix loadedKVs logging after batch handoff

4c583ea

Conversation

joechenrh commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

ti-chi-bot Bot commented Apr 30, 2026

Uh oh!

coderabbitai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

tiprow Bot commented Apr 30, 2026

Uh oh!

ti-chi-bot Bot commented May 7, 2026

Uh oh!

pantheon-ai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot Bot commented May 7, 2026

Uh oh!

ti-chi-bot Bot commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joechenrh commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading

pantheon-ai Bot commented May 7, 2026 •

edited

Loading

codecov Bot commented May 7, 2026 •

edited

Loading