Skip to content

Conversation

@dekuu5
Copy link

@dekuu5 dekuu5 commented Jan 29, 2026

Which issue does this PR close?

Rationale for this change

The SpillPool test spill::spill_pool::channel was failing indeterministically due to a race condition between SpillFile's coordination logic and the spawn_buffered background task used in the stream reader.

What changes are included in this PR?

read_spill_as_stream now returns a normal stream not a buffered stream

Are these changes tested?

No, the fix is for an existing test mentioned in the issue. I wasn't able to find the bug initially, but I found it by stress-testing the original test 100 times in parallel.

Are there any user-facing changes?

no

i still think we should add a buffer stream that is aware of the coordination of the writer and reader but i can't warp my head around it yet.

Signed-off-by: Ahmed hossam <ahmed.hossambahig@gmail.com>
Copilot AI review requested due to automatic review settings January 29, 2026 18:27
@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Jan 29, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an indeterministic test failure in SpillPool caused by a race condition between the writer and reader coordination logic when using a buffered stream.

Changes:

  • Removed the spawn_buffered wrapper from read_spill_as_stream to return an unbuffered stream
  • Removed the unused import of spawn_buffered from spill_manager.rs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@martin-g martin-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not too complex maybe add a test case verifying that there is no racing any more

max_record_batch_memory,
)));

Ok(spawn_buffered(stream, self.batch_read_buffer_capacity))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems batch_read_buffer_capacity is no more used.
It could be deprecated or maybe even removed.
https://github.com/dekuu5/datafusion/blob/1b8ef43fdd1424a3e4fe2db213fec4e7228788b0/datafusion/physical-plan/src/sorts/multi_level_merge.rs#L276 is a no-op

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Indeterministic test failure in SpillPool

2 participants