Skip to content

Optimize indexed OR/Contains predicates to prevent cross-collection scan leakage in single-file mode#66

Merged
mrdevrobot merged 7 commits intomainfrom
copilot/fix-bson-scan-issues
Apr 23, 2026
Merged

Optimize indexed OR/Contains predicates to prevent cross-collection scan leakage in single-file mode#66
mrdevrobot merged 7 commits intomainfrom
copilot/fix-bson-scan-issues

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 21, 2026

  • Reproduce current branch state and run baseline build/tests to identify unrelated failures.
  • Fix Contains IN optimization type support to include indexable key types (Guid, byte[]).
  • De-duplicate InValues in optimizer while preserving source order.
  • Ensure FetchAsync and CountByPredicateAsync use distinct probe keys to prevent duplicate yields/over-counting.
  • Add focused regression tests for duplicate Contains values and Guid/byte[] IN optimization behavior.
  • Run targeted tests for touched query optimizer/collection paths.
  • Run final validation + reply to new PR comments with commit hash.

Copilot AI and others added 2 commits April 21, 2026 21:31
…n scan fallback

Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/cb144a09-ce2a-422f-864e-23e1bf3a2320

Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix queries with OR/Contains to avoid BSON scan errors Optimize indexed OR/Contains predicates to prevent cross-collection scan leakage in single-file mode Apr 21, 2026
Copilot AI requested a review from mrdevrobot April 21, 2026 21:36
@mrdevrobot mrdevrobot requested a review from Copilot April 22, 2026 19:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends BLite’s query optimizer and execution paths to keep indexed OR-of-equality and Contains (“IN”) predicates on B-Tree index probes, avoiding fallback BSON scans that could materialize cross-collection “phantom” rows in single-file mode.

Changes:

  • Add InValues support to IndexOptimizer to represent discrete multi-key probes for OR-equality and Contains predicates.
  • Update DocumentCollection.FetchAsync and CountByPredicateAsync to execute InValues as per-key index probes (no BSON scan fallback).
  • Add optimizer unit tests and regression tests validating cross-collection isolation for these query shapes.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
tests/BLite.Tests/IndexOptimizationTests.cs Adds unit tests asserting OR and Contains optimize to InValues.
tests/BLite.Tests/CrossCollectionQueryIsolationTests.cs New regression tests reproducing prior cross-collection leakage scenario and validating isolation.
src/BLite.Core/Query/IndexOptimizer.cs Introduces OptimizationResult.InValues, OR merge to multi-probe, and Contains→IN predicate parsing.
src/BLite.Core/Collections/DocumentCollection.cs Executes InValues plans via per-key index probes for fetch and count to avoid scan fallback.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/BLite.Core/Query/IndexOptimizer.cs
Comment thread src/BLite.Core/Collections/DocumentCollection.cs
Comment thread src/BLite.Core/Collections/DocumentCollection.cs
Comment thread src/BLite.Core/Query/IndexOptimizer.cs Outdated
@mrdevrobot
Copy link
Copy Markdown
Contributor

@copilot add corrections based on comments in this PR

@mrdevrobot
Copy link
Copy Markdown
Contributor

@copilot apply changes based on the comments in this thread

Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/38548de0-4db8-4009-bf02-5519c136a64c

Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

@copilot add corrections based on comments in this PR

Applied the requested review-thread corrections in commit 5092a7f: added Guid/byte[] IN key support, deduped optimizer/runtime IN probes (including byte[] value-aware dedupe), and added regression tests for duplicate Contains values and Guid/byte[] optimization paths.

@copilot apply changes based on the comments in this thread

Done in commit 5092a7f. The changes from that review thread are now applied and validated with targeted + full test runs.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • av-build-tel-api-v1.avaloniaui.net
    • Triggering command: /usr/share/dotnet/dotnet dotnet exec --runtimeconfig /home/REDACTED/.nuget/packages/avalonia.buildservices/11.3.2/tools/netstandard2.0/runtimeconfig.json /home/REDACTED/.nuget/packages/avalonia.buildservices/11.3.2/tools/netstandard2.0/Avalonia.BuildServices.Collector.dll (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI requested a review from mrdevrobot April 23, 2026 07:24
@mrdevrobot mrdevrobot marked this pull request as ready for review April 23, 2026 08:07
@mrdevrobot mrdevrobot merged commit b23b019 into main Apr 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Queries with OR/Contains fall back to BSON scan and return cross-collection phantom objects in single-file mode

3 participants