Optimize indexed OR/Contains predicates to prevent cross-collection scan leakage in single-file mode#66
Conversation
…n scan fallback Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/cb144a09-ce2a-422f-864e-23e1bf3a2320 Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/cb144a09-ce2a-422f-864e-23e1bf3a2320 Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
OR/Contains predicates to prevent cross-collection scan leakage in single-file mode
There was a problem hiding this comment.
Pull request overview
This PR extends BLite’s query optimizer and execution paths to keep indexed OR-of-equality and Contains (“IN”) predicates on B-Tree index probes, avoiding fallback BSON scans that could materialize cross-collection “phantom” rows in single-file mode.
Changes:
- Add
InValuessupport toIndexOptimizerto represent discrete multi-key probes forOR-equality andContainspredicates. - Update
DocumentCollection.FetchAsyncandCountByPredicateAsyncto executeInValuesas per-key index probes (no BSON scan fallback). - Add optimizer unit tests and regression tests validating cross-collection isolation for these query shapes.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| tests/BLite.Tests/IndexOptimizationTests.cs | Adds unit tests asserting OR and Contains optimize to InValues. |
| tests/BLite.Tests/CrossCollectionQueryIsolationTests.cs | New regression tests reproducing prior cross-collection leakage scenario and validating isolation. |
| src/BLite.Core/Query/IndexOptimizer.cs | Introduces OptimizationResult.InValues, OR merge to multi-probe, and Contains→IN predicate parsing. |
| src/BLite.Core/Collections/DocumentCollection.cs | Executes InValues plans via per-key index probes for fetch and count to avoid scan fallback. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@copilot add corrections based on comments in this PR |
|
@copilot apply changes based on the comments in this thread |
Agent-Logs-Url: https://github.com/EntglDb/BLite/sessions/38548de0-4db8-4009-bf02-5519c136a64c Co-authored-by: mrdevrobot <12503462+mrdevrobot@users.noreply.github.com>
Applied the requested review-thread corrections in commit
Done in commit Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
…glDb/BLite into copilot/fix-bson-scan-issues
ContainsIN optimization type support to include indexable key types (Guid,byte[]).InValuesin optimizer while preserving source order.FetchAsyncandCountByPredicateAsyncuse distinct probe keys to prevent duplicate yields/over-counting.Containsvalues and Guid/byte[] IN optimization behavior.