Skip to content

InsertBulkAsync fails with "Not enough space" due to an unrecoverable desync between in-memory FSI and physical page (v4.3) #58

@LeoYang06

Description

@LeoYang06

Package version

4.3.0

Affected package

BLite (client SDK)

.NET version

10.0

Description

Hi! We are performing heavy load testing on the v4.3 release. During a massive data ingestion process, we encountered a recurring exception that completely blocks further inserts.

It appears that there is a state desync between the in-memory Free Space Index (FSI) and the transactional Physical Page Layout.

The Exception Log:

System.InvalidOperationException: Not enough space: need 609, have 14 | PageId=14251 | SlotCount=62 | Start=520 | End=534 | FSI=3273
   at BLite.Core.Collections.DocumentCollection`2.InsertIntoPage(UInt32 pageId, Byte[] data, ITransaction transaction)
   at BLite.Core.Collections.DocumentCollection`2.InsertDataCore(...)
   at BLite.Core.Collections.DocumentCollection`2.InsertBulkInternal(...)

Objective Observations from the Logs:
Our application logs show this exact same exception firing repeatedly over several seconds on subsequent insert attempts. Remarkably, the page state (PageId=14251, SlotCount=62, Start=520, End=534, FSI=3273) remains perfectly identical across all failures.

Based on the DocumentCollection.cs source code, this creates a "poisoned cache loop":

  1. The allocator queries _fsi, which claims the page has 3273 bytes free, and routes the document to PageId=14251.
  2. InsertIntoPage reads the page from storage. The physical header shows only 14 bytes available (End 534 - Start 520).
  3. It throws an InvalidOperationException at line 1107 because freeSpace < requiredSpace.
  4. The transaction aborts. However, because it threw, the in-memory _fsi is never corrected. The next insert asks _fsi again, gets routed to the same page, and crashes again.

Possible Context (for your reference):
Since _fsi is an in-memory structure and the physical pages are transaction-isolated, we suspect this desync might happen if a previous transaction performed operations that increased the FSI (e.g., Deletes or Updates), but that transaction was subsequently rolled back. The physical page correctly reverts via WAL, but the in-memory _fsi.Update() might not be reverted, leaving it in an overly optimistic state.

Could you please take a look at how FSI maintains consistency with page headers, especially during transaction rollbacks or concurrent modifications in v4.3?

Thanks for your continuous hard work on the engine!

Metadata

Metadata

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions