(feat): PSDB Layout 2 #266
Conversation
- Fix bool vector serialization: include partial byte in buffer when there are unfilled bits - Fix type mismatch in system.go: change FP32Vector default decoder from FP16Vector to Float32Vector - Add bitmap bounds checking: validate bitmap indices before access in deserialized_psdb_layout2.go - Check bounds in countSetBitsBefore() to prevent panic on undersized bitmaps - Check bounds in skipStringVectorsInDense() before bitmap access - Check bounds in GetNumericVectorFeature() and GetBoolVectorFeature() bitmap iteration Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- LLD_TEMPLATE_FOR_WORD.txt: Word-compatible template with 686 lines - WORD_IMPORT_GUIDE.txt: Step-by-step import instructions - lld_doc_generation_prompt.md: Comprehensive LLM prompt with 12 sections - LLD_TEMPLATE.md: Markdown version for GitHub documentation - COMMIT_ANALYSIS.md: Detailed analysis of 14 commits - README.md: Quick start guide for all usage workflows - INDEX.txt: Master navigation index - GENERATION_SUMMARY.txt: Feature highlights Covers PSDB Layout V2 & V3 bitmap-based sparse feature encoding. Ready for Word document creation or LLM-based generation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- psdb_shadow_compare.go: return pooled PSDB to pool on Build() error (pool leak fix) - system/system.go: fix GetFP32Vector using FP16 decoder for FP32 defaults - perm_storage_datablock_v3.go: fix sparse bool vector partial byte truncation, add setupBoolDtypeLastIdx; make setupHeadersLayout2 idempotent (overwrite instead of append on repeat calls) - cache_storage_datablock_v2.go: add uint16 overflow guard for payloads > 64KB - perm_storage_datablock_v2.go: fix comment (byte 8 bit 3 → byte 9 bit 0) - deserialized_psdb_v3.go: add bounds checks for pos in GetStringVectorFeature and GetBoolVectorFeature - layout_comparison_test.go: guard divide-by-zero in Conclusion block; use safe type assertion for DeserializedPSDBLayout2; remove hardcoded rationale numbers - redis.go: preserve lifetime-TTL (ExpiryAt==0) PSDBs when building Redis CSDBs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- cache_storage_datablock_v2_test.go: replace hardcoded 3-byte defaults with ddb.GetDataType().Size() for all GetNumericScalarFeature calls (14 sites)
- deserialized_psdb_v2_test.go: same fix for d.GetDataType().Size() (12 sites)
- handler/feature/persist.go: fix metric tags to use colon-separated format ("key:value") matching metric.TagAsString() convention
- deserialized_psdb_v3.go: add comments explaining dual bit-ordering in GetBoolVectorFeature (bitmap=LSB-first, dense payload=MSB-first)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
WalkthroughThis pull request introduces a Layout V2 serialization format for Permanent Storage Data Blocks with bitmap-based sparse feature encoding, abstract PSDBBlock interface to unify Layout V1/V2 deserialization, and shadow comparison infrastructure to evaluate Layout 2 benefits. API updates thread bitmap data through feature parsing and retrieval chains, while configuration extensions enable layout comparison sampling. Changes
🚥 Pre-merge checks | ✅ 1✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
♻️ Duplicate comments (2)
online-feature-store/internal/data/blocks/layout_comparison_results.txt (1)
1443-1443:⚠️ Potential issue | 🟡 MinorFix the double percent sign typo.
Line 1443 contains
0%%which should be0%.📝 Proposed fix
- • Overhead only in edge cases with 0%% defaults + • Overhead only in edge cases with 0% defaults🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@online-feature-store/internal/data/blocks/layout_comparison_results.txt` at line 1443, Replace the stray double-percent typo "0%%" in the text snippet "• Overhead only in edge cases with 0%% defaults" with a single percent sign so it reads "0%"; search for the literal "0%%" occurrence and update it to "0%" (no other changes).online-feature-store/internal/data/blocks/deserialized_psdb_v2.go (1)
339-354:⚠️ Potential issue | 🟡 MinorMissing bounds check for
posparameter persists.Per the past review comment,
GetNumericVectorFeaturestill lacks an explicit bounds check forpos < 0 || pos >= len(vectorLengths)before accessingvectorLengths[pos]at line 349. A negativeposwould cause the loop to never match and then panic.This was acknowledged but not addressed in this PR. Consider adding the defensive check:
🛡️ Proposed fix
func (dd *DeserializedPSDB) GetNumericVectorFeature(pos int, vectorLengths []uint16, defaultValue []byte) ([]byte, error) { + if pos < 0 || pos >= len(vectorLengths) { + return nil, fmt.Errorf("position %d out of bounds for vectorLengths (len=%d)", pos, len(vectorLengths)) + } data := dd.OriginalData size := dd.DataType.Size() var start int = 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@online-feature-store/internal/data/blocks/deserialized_psdb_v2.go` around lines 339 - 354, GetNumericVectorFeature is missing a defensive bounds check for the pos parameter; add an explicit check at the start of the function to verify pos >= 0 and pos < len(vectorLengths) and return a clear error (e.g. "position out of bounds") if it fails, before the loop or any access to vectorLengths[pos], so the later loop and the computation of end cannot panic when pos is negative or >= len(vectorLengths).
🧹 Nitpick comments (2)
online-feature-store/internal/data/blocks/perm_storage_datablock_v2.go (1)
59-86: Potential issue:Clear()readslayoutVersionbefore resetting it.The function captures
headerLenbased onp.layoutVersion(lines 60-63), then resetsp.layoutVersion = 0(line 64). This is correct for determining the current buffer trim size. However, ifClear()is called on a freshly allocated struct wherelayoutVersion == 0,headerLendefaults toPSDBLayout1HeaderBytes(9), which may not match the actual buffer size if it was pre-allocated for layout 2.This is likely fine since the builder initializes the buffer appropriately, but worth noting for defensive coding.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@online-feature-store/internal/data/blocks/perm_storage_datablock_v2.go` around lines 59 - 86, PermStorageDataBlock.Clear determines headerLen from p.layoutVersion which can be zero for newly allocated structs, causing an incorrect trim; update Clear (function PermStorageDataBlock.Clear) to infer headerLen defensively from the buffer instead of only from p.layoutVersion: if p.layoutVersion != 0 use it as now, otherwise inspect len(p.buf) and choose PSDBLayout2HeaderBytes when buf is large enough, fall back to PSDBLayout1HeaderBytes or to len(p.buf) if smaller; then proceed to reset p.layoutVersion and trim p.buf accordingly. Ensure references to PSDBLayout1HeaderBytes, PSDBLayout2HeaderBytes, p.layoutVersion and p.buf are used in the new logic.online-feature-store/internal/handler/feature/persist.go (1)
6-6: Consider usingmath/rand/v2or explicit seeding for deterministic behavior in tests.
math/randuses a global source that defaults to a deterministic seed (1) in Go <1.20 and auto-seeds in Go ≥1.20. For production sampling this is fine, but be aware that test determinism may vary across Go versions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@online-feature-store/internal/handler/feature/persist.go` at line 6, The import of "math/rand" should be made explicit for deterministic tests: either switch the import to "math/rand/v2" and update any calls, or (preferably) make randomness injectable or explicitly seed the global source; locate uses of rand (e.g., rand.Intn, rand.Float64) in persist.go and then either (A) replace global calls with an injected *rand.Rand instance (created with rand.New(rand.NewSource(seed)) in tests) or (B) add an init() that calls rand.Seed(time.Now().UnixNano()) for production and ensure tests call rand.Seed(fixedSeed) or use rand.NewSource(fixedSeed) to guarantee deterministic behavior across Go versions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@online-feature-store/internal/data/blocks/deserialized_psdb_v2.go`:
- Around line 339-354: GetNumericVectorFeature is missing a defensive bounds
check for the pos parameter; add an explicit check at the start of the function
to verify pos >= 0 and pos < len(vectorLengths) and return a clear error (e.g.
"position out of bounds") if it fails, before the loop or any access to
vectorLengths[pos], so the later loop and the computation of end cannot panic
when pos is negative or >= len(vectorLengths).
In `@online-feature-store/internal/data/blocks/layout_comparison_results.txt`:
- Line 1443: Replace the stray double-percent typo "0%%" in the text snippet "•
Overhead only in edge cases with 0%% defaults" with a single percent sign so it
reads "0%"; search for the literal "0%%" occurrence and update it to "0%" (no
other changes).
---
Nitpick comments:
In `@online-feature-store/internal/data/blocks/perm_storage_datablock_v2.go`:
- Around line 59-86: PermStorageDataBlock.Clear determines headerLen from
p.layoutVersion which can be zero for newly allocated structs, causing an
incorrect trim; update Clear (function PermStorageDataBlock.Clear) to infer
headerLen defensively from the buffer instead of only from p.layoutVersion: if
p.layoutVersion != 0 use it as now, otherwise inspect len(p.buf) and choose
PSDBLayout2HeaderBytes when buf is large enough, fall back to
PSDBLayout1HeaderBytes or to len(p.buf) if smaller; then proceed to reset
p.layoutVersion and trim p.buf accordingly. Ensure references to
PSDBLayout1HeaderBytes, PSDBLayout2HeaderBytes, p.layoutVersion and p.buf are
used in the new logic.
In `@online-feature-store/internal/handler/feature/persist.go`:
- Line 6: The import of "math/rand" should be made explicit for deterministic
tests: either switch the import to "math/rand/v2" and update any calls, or
(preferably) make randomness injectable or explicitly seed the global source;
locate uses of rand (e.g., rand.Intn, rand.Float64) in persist.go and then
either (A) replace global calls with an injected *rand.Rand instance (created
with rand.New(rand.NewSource(seed)) in tests) or (B) add an init() that calls
rand.Seed(time.Now().UnixNano()) for production and ensure tests call
rand.Seed(fixedSeed) or use rand.NewSource(fixedSeed) to guarantee deterministic
behavior across Go versions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5ebe8c3d-6d77-4bbe-8d02-62c692be9364
📒 Files selected for processing (7)
online-feature-store/internal/data/blocks/cache_storage_datablock_v2.goonline-feature-store/internal/data/blocks/deserialized_psdb_v2.goonline-feature-store/internal/data/blocks/deserialized_psdb_v3.goonline-feature-store/internal/data/blocks/layout_comparison_results.txtonline-feature-store/internal/data/blocks/perm_storage_datablock_v2.goonline-feature-store/internal/data/blocks/psdb_shadow_compare.goonline-feature-store/internal/handler/feature/persist.go
✅ Files skipped from review due to trivial changes (1)
- online-feature-store/internal/data/blocks/deserialized_psdb_v3.go
🚧 Files skipped from review as they are similar to previous changes (2)
- online-feature-store/internal/data/blocks/psdb_shadow_compare.go
- online-feature-store/internal/data/blocks/cache_storage_datablock_v2.go
What type of PR is this?
📂 Modules Affected
horizon(Real-time systems / networking)online-feature-store(Feature serving infra)trufflebox-ui(Admin panel / UI)infra(Docker, CI/CD, GCP/AWS setup)docs(Documentation updates)___________Added tests
Key Requirement Doc
Downstream Impact
Additional Information
Summary by CodeRabbit
New Features
Configuration