fix(schema-pack): embed bundled pack YAMLs in the compiled binary (+ entity-slug orphan floor)#2267
Open
JiraiyaETH wants to merge 1 commit into
Open
Conversation
…ty-slug orphan floor
LAYER 0 — schema-pack compile bug (the foundational filing/taxonomy outage).
Bundled pack YAMLs were never embedded in `bun build --compile` binaries: the
locators resolved paths via `fileURLToPath(import.meta.url)+existsSync`, which
points at `/$bunfs/root/...` where the YAML isn't on disk. So `gbrain schema
active` => "unknown schema pack: gbrain-base" (the active pack extends
gbrain-base, so the bundled parent failing took the whole resolution down),
put_page silently degraded (skipped type validation → 32-vs-15 type sprawl),
and brain-taxonomist was neutered.
- New central registry src/core/schema-pack/bundled-packs.ts imports all 7
bundled YAMLs via `import … with { type: 'file' }` (Bun embeds the bytes;
the import is a readable path in both `bun run` and `--compile`). Mirrors the
proven WASM-embed pattern.
- Route EVERY pack-load surface through it: load-active defaultPackLocator,
CLI schema show/validate/list/use/fork (packPathByName + runList), MCP
list_schema_packs + schema_lint named-pack, and the read-only mutability set
(was 3 names, now all 7).
- put_page FAIL-CLOSED: a configured pack (resolution.source !== 'default')
that won't load now throws instead of silently degrading; default-pack load
failure loud-warns. (Stops the silent-degrade that accumulated the drift.)
- Functional regression guard scripts/check-packs-embedded.sh +
packs-smoketest.ts (compiles a probe, loads all bundled packs through the
compiled binary). Wired into check:all, run-verify-parallel (the CI gate),
ci-local. Updated the BUNDLED_PACK_NAMES size test 3 → 7. Docs refreshed.
DEFERRED (noted): registry resolvePack returns the child manifest unchanged for
extends/borrow_from, so gbrain-recommended/everything resolve with 0 page types
— not the outage (jarvis-operational declares its types directly); reconciling
would change the live type set right before deploy.
LAYER 1a — entity-slug floor: no literal-null / hyphen-flattened orphans.
- extract.ts: a model-emitted `entity` of "null"/"none"/whitespace is coerced
to JSON null (fact stays unbound, never a `null`-slug orphan).
- entities/resolve.ts: non-entity tokens → null binding (kept unbound via the
existing backstop legacy bucket, not dropped); floor uses slugifyEntityPath
which preserves an explicit ENTITY-prefix path (companies/Hermes Agent →
companies/hermes-agent) instead of flattening to companies-hermes-agent, but
flattens non-prefix slashes (A/B Partners → a-b-partners) so it can't mint
arbitrary nested pages past the stub guard.
- Read path (recall + MCP list_facts): a null resolution returns no facts
instead of querying the raw string and surfacing legacy orphan rows.
- Tests: junk→null, path-preserve vs flatten, backstop "null"→entity_slug NULL.
DEFERRED (noted): deriving the entity prefix set from the active pack's
path_prefixes instead of the hardcoded people/+companies/ (matches the live
pack today).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug (Layer 0 — the foundational outage)
Bundled schema-pack YAMLs are never embedded in
bun build --compilebinaries. The pack locators resolve paths viafileURLToPath(import.meta.url) + existsSync, which points at/$bunfs/root/...in a compiled binary where the YAML is not on disk. Symptoms in a deployed binary:gbrain schema active→unknown schema pack: gbrain-base. Any active pack thatextends: gbrain-basefails to resolve because the bundled parent can't load, taking the whole resolution down.put_pagesilently degraded — the pack-loadtry/catchswallowed the failure, reverted to a hardcoded prefix table, and skipped type validation → accumulating page-type sprawl.gbrain schema show gbrain-baseand the lens packs (gbrain-creator/investor/engineer/everything) were unreachable from the CLI/MCP.build:schemais not the cause (it regenerates Postgres DDL); a plain rebuild reproduces the bug. The fix is a source-level embed.The fix
src/core/schema-pack/bundled-packs.tsimports all 7 bundled pack YAMLs viaimport … with { type: 'file' }(Bun embeds the bytes; the import evaluates to a readable path in bothbun runand--compile). Mirrors the existing WASM-embed pattern (src/core/chunkers/code.ts+scripts/check-wasm-embedded.sh). ExportsBUNDLED_PACK_PATHS,BUNDLED_PACK_LIST,BUNDLED_PACK_NAMES,bundledPackPath().load-active.tsdefaultPackLocator, the CLIschema show/validate/list/use/fork(packPathByName+runList), MCPlist_schema_packs+schema_lintnamed-pack, and the read-only mutability set inmutate.ts(was 3 names, now all 7).put_page— a configured pack (resolution source ≠default) that won't load now throws instead of silently degrading; a default-pack load failure loud-warns.scripts/check-packs-embedded.sh+scripts/packs-smoketest.ts: compiles a probe and loads every bundled pack through the compiled binary (notstrings | grep). Wired intocheck:all,run-verify-parallel, andci-local. Updated theBUNDLED_PACK_NAMESsize test 3 → 7. Docs refreshed.Layer 1a — entity-slug floor (no literal-
null/hyphen-flattened fact orphans)extract.ts: a model-emittedentityof"null"/"none"/whitespace is coerced to JSON null (the fact stays unbound, never anull-slug orphan).entities/resolve.ts: non-entity tokens → null binding (kept unbound via the existing legacy bucket, not dropped); the floor usesslugifyEntityPath, which preserves an explicit entity-prefix path (companies/Acme Co→companies/acme-co) but flattens non-prefix slashes (A/B Partners→a-b-partners) so it can't mint arbitrary nested pages past the stub guard.recall+ MCPlist_facts): a null resolution returns no facts instead of querying the raw string and surfacing legacy orphan rows.Deferred (called out, not in this PR)
registry.tsresolvePackreturns the child manifest unchanged forextends/borrow_from, sogbrain-recommended/gbrain-everythingresolve with 0 page types. Not the outage (operational packs declare their types directly), but worth a follow-up to honor the lens-pack design.path_prefixesinstead of the hardcodedpeople/+companies/.Tests
typecheck clean ·
scripts/check-packs-embedded.shpasses (compiled-binary probe) · the schema-pack/facts/entity suites pass (246 tests). Verified live: a freshly compiled binary resolves the configured active pack andschema validate <lens-pack>reads its embedded/$bunfs/root/...yaml.🤖 Generated with Claude Code