feat(core): progressive disclosure for system prompt by user-prompt keywords#109
feat(core): progressive disclosure for system prompt by user-prompt keywords#109
Conversation
…eywords The full create-mode system prompt is ~41 KB / ~10k tokens. Small-context models (e.g. minimax-m2.5:free at 8k ctx) hard-fail on it; even large-context models dilute their attention and ignore most of the instructions. composeSystemPrompt() now accepts an optional userPrompt. When provided in create mode, the composer builds: - Layer 1 (always, ~12 KB): identity, workflow, output-rules, design-methodology, pre-flight, editmode-protocol, safety, plus a new condensed antiSlopDigest section (forbidden-list bullets only). - Layer 2 (keyword-matched): chart-rendering + dashboard-ambient-signals for dashboard cues; iOS starter template for mobile cues; single-page / big-numbers / customer-quotes craft subsections for marketing cues; logos subsection for brand cues. No keyword match → fall back to the full craft directives (better safe than sorted). - Layer 3 (deferred TODO): retry-on-quality-fail injection of full ANTI_SLOP + ARTIFACT_TYPES. Measurements for sample prompts: full (no userPrompt): 41,327 chars "做个数据看板": 22,614 chars (55%) "iOS 移动端 onboarding": 21,744 chars (53%) "indie marketing landing page": 19,815 chars (48%) "随便做点东西" (no kw): 24,464 chars (59%) When userPrompt is omitted, or mode is tweak / revise, the output is byte-identical to before — full back-compat. The drift contract for PROMPT_SECTION_FILES is preserved (new antiSlopDigest section gets its own .v1.txt). Adds 9 vitest cases covering back-compat, layer 1 invariants, each keyword bucket, no-keyword fallback, the 25 KB regression guard, and mode tweak/revise pass-through. All 158 core tests pass.
There was a problem hiding this comment.
Findings
-
[Major] Dashboard keyword regex has substring false positives —
graphmatchesparagraphandmetricmatchesasymmetric/biometric, so unrelated prompts can be mis-routed into dashboard/chart instructions. This changes prompt composition unexpectedly and can degrade output quality. Evidencepackages/core/src/prompts/index.ts:847
Suggested fix:const KEYWORDS_DASHBOARD = /\b(dashboard|chart|graph|plot|visualization|analytics|metric|kpi)s?\b|数据|看板|图表/i;
-
[Minor] Tests cover positive keyword matches but miss false-positive guards for substring collisions introduced by regex routing. Evidence
packages/core/src/generate.test.ts:1419
Suggested fix:it('does not trigger dashboard routing on substring collisions', () => { const p = composeSystemPrompt({ mode: 'create', userPrompt: 'improve paragraph rhythm and asymmetric spacing' }); expect(p).not.toContain('Chart rendering contract'); expect(p).not.toContain('Dashboard ambient signals'); });
Summary
- Review mode: initial
- 2 issues found (1 Major, 1 Minor), both introduced by the regex-based progressive routing change.
docs/VISION.mdanddocs/PRINCIPLES.md: Not found in repo/docs in this checkout.
Testing
- Not run (automation)
| // --------------------------------------------------------------------------- | ||
|
|
||
| const KEYWORDS_DASHBOARD = | ||
| /(dashboard|chart|graph|plot|visualization|数据|看板|图表|analytics|metric|KPI)/i; |
There was a problem hiding this comment.
KEYWORDS_DASHBOARD currently matches substrings (graph in paragraph, metric in asymmetric), which can route unrelated prompts into dashboard mode. Consider word boundaries for Latin terms while keeping CJK terms as-is:
const KEYWORDS_DASHBOARD =
/\b(dashboard|chart|graph|plot|visualization|analytics|metric|kpi)s?\b|数据|看板|图表/i;| } | ||
| }); | ||
|
|
||
| it('dashboard prompt: includes chart rendering, excludes iOS starter', () => { |
There was a problem hiding this comment.
Please add a negative routing test for substring collisions to prevent regressions (e.g., paragraph/asymmetric should not trigger dashboard routing):
it('does not trigger dashboard routing on substring collisions', () => {
const p = composeSystemPrompt({ mode: 'create', userPrompt: 'improve paragraph rhythm and asymmetric spacing' });
expect(p).not.toContain('Chart rendering contract');
expect(p).not.toContain('Dashboard ambient signals');
});There was a problem hiding this comment.
Findings
- None.
Summary
- Review mode: initial
- No blocking/major/minor/nit issues found in added/modified lines of this diff.
docs/VISION.mdanddocs/PRINCIPLES.md: Not found in repo/docs.- Residual risk: progressive-routing behavior is covered by unit tests in
composeSystemPrompt()but this PR does not include provider-integrated smoke coverage.
Testing
- Not run (automation)
open-codesign Bot
Summary
The full create-mode system prompt was ~41 KB / ~10k tokens. That hard-fails small-context models (e.g. minimax-m2.5:free at 8k ctx) and dilutes attention on every other model so most directives get ignored. This PR splits the prompt into layers and emits only the layers that match the user's prompt keywords.
composeSystemPrompt()gains an optionaluserPromptfield. When provided increatemode, the composer assembles:antiSlopDigest(forbidden-list bullets only, ~1.5 KB).Back-compat:
userPromptundefined OR modetweak/revise→ byte-identical to the pre-PR prompt.Section sizes addressed (the bloat)
Measured prompt size
做个数据看板iOS 移动端 onboardingindie marketing landing page随便做点东西(no keyword)A regression-guard test asserts matched dashboard prompt < 25 KB.
Tests
Adds 9 new vitest cases (
composeSystemPrompt() — progressive disclosure):userPromptis omittedtweakignoresuserPromptreviseignoresuserPromptTotal
@open-codesign/coretests: 158 passing (was 149).The drift contract for
PROMPT_SECTION_FILESis preserved — the newantiSlopDigestconstant ships its ownanti-slop-digest.v1.txt.apps/desktopconsumer is wired through automatically:generate()now passesinput.promptasuserPromptto the composer.PRINCIPLES checklist (§5b)
userPromptis undefined; existing 88 generate.test.ts cases unchanged.packages/core/src/prompts/; new section follows the same.v1.txt+ TS-constant pattern as the others.composeFull/composeCreateProgressive), keyword routing extracted intoplanKeywordMatches.Test plan
pnpm --filter @open-codesign/core test— 158/158 passpnpm --filter @open-codesign/core typecheck— cleanpnpm exec biome checkon changed files — clean (cognitive complexity within budget)