Skip to content

Plan A: migrate demo onto canonical generateSandboxedUi rails (Fable 5)#85

Merged
jerelvelarde merged 10 commits into
mainfrom
jerel/cranky-kalam-f06e99
Jun 10, 2026
Merged

Plan A: migrate demo onto canonical generateSandboxedUi rails (Fable 5)#85
jerelvelarde merged 10 commits into
mainfrom
jerel/cranky-kalam-f06e99

Conversation

@jerelvelarde

Copy link
Copy Markdown
Collaborator

Implements Plan A — Improve the Demo from the Canonical: the demo now runs on the shipped canonical rails (generateSandboxedUiOpenGenerativeUIMiddleware → activity renderer) instead of its parallel widgetRenderer implementation — without regressing the demo's strengths. Every demo improvement is now upstreamable.

What changed (by phase)

Phase Ships
0 + A1 Test infra (vitest × 3 packages + pytest — repo previously had zero tests), CopilotKit pinned to 1.55.2-next.1, new @repo/design-system package single-sourcing the THEME/SVG/FORM CSS + importmap (kills the "keep in sync" fork in apps/mcp)
A2 openGenerativeUI: true on the runtime (auto-applies the middleware); demo-owned activity renderer registered via renderActivityMessages override — keeps CSS-first gating and adds design-system + importmap injection, idiomorph preview morphing, continuous ResizeObserver autosize; Zod-validated sendPrompt/openLink sandbox functions (https-only + origin allowlist); agent prompt switched to the ordered param contract
A3 + A4 widget-renderer.tsx custom iframe rail deleted (guard test pins zero references); export overlay rides activity content; all three skills rewritten for the procedural channel (csshtmljsFunctions toolbox → jsExpressions stepwise, parameterized generators for one-expression refinement turns)
Review 6-lens adversarial review, 12 confirmed findings fixed — incl. an ExportOverlay remount that destroyed the live sandbox at completion, export script scoping, CSP restoration, CI test wiring, Dockerfile/render.yaml workspace builds
Model switch Agent runs Anthropic Claude Fable 5 (claude-fable-5) via langchain-anthropic; ANTHROPIC_API_KEY replaces OPENAI_API_KEY everywhere
Live-run fixes Found by running the acceptance prompt end-to-end: copilotkit 0.1.78→0.1.94 (AG-UI Context serialization crash), new ConsecutiveSystemMessagesMiddleware (Anthropic rejects non-consecutive system messages; repairs streamed zero-length thinking blocks), single-build prompt rule (followUp loop), max_tokens=64000 (default 4096 truncated tool args mid-stream — root cause of blank widgets)

Plan acceptance criteria

  1. Canonical rails, proven live: "Build an airplane that has pitch, yaw, roll" produces a working Three.js widget through generateSandboxedUi activity events (verified in-browser: full protocol Acknowledge → plan card → placeholders → build → narrate; screenshot in session)
  2. CSS-first gating: test-pinned — html chunks streaming without cssComplete create no sandbox
  3. No host-window access: websandbox null-origin iframe (sandbox="allow-scripts"); all bridge calls Zod-validated, openLink https-only + env origin allowlist
  4. Exactly one design-system CSS copy: enforced by a repo-scan test (A1 found the two forks had already drifted)
  5. Non-regressions: plan card, narration protocol, charts hybrid, export overlay, template library, HITL, QR flow — all traced and test-covered

Test coverage

116 vitest (app 97 / design-system 15 / mcp 4) + 45 pytest, all written red-green. CI now runs both JS and Python suites.

Notes for reviewers

  • The plan's transient A2 feature flag was skipped — single-PR migration made it dead ceremony; the phased commits preserve the rollout shape.
  • LOADING_PHRASES cycling and placeholderMessages both provide a "generating" voice — left as a UX decision.
  • apps/mcp/Dockerfile was already broken pre-migration (separate task).
  • Local run requires ANTHROPIC_API_KEY in the repo-root .env (see .env.example).

🤖 Generated with Claude Code

jerelvelarde and others added 8 commits June 10, 2026 08:19
…n system

- Add vitest (apps/app jsdom, apps/mcp node) + pytest (apps/agent) + turbo test task
- Pin @copilotkit/{react-core,react-ui,runtime} to 1.55.2-next.1 (canonical
  OpenGenerativeUI rails: generateSandboxedUi, OpenGenerativeUIMiddleware)
- New @repo/design-system package single-sources THEME_CSS / SVG_CLASSES_CSS /
  FORM_STYLES_CSS / importmap; apps/app and apps/mcp now consume it, killing
  the "keep in sync" fork in apps/mcp/src/renderer.ts (plan acceptance #4,
  enforced by a repo-scan test)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- CopilotRuntime now sets openGenerativeUI: true (auto-applies
  OpenGenerativeUIMiddleware; options extracted to a tested builder)
- Demo-owned activity renderer for "open-generative-ui" events overrides the
  built-in via renderActivityMessages: keeps CSS-first gating + adds design
  system & importmap injection, idiomorph preview morphing, and continuous
  ResizeObserver autosize (demo non-regressions)
- Zod-validated sandboxFunctions: sendPrompt (chat bridge via CustomEvent +
  OpenGenUIPromptBridge) and openLink (https-only + optional origin allowlist)
- OPEN_GEN_UI_DESIGN_SKILL replaces the default shadcn designSkill
- Python agent prompt contract switched widgetRenderer -> generateSandboxedUi
  (ordered params, css-first, Websandbox.connection.remote bridge, dynamic
  imports), protocol & quality bar preserved; prompt hoisted to src/prompt.py
- 60 app tests + 15 design-system tests + 8 pytest, all red-green

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…channel

A3 (apps/app):
- Delete widget-renderer.tsx and its useComponent registration; guard test
  pins zero widgetRenderer/widget-renderer references in src
- Export overlay now rides the canonical activity content via
  assembleStandaloneHtmlFromActivity (importmap -> design system -> css ->
  Websandbox stub -> html -> jsFunctions/jsExpressions)
- Legacy open-link postMessage handler hardened via isAllowedLinkUrl
- template-card imports THEME_CSS from @repo/design-system

A4 (apps/agent):
- All three SKILL.md files rewritten to generateSandboxedUi ordered contract:
  css-first reveal, jsFunctions toolbox + jsExpressions stepwise invocations,
  parameterized generators for one-expression refinement turns, Websandbox
  bridge calls, dynamic-import library guidance; quality bar preserved
- 17 new pytest contract tests (red-green); 25 total green

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… wiring

Adversarial review of the full migration diff (6 lenses, 12 confirmed
critical/major findings, all fixed red-green):

- ExportOverlay now always wraps the renderer (ready={isComplete}) — the
  conditional wrap remounted the container at completion and destroyed the
  live final sandbox iframe
- Exported standalone HTML emits one classic script (jsFunctions as globals,
  matching live sandbox.run semantics; expressions in an async IIFE) and
  escapes </style> in generated css
- Legacy four-CDN CSP meta restored on the final sandbox frame
- prompt.py + skills no longer teach top-level await in jsFunctions /
  jsExpressions (classic-script semantics; snippets wrapped in async fns)
- Seed templates use the Websandbox bridge form; apply_template appends a
  canonical-translation usage note
- Vestigial unvalidated open-link postMessage handler removed from page.tsx
- turbo lint/test dependsOn ^build; CI now runs pnpm test + new pytest job
- Dockerfile.app + render.yaml build @repo/design-system before the app
- README + docs/generative-ui.md rewritten to the canonical rails
- New coverage: process-partial-html (12), seed-templates, MCP renderer,
  resize clamping/non-finite guard, morph-enter entrance parity

Suites: 97 app + 15 design-system + 4 mcp vitest, 32 pytest — all green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- New src/model.py factory: ChatAnthropic, default claude-fable-5,
  LLM_MODEL env override preserved (red-green tested)
- Drop langchain-openai + openai deps; add langchain-anthropic
- ANTHROPIC_API_KEY replaces OPENAI_API_KEY everywhere: render.yaml,
  Makefile, README, CONTRIBUTING, CLAUDE.md, docs (getting-started,
  deployment, bring-to-your-app, agent-state, agent-tools)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Found by running the airplane acceptance prompt live on Fable 5 (all
red-green tested, 44 pytest green):

- Bump copilotkit 0.1.78 -> 0.1.94 + ag-ui-langgraph -> >=0.0.38: 0.1.78's
  CopilotKitMiddleware crashes (TypeError) serializing the AG-UI Context
  pydantic models the canonical frontend now sends as agent context
- New src/anthropic_compat.py ConsecutiveSystemMessagesMiddleware:
  (1) reorders the middleware's "App Context" SystemMessage (appended after
  the human turn by the add_messages reducer) to the head — Anthropic rejects
  non-consecutive system messages; (2) repairs streamed zero-length thinking
  blocks missing the "thinking" field, which Anthropic 400s on tool-loop
  replay
- Prompt: generateSandboxedUi at most once per request — the followUp run
  returning "UI generated" caused Fable to rebuild the widget in a loop
  (observed 5+ rebuilds; legacy useComponent rail had no followUp runs)
- .claude/launch.json for the preview server

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root cause of the blank 3D canvas: langchain-anthropic's default
max_tokens (4096) truncated generateSandboxedUi tool args mid-stream —
css and html arrived but jsFunctions/jsExpressions were cut off, so the
middleware never emitted htmlComplete and the renderer never left its
preview sandbox (no final sandbox, no JS execution, no autosize). The
same truncation caused the earlier rebuild loop: each followUp turn saw
its own amputated tool call.

Diagnosed by replaying the exact streaming delta sequence against the
real renderer (passed — renderer was sound) and then extracting the live
activity content via React fiber: keys stopped at html with no
htmlComplete. Verified fixed live: full 19KB final frame, continuous
resize reports, working Three.js airplane with pitch/yaw/roll controls.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Production safety net for the Fable 5 cutover: flipping LLM_MODEL to a
gpt-* name in the Render dashboard falls back to OpenAI without a code
change. OPENAI_API_KEY re-declared in render.yaml (sync: false) so the
existing dashboard value survives blueprint sync.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jerelvelarde

Copy link
Copy Markdown
Collaborator Author

Added a production fallback in e54c1ff: LLM_MODEL names starting with gpt- route to ChatOpenAI, everything else to ChatAnthropic (default claude-fable-5). OPENAI_API_KEY is re-declared in render.yaml (sync: false) so the existing dashboard value survives blueprint sync — prod can fall back to GPT-5 by flipping LLM_MODEL in the Render dashboard, no redeploy of code required.

Deploy note: ANTHROPIC_API_KEY must be set on the open-generative-ui-agent Render service before/at merge, or chat requests will fail at the model call (service still boots and passes health checks).

🤖 Generated with Claude Code

The Render service's dashboard build command (pre-blueprint-sync) runs
plain `pnpm --filter @repo/app build`, which does not build workspace
dependencies — @repo/design-system dist/ is missing and next build dies
with Module not found. Building the dependency inside the app's build
script makes every invocation path work: turbo (^build, now redundant
but cheap), the render.yaml buildCommand, the stale dashboard command,
and Dockerfile.app.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jerelvelarde jerelvelarde temporarily deployed to jerel/cranky-kalam-f06e99 - open-generative-ui-app PR #85 June 10, 2026 22:28 — with Render Destroyed
@jerelvelarde jerelvelarde merged commit 53720d6 into main Jun 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant