Plan A: migrate demo onto canonical generateSandboxedUi rails (Fable 5)#85
Merged
Conversation
…n system
- Add vitest (apps/app jsdom, apps/mcp node) + pytest (apps/agent) + turbo test task
- Pin @copilotkit/{react-core,react-ui,runtime} to 1.55.2-next.1 (canonical
OpenGenerativeUI rails: generateSandboxedUi, OpenGenerativeUIMiddleware)
- New @repo/design-system package single-sources THEME_CSS / SVG_CLASSES_CSS /
FORM_STYLES_CSS / importmap; apps/app and apps/mcp now consume it, killing
the "keep in sync" fork in apps/mcp/src/renderer.ts (plan acceptance #4,
enforced by a repo-scan test)
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- CopilotRuntime now sets openGenerativeUI: true (auto-applies OpenGenerativeUIMiddleware; options extracted to a tested builder) - Demo-owned activity renderer for "open-generative-ui" events overrides the built-in via renderActivityMessages: keeps CSS-first gating + adds design system & importmap injection, idiomorph preview morphing, and continuous ResizeObserver autosize (demo non-regressions) - Zod-validated sandboxFunctions: sendPrompt (chat bridge via CustomEvent + OpenGenUIPromptBridge) and openLink (https-only + optional origin allowlist) - OPEN_GEN_UI_DESIGN_SKILL replaces the default shadcn designSkill - Python agent prompt contract switched widgetRenderer -> generateSandboxedUi (ordered params, css-first, Websandbox.connection.remote bridge, dynamic imports), protocol & quality bar preserved; prompt hoisted to src/prompt.py - 60 app tests + 15 design-system tests + 8 pytest, all red-green Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…channel A3 (apps/app): - Delete widget-renderer.tsx and its useComponent registration; guard test pins zero widgetRenderer/widget-renderer references in src - Export overlay now rides the canonical activity content via assembleStandaloneHtmlFromActivity (importmap -> design system -> css -> Websandbox stub -> html -> jsFunctions/jsExpressions) - Legacy open-link postMessage handler hardened via isAllowedLinkUrl - template-card imports THEME_CSS from @repo/design-system A4 (apps/agent): - All three SKILL.md files rewritten to generateSandboxedUi ordered contract: css-first reveal, jsFunctions toolbox + jsExpressions stepwise invocations, parameterized generators for one-expression refinement turns, Websandbox bridge calls, dynamic-import library guidance; quality bar preserved - 17 new pytest contract tests (red-green); 25 total green Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… wiring
Adversarial review of the full migration diff (6 lenses, 12 confirmed
critical/major findings, all fixed red-green):
- ExportOverlay now always wraps the renderer (ready={isComplete}) — the
conditional wrap remounted the container at completion and destroyed the
live final sandbox iframe
- Exported standalone HTML emits one classic script (jsFunctions as globals,
matching live sandbox.run semantics; expressions in an async IIFE) and
escapes </style> in generated css
- Legacy four-CDN CSP meta restored on the final sandbox frame
- prompt.py + skills no longer teach top-level await in jsFunctions /
jsExpressions (classic-script semantics; snippets wrapped in async fns)
- Seed templates use the Websandbox bridge form; apply_template appends a
canonical-translation usage note
- Vestigial unvalidated open-link postMessage handler removed from page.tsx
- turbo lint/test dependsOn ^build; CI now runs pnpm test + new pytest job
- Dockerfile.app + render.yaml build @repo/design-system before the app
- README + docs/generative-ui.md rewritten to the canonical rails
- New coverage: process-partial-html (12), seed-templates, MCP renderer,
resize clamping/non-finite guard, morph-enter entrance parity
Suites: 97 app + 15 design-system + 4 mcp vitest, 32 pytest — all green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- New src/model.py factory: ChatAnthropic, default claude-fable-5, LLM_MODEL env override preserved (red-green tested) - Drop langchain-openai + openai deps; add langchain-anthropic - ANTHROPIC_API_KEY replaces OPENAI_API_KEY everywhere: render.yaml, Makefile, README, CONTRIBUTING, CLAUDE.md, docs (getting-started, deployment, bring-to-your-app, agent-state, agent-tools) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Found by running the airplane acceptance prompt live on Fable 5 (all red-green tested, 44 pytest green): - Bump copilotkit 0.1.78 -> 0.1.94 + ag-ui-langgraph -> >=0.0.38: 0.1.78's CopilotKitMiddleware crashes (TypeError) serializing the AG-UI Context pydantic models the canonical frontend now sends as agent context - New src/anthropic_compat.py ConsecutiveSystemMessagesMiddleware: (1) reorders the middleware's "App Context" SystemMessage (appended after the human turn by the add_messages reducer) to the head — Anthropic rejects non-consecutive system messages; (2) repairs streamed zero-length thinking blocks missing the "thinking" field, which Anthropic 400s on tool-loop replay - Prompt: generateSandboxedUi at most once per request — the followUp run returning "UI generated" caused Fable to rebuild the widget in a loop (observed 5+ rebuilds; legacy useComponent rail had no followUp runs) - .claude/launch.json for the preview server Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root cause of the blank 3D canvas: langchain-anthropic's default max_tokens (4096) truncated generateSandboxedUi tool args mid-stream — css and html arrived but jsFunctions/jsExpressions were cut off, so the middleware never emitted htmlComplete and the renderer never left its preview sandbox (no final sandbox, no JS execution, no autosize). The same truncation caused the earlier rebuild loop: each followUp turn saw its own amputated tool call. Diagnosed by replaying the exact streaming delta sequence against the real renderer (passed — renderer was sound) and then extracting the live activity content via React fiber: keys stopped at html with no htmlComplete. Verified fixed live: full 19KB final frame, continuous resize reports, working Three.js airplane with pitch/yaw/roll controls. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Production safety net for the Fable 5 cutover: flipping LLM_MODEL to a gpt-* name in the Render dashboard falls back to OpenAI without a code change. OPENAI_API_KEY re-declared in render.yaml (sync: false) so the existing dashboard value survives blueprint sync. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Collaborator
Author
|
Added a production fallback in e54c1ff: Deploy note: 🤖 Generated with Claude Code |
The Render service's dashboard build command (pre-blueprint-sync) runs plain `pnpm --filter @repo/app build`, which does not build workspace dependencies — @repo/design-system dist/ is missing and next build dies with Module not found. Building the dependency inside the app's build script makes every invocation path work: turbo (^build, now redundant but cheap), the render.yaml buildCommand, the stale dashboard command, and Dockerfile.app. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements Plan A — Improve the Demo from the Canonical: the demo now runs on the shipped canonical rails (
generateSandboxedUi→OpenGenerativeUIMiddleware→ activity renderer) instead of its parallelwidgetRendererimplementation — without regressing the demo's strengths. Every demo improvement is now upstreamable.What changed (by phase)
1.55.2-next.1, new@repo/design-systempackage single-sourcing the THEME/SVG/FORM CSS + importmap (kills the "keep in sync" fork inapps/mcp)openGenerativeUI: trueon the runtime (auto-applies the middleware); demo-owned activity renderer registered viarenderActivityMessagesoverride — keeps CSS-first gating and adds design-system + importmap injection, idiomorph preview morphing, continuous ResizeObserver autosize; Zod-validatedsendPrompt/openLinksandbox functions (https-only + origin allowlist); agent prompt switched to the ordered param contractwidget-renderer.tsxcustom iframe rail deleted (guard test pins zero references); export overlay rides activity content; all three skills rewritten for the procedural channel (css→html→jsFunctionstoolbox →jsExpressionsstepwise, parameterized generators for one-expression refinement turns)claude-fable-5) vialangchain-anthropic;ANTHROPIC_API_KEYreplacesOPENAI_API_KEYeverywherecopilotkit0.1.78→0.1.94 (AG-UI Context serialization crash), newConsecutiveSystemMessagesMiddleware(Anthropic rejects non-consecutive system messages; repairs streamed zero-length thinking blocks), single-build prompt rule (followUp loop),max_tokens=64000(default 4096 truncated tool args mid-stream — root cause of blank widgets)Plan acceptance criteria
generateSandboxedUiactivity events (verified in-browser: full protocol Acknowledge → plan card → placeholders → build → narrate; screenshot in session)cssCompletecreate no sandboxsandbox="allow-scripts"); all bridge calls Zod-validated,openLinkhttps-only + env origin allowlistTest coverage
116 vitest (app 97 / design-system 15 / mcp 4) + 45 pytest, all written red-green. CI now runs both JS and Python suites.
Notes for reviewers
LOADING_PHRASEScycling andplaceholderMessagesboth provide a "generating" voice — left as a UX decision.apps/mcp/Dockerfilewas already broken pre-migration (separate task).ANTHROPIC_API_KEYin the repo-root.env(see.env.example).🤖 Generated with Claude Code