Skip to content

feat(skills): teach temporal thinking and visual variety#716

Open
miguel-heygen wants to merge 2 commits into
mainfrom
feat/skill-temporal-thinking
Open

feat(skills): teach temporal thinking and visual variety#716
miguel-heygen wants to merge 2 commits into
mainfrom
feat/skill-temporal-thinking

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

Summary

Addresses the core gap where LLMs default to slide-like layouts (centered text over dark background, same layout every scene). The main hyperframes skill now teaches agents to think through frames in time rather than composing pages.

What's new in SKILL.md

Temporal map requirement (Step 3 in Plan)

  • Write a one-line-per-second description of what the viewer sees before any HTML
  • Forces the agent to think about visual interest at each moment, not just content structure

"Think in Frames, Not Pages" section

  • Slideshow trap: explicit anti-patterns (same layout, same animation, same color temp, no surprise)
  • Scene variety checklist: 7 layout types to rotate between (statement, full-bleed image, split frame, kinetic type, data beat, terminal/code, atmospheric)
  • One focus per frame: billboard-per-beat principle
  • Beat duration guide: impact (0.7-1.8s), content (2-4s), atmosphere (4-8s)

Easing vocabulary table

  • Intent-based ease selection (snap, overshoot, soft land, mechanical, spring, dramatic) instead of defaulting to power2.out on everything

What's NOT changed

  • All existing rules, data attributes, composition structure, transition rules — untouched
  • House style and motion principles refs stay as-is (the new sections complement, not replace)

Test plan

  • A/B comparison: same prompt rendered with old vs new skill (in progress, will post results)
  • Verify temporal map step doesn't slow down simple edits (the "skip straight to rules" escape hatch is preserved)

🤖 Generated with Claude Code

Addresses the gap where LLMs default to slide-like layouts (centered
text over dark background repeated for every scene). The main skill
now teaches:

- Temporal map: write what the viewer sees per second before any HTML
- Slideshow trap: explicit anti-patterns and how to break them
- Scene variety: table of layout types to rotate between
- One focus per frame: billboard-per-beat principle
- Beat duration guide: impact/content/atmosphere timing
- Easing vocabulary: intent-based ease selection instead of power2.out

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents default to CSS rectangles for illustrations, producing amateur
visuals. The skill now:

- Mandates inline SVG over CSS shapes for any non-text visual
- Provides a table of SVG patterns per visual need (diagrams, node
  graphs, data viz, icons, decoratives, waveforms)
- Requires 3-layer depth per scene (background + content + accent)
- Includes the stroke draw-on pattern inline since it's the most
  commonly needed SVG animation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review at e466b8d3. CI is mostly "skipping" — Format, Lint, Build, Test, Typecheck, CLI smoke, all windows checks — because the PR only touches skills/hyperframes/SKILL.md, which doesn't trigger those code paths. Required checks that did run (Analyze, CodeQL, Detect changes, Format, Semantic PR title) are green.

Audited

  • skills/hyperframes/SKILL.md end-to-end (+109/-4)

Strengths

  • Temporal map step is the right shape. Step 3 in the Plan (SKILL.md:51-52) explicitly demands a one-line-per-second viewer description before any HTML. The example block (0.0s Black → title fades up…) is concrete enough that an agent can copy it; abstract enough that it adapts to any subject. This is the kind of forcing function that the system-prompt rules can enforce — the agent CAN'T write HTML without first emitting the map.
  • "Slideshow trap" anti-patterns are calibrated specifically. "Same layout repeated → restructure" / "Same animation repeated → each scene needs its own entrance character" / "Same color temp" / "No surprise" — these are the four most common LLM defaults this skill is fighting. Naming them by their failure mode is more effective than abstract "make it dynamic" advice.
  • Easing vocabulary table maps intent → ease instead of "default to power2.out everywhere." snap / overshoot / soft land / mechanical / spring / dramatic is the right level of abstraction — an agent can pick from six named affects without memorizing ease curves.
  • <HARD-GATE> block at :64 is preserved — the existing "verify you have a visual identity" gate stays, and the new step 3 doesn't slip past it. Good additive layering.
  • Beat duration guide (impact 0.7-1.8s / content 2-4s / atmosphere 4-8s) gives the agent timing anchors. Without these the default of "2s per beat" averages everything to slideshow rhythm.

Important — this PR has been superseded by #762

#762 ("fix(cli): add source discriminator to telemetry events") includes the same two commits as this PR (3073d0ab + e466b8d3) plus one additional commit (33f809f0, the telemetry fix). #762's history is a strict superset of #716's.

If #762 merges first, this PR becomes a no-op. If this PR merges first, #762's skills-portion vanishes from the diff (becomes telemetry-only). Either flow works, but the merge queue should know — pick a target and close the other.

My recommendation: land this PR first (skills changes have separate review-and-rollback risk from telemetry; ship them independent). Then split #762 down to telemetry-only, fix its three failing required checks, and land that separately.

Important — no positive-pin test on the prompt-text changes (Rule 9)

This PR changes the prompt text the LLM agent reads to plan compositions. Per Rule 9, prompt-text changes need a positive-pin test that asserts on the specific wording — generic "the skill loads" coverage isn't enough.

Concrete asks:

  • A test that asserts "Write a temporal map first" is present in the loaded skill.
  • A test that asserts "slideshow trap" (lowercase, exact phrase) is present.
  • A test that asserts the easing-vocabulary section has the six named affects.

This is the kind of regression that ships silently otherwise — a future wording polish or merge conflict could drop the temporal-map gate and no one would catch it until the agent's output regresses to slideshows. The HF skill is the agent's primary input — pin it.

Carve-out caveat: if the team treats the hyperframes skill as still finding its voice and is doing wording polish per merge, scope the pins to the concept (temporal map, slideshow trap, easing vocabulary) rather than exact phrases. That trades brittleness for survival across polish passes.

Nit

  • The ## Think in Frames, Not Pages section starts at :64 but the cross-reference from step 3 is See "Think in Frames" below (different wording). Either rename the section or update the reference for grep-findability.

Verdict

Verdict: APPROVE
Reasoning: The temporal-map step + slideshow-trap anti-patterns + easing-vocabulary are exactly the right shape for fighting the LLM's default-to-slides bias. The Rule 9 prompt-text pinning is the only material gap. PR is a strict subset of #762 — pick one to merge and close the other.

— Vai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants