Skip to content

feat(analysis): add visual timing QA#73

Open
RayJiang4S wants to merge 9 commits into
calesthio:mainfrom
RayJiang4S:codex/visual-timing-qa
Open

feat(analysis): add visual timing QA#73
RayJiang4S wants to merge 9 commits into
calesthio:mainfrom
RayJiang4S:codex/visual-timing-qa

Conversation

@RayJiang4S

Copy link
Copy Markdown

Summary

  • add visual_timing_qa, a post-render helper for cue-based visual timing review
  • support suggest_cues, dry_run, review, and annotate operations
  • generate review markdown, contact sheets, reviewer summaries, and structured annotation notes
  • document when to use/skip this workflow and wire it into the explainer QA guidance

Notes

This is intentionally human-in-the-loop. It does not call an external vision model or make automatic creative approvals; it organizes cue windows so reviewers can verify whether key visual states match narration.

Test Plan

  • .venv/bin/python -m pytest tests/tools/test_visual_timing_qa.py -q
  • .venv/bin/python -m py_compile tools/analysis/visual_timing_qa.py tests/tools/test_visual_timing_qa.py

@RayJiang4S RayJiang4S requested a review from calesthio as a code owner May 12, 2026 08:39
Add conservative rule-based checks to visual_timing_qa review output so obvious timing and subtitle risks are surfaced before human review.
Add a reviewer-friendly HTML review surface with per-cue image navigation,
filters, inline decisions, copied review payloads, and machine-readable
action items for downstream video adjustment workflows.
@RayJiang4S

Copy link
Copy Markdown
Author

Follow-up update for the Visual Timing QA review workflow.

This update turns the generated HTML from a passive contact-sheet artifact into an actionable reviewer surface:

  • Adds a reviewer-friendly HTML page with per-cue cards, frame thumbnails, and grouped lightbox navigation.
  • Adds filters for auto-review status and human-review status.
  • Adds inline human decisions: pass, needs adjustment, and wrong expectation.
  • Keeps review efficient by treating unreviewed items as implicit pass on submit, so reviewers only mark exceptions.
  • Exports/copies a static JSON review payload from the HTML page for agent/workflow handoff.
  • Extends annotate to accept annotations_path and write summary plus action_items into review_notes.json.
  • Localizes review UI and heuristic messages from subtitle/narration language, while preserving English fallback.
  • Removes the contact-sheet overview link from the main UI in favor of per-cue frame review; contact sheets are still generated as artifacts.

The intended production loop is:

  1. Generate a Visual Timing QA review page after rendering.
  2. Reviewer marks only problematic cues and submits the copied JSON payload.
  3. The agent/workflow runs annotate with that payload.
  4. review_notes.json.action_items becomes the machine-readable handoff for adjusting cue timing, animation timing, subtitle rendering, or cue definitions.
  5. Re-render and run a focused recheck until action_items is empty.

Validated locally with:

python -m pytest tests/tools/test_visual_timing_qa.py -q

Result: 11 passed.

Add machine-readable completion state for visual timing annotations so review rounds can continue until every cue passes. Honor unreviewed_policy=PASS submissions from the interactive review page and document the rerender/review loop.
@RayJiang4S

Copy link
Copy Markdown
Author

Added a generic English demo screenshot for the Visual Timing QA review page, so the PR does not rely on any private project footage or internal UI.

The latest update also makes the human review loop explicit and iterative: annotate now reports review_complete, next_operation, pending_review_cues, and missing_review_cues. If any cue still needs timing/layout/expectation changes, the workflow sends the video back for revision, then reruns review and annotate until every cue passes.

The interactive page's unreviewed_policy=PASS is now honored by the tool as well, so untouched cues can be recorded as passed when the reviewer only marks problematic items.

Image

Add process and node-flow review prompts plus initial-review heuristics so visual timing QA can flag states that may complete before narration reaches the cue.
@RayJiang4S

Copy link
Copy Markdown
Author

Updated this branch with the early-complete timing QA follow-up from the latest promo review.

What changed:

  • Added an early_complete_process_check capability marker for visual timing QA.
  • Adds process/node-flow review prompts so reviewers explicitly check whether the final state has already completed before narration reaches the cue.
  • Includes early-complete metadata in Markdown and HTML review output.
  • Adds rule-based initial-review notes for static process windows and cases that appear settled by the target frame.
  • Covers the new behavior in tests/tools/test_visual_timing_qa.py.

Validation:

  • .venv/bin/python -m pytest tests/tools/test_visual_timing_qa.py -q passed: 14 tests.
  • py_compile passed for tools/analysis/visual_timing_qa.py and tests/tools/test_visual_timing_qa.py.
  • Broader tests/tools/test_visual_timing_qa.py tests/contracts/test_phase3_contracts.py -q still hits the pre-existing provider catalog mismatch because this local registry includes doubao; unrelated to this PR.

Add agent-facing review routing, stale visual-state checks, and contact sheet lightbox navigation for visual timing QA.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant