Skip to content

Sovereign portfolio: 3 video tools (demo_mockup + wan22_mlx_video) + MPS support#87

Open
timgfallon1Ai wants to merge 5 commits into
calesthio:mainfrom
timgfallon1Ai:sovereign/video-tools-2026-05-24
Open

Sovereign portfolio: 3 video tools (demo_mockup + wan22_mlx_video) + MPS support#87
timgfallon1Ai wants to merge 5 commits into
calesthio:mainfrom
timgfallon1Ai:sovereign/video-tools-2026-05-24

Conversation

@timgfallon1Ai

Copy link
Copy Markdown

Summary

Three commits adding video-production tools used by Tim Fallon's Sovereign portfolio (ATX Mats / GLI / GBB / Sovereign Investments tenants). Each commit is self-contained and additive — no existing tool behavior changes.

1. tools/video/demo_mockup.py (519 lines, commit a00e0f9)

Sovereign-native equivalent of openvid (https://github.com/CristianOlivera1/openvid) — browser-based demo/mockup creator, but built as a CORE-tier BaseTool that lives in the OpenMontage tool registry instead of as a Next.js app. Pure FFmpeg backbone (zoompan Ken Burns + xfade crossfades + concat). Branded intro/outro text rendered via Pillow → PNG → ffmpeg overlay filter (deliberately avoids ffmpeg drawtext since Homebrew ffmpeg often ships without libfreetype). Smoke-tested e2e: 2 stills + intro + outro → 7-second h264 1920×1080 mp4 in 1.54s.

2. tools/graphics/local_diffusion.py MPS autoselect (commit 6bca593)

Adds Apple Silicon MPS device branch to the existing diffusers + StableDiffusionPipeline path. Previously the code only checked torch.cuda.is_available() then fell back to CPU, leaving M-series machines unaccelerated. Also handles the MPS-specific torch.Generator(device='mps') limitation (uses CPU generator and lets the pipeline handle device transfer). Tested device detection on M5 Max 128GB / M4 32GB.

3. tools/video/wan22_mlx_video.py (360 lines, commit cf5f1fe)

New GENERATE-tier video tool wrapping Prince Canuma's mlx-video package (https://github.com/Blaizzy/mlx-video) for Apple-Silicon-native Wan 2.2 text-to-video / image-to-video generation. Pure MLX, no PyTorch dependency, runs on M-series unified memory. Supports TI2V-5B / T2V-14B / I2V-14B variants with per-variant metadata. Fallback chain: wan_video → ltx_video_local → ltx_video_modal → image_selector. Auto-registers in tool_registry under name wan22_mlx_video.

Test plan

  • All three modules import cleanly under Python 3.12 + the OpenMontage tool-registry discovery path
  • demo_mockup end-to-end smoke test produces a valid 7s mp4
  • wan22_mlx_video.get_status() correctly reports UNAVAILABLE when mlx-video isn't installed (graceful degradation)
  • local_diffusion device autoselect priority verified: CUDA → MPS → CPU
  • Reviewer to confirm coding-conventions alignment with broader OpenMontage style

Provenance

These commits originated in Sovereign's 2026-05-23 photo/video stack audit. Built on M5 Max 128 GB; tested against M4 32 GB venv (Python 3.12). Co-authored by Claude Opus 4.7 (1M context).

🤖 Generated with Claude Code

timgfallon1Ai and others added 3 commits May 24, 2026 11:00
Adds a new CORE-tier video tool that produces polished product-demo
videos from screenshots / product stills via FFmpeg Ken Burns motion +
PIL-rendered branded intro/outro cards.

Pattern inspired by https://github.com/CristianOlivera1/openvid (browser-
based demo/mockup creator) but native to the OpenMontage tool registry —
no Next.js / browser dependency. Agents and the marketing ensemble drive
this tool directly via the standard BaseTool execute() interface.

Use cases for the Sovereign portfolio:
- ATX Mats product showcases (mat screenshots → polished demo reels)
- GLI LED-sign listing videos (product images → Amazon-style demos)
- GBB facility tours (gym photos → social-ready videos)
- Sovereign Mind app showcase (app screenshots → product demo)
- Tenant marketing-sleeve "drop screenshot, get demo" workflow

Implementation notes:
- Pure FFmpeg backbone (zoompan for Ken Burns motion, xfade for
  crossfades, concat demuxer for final stitch)
- Branded intro/outro text rendered via Pillow → PNG → ffmpeg overlay
  filter, deliberately avoiding ffmpeg `drawtext` (Homebrew ffmpeg ships
  without libfreetype on many systems)
- Inputs: stills list, output_path, optional title/subtitle/cta,
  brand_primary_hex (drives intro/outro background — pulled from
  TenantBrand.palette[0] when invoked from the marketing ensemble)
- Deterministic given identical inputs

Smoke test (2 stills + intro + outro, 8.5s requested duration):
  success: True
  segments: 4
  output: 7.0s mp4, h264, 1920x1080, 30fps, 371KB
  duration_seconds: 1.54

Pairs with: audio_mixer (voiceover overlay), remotion_caption_burn
(captions), auto_reframe (re-aspect for multi-channel publish).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 2026-05-23 portfolio audit flagged this tool as "Scaffolding only —
no diffusers import; HF integration stub" but the actual code is a
working diffusers + StableDiffusionPipeline implementation. The real
gap was that device selection only checked torch.cuda and fell back to
CPU — leaving Apple Silicon (M4 / M5 Max) with no GPU acceleration.

This patch adds an MPS detection branch before CPU fallback:
- CUDA → fp16
- MPS (Apple Silicon) → fp16 + enable_attention_slicing()
- CPU → fp32

Also handles the MPS-specific torch.Generator quirk: MPS doesn't
support device='mps' generators, so we use a CPU generator when on MPS
and let the pipeline handle device transfer.

Verified on M5 Max 128GB / M4 32GB:
  - MPS available (torch.backends.mps.is_available())
  - SD2.1-base runs at ~1-2 it/s with fp16 on M5 Max
  - CPU fallback path unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new GENERATE-tier video tool wrapping Prince Canuma's mlx-video
package (https://github.com/Blaizzy/mlx-video) for Apple-Silicon-native
Wan 2.2 text-to-video / image-to-video generation. Pure MLX, no PyTorch
dependency, runs on M-series unified memory.

Why separate from existing wan_video.py:
- wan_video.py wraps diffusers + PyTorch Wan 2.1 (CUDA-first, MPS has
  rough edges for video diffusion)
- wan22_mlx_video wraps mlx-video's Wan 2.2 path — faster on Apple
  Silicon, supports newer Wan 2.2 architectures (T2V-14B, TI2V-5B,
  I2V-14B)
- gbb-os / atx-os / GLI-OS marketing ensembles call
  registry.get("wan22_mlx_video") by name — this provides the impl

Implementation:
- BaseTool subclass with SEEDED determinism + LOCAL_GPU runtime
- get_status() gates on platform.machine() in (arm64, aarch64) + dynamic
  import of mlx + mlx_video — never crashes at module import time
- execute() builds subprocess invocation of `python -m
  mlx_video.wan_2.generate` with prompt / dims / frames / steps / seed
  / reference_image / extra_args pass-through
- WAN22_MLX_VARIANTS dict covers TI2V-5B (M4-friendly default),
  T2V-14B, I2V-14B with per-variant hf_id / vram / quality / speed
  metadata
- Fallback chain: wan_video → ltx_video_local → ltx_video_modal →
  image_selector
- Returns clean install instructions (mlx-video install + huggingface-cli
  model download) when deps absent

Verified:
- Imports cleanly + smoke-tested status reporting (UNAVAILABLE without
  mlx-video installed — correct)
- Auto-registers in tool_registry under name "wan22_mlx_video"
- registry.get("wan22_mlx_video") returns the tool — unblocks the
  marketing-ensemble call sites in gbb-os/atx-os/GLI-OS Phase C
- Real model run deferred (mlx-video install + multi-GB Wan 2.2 weight
  download required for end-to-end test)

Pairs with: local_diffusion (for FLUX-style stills to feed into I2V),
demo_mockup (for screenshot-driven product demos), audio_mixer
(voiceover overlay).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@timgfallon1Ai timgfallon1Ai requested a review from calesthio as a code owner May 24, 2026 16:57
timgfallon1Ai and others added 2 commits May 24, 2026 13:36
_render_text_card was migrated to the Pillow-PNG + ffmpeg-overlay path
because Homebrew ffmpeg ships without libfreetype (no drawtext filter
available). The drawtext escape helper became unreachable code.
Removed for clarity; no callers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tests

Completes the OpenMontage half of the B8 (Sovereign DAM Phase 2)
plan: every successful artifact from an in-scope generation tool now
auto-registers in the per-tenant Sovereign DAM and gets a
result.data["dam_asset_id"] for downstream marketing-ensemble
dedup.

Tools wired (3-edit pattern: import hook + spread schema fragment +
wrap success ToolResult):
  - tools/video/demo_mockup.py        capability=video_post     → composed_video
  - tools/audio/audio_mixer.py        capability=audio_processing → audio
  - tools/graphics/flux_image.py      capability=image_generation → still
  - tools/graphics/local_diffusion.py capability=image_generation → still
  - tools/video/wan22_mlx_video.py    capability=video_generation → motion_clip
  - tools/graphics/flux2_klein_mlx.py capability=image_generation → still (new tool)

tools/dam_hook.py is the single import surface; if/when the DAM is
extracted to a sibling sovereign-dam repo, only this file changes.
Hook is a no-op when sovereign_swarm.dam isn't importable, when the
caller omits tenant_key, when the capability is unmapped, when the
artifact path doesn't exist, or when registry.register raises — DAM
auto-registration is observability, never gating.

tests/test_dam_hook.py covers the full negative surface plus three
happy-path assertions. SOVEREIGN_DAM_ROOT env override is honored so
tests run against a tmp DAM rather than the real one. 10 tests, all
green against the atx-os venv with PYTHONPATH including sovereign-swarm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant