Skip to content

Pull#65

Open
dramankhanna91-svg wants to merge 7 commits into
calesthio:mainfrom
dramankhanna91-svg:main
Open

Pull#65
dramankhanna91-svg wants to merge 7 commits into
calesthio:mainfrom
dramankhanna91-svg:main

Conversation

@dramankhanna91-svg

Copy link
Copy Markdown

No description provided.

Aman Khanna and others added 7 commits April 14, 2026 18:54
Plan for implementing 3 Google Flow account slots with sequential
fallback (session_1/2/3.json). Covers _google_flow_base.py helpers,
google_flow_setup.py --slot arg, and execute() refactor in both
google_flow_image and google_flow_video.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Split into dedicated image and video tools sharing a base module:
- google_flow_image.py (v1.2.0): Nano Banana image generation; batch
  quantity x1-x4, all images downloaded, ingredients limit enforced
- google_flow_video.py (v0.5.0): Veo video generation; frames/ingredients
  mutual exclusion, camera motion, continue_prompt, batch download, 720p-4K
- _google_flow_base.py: shared Playwright helpers (session check, settings
  panel, frame/ingredient upload, download submenu, camera motion)
- google_flow_setup.py: one-time login + session save
- GOOGLE_FLOW_TOOLS.md: human-readable capability map

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New tools:
- tools/video/heygen_browser_video.py: HeyGen avatar video via Playwright
- tools/video/magic_hour_video.py: Magic Hour video via Playwright
- tools/graphics/giphy_search.py: Giphy animated GIF search

New agent skills (Layer 3):
- .agents/skills/google-flow/SKILL.md: Google Flow prompting + parameter guide
- .agents/skills/giphy/SKILL.md: Giphy search guidance
- .agents/skills/magic-hour/SKILL.md: Magic Hour browser automation guide
- .agents/skills/heygen/SKILL.md: updated HeyGen skill

New meta/tool skills:
- skills/meta/browser-tools-setup.md: Playwright session setup guide
- skills/tools/browser-tools.md: browser tool usage patterns

Core updates:
- tools/base_tool.py: extended BaseTool with new fields
- AGENT_GUIDE.md, PROJECT_CONTEXT.md: updated for new capabilities
- CHANGELOG.md, CODE_REVIEW.md, docs/useapi_net_analysis.md: added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- google-flow/SKILL.md: add Usage Rules section (no 4k, camera motion
  off by default, continue_prompt director-only, first_frame/last_frame
  for continuity only with fail-and-drop behavior, ingredients only for
  user-provided references); update download quality table to drop 4k;
  expand parameter docs with validated defaults from committed code
- skills/tools/browser-tools.md: add inline rules note to google_flow_video
  entry reflecting same constraints
- CHANGELOG.md: add commit SHAs (3aa57ab, d7b4f1b), expand google_flow
  entries with version numbers and full feature list, mark 720p bug fixed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…magen

- piper_tts: resolve voice model path from ~/.local/share/piper-tts/voices,
  ~/.piper/models, or $PIPER_MODELS_DIR before passing to --model flag
- google_flow_image: fix NameError — final_prompt → prompt (line 264)
- google_flow_video + google_flow_image: raise page.goto timeout 30s → 150s,
  switch waitUntil networkidle → domcontentloaded, add session-expired detection
- video_compose + remotion_caption_burn: add _check_libass() cached pre-flight;
  gracefully skip subtitle filter with RuntimeWarning when libass absent
- google_tts: add Gemini AI Studio mode via generativelanguage.googleapis.com;
  GEMINI_API_KEY/GOOGLE_API_KEY → gemini_studio, GOOGLE_APPLICATION_CREDENTIALS
  → cloud_tts; result includes mode field
- google_imagen: add dual-mode (gemini_studio / vertex_ai) + Gemini Pro and
  Flash image generation via response_modalities IMAGE; expose model aliases
  gemini-pro and gemini-flash in input schema; cost estimates updated
- .env.example: document GEMINI_API_KEY, GOOGLE_APPLICATION_CREDENTIALS,
  PIPER_MODELS_DIR, HeyGen MCP OAuth note, Gemini video generation (Veo 2/3)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- GOOGLE_FLOW_EMAIL: browser-based Google Flow setup
- HIGGSFIELD_API_KEY/SECRET + combined HIGGSFIELD_KEY
- MAGICHOUR_API_KEY: Magic Hour video effects
- COVERR, POND5, VIDEVO, NARA, NASA stock footage keys
- GIPHY_API_KEY: GIF overlay support

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant