Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
4e4fde7
Merge branch 'main' into dev
navidshad Feb 24, 2026
033beb8
feat: Implement background task management for preprocessing and upda…
navidshad Feb 24, 2026
092948e
Merge pull request #34 from navidshad/CU-86ewqdkec_Implement-Backgrou…
navidshad Mar 27, 2026
5926744
feat: introduce Vue Flow graph-based chat interface for parallel task…
navidshad Mar 27, 2026
d717e4c
feat: Implement robust cross-platform scenedetect binary and module p…
navidshad Mar 27, 2026
4f2595e
feat: Implement retry functionality for message processing and prepro…
navidshad Mar 27, 2026
ad82624
feat: Add extensive debug logging and improve asynchronous handling w…
navidshad Mar 27, 2026
5439720
feat: Implement video playback and download functionality in result n…
navidshad Mar 27, 2026
a87994b
feat: redesign graph nodes with interactive video previews and add Co…
navidshad Mar 29, 2026
57087e0
feat: add draggable handle to ConversationNode and restrict drag inte…
navidshad Mar 29, 2026
193c978
feat: implement persistent graph node positioning with drag-and-drop …
navidshad Mar 29, 2026
104eee1
feat: add version and file type badges to ResultNode and include vers…
navidshad Mar 29, 2026
ed0c467
feat: add markdown rendering support to ConversationNode and ResultNo…
navidshad Mar 29, 2026
063aa69
Merge pull request #35 from navidshad/CU-86ex2bna2_Support-parallel-t…
navidshad Mar 29, 2026
07ba7d6
chore: release 1.1.6 [skip ci]
github-actions[bot] Mar 29, 2026
1bc2dbd
Merge remote-tracking branch 'origin/main' into dev
navidshad Mar 29, 2026
7b8a56c
feat: implement automated thumbnail generation pipeline, add Gemini 3…
navidshad Mar 29, 2026
6f3a146
feat: implement recursive message branch deletion and add UI controls…
navidshad Mar 29, 2026
719b7c7
Merge pull request #36 from navidshad/CU-86ex2rmyn_Implement-Thumbnai…
navidshad Mar 30, 2026
30b811b
refactor: replace SRT format with line-based transcript format for im…
navidshad Mar 30, 2026
12ce900
feat: add waitForEnrichTranscript pipeline phase and remove inline tr…
navidshad Mar 30, 2026
9bd4d73
feat: introduce EnrichedTimelineSegment and update transcript enrichm…
navidshad Mar 30, 2026
b80559b
Merge pull request #37 from navidshad/CU-86ewqdkht_Fix-transcript-par…
navidshad Mar 30, 2026
874788d
refactor: redesign ResultNode UI with media-centric layout and enhanc…
navidshad Mar 31, 2026
eaf3698
refactor: rename videoUrl to mediaContentUrl and add image support to…
navidshad Mar 31, 2026
54efd4b
feat: upgrade timeline segments to include visual metadata and update…
navidshad Mar 31, 2026
7ce8506
refactor: update GraphChatPage header layout with constrained title w…
navidshad Mar 31, 2026
54ac31a
Merge pull request #38 from navidshad/CU-86ex3gkk8_Add-cost-detail-Vi…
navidshad Mar 31, 2026
2341b3b
feat: integrate AI-generated release notes into the GitHub release wo…
navidshad Mar 31, 2026
347ce54
feat: implement global design system with custom colors, glassmorphis…
navidshad Mar 31, 2026
e9cd917
docs: update PilotUI documentation with source URLs, improved navigat…
navidshad Apr 1, 2026
0026d25
feat: upgrade chat inputs to auto-resizing textareas with consistent …
navidshad Apr 1, 2026
4105e65
feat: add copy-to-clipboard functionality to conversation messages #8…
navidshad Apr 1, 2026
e25f0dc
Merge pull request #41 from navidshad/CU-86ex4147w_Support-multi-line…
navidshad Apr 1, 2026
8a42dc1
feat: add temporary directory safety checks and UI warnings for unsta…
navidshad Apr 1, 2026
d9e6579
Merge pull request #42 from navidshad/CU-86ex41t5v_Add-Wrning-for-tem…
navidshad Apr 1, 2026
c84b74e
feat: add system instruction support to Gemini adapter and integrate …
navidshad Apr 1, 2026
69b59a4
feat: implement pipeline cancellation support using AbortSignal acros…
navidshad Apr 1, 2026
33f5fe5
refactor: ensure usage is recorded immediately and add stop confirmat…
navidshad Apr 1, 2026
96915c0
refactor: organize temporary files into subdirectories, implement imm…
navidshad Apr 1, 2026
9fa85a1
Merge branch 'dev' of https://github.com/navidshad/vgtu-video-summari…
navidshad Apr 1, 2026
e6cad9b
feat: add AbortSignal to task context for cancellation support
navidshad Apr 1, 2026
e06201b
feat: add status field to background tasks and display it in MediaNod…
navidshad Apr 1, 2026
b1c2bcb
feat: implement video URL download support using yt-dlp integration #…
SomiVista Apr 1, 2026
6f63783
feat: implement video resolution selection by adding format fetching …
SomiVista Apr 1, 2026
57b6856
refactor: remove hover-based opacity transitions and update overlay z…
navidshad Apr 1, 2026
7ff8b36
feat: add video metadata retrieval and display in ResultNode component
navidshad Apr 1, 2026
cc1bbae
feat: implement chunked silence segments and update prompt instructio…
navidshad Apr 1, 2026
e5d7b47
feat: add real-time download progress tracking and UI visualization f…
SomiVista Apr 1, 2026
e5bcb86
refactor: make pipeline execution asynchronous and add loading state …
SomiVista Apr 2, 2026
9b4fd3c
Merge remote-tracking branch 'origin/dev' into CU-86ex3gqx6_Implement…
SomiVista Apr 3, 2026
09a22da
refactor: replace manual child_process spawning with ytdlp-nodejs wra…
SomiVista Apr 3, 2026
0a4910d
feat: implement robust retry logic with exponential backoff for Gemin…
navidshad Apr 3, 2026
d546e4e
Merge pull request #45 from navidshad/CU-86ex3gw92_Add-gemini-batch-s…
navidshad Apr 3, 2026
23a3b9d
Merge remote-tracking branch 'origin/dev' into CU-86ex3gqx6_Implement…
navidshad Apr 3, 2026
71d2439
refactor: improve yt-dlp download reliability with path normalization…
navidshad Apr 3, 2026
a5f55f0
refactor: decompose ResultNode into specialized SummaryNode, Thumbnai…
navidshad Apr 3, 2026
aa50005
Merge pull request #44 from navidshad/CU-86ex3gqx6_Implement-link-sup…
navidshad Apr 3, 2026
33b697e
feat: integrate multi-image processing pipeline and image-only graph …
navidshad Apr 4, 2026
0ecfb61
refactor: redesign AttachmentModal using shared components and unify …
navidshad Apr 4, 2026
17341c6
refactor: replace manual input implementations with BaseMessageInput …
navidshad Apr 4, 2026
b2ba3ae
style: increase grid column count in AttachmentModal for better layou…
navidshad Apr 6, 2026
19bbec7
feat: implement multimodal intent recognition and intelligent referen…
navidshad Apr 6, 2026
bc4cf92
Merge pull request #46 from navidshad/CU-86ex50815_Implement-image-su…
navidshad Apr 6, 2026
7da8681
chore: rebrand project to FrameFlow and update documentation accordingly
navidshad Apr 6, 2026
08da113
refactor: update UploadPage UI text and improve code formatting
navidshad Apr 6, 2026
bcc229c
feat: update temporary directory path to include FrameFlow subdirecto…
navidshad Apr 6, 2026
fe60fb7
style: standardize disabled button states across UI pages with consis…
navidshad Apr 6, 2026
e541405
feat: implement real-time thread updates across windows and improve a…
navidshad Apr 6, 2026
ab07cc3
feat: implement automatic thread path repair and synchronization when…
navidshad Apr 6, 2026
da3a464
refactor: include frame paths in scene descriptions to improve refere…
navidshad Apr 6, 2026
1c01678
refactor: implement lazy initialization for managers and add missing …
navidshad Apr 6, 2026
09617f0
refactor: improve background task error handling, retry logic, and tr…
navidshad Apr 7, 2026
5b8c23c
refactor: implement dynamic node height calculation for graph layout …
navidshad Apr 7, 2026
591d1d4
feat: extract model text output from Gemini adapter and display it in…
navidshad Apr 7, 2026
03bb720
feat: update prompt instructions to use generic descriptors instead o…
navidshad Apr 7, 2026
ddf8a91
feat: enable Gemini thinking mode and implement robust response text …
navidshad Apr 7, 2026
726cf45
feat: skip image extraction task if image text data is already cached
navidshad Apr 7, 2026
a47a399
feat: add support for image iteration and refinement by passing attac…
navidshad Apr 7, 2026
a1086a7
refactor: centralize temporary directory path constants and restrict …
navidshad Apr 7, 2026
905cfce
feat: implement image upscaling functionality with Gemini creative re…
navidshad Apr 7, 2026
16b2677
feat: enable thinking configuration with 8000 budget in image generat…
navidshad Apr 7, 2026
91e899b
docs: update README with new interface screenshots and project attrib…
navidshad Apr 7, 2026
f4049fc
docs: reorder README header elements and reposition banner image
navidshad Apr 7, 2026
0268bc2
docs: update project banner and description layout in README
navidshad Apr 7, 2026
c28b12f
style: update MediaNode metadata text colors to support light mode an…
navidshad Apr 7, 2026
572d04f
docs: remove Brain pipeline diagram and reproducibility link from README
navidshad Apr 7, 2026
aeabbc7
docs: simplify dashboard screenshot styling in README
navidshad Apr 7, 2026
188f225
docs: add supported media formats and optimization details to README
navidshad Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
221 changes: 175 additions & 46 deletions .agent/skills/frontend_design/lib-vue-components.md

Large diffs are not rendered by default.

70 changes: 69 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,80 @@ jobs:
run: |
git push origin main --follow-tags

- name: Generate AI Release Notes
id: generate-notes
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY_FOR_PR_DESC_GENERATOR }}
MODEL_NAME: "gpt-5.4-nano-2026-03-17"
run: |
# Get the new tag from the previous step
NEW_TAG=${{ steps.version-bump.outputs.version }}

# Find the previous tag (excluding the one we just created)
PREV_TAG=$(git describe --tags --abbrev=0 $NEW_TAG^ 2>/dev/null || echo "")

echo "Current Tag: $NEW_TAG"
echo "Previous Tag: $PREV_TAG"

if [ -z "$PREV_TAG" ]; then
# If no previous tag, take all commits
COMMITS=$(git log --oneline --no-merges)
else
# Get commits between the two tags, excluding the version bump commit itself
COMMITS=$(git log $PREV_TAG..$NEW_TAG^ --oneline --no-merges)
fi

if [ -z "$COMMITS" ]; then
COMMITS="No changes detected."
fi

echo "=== Commits to summarize ==="
echo "$COMMITS"
echo "============================"

# Prepare JSON payload for OpenAI API
JSON_PAYLOAD=$(jq -n --arg commits "$COMMITS" --arg model "$MODEL_NAME" '{
model: $model,
messages: [
{
role: "system",
content: "You are a professional release manager. Summarize commit messages into clear, user-friendly release notes with markdown sections for Features, Bug Fixes, and Improvements."
},
{
role: "user",
content: ("Summarize these commits into professional release notes. Use bullet points and be concise.\n\nCommits:\n" + $commits)
}
],
temperature: 0.7
}')

# Call OpenAI API
RESPONSE=$(curl -s -X POST "https://api.openai.com/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${OPENAI_API_KEY}" \
-d "$JSON_PAYLOAD")

# Extract notes from response
NOTES=$(echo "$RESPONSE" | jq -r '.choices[0].message.content' 2>/dev/null || echo "")

# Fallback if AI fails or returns null
if [ "$NOTES" == "null" ] || [ -z "$NOTES" ]; then
echo "Warning: AI generation failed or returned empty. Falling back to commit list."
echo "API Response: $RESPONSE"
NOTES="### Changes\n\n$COMMITS"
fi

# Output the notes for the next step
echo "changelog<<EOF" >> $GITHUB_OUTPUT
echo "$NOTES" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

- name: Create Release
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ steps.version-bump.outputs.version }}
name: Release ${{ steps.version-bump.outputs.version }}
generate_release_notes: true
body: ${{ steps.generate-notes.outputs.changelog }}
draft: false
prerelease: false
env:
Expand Down
114 changes: 60 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,85 +1,91 @@
# 🌊 FrameFlow

# 🎬 VGTU Video Summarization
![Project Process](./docs/imgs/process.jpeg)

<img src="./docs/screenshots/01_light-theme.jpg" width="100%" alt="FrameFlow Banner" />

---

A high-fidelity platform that transforms long-form video content into concise, meaningful highlights. By leveraging **Google Gemini's** multimodal intelligence and precise **FFmpeg** engineering, it provides a seamless chat-based refinement experience.
**FrameFlow** is a high-fidelity multimedia platform that bridges the gap between raw video/image assets and creative intelligence. By fusing **Google Gemini's** multimodal brain with precise **FFmpeg** engineering, FrameFlow transforms how you consume, extract, and generate media.

<div>

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Platform: Electron](https://img.shields.io/badge/Platform-Electron-lightgrey.svg)](https://www.electronjs.org/)
[![Framework: Vue 3](https://img.shields.io/badge/Framework-Vue%203-4fc08d.svg)](https://vuejs.org/)
[![AI: Google Gemini](https://img.shields.io/badge/AI-Google%20Gemini-4285F4.svg)](https://deepmind.google/technologies/gemini/)

**Key Use Cases:**
- 🎓 **Academic Hub**: Condense 2-hour technical lectures into 5-minute study guides.
- 📱 **Content Creation**: Generate social media teasers from raw footage with natural language.
- 🔍 **Quick Review**: Rapidly navigate long meetings or webinars for specific insights.
</div>

---
## 🧭 Quick Links & Navigation

| Topic | Resource / Section | Description |
| :--- | :--- | :--- |
| 🏗 **Architecture** | [**Technical Deep-Dive**](./docs/architecture.md) | Pipeline logic, intent nodes, and iterative generation. |
| 🎨 **UI & UX** | [**Design Overview**](./docs/ui_ux.md) | Frontend components, state, and user interaction flow. |
| 🚀 **Setup** | [**Setup Guide**](./docs/setup.md) | Prerequisites and environment installation instructions. |
| 📦 **Repository** | [**Deliverables**](#-deliverables) | Formal project components and file structure. |
| 🧠 **AI Logic** | [**The Pipeline**](#-the-ai-pipeline-highlights) | Logic overview of the 4-phase summarization engine. |
| 🛠 **Verifiability** | [**Reproducibility**](#-reproducibility) | Ensuring consistent results across environments. |
| 📸 **Demo** | [**Final Screenshot**](#-final-snapshot) | Visual overview of the chat and video editor interface. |
## 🚀 The Three Pillars of FrameFlow

---
FrameFlow is built on three core intelligence layers, designed for creators, researchers, and developers.

## 📦 Deliverables
This formal homework project delivers a complete production-grade ecosystem:
* **Production Code**: Electron desktop app written in Vue 3 & TypeScript.
* **AI Engine**: A 4-phase pipeline (Extraction, Intent, Generation, Assembly).
* **Reproduction Tools**: Download the [Sample Videos Folder](https://drive.google.com/drive/folders/1g2Cp533NPQPtngLvnCuP5T8PZNc-FTZK?usp=sharing) (includes full and short versions) and use the app for a 4-phase trace.
* **Visual Documentation**: Fully documented [Architecture](./docs/architecture.md) and [UI/UX Flow](./docs/ui_ux.md).
### 1. 🎞️ Video to Short Video
Transform long-form content into concise, meaningful highlights.
- **Academic Precision**: Condense 2-hour technical lectures into 5-minute study guides.
- **Meeting Recap**: Rapidly navigate long webinars for specific insights.
- **Narrative Awareness**: AI understands scene transitions and audio context simultaneously.

### 2. 📸 Video to Thumbnail
Extract and generate high-fidelity visual assets from any video source.
- **Auto-Enrichment**: AI analyzes scene quality to extract the most representative frames.
- **Professional Thumbnails**: Generate YouTube-ready or presentation-grade thumbnails with AI-driven composition.
- **Batch Processing**: Extract hundreds of scene-indexed images in seconds.

### 3. 🎨 Images to Image
Leverage multimodal prompts to transform existing images or generate new ones from scratch.
- **Visual Continuity**: Use existing frames as structural references for new generations.
- **Prompt-Driven Flow**: Refine images using natural language within a unified chat-graph interface.
- **Multimodal Fusion**: Combine video context with external image uploads for hybrid creativity.

---

## 🧠 The AI Pipeline (Highlights)
## 🧩 Supported Inputs

Our unique **4-Phase Engine** ensures that every summary is contextually accurate:
* **Intent Recognition**: Uses a "Brain" node to distinguish between chat and generation, preventing token waste.
* **Iterative Refinement**: Supports an **Edit Mode** that performs a technical "diff" on previous timelines for perfect consistency.
* **Multimodal Fusion**: Processes visual scene transitions, audio transcripts, and user context simultaneously.
FrameFlow handles a wide range of media formats and sources:

> [!TIP]
> **Deep Dive:** Check out the **[Architecture Deep-Dive](./docs/architecture.md)** for Mermaid diagrams and logic breakdowns.
- **Video Formats**: Native support for `.mp4`, `.avi`, `.mov`, and `.webm`.
- **Online Sources**: YouTube, Google Drive, and direct media URLs (via `yt-dlp`).
- **Images**: High-fidelity `.jpg`, `.png`, and `.webp` for structural reference and multimodal generation.
- **Optimization**: High-res videos are automatically downscaled (480p) to ensure lightning-fast AI analysis without losing metadata.

---

## 🎨 UX Highlights
The interface is designed for **transparency** and **iterative control**:
* **Version History**: Switch between generated versions instantly to find the best cut.
* **Live Token Metrics**: Monitor AI usage costs and token counts in real-time.
* **Zero-Config Preprocessing**: Automatic scene detection and transcript extraction upon upload.
## 🎨 Premium Experience (UX)

---
FrameFlow isn't just a tool; it's an iterative workspace:

## 🛠 Reproducibility
To guarantee identical behavior across different environments:
* **Sample Data**: Download our [Main Reference Videos Folder](https://drive.google.com/drive/folders/1g2Cp533NPQPtngLvnCuP5T8PZNc-FTZK?usp=sharing) (includes full and short versions) to test the pipeline.
* **JSON Enforcement**: Strict schemas ensure deterministic AI responses.
* **Precision Slicing**: FFmpeg settings calibrated for frame-accurate cuts.
* **Dependency Guard**: Locked environments via `package-lock.json` and `.npmrc`.
* **Reference Stability**: Edit mode always builds upon a fixed "Seed" timeline to avoid hallucinations.
- **Vue Flow Graph Interface**: Manage parallel tasks and version branches visually.
- **Live Metrics**: Monitor AI token usage and processing costs in real-time.
- **Zero-Config Preprocessing**: Automatic scene detection and transcript extraction.
- **Ambient Design**: A sleek, dark-mode-first interface with glassmorphism and smooth animations.

---

## 🚀 Getting Started
Check the **[Installation & Setup Guide](./docs/setup.md)** to configure:
1. **Environment**: Node.js and Gemini API Key.
2. **Tools**: FFmpeg and PySceneDetect for your OS.
3. **Launch**: `npm install && npm run dev`.
## 🧭 Navigation & Setup

| Section | Link | Purpose |
| :--- | :--- | :--- |
| 🏗 **Architecture** | [**Deep-Dive**](./docs/architecture.md) | Pipeline logic, intent nodes, and iterative generation. |
| 🚀 **Installation** | [**Setup Guide**](./docs/setup.md) | Node.js, Gemini API, FFmpeg, and yt-dlp setup. |
| 🎨 **UI/UX** | [**Design Overview**](./docs/ui_ux.md) | Frontend components and interaction flow. |

---

## 📸 Final Snapshot
## 📸 Interface Preview

<div align="center">
<img src="./docs/imgs/screenshot_chatpage.png" width="800px" alt="App Screenshot" />
<img src="./docs/screenshots/01_dark-theme.jpg" alt="FrameFlow Dashboard" />
</div>

---

## 📜 License
Licensed under the MIT License - see [LICENSE](LICENSE) for details.
## 📜 License & Credits

FrameFlow is licensed under the **MIT License**. Created by [navidshad](https://github.com/navidshad) and his classmates as part of a high-fidelity AI engineering initiative at Vilnius Gediminas Technical University (VGTU).

---

> [!TIP]
> **Pro Choice:** Check the **[Architecture Deep-Dive](./docs/architecture.md)** to see how we handle multimodal intent recognition and technical "diffs" for consistency.
Binary file added docs/screenshots/01_dark-theme.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/screenshots/01_light-theme.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions docs/setup.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 🛠 Installation & Setup Guide

This guide provides detailed instructions on how to set up the environment and run the **VGTU Video Summarization** application.
This guide provides detailed instructions on how to set up the environment and run the **FrameFlow** application.

---

Expand Down Expand Up @@ -51,8 +51,8 @@ The application relies on **FFmpeg** for video processing and **PySceneDetect**

1. **Clone the repository**:
```bash
git clone https://github.com/navidshad/vgtu-video-summarization.git
cd vgtu-video-summarization
git clone https://github.com/navidshad/frameflow.git
cd frameflow
```

2. **Install dependencies**:
Expand Down
Loading
Loading