AI Content Generator

Text-to-Image, Text-to-Video, and Text-to-Speech in a single pipeline — runs on Apple Silicon (MPS), NVIDIA GPU (CUDA), or free Google Colab T4.

What It Does

Generate rich media from text prompts — images, videos, and narrated audio — all from one Gradio interface. Designed to run free on Google Colab (T4 GPU) or locally on Mac M1/M2/M3 with Apple MPS acceleration.

Capabilities

Mode	Model	Speed	Quality
Text-to-Image	SDXL-Turbo (Stability AI)	~2s (MPS)	High
Text-to-Video	ModelScope text-to-video	~30s (T4)	Medium
Text-to-Speech	Bark (Suno)	~10s	High quality
Text-to-Speech	Edge TTS (Microsoft)	~1s	Fast, free
Combined	Video + Audio narration	—	Full pipeline

Architecture

Text Prompt
     │
     ├──▶ ImageGenerator  ──▶ SDXL-Turbo (stabilityai/sdxl-turbo)
     │                         AutoPipelineForText2Image
     │                         Apple MPS / CUDA / CPU
     │
     ├──▶ VideoGenerator  ──▶ ModelScope (damo-vilab/text-to-video-ms-1.7b)
     │                         TextToVideoSDPipeline
     │
     ├──▶ AudioGenerator  ──▶ Bark (suno/bark) — high quality
     │                    ──▶ Edge TTS          — fast, free
     │
     └──▶ Pipeline        ──▶ Combined: video + audio → final output

Quick Start

Option 1: Google Colab (Recommended — Free T4 GPU)

Open notebooks/AI_Content_Generator_MVP.ipynb
Upload to Google Colab
Runtime → Change runtime type → T4 GPU
Run all cells → Gradio public URL is generated automatically

Option 2: Local (Mac M1/M2/M3)

git clone https://github.com/apuroopy1-prog/ai-video-generator.git
cd ai-video-generator

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

python app.py
# Open http://localhost:7860

Project Structure

ai-content-generator/
├── app.py                          # Main Gradio web UI
├── quick_start.py                  # Minimal Python API example
├── requirements.txt
├── notebooks/
│   └── AI_Content_Generator_MVP.ipynb   # Colab notebook
├── src/
│   ├── image_generator.py          # SDXL-Turbo text-to-image
│   ├── video_generator.py          # ModelScope text-to-video
│   ├── audio_generator.py          # Bark + Edge TTS
│   └── pipeline.py                 # Combined video+audio pipeline
├── configs/                        # Model config files
├── huggingface-space/              # HuggingFace Spaces deployment
└── outputs/                        # Generated files (gitignored)

Python API

from src.image_generator import ImageGenerator
from src.video_generator import VideoGenerator
from src.audio_generator import AudioGenerator

# Generate image
img_gen = ImageGenerator()
image = img_gen.generate("A futuristic city at sunset, cyberpunk style")

# Generate video
vid_gen = VideoGenerator()
video = vid_gen.generate("A robot walking through a forest")

# Generate speech
audio_gen = AudioGenerator()
audio = audio_gen.generate("Welcome to the future of AI content creation")

Hardware Requirements

Hardware	Image	Video	Audio
Apple M1/M2/M3 (MPS)	✅ Fast	✅ Slow	✅
NVIDIA GPU (CUDA)	✅ Fast	✅ Fast	✅
Google Colab T4	✅ Fast	✅ Fast	✅
CPU only	✅ Very slow	⚠️ Very slow	✅

Tech Stack

Component	Technology
Text-to-Image	Diffusers, SDXL-Turbo, AutoPipelineForText2Image
Text-to-Video	ModelScope, TextToVideoSDPipeline
Text-to-Speech	Bark (suno), Edge TTS
UI	Gradio
Deep Learning	PyTorch (MPS + CUDA)
Models	HuggingFace Hub

Built By

Apuroop Yarabarla — AI/ML Engineer & AI Product Owner

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Content Generator

What It Does

Capabilities

Architecture

Quick Start

Option 1: Google Colab (Recommended — Free T4 GPU)

Option 2: Local (Mac M1/M2/M3)

Project Structure

Python API

Hardware Requirements

Tech Stack

Built By

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
huggingface-space		huggingface-space
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
quick_start.py		quick_start.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI Content Generator

What It Does

Capabilities

Architecture

Quick Start

Option 1: Google Colab (Recommended — Free T4 GPU)

Option 2: Local (Mac M1/M2/M3)

Project Structure

Python API

Hardware Requirements

Tech Stack

Built By

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages