Image Generation OpenAPI Server

OpenAPI server for AI image generation and editing. Supports LiteLLM proxy (for cost tracking) and direct API access to OpenAI and Google Gemini.

Features

Image Generation: Create images from text prompts
Image Editing: Edit existing images with mask-based inpainting (OpenAI) or prompt-based editing (Gemini)
Multi-Provider: OpenAI (GPT-Image-1.5, GPT-Image-1) and Google Gemini
LiteLLM Integration: Unified API with cost tracking
Open WebUI Integration: Auto-upload images to Open WebUI's file storage
Dynamic Models: Auto-discovers available models from LiteLLM

Quick Start

Docker (Recommended)

Clone and configure:

git clone <your-repo>
cd openapi-image-gen
cp .env.example .env
# Edit .env with your API keys

Start the server:

docker-compose up -d

Access the API:

API Docs: http://localhost:8000/docs
Health Check: http://localhost:8000/health

Manual Setup

Install dependencies:

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure environment:

cp .env.example .env
# Edit .env with your configuration

Run the server:
```
uvicorn app.main:app --reload
```

API Examples

Generate Image

curl -X POST "http://localhost:8000/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A mountain landscape at sunset",
    "model": "gpt-image-1",
    "aspect_ratio": "16:9"
  }'

Edit Image (Mask-based, OpenAI)

curl -X POST "http://localhost:8000/edit" \
  -F "image=@photo.png" \
  -F "mask=@mask.png" \
  -F "prompt=Replace the sky with a sunset" \
  -F "provider=openai" \
  -F "model=gpt-image-1"

Edit Image (Prompt-based, Gemini)

curl -X POST "http://localhost:8000/edit" \
  -F "image=@photo.png" \
  -F "prompt=Make the background more colorful" \
  -F "provider=gemini"

Edit via URL Reference

curl -X POST "http://localhost:8000/edit" \
  -F "image_url=http://localhost:8000/images/abc123.png" \
  -F "prompt=Add a hat to the person" \
  -F "provider=gemini"

List Models

curl "http://localhost:8000/models"

Provider Configuration

LiteLLM (Primary - Recommended)

LiteLLM provides unified access, cost tracking, and rate limiting:

LITELLM_BASE_URL=http://litellm:4000
LITELLM_API_KEY=your-key  # Optional

Direct APIs (Fallback)

Configure direct access for fallback when LiteLLM is unavailable:

OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...

Direct Provider Fallback

LiteLLM doesn't support all provider features. Enable DIRECT_PROVIDER_FALLBACK to automatically use the native provider API when needed:

DIRECT_PROVIDER_FALLBACK=true
GEMINI_API_KEY=...  # Required for Gemini fallback

Currently applies to:

Gemini aspect ratios: 16:9, 9:16, 4:3, 3:4 (LiteLLM only supports 1:1)

Supported Models

OpenAI

Model	Generation	Editing	Notes
gpt-image-1.5	Yes	Yes (mask)	Latest, quality: auto/high/medium/low
gpt-image-1	Yes	Yes (mask)	Fast, supports inpainting
gpt-image-1-mini	Yes	Yes (mask)	Cost-efficient variant
~~dall-e-3~~	Yes	No	Deprecated — shutting down May 12, 2026
~~dall-e-2~~	Yes	Yes (mask)	Deprecated — shutting down May 12, 2026

Google Gemini

Model	Generation	Editing	Notes
gemini-2.5-flash-image	Yes	Yes (prompt)	Fast, wide aspect ratio support
gemini-3-pro-image-preview	Yes	Yes (prompt)	Preview, up to 4K resolution
~~gemini-2.0-flash-preview-image-generation~~	Yes	Yes (prompt)	Deprecated — shutting down March 31, 2026

See CONFIGURATION.md for details.

Open WebUI Integration

In Open WebUI, go to Settings > Tools
Add new OpenAPI server:
- URL: http://your-server:8000/openapi.json
- Name: Image Generation
Tools will appear automatically in chat interface

For admin/global tools, configure in Admin Settings with server-accessible URL.

Native Image Display (Recommended)

For the best experience (images display natively in chat with download/save):

Generate an API key in Open WebUI: Settings > Account > API Keys
Configure in .env:

OPENWEBUI_MODE=true
OPENWEBUI_BASE_URL=http://open-webui:3000
OPENWEBUI_API_KEY=your-owui-api-key

Images are uploaded directly to Open WebUI's file storage and displayed inline, just like the built-in image generation.

Without OPENWEBUI_BASE_URL/OPENWEBUI_API_KEY, falls back to base64 HTMLResponse (iframe display).

API Endpoints