Veo 3.1 API with APIDot

Build with the Veo 3.1 API using APIDot: cURL, Node.js, polling, webhooks, pricing, and fal.ai comparison in one production-oriented GitHub repo.

Get API Key | API Docs | Model Page | Main Examples

Why this repo exists

Run Veo 3.1 text-to-video, frame-guided video generation, and reference image workflows from one APIDot page across Lite, Fast, and Quality. Use the same async API pattern to test prompts, compare modes, and move the workflow into production without changing how your integration works.

This repository turns that workflow into runnable server-side examples: a verified cURL request, a native Node.js polling example, webhook receiver notes, prompt examples, pricing context, and production integration guardrails.

Overview

Veo 3.1 is Google DeepMind's advanced video generation model for short clips that need synchronized audio, cinematic motion, and stronger visual continuity. It is built for prompts where camera direction, scene physics, sound, and story intent should work together instead of being assembled later.

Its core strengths include native audio generation, text-to-video creation, first/last frame control, reference image guidance, cinematic camera language, realistic lighting, and more stable motion. This makes Veo 3.1 useful for product launches, short ads, social videos, storyboard previews, game teasers, explainers, and previs clips.

On APIDot, Veo 3.1 is available through veo3.1-lite, veo3.1-fast, and veo3.1-quality for fixed 8-second jobs with 16:9 or 9:16 output and 720p, 1080p, or 4k resolution options. Use Lite for text-only iteration, Fast for text, frame, or reference image workflows, and Quality when premium text or frame-guided output matters most.

Capabilities

Generating native audio so dialogue, ambience, and effects stay synchronized: Veo 3.1 can generate synchronized audio together with the video, including dialogue, ambient sound, and scene-aware effects. This helps the first output feel closer to a real short clip instead of a silent motion preview.
Turning detailed prompts into cinematic 8-second scenes with clear direction: Veo 3.1 can turn structured prompts into short videos with subject action, environment, lighting, camera movement, and mood. It is useful when teams need to test concepts quickly while still describing the shot like a real video brief.
Using first and last frames to control motion paths and transitions: With frame-guided generation, Veo 3.1 can animate from a supplied starting frame or move between first and last frames. This is useful for product reveals, storyboard transitions, before-and-after motion, and clips where the beginning or ending must stay controlled.
Guiding subjects, products, and style with reference images for lower drift: Veo 3.1 can use reference images to guide the subject, product look, scene style, or brand direction. This helps reduce visual drift when the generated video needs to stay closer to a character, object, or visual system.
Directing camera movement, framing, lighting, and pacing with cinematic language: Veo 3.1 responds well to prompts that describe tracking shots, close-ups, overhead angles, portrait framing, lighting cues, pacing, and atmosphere. This makes it stronger for clips where camera intent matters, not just for making a scene move.
Maintaining believable physics, lighting, and continuity across the full clip: Veo 3.1 is designed for more believable motion, scene physics, shadows, and visual continuity. It is useful for short ads, product scenes, action moments, explainers, and other clips where the generated video needs to feel coherent across the full 8 seconds.

Common use cases

This page is designed for teams that already know they need short-form AI video and want a more search-friendly answer to the real operational questions: which Veo 3.1 variant to choose, when to use text-only versus frame or reference image mode, and how to plug the result into an async API workflow.

Short ad videos for product launches and paid campaigns
Product demo videos and ecommerce launch clips
Social media short videos with stronger pacing and shot direction
Storyboard tests and film previs clips
Game teaser videos and concept reveal clips
Training, explainer, and internal comms videos

Pricing on APIDot

Catalog price: Starting at $0.05 / video. Pricing snapshot: per video | veo3.1-lite: 720p/1080p 10 cr ($0.05), 4K 30 cr ($0.15); veo3.1-fast: 720p/1080p 20 cr ($0.10), 4K 60 cr ($0.30); veo3.1-quality: 720p/1080p 120 cr ($0.60), 4K 400 cr ($2.00)

This README uses the pricing data currently published in the APIDot model catalog. Check the APIDot model page before high-volume production runs.

Model-specific pricing

veo3.1-lite: per video | 720p/1080p: 10 credits ($0.05), 4K: 30 credits ($0.15)
veo3.1-fast: per video | 720p/1080p: 20 credits ($0.10), 4K: 60 credits ($0.30)
veo3.1-quality: per video | 720p/1080p: 120 credits ($0.60), 4K: 400 credits ($2.00)

APIDot vs fal.ai

For tiers with fal.ai comparison data in the APIDot catalog, APIDot shows up to 89% lower listed price. Treat this as a catalog snapshot and verify current pricing before launch.

Tier	APIDot listed price	fal.ai listed price	Note
veo3.1-lite, 720p/1080p	$0.05	$0.4	APIDot is 88% lower in this tier
veo3.1-fast, 720p/1080p	$0.1	$0.75	APIDot is 87% lower in this tier
veo3.1-fast, 4K	$0.3	$2.8	APIDot is 89% lower in this tier
veo3.1-quality, 720p/1080p	$0.6	$3.2	APIDot is 81% lower in this tier
veo3.1-quality, 4K	$2	$4.8	APIDot is 58% lower in this tier

Quick start

cp .env.example .env
# Edit .env and set APIDOT_API_KEY
cd node
npm start

The same request shape is available as a copy-paste cURL example in curl/generate.md.

API workflow

flowchart LR
    A[Submit generation request] --> B[Receive data.task_id]
    B --> C{Delivery mode}
    C -->|Polling| D[Check task status]
    C -->|Webhook| E[Receive callback_url event]
    D --> F[Read result URL from finished task]
    E --> F

Use polling for local tests and webhook delivery for production queues. Store data.task_id before the first status check so retries, callbacks, and result URLs can be reconciled safely.

Minimal API request

Submit to APIDot's unified async generation endpoint:

POST https://api.apidot.ai/api/generate/submit
Authorization: Bearer <APIDOT_API_KEY>
Content-Type: application/json

Primary payload:

{
  "model": "veo3.1-lite",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "A miniature city waking up at sunrise, cinematic light, smooth camera motion",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "720p"
  }
}

Submit Veo 3.1 fast, lite, and quality jobs through APIDot's unified async generation endpoint.

Veo 3.1 uses the shared APIDot submit-and-track workflow. veo3.1-fast, veo3.1-lite, and veo3.1-quality all support text-to-video. veo3.1-fast also supports frame and reference image-to-video, while veo3.1-quality supports frame image-to-video only. veo3.1-lite is text-to-video only.

Model IDs and request variants

veo3.1-lite text-to-video

{
  "model": "veo3.1-lite",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "A miniature city waking up at sunrise, cinematic light, smooth camera motion",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "720p"
  }
}

veo3.1-fast text-to-video

{
  "model": "veo3.1-fast",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "Dolphins jumping in a bright blue ocean with native ambient sound",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "720p"
  }
}

veo3.1-fast frame image-to-video

{
  "model": "veo3.1-fast",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "Animate the subject from the first frame into a smooth final pose",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "1080p",
    "generate_type": "frame",
    "image_urls": [
      "https://cdn.example.com/first-frame.webp",
      "https://cdn.example.com/last-frame.webp"
    ]
  }
}

veo3.1-fast reference image-to-video

{
  "model": "veo3.1-fast",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "Create a dynamic product scene using the supplied visual references",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "1080p",
    "generate_type": "reference",
    "image_urls": [
      "https://cdn.example.com/reference-1.webp",
      "https://cdn.example.com/reference-2.webp",
      "https://cdn.example.com/reference-3.webp"
    ]
  }
}

veo3.1-quality text-to-video

{
  "model": "veo3.1-quality",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "A premium cinematic landscape reveal with natural camera motion and native ambience",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "4k"
  }
}

veo3.1-quality frame image-to-video

{
  "model": "veo3.1-quality",
  "callback_url": "https://your-domain.com/callback",
  "input": {
    "prompt": "Create a premium cinematic continuation from the supplied first frame",
    "duration": 8,
    "aspect_ratio": "16:9",
    "resolution": "4k",
    "generate_type": "frame",
    "image_urls": [
      "https://cdn.example.com/first-frame.webp"
    ]
  }
}

Request parameters

Field	Type	Required	Description
model	string	yes	Target Veo 3.1 variant: `veo3.1-lite`, `veo3.1-fast`, or `veo3.1-quality`.
callback_url	string	no	Optional webhook callback URL for terminal task updates.
input	object	yes	Container for Veo 3.1 generation parameters.
input.prompt	string	yes	Main prompt describing scene, motion, audio intent, and target aesthetic.
input.duration	number	no	Requested duration in seconds. Veo 3.1 is fixed at 8 seconds.
input.aspect_ratio	string	no	Supported values are `16:9` and `9:16`.
input.resolution	string	no	Supported values are `720p`, `1080p`, and `4k`.
input.generate_type	string	no	Optional image-to-video mode. Use `frame` for first/last-frame control or `reference` for reference images. Do not send this field for text-to-video or `veo3.1-lite`; `reference` is only supported by `veo3.1-fast`.
input.image_urls	string[]	no	Optional image inputs for supported image-to-video runs. `frame` supports 1-2 images; `reference` supports 1-3 images. Do not send this field to `veo3.1-lite`.

Practical integration notes

Use veo3.1-lite for the lowest-cost text-only iteration, veo3.1-fast for fast text or image-guided runs, and veo3.1-quality for premium output.
Keep prompts explicit about motion, camera perspective, native audio intent, and visual continuity.
Do not send input.generate_type or input.image_urls to veo3.1-lite.
Use input.generate_type: "frame" for first/last-frame workflows. Use reference only with veo3.1-fast.

Polling and webhooks

APIDot media generation is asynchronous. Store data.task_id immediately after submit, poll /api/generate/status/{task_id} for local tests, and use callback_url webhooks for production queues where users may leave the page before completion.

Webhook handlers should verify task ownership, persist callback events, return 2xx quickly, and be idempotent because duplicate deliveries can happen.

Response and errors

code: HTTP-style status code. Successful submits return 200.
data.task_id: Async task identifier returned immediately after submission.
data.status: Initial task status, typically not_started.
data.created_time: ISO 8601 timestamp for task creation.

Common error classes:

400 invalid_request: Missing fields, unsupported generate mode, or invalid image input combinations.
401 authentication_error: Missing, expired, or invalid Bearer API key.
402 insufficient_credits: The current prepaid balance cannot cover the Veo 3.1 request.
429 rate_limited: The API key is temporarily above the allowed submission rate.

Production notes

Keep APIDot API keys in server-side environment variables.
Persist task_id, selected model, request payload, user ID, and status together.
Use a moderate polling interval for tests and webhooks for durable production callbacks.
Validate source media URLs before submitting requests that depend on source files.
Avoid logging API keys, private prompts, private media URLs, or callback URLs.
Retry transient network failures with backoff, but do not retry unchanged invalid payloads.

FAQ

What is Veo 3.1 Video API on APIDot?

It is APIDot's unified entry point for veo3.1-lite, veo3.1-fast, and veo3.1-quality. You can test Veo 3.1 text-to-video, frame-guided generation, and supported reference image workflows in one place, then keep the same async integration pattern when you move the workload into production.

Which inputs, generation modes, and output settings are supported?

The current page is built around fixed 8-second generation with 16:9 and 9:16 output, plus 720p, 1080p, or 4k depending on the variant. veo3.1-lite is text-to-video only. veo3.1-fast supports text, frame-guided generation, and reference images. veo3.1-quality supports text plus frame-guided generation, but not reference mode.

Can I use reference images with every Veo 3.1 variant?

No. Reference image video generation is available only on veo3.1-fast. If you want the lowest-cost way to test a text idea, use veo3.1-lite. If you want higher-fidelity frame-guided generation for a more polished clip, veo3.1-quality is the better fit, but it does not support reference mode.

When should I use frame mode instead of reference mode?

Use frame mode when you already know the opening shot, the closing shot, or the transition you want to control. Use reference mode when the bigger goal is keeping the subject, product, or visual style consistent across generations. On APIDot, frame mode works on veo3.1-fast and veo3.1-quality, while reference mode is limited to veo3.1-fast.

Does Veo 3.1 generate native audio, and do failed jobs get charged?

Yes. Native audio is one of the headline strengths of the Veo 3.1 family. On APIDot, credits are intended for successful generations rather than failed jobs.

Does the API use an async job pattern?

Yes. You submit the request, receive a task ID, and then either poll the status endpoint or wait for a callback. That makes Veo 3.1 easier to plug into queue-based systems, content pipelines, and other production workflows without blocking the request lifecycle.

How do teams choose between Lite, Fast, and Quality?

A practical rule is: use Lite for cheap text-only exploration, Fast when you need the broadest working mode with image guidance and frequent iteration, and Quality when the final video matters more than turnaround. Teams often move through them in that same order as a project becomes more defined.

What kinds of prompts and camera directions work well with Veo 3.1?

The safest pattern is to be explicit about subject, action, camera perspective, movement, lighting, atmosphere, and audio intent. Prompts that use film language such as tracking shot, overhead shot, close-up, slow push-in, backlight, or ambient street sound tend to give Veo 3.1 clearer instructions than short generic prompts.

Which Veo 3.1 model id should I use?

veo3.1-lite is text-only and cost-efficient. veo3.1-fast supports text-to-video plus frame/reference image-to-video. veo3.1-quality supports text-to-video and frame image-to-video for the highest-quality output.

How does `input.generate_type` work?

Only include input.generate_type when sending images. Use frame for up to two images: first frame and optional last frame. Use reference for up to three reference images, and only with veo3.1-fast.

Can `veo3.1-lite` use images?

No. veo3.1-lite should not receive input.generate_type or input.image_urls; submit text prompts only.

Is APIDot the creator of the underlying model?

No. This is an APIDot integration repository for calling Veo 3.1 through APIDot. Google DeepMind is listed as the model provider in the APIDot catalog. Use the APIDot model page for current availability, pricing, and usage terms.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
curl		curl
docs		docs
node		node
webhooks/express-webhook		webhooks/express-webhook
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Model	Repository
Seedance 2	seedance-2-api
Sora 2 Official	sora-2-official-api
Happy Horse	happy-horse-api

Folders and files

Latest commit

History

Repository files navigation

Veo 3.1 API with APIDot

Why this repo exists

Overview

Capabilities

Common use cases

Pricing on APIDot

Model-specific pricing

APIDot vs fal.ai

Quick start

API workflow

Minimal API request

Model IDs and request variants

veo3.1-lite text-to-video

veo3.1-fast text-to-video

veo3.1-fast frame image-to-video

veo3.1-fast reference image-to-video

veo3.1-quality text-to-video

veo3.1-quality frame image-to-video

Request parameters

Practical integration notes

Polling and webhooks

Response and errors

Production notes

FAQ

What is Veo 3.1 Video API on APIDot?

Which inputs, generation modes, and output settings are supported?

Can I use reference images with every Veo 3.1 variant?

When should I use frame mode instead of reference mode?

Does Veo 3.1 generate native audio, and do failed jobs get charged?

Does the API use an async job pattern?

How do teams choose between Lite, Fast, and Quality?

What kinds of prompts and camera directions work well with Veo 3.1?

Which Veo 3.1 model id should I use?

How does input.generate_type work?

Can veo3.1-lite use images?

Is APIDot the creator of the underlying model?

Related links

Related APIDot model API repositories

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

How does `input.generate_type` work?

Can `veo3.1-lite` use images?

Packages