Build with the Veo 3.1 API using APIDot: cURL, Node.js, polling, webhooks, pricing, and fal.ai comparison in one production-oriented GitHub repo.
Get API Key | API Docs | Model Page | Main Examples
Run Veo 3.1 text-to-video, frame-guided video generation, and reference image workflows from one APIDot page across Lite, Fast, and Quality. Use the same async API pattern to test prompts, compare modes, and move the workflow into production without changing how your integration works.
This repository turns that workflow into runnable server-side examples: a verified cURL request, a native Node.js polling example, webhook receiver notes, prompt examples, pricing context, and production integration guardrails.
Veo 3.1 is Google DeepMind's advanced video generation model for short clips that need synchronized audio, cinematic motion, and stronger visual continuity. It is built for prompts where camera direction, scene physics, sound, and story intent should work together instead of being assembled later.
Its core strengths include native audio generation, text-to-video creation, first/last frame control, reference image guidance, cinematic camera language, realistic lighting, and more stable motion. This makes Veo 3.1 useful for product launches, short ads, social videos, storyboard previews, game teasers, explainers, and previs clips.
On APIDot, Veo 3.1 is available through veo3.1-lite, veo3.1-fast, and veo3.1-quality for fixed 8-second jobs with 16:9 or 9:16 output and 720p, 1080p, or 4k resolution options. Use Lite for text-only iteration, Fast for text, frame, or reference image workflows, and Quality when premium text or frame-guided output matters most.
- Generating native audio so dialogue, ambience, and effects stay synchronized: Veo 3.1 can generate synchronized audio together with the video, including dialogue, ambient sound, and scene-aware effects. This helps the first output feel closer to a real short clip instead of a silent motion preview.
- Turning detailed prompts into cinematic 8-second scenes with clear direction: Veo 3.1 can turn structured prompts into short videos with subject action, environment, lighting, camera movement, and mood. It is useful when teams need to test concepts quickly while still describing the shot like a real video brief.
- Using first and last frames to control motion paths and transitions: With frame-guided generation, Veo 3.1 can animate from a supplied starting frame or move between first and last frames. This is useful for product reveals, storyboard transitions, before-and-after motion, and clips where the beginning or ending must stay controlled.
- Guiding subjects, products, and style with reference images for lower drift: Veo 3.1 can use reference images to guide the subject, product look, scene style, or brand direction. This helps reduce visual drift when the generated video needs to stay closer to a character, object, or visual system.
- Directing camera movement, framing, lighting, and pacing with cinematic language: Veo 3.1 responds well to prompts that describe tracking shots, close-ups, overhead angles, portrait framing, lighting cues, pacing, and atmosphere. This makes it stronger for clips where camera intent matters, not just for making a scene move.
- Maintaining believable physics, lighting, and continuity across the full clip: Veo 3.1 is designed for more believable motion, scene physics, shadows, and visual continuity. It is useful for short ads, product scenes, action moments, explainers, and other clips where the generated video needs to feel coherent across the full 8 seconds.
This page is designed for teams that already know they need short-form AI video and want a more search-friendly answer to the real operational questions: which Veo 3.1 variant to choose, when to use text-only versus frame or reference image mode, and how to plug the result into an async API workflow.
- Short ad videos for product launches and paid campaigns
- Product demo videos and ecommerce launch clips
- Social media short videos with stronger pacing and shot direction
- Storyboard tests and film previs clips
- Game teaser videos and concept reveal clips
- Training, explainer, and internal comms videos
Catalog price: Starting at $0.05 / video. Pricing snapshot: per video | veo3.1-lite: 720p/1080p 10 cr ($0.05), 4K 30 cr ($0.15); veo3.1-fast: 720p/1080p 20 cr ($0.10), 4K 60 cr ($0.30); veo3.1-quality: 720p/1080p 120 cr ($0.60), 4K 400 cr ($2.00)
This README uses the pricing data currently published in the APIDot model catalog. Check the APIDot model page before high-volume production runs.
- veo3.1-lite: per video | 720p/1080p: 10 credits ($0.05), 4K: 30 credits ($0.15)
- veo3.1-fast: per video | 720p/1080p: 20 credits ($0.10), 4K: 60 credits ($0.30)
- veo3.1-quality: per video | 720p/1080p: 120 credits ($0.60), 4K: 400 credits ($2.00)
For tiers with fal.ai comparison data in the APIDot catalog, APIDot shows up to 89% lower listed price. Treat this as a catalog snapshot and verify current pricing before launch.
| Tier | APIDot listed price | fal.ai listed price | Note |
|---|---|---|---|
| veo3.1-lite, 720p/1080p | $0.05 | $0.4 | APIDot is 88% lower in this tier |
| veo3.1-fast, 720p/1080p | $0.1 | $0.75 | APIDot is 87% lower in this tier |
| veo3.1-fast, 4K | $0.3 | $2.8 | APIDot is 89% lower in this tier |
| veo3.1-quality, 720p/1080p | $0.6 | $3.2 | APIDot is 81% lower in this tier |
| veo3.1-quality, 4K | $2 | $4.8 | APIDot is 58% lower in this tier |
cp .env.example .env
# Edit .env and set APIDOT_API_KEY
cd node
npm start
The same request shape is available as a copy-paste cURL example in curl/generate.md.
flowchart LR
A[Submit generation request] --> B[Receive data.task_id]
B --> C{Delivery mode}
C -->|Polling| D[Check task status]
C -->|Webhook| E[Receive callback_url event]
D --> F[Read result URL from finished task]
E --> F
Use polling for local tests and webhook delivery for production queues. Store data.task_id before the first status check so retries, callbacks, and result URLs can be reconciled safely.
Submit to APIDot's unified async generation endpoint:
POST https://api.apidot.ai/api/generate/submit
Authorization: Bearer <APIDOT_API_KEY>
Content-Type: application/json
Primary payload:
{
"model": "veo3.1-lite",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "A miniature city waking up at sunrise, cinematic light, smooth camera motion",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "720p"
}
}Submit Veo 3.1 fast, lite, and quality jobs through APIDot's unified async generation endpoint.
Veo 3.1 uses the shared APIDot submit-and-track workflow. veo3.1-fast, veo3.1-lite, and veo3.1-quality all support text-to-video. veo3.1-fast also supports frame and reference image-to-video, while veo3.1-quality supports frame image-to-video only. veo3.1-lite is text-to-video only.
{
"model": "veo3.1-lite",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "A miniature city waking up at sunrise, cinematic light, smooth camera motion",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "720p"
}
}{
"model": "veo3.1-fast",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "Dolphins jumping in a bright blue ocean with native ambient sound",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "720p"
}
}{
"model": "veo3.1-fast",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "Animate the subject from the first frame into a smooth final pose",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "1080p",
"generate_type": "frame",
"image_urls": [
"https://cdn.example.com/first-frame.webp",
"https://cdn.example.com/last-frame.webp"
]
}
}{
"model": "veo3.1-fast",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "Create a dynamic product scene using the supplied visual references",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "1080p",
"generate_type": "reference",
"image_urls": [
"https://cdn.example.com/reference-1.webp",
"https://cdn.example.com/reference-2.webp",
"https://cdn.example.com/reference-3.webp"
]
}
}{
"model": "veo3.1-quality",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "A premium cinematic landscape reveal with natural camera motion and native ambience",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "4k"
}
}{
"model": "veo3.1-quality",
"callback_url": "https://your-domain.com/callback",
"input": {
"prompt": "Create a premium cinematic continuation from the supplied first frame",
"duration": 8,
"aspect_ratio": "16:9",
"resolution": "4k",
"generate_type": "frame",
"image_urls": [
"https://cdn.example.com/first-frame.webp"
]
}
}| Field | Type | Required | Description |
|---|---|---|---|
| model | string | yes | Target Veo 3.1 variant: veo3.1-lite, veo3.1-fast, or veo3.1-quality. |
| callback_url | string | no | Optional webhook callback URL for terminal task updates. |
| input | object | yes | Container for Veo 3.1 generation parameters. |
| input.prompt | string | yes | Main prompt describing scene, motion, audio intent, and target aesthetic. |
| input.duration | number | no | Requested duration in seconds. Veo 3.1 is fixed at 8 seconds. |
| input.aspect_ratio | string | no | Supported values are 16:9 and 9:16. |
| input.resolution | string | no | Supported values are 720p, 1080p, and 4k. |
| input.generate_type | string | no | Optional image-to-video mode. Use frame for first/last-frame control or reference for reference images. Do not send this field for text-to-video or veo3.1-lite; reference is only supported by veo3.1-fast. |
| input.image_urls | string[] | no | Optional image inputs for supported image-to-video runs. frame supports 1-2 images; reference supports 1-3 images. Do not send this field to veo3.1-lite. |
- Use
veo3.1-litefor the lowest-cost text-only iteration,veo3.1-fastfor fast text or image-guided runs, andveo3.1-qualityfor premium output. - Keep prompts explicit about motion, camera perspective, native audio intent, and visual continuity.
- Do not send
input.generate_typeorinput.image_urlstoveo3.1-lite. - Use
input.generate_type: "frame"for first/last-frame workflows. Usereferenceonly withveo3.1-fast.
APIDot media generation is asynchronous. Store data.task_id immediately after submit, poll /api/generate/status/{task_id} for local tests, and use callback_url webhooks for production queues where users may leave the page before completion.
Webhook handlers should verify task ownership, persist callback events, return 2xx quickly, and be idempotent because duplicate deliveries can happen.
- code: HTTP-style status code. Successful submits return
200. - data.task_id: Async task identifier returned immediately after submission.
- data.status: Initial task status, typically
not_started. - data.created_time: ISO 8601 timestamp for task creation.
Common error classes:
- 400 invalid_request: Missing fields, unsupported generate mode, or invalid image input combinations.
- 401 authentication_error: Missing, expired, or invalid Bearer API key.
- 402 insufficient_credits: The current prepaid balance cannot cover the Veo 3.1 request.
- 429 rate_limited: The API key is temporarily above the allowed submission rate.
- Keep APIDot API keys in server-side environment variables.
- Persist task_id, selected model, request payload, user ID, and status together.
- Use a moderate polling interval for tests and webhooks for durable production callbacks.
- Validate source media URLs before submitting requests that depend on source files.
- Avoid logging API keys, private prompts, private media URLs, or callback URLs.
- Retry transient network failures with backoff, but do not retry unchanged invalid payloads.
It is APIDot's unified entry point for veo3.1-lite, veo3.1-fast, and veo3.1-quality. You can test Veo 3.1 text-to-video, frame-guided generation, and supported reference image workflows in one place, then keep the same async integration pattern when you move the workload into production.
The current page is built around fixed 8-second generation with 16:9 and 9:16 output, plus 720p, 1080p, or 4k depending on the variant. veo3.1-lite is text-to-video only. veo3.1-fast supports text, frame-guided generation, and reference images. veo3.1-quality supports text plus frame-guided generation, but not reference mode.
No. Reference image video generation is available only on veo3.1-fast. If you want the lowest-cost way to test a text idea, use veo3.1-lite. If you want higher-fidelity frame-guided generation for a more polished clip, veo3.1-quality is the better fit, but it does not support reference mode.
Use frame mode when you already know the opening shot, the closing shot, or the transition you want to control. Use reference mode when the bigger goal is keeping the subject, product, or visual style consistent across generations. On APIDot, frame mode works on veo3.1-fast and veo3.1-quality, while reference mode is limited to veo3.1-fast.
Yes. Native audio is one of the headline strengths of the Veo 3.1 family. On APIDot, credits are intended for successful generations rather than failed jobs.
Yes. You submit the request, receive a task ID, and then either poll the status endpoint or wait for a callback. That makes Veo 3.1 easier to plug into queue-based systems, content pipelines, and other production workflows without blocking the request lifecycle.
A practical rule is: use Lite for cheap text-only exploration, Fast when you need the broadest working mode with image guidance and frequent iteration, and Quality when the final video matters more than turnaround. Teams often move through them in that same order as a project becomes more defined.
The safest pattern is to be explicit about subject, action, camera perspective, movement, lighting, atmosphere, and audio intent. Prompts that use film language such as tracking shot, overhead shot, close-up, slow push-in, backlight, or ambient street sound tend to give Veo 3.1 clearer instructions than short generic prompts.
veo3.1-lite is text-only and cost-efficient. veo3.1-fast supports text-to-video plus frame/reference image-to-video. veo3.1-quality supports text-to-video and frame image-to-video for the highest-quality output.
Only include input.generate_type when sending images. Use frame for up to two images: first frame and optional last frame. Use reference for up to three reference images, and only with veo3.1-fast.
No. veo3.1-lite should not receive input.generate_type or input.image_urls; submit text prompts only.
No. This is an APIDot integration repository for calling Veo 3.1 through APIDot. Google DeepMind is listed as the model provider in the APIDot catalog. Use the APIDot model page for current availability, pricing, and usage terms.
- APIDot: https://apidot.ai
- Veo 3.1 model page: https://apidot.ai/models/veo-3-1
- Veo 3.1 API docs: https://apidot.ai/docs/veo-3-1
- APIDot quickstart: https://apidot.ai/docs/quickstart
- APIDot webhooks: https://apidot.ai/docs/webhooks
- Main APIDot examples repo: https://github.com/APIDotAI/apidot-examples
More video API examples from APIDot:
| Model | Repository |
|---|---|
| Seedance 2 | seedance-2-api |
| Sora 2 Official | sora-2-official-api |
| Happy Horse | happy-horse-api |