Inference scaling by nicolaus-huang · Pull Request #830 · hpcaitech/Open-Sora

nicolaus-huang · 2025-03-19T10:42:16Z

Inference Scaling

Implementation of scaling method during inferencing inspaired by Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps. Spend more computational resources to get better results. Use it by specifying the sampling option.

torchrun --nproc_per_node 4 --standalone scripts/diffusion/inference.py configs/diffusion/inference/t2i2v_768px_inference_scaling.py --save-dir samples --dataset.data-path assets/texts/sora.csv

Original	num_subtree=3 num_scaling_steps=5 num_noise=1 time=16min	num_subtree=7 num_scaling_steps=8 num_noise=1 time=1h

Devindelarocka

Cinematic Video Prompt
Aspect Ratio: 16:9 horizontal
Style: Photorealistic, raw found-footage aesthetic
Camera: First-person perspective (FPV), selfie-style, low angle looking up at the character
Device Simulation: iPhone handheld, slight shake for realism

Scene Concept
Setting:

A lively beach party at sunset in Maui. Golden light reflects off the ocean, tiki torches flicker in the background, and reggae music drifts from a nearby beach bar.
People are laughing in the distance, surfboards stacked against a palm tree, and the faint sound of waves crashing adds depth.

Character:

A Grizzly Bear with a surfer vibe: bright floral Hawaiian shirt, colorful surfer shorts, mirrored sunglasses.
He’s holding a half-empty coconut drink with a tiny umbrella and a slice of pineapple.

Camera Movement:

Starts with the camera held low, angled upward toward the bear’s face.
Slight wobble as if the person filming is tipsy.
Occasional lens flare from the sunset for realism.

Expression Progression (8 seconds):

Curious – Bear leans in, squints at the camera, tilts his head.
Drunk/Intoxicated – He sways, grins lazily, tongue slightly out, sunglasses slipping.
Wide-eyed Panic – Suddenly startled by a crab crawling on his foot.
Surprised & Dumbfounded – Mouth agape, sunglasses slide down his nose, coconut spills slightly.

Dialogue (Lip-Synced, Jamaican Reggae Accent, No Subtitles):

Bear:
“Ey mon… you ever seen a bear surf? Ha! Dis coconut talkin’ louder than me head, mon… Whoa—what’s dat?! Crab attack! Jah bless, dis beach is wild!”

Soundscape:

Ambient reggae beats in the background.
Ocean waves, distant laughter, and a sudden “snap” sound when the crab appears.
Bear’s voice is deep, rhythmic, and playful with exaggerated Jamaican inflection.

nicolaus-huang added 5 commits March 17, 2025 15:37

inference scaling init

bb64366

add configs and clean code

a2e4e16

add inference scaling doc

681a3da

fix bug

134c985

clean config

e660956

Devindelarocka approved these changes Oct 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference scaling#830

Inference scaling#830
nicolaus-huang wants to merge 5 commits intomainfrom
inference-scaling

nicolaus-huang commented Mar 19, 2025 •

edited

Loading

Uh oh!

Devindelarocka left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nicolaus-huang commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!