sonic_Combine

Data:2025-06-16

(1)算法来源说明

(2)环境安装说明

2.1 python>=3.10

2.2 Requires cude environment,cuda>=11.8

2.3 Installation package environment

pip install -r requirements1.txt

2.4 need to build ops:

cd src/utilslive/dependencies/XPose/models/UniPose/ops
python setup.py build install

(3)项目使用说明

3.1 Inference

python sonic_full_inference_v2.py

3.2 Model download

download Model from https://github.com/jixiaozhong/Sonic
download yolov8x-seg.pt model

main_package_name
  ├──checkpoints
  │  ├──Sonic
  │  │  ├──audio2bucket.pth
  │  │  ├──audio2token.pth
  │  │  ├──unet.pth
  │  ├──stable-video-diffusion-img2vid-xt
  │  │  ├──...
  │  ├──whisper-tiny
  │  │  ├──...
  │  ├──RIFE
  │  │  ├──flownet.pkl
  │  ├──yoloface_v5m.pt
  ├──pretrained_weights
  │  ├──yolov8x-seg.pt

3.3 Request parameters

Parameter	Type	Description
image_path	string	Input image path
audio_path	string	Input audio path
output_path	string	Output Directory Path
crop_save_path	string	Crop image save path
min_resolution	int	Minimum resolution (default 448)
inference_steps	int	inference steps (default 15)
animal_signal	bool	Is it animal mode
pastback	bool	Do you want to execute pasting back to the original image
mult_people	bool	Is it a multiplayer mode
dynamic_scale	float	app mode, facial dynamic amplitude parameter
face_boxes	List	Specify a list of face boxes, where each element is a BoundingBox structure
crop_size_ration	string	ve Mode, crop image ratio, such as "448:448"
custom_box	BoundingBox	vikapp Crop Mode, custom Crop Box, using the BoundingBox structure
crop_app	bool	vikapp Crop Mode
no_human_face_run	bool	Want to continue running without face detection
full_image_inference	bool	Process the full image

3.4 Response parameters

Parameter	Type	Description
output_video_path	string	Output video path or 'false'

3.5 What New

Reduced memory usage, now saving a result video every 500 frames, with adjustable parameters for each_process_video_frames_number in Sonic_full.
The processing options for portraits or animals have added modules such as face_boxes, crop_size_ration, no_human_face_run, custom_box and crop_app. The 3.3 Request parameters table show each parameter introduction.
Add image format detection and conversion.
Implement full image inference.

(4)算法信息说明

4.1 device infomation

cpu	gpu
intel i7-13700KF	nvidia 4090

4.2 running infomation

model	resolution	steps	audio duration	process time	gpu memory
	448 $$\times$$ 512	15	10s	145s	17G
	1920 $$\times$$ 1080	15	15s	165s	17G

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
checkpoints		checkpoints
config/inference		config/inference
examples		examples
flagged		flagged
pretrained_weights		pretrained_weights
res_path		res_path
script		script
src		src
tmp		tmp
tmp_path		tmp_path
utils		utils
README.md		README.md
requirements1.txt		requirements1.txt
sonic_full.py		sonic_full.py
sonic_full_inference.py		sonic_full_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sonic_Combine

Data:2025-06-16

(1)算法来源说明

(2)环境安装说明

(3)项目使用说明

3.1 Inference

3.2 Model download

3.3 Request parameters

3.4 Response parameters

3.5 What New

(4)算法信息说明

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sonic_Combine

Data:2025-06-16

(1)算法来源说明

(2)环境安装说明

(3)项目使用说明

3.1 Inference

3.2 Model download

3.3 Request parameters

3.4 Response parameters

3.5 What New

(4)算法信息说明

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages