Skip to content

Fix height/width swap and other bugs in dataset utils#910

Open
Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Mr-Neutr0n:fix/dataset-utils-bugs
Open

Fix height/width swap and other bugs in dataset utils#910
Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Mr-Neutr0n:fix/dataset-utils-bugs

Conversation

@Mr-Neutr0n
Copy link

Summary

Found a few bugs in opensora/datasets/utils.py while reading through the dataset code:

  • rescale_image_by_path() and rescale_video_by_path(): These pass (width, height) to transforms.Resize(), but torchvision expects the size argument as (height, width). This causes images/videos to be resized with swapped dimensions.

  • rand_size_crop_arr(): The w_start calculation uses height instead of width for bounding the random horizontal crop offset. This means the crop window along the width axis is incorrectly constrained by the height value, which can lead to out-of-bounds crops or overly restricted cropping depending on the aspect ratio.

  • download_url(): Passes encoding="utf-8" to open() in binary write mode ("wb"). Binary mode doesn't accept an encoding parameter and this will raise a ValueError at runtime.

Changes

  • Swap (width, height) to (height, width) in transforms.Resize() calls in both rescale_image_by_path and rescale_video_by_path
  • Replace height with width in the w_start boundary calculation in rand_size_crop_arr
  • Remove the invalid encoding="utf-8" argument from the binary open() call in download_url

Test plan

  • Verified each fix by reading the relevant torchvision and Python docs
  • Run dataset preprocessing pipeline with image/video rescaling to confirm correct output dimensions
  • Run random size crop augmentation to confirm crops stay within bounds
  • Run URL download to confirm no runtime error

- Fix rescale_image_by_path and rescale_video_by_path passing (width, height)
  to transforms.Resize(), which expects (height, width)
- Fix rand_size_crop_arr using height instead of width for w_start boundary
- Fix download_url passing encoding="utf-8" to binary write mode "wb"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant