Fix height/width swap and other bugs in dataset utils#910
Open
Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Open
Fix height/width swap and other bugs in dataset utils#910Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Mr-Neutr0n wants to merge 1 commit intohpcaitech:mainfrom
Conversation
- Fix rescale_image_by_path and rescale_video_by_path passing (width, height) to transforms.Resize(), which expects (height, width) - Fix rand_size_crop_arr using height instead of width for w_start boundary - Fix download_url passing encoding="utf-8" to binary write mode "wb"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Found a few bugs in
opensora/datasets/utils.pywhile reading through the dataset code:rescale_image_by_path()andrescale_video_by_path(): These pass(width, height)totransforms.Resize(), but torchvision expects the size argument as(height, width). This causes images/videos to be resized with swapped dimensions.rand_size_crop_arr(): Thew_startcalculation usesheightinstead ofwidthfor bounding the random horizontal crop offset. This means the crop window along the width axis is incorrectly constrained by the height value, which can lead to out-of-bounds crops or overly restricted cropping depending on the aspect ratio.download_url(): Passesencoding="utf-8"toopen()in binary write mode ("wb"). Binary mode doesn't accept an encoding parameter and this will raise aValueErrorat runtime.Changes
(width, height)to(height, width)intransforms.Resize()calls in bothrescale_image_by_pathandrescale_video_by_pathheightwithwidthin thew_startboundary calculation inrand_size_crop_arrencoding="utf-8"argument from the binaryopen()call indownload_urlTest plan