Skip to content

stbir_resize in resizeImage maybe calc wrong result #116

Description

@lix19937

stbir_resize in resizeImage maybe calc wrong result

https://github.com/NVIDIA/TensorRT-Edge-LLM/blob/release/0.8.0/cpp/multimodal/qwenViTRunner.cpp#L438-L449

            if (doResize)
            {
                auto [resizedHeight, resizedWidth] = getResizedImageSize(image.height, image.width);
                rt::imageUtils::resizeImage(
                    image, mResizedImageHost, resizedWidth, resizedHeight, rt::imageUtils::InterpolationMode::kBICUBIC);
                formatPatch(mResizedImageHost, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize,
                    maxSeqLen, stream);
            }
            else
            {
                formatPatch(image, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize, maxSeqLen, stream);
            }

when image.height == resizedHeight && image.width == resizedWidth, it still enter
into resizeImage function, https://github.com/NVIDIA/TensorRT-Edge-LLM/blob/release/0.8.0/cpp/runtime/imageUtils.cpp#L104-L127

    if (mode == InterpolationMode::kBICUBIC)
    {
        stbir_resize(image.data(), image.width, image.height, kINPUT_STRIDE_BYTES, resizedImage.data(), newWidth,
            newHeight, kOUTPUT_STRIDE_BYTES, STBIR_RGB, STBIR_TYPE_UINT8, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM);
    }

it will generate wrong/unstable output even if each input is fixed, the STBIR_FILTER_CATMULLROM mode can cause data fluctuations.


So how to fix ?
ref follow :

            if (doResize)
            {
                auto [resizedHeight, resizedWidth] = getResizedImageSize(image.height, image.width);
               if (resizedHeight != image.height || resizedWidth != image.width) {
                    rt::imageUtils::resizeImage(
                        image, mResizedImageHost, resizedWidth, resizedHeight, rt::imageUtils::InterpolationMode::kBICUBIC);
                    formatPatch(mResizedImageHost, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize,
                        maxSeqLen, stream);
                }
                else {
                    formatPatch(image, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize, maxSeqLen, stream);
                }
            }
            else
            {
                formatPatch(image, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize, maxSeqLen, stream);
            }

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions