stbir_resize in resizeImage maybe calc wrong result
https://github.com/NVIDIA/TensorRT-Edge-LLM/blob/release/0.8.0/cpp/multimodal/qwenViTRunner.cpp#L438-L449
if (doResize)
{
auto [resizedHeight, resizedWidth] = getResizedImageSize(image.height, image.width);
rt::imageUtils::resizeImage(
image, mResizedImageHost, resizedWidth, resizedHeight, rt::imageUtils::InterpolationMode::kBICUBIC);
formatPatch(mResizedImageHost, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize,
maxSeqLen, stream);
}
else
{
formatPatch(image, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize, maxSeqLen, stream);
}
when image.height == resizedHeight && image.width == resizedWidth, it still enter
into resizeImage function, https://github.com/NVIDIA/TensorRT-Edge-LLM/blob/release/0.8.0/cpp/runtime/imageUtils.cpp#L104-L127
if (mode == InterpolationMode::kBICUBIC)
{
stbir_resize(image.data(), image.width, image.height, kINPUT_STRIDE_BYTES, resizedImage.data(), newWidth,
newHeight, kOUTPUT_STRIDE_BYTES, STBIR_RGB, STBIR_TYPE_UINT8, STBIR_EDGE_CLAMP, STBIR_FILTER_CATMULLROM);
}
it will generate wrong/unstable output even if each input is fixed, the STBIR_FILTER_CATMULLROM mode can cause data fluctuations.
So how to fix ?
ref follow :
if (doResize)
{
auto [resizedHeight, resizedWidth] = getResizedImageSize(image.height, image.width);
if (resizedHeight != image.height || resizedWidth != image.width) {
rt::imageUtils::resizeImage(
image, mResizedImageHost, resizedWidth, resizedHeight, rt::imageUtils::InterpolationMode::kBICUBIC);
formatPatch(mResizedImageHost, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize,
maxSeqLen, stream);
}
else {
formatPatch(image, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize, maxSeqLen, stream);
}
}
else
{
formatPatch(image, imageGridTHWs, imageTokenLengths, cuSeqlensData, cuSeqlensSize, maxSeqLen, stream);
}
stbir_resize in resizeImage maybe calc wrong result
https://github.com/NVIDIA/TensorRT-Edge-LLM/blob/release/0.8.0/cpp/multimodal/qwenViTRunner.cpp#L438-L449
when
image.height == resizedHeight && image.width == resizedWidth, it still enterinto resizeImage function, https://github.com/NVIDIA/TensorRT-Edge-LLM/blob/release/0.8.0/cpp/runtime/imageUtils.cpp#L104-L127
it will generate wrong/unstable output even if each input is fixed, the STBIR_FILTER_CATMULLROM mode can cause data fluctuations.
So how to fix ?
ref follow :