Skip to content

bugfix: ensure tensor contiguous layout before protobuf serialization.#1468

Open
a120092009 wants to merge 2 commits into
jd-opensource:mainfrom
a120092009:fix/flux-mlu
Open

bugfix: ensure tensor contiguous layout before protobuf serialization.#1468
a120092009 wants to merge 2 commits into
jd-opensource:mainfrom
a120092009:fix/flux-mlu

Conversation

@a120092009
Copy link
Copy Markdown
Collaborator

NHWC (channels_last) tensors pass raw bytes directly into the proto buffer when serialized via torch_to_proto. On deserialization the reconstructed tensor assumes NCHW (contiguous) layout, which corrupts the channel order for bfloat16 and float16 image data. Force a contiguous copy before reading data_ptr() so the byte stream always matches the expected row-major order.

NHWC (channels_last) tensors pass raw bytes directly into the proto
buffer when serialized via torch_to_proto. On deserialization the
reconstructed tensor assumes NCHW (contiguous) layout, which corrupts
the channel order for bfloat16 and float16 image data. Force a
contiguous copy before reading data_ptr() so the byte stream always
matches the expected row-major order.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request ensures that tensors are contiguous before serializing raw bytes in torch_to_proto, specifically for kBFloat16 and kHalf types. The review feedback correctly identifies that the current implementation performs a redundant .contiguous() call for all tensor types at the top of the function, which can lead to performance regressions. Additionally, the feedback points out violations of the repository style guide regarding the use of auto for torch::Tensor and the placement of variable declarations. It is recommended to move the contiguity check locally within the specific switch cases where it is required.

Comment thread xllm/core/util/utils.cpp Outdated
Comment thread xllm/core/util/utils.cpp
Comment thread xllm/core/util/utils.cpp
…ranches.

Avoid redundant contiguous() for types where set_data_to_contents already
handles the conversion, and use explicit torch::Tensor type per style guide.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants