Skip to content

hipfile: Allow buffer offsets when issuing IO to an unregistered buffer#177

Draft
kurtmcmillan wants to merge 6 commits intodevelopfrom
kumcmill/tempbuffer-size
Draft

hipfile: Allow buffer offsets when issuing IO to an unregistered buffer#177
kurtmcmillan wants to merge 6 commits intodevelopfrom
kumcmill/tempbuffer-size

Conversation

@kurtmcmillan
Copy link
Collaborator

When hipfile creates a temporary buffer to handle an IO the IO size is passed into Buffer's constructor as the buffer's size. Before issuing the IO, the backends (fastpath/fallback) check if the IO would overflow the buffer. This check fails with temporary buffers because buffer_size < io_size + offset when 0 < offset.

There are two options here.

1. hipfile does not store the size of the buffer in the Buffer object. This option
relies on the bounds checking performed by the THUNK to ensure that the IO
operation fits within the bounds of the buffer. This makes IO with an
unregistered buffer a little faster because hipFile does not need to call
hipMemGetAdressRange on every IO. However, if the THUNK detects that an IO
would overflow the buffer the error is lost as the error propagates back up
the stack and hipfile can only return -hipFileDriverError or
-hipfileInternalError.

2. hipfile stores the size of the buffer in the Buffer object. hipFile performs
its own bounds checking on each IO. This allows hipfile to return
-hipfileInvalidValue when an IO would overflow the buffer. The downsides are IO
to unregistered buffers now have to call hipMemGetAdressRange() on every IO and
the bounds check duplicates the check done in the THUNK.

hipfile will go with the second option because (1) IO with
unregistered buffers is expected to e slower than with registered buffers and
(2) it is better to give clients a more precise error (-hipfileInvalidValue).
The IO path checks that IO will not overflow. It is not clear what value this
check provides.
…dBuffer()

Rename getBuffer(ptr) to getRegisteredBuffer() now that
BufferMap::getBuffer(ptr, length, flags) no longer uses length nor flags.

This rename is necessary as a future commit will remove the unused parameters.
…fer()

Rename getBuffer(ptr) to getRegisteredBuffer() now that
BufferMap::getBuffer(ptr, length, flags) no longer uses length nor flags.

This rename is necessary as a future commit will remove the unused parameters.
Copilot AI review requested due to automatic review settings January 30, 2026 21:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an issue where hipfile would incorrectly reject IO operations to unregistered buffers when a non-zero buffer offset was specified. The fix simplifies the buffer handling logic by removing the length parameter from buffer lookup functions and having temporary buffers automatically determine their size by querying the HIP runtime.

Changes:

  • Introduced getRegisteredBuffer() method to explicitly retrieve only registered buffers
  • Modified getBuffer() to automatically determine buffer size via HIP runtime queries instead of requiring a length parameter
  • Removed validation that rejected temporary buffers with offsets by eliminating the size comparison check

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/system/io.cpp Added new test cases verifying IO operations with buffer offsets work correctly and fail appropriately on overflow
test/amd_detail/state_mt.cpp Updated test calls to use new getRegisteredBuffer/getBuffer API without length parameters
test/amd_detail/mstate.h Updated mock interface to reflect new API signatures
test/amd_detail/mbuffer.h Updated mock interface to reflect new API signatures
test/amd_detail/hipfile-api.cpp Removed test for invalid buffer length validation and updated mock expectations
test/amd_detail/fallback.cpp Updated test calls to use getRegisteredBuffer for explicitly registered buffers
test/amd_detail/buffer.cpp Updated tests to use getRegisteredBuffer, removed test for length validation, and added test for temporary buffer creation
test/CMakeLists.txt Added new io.cpp test file to build
src/amd_detail/state.h Updated method signatures to remove length/flags parameters and added getRegisteredBuffer
src/amd_detail/state.cpp Renamed getBuffer to getRegisteredBuffer and simplified getBuffer implementation
src/amd_detail/hipfile.cpp Updated call to getFileAndBuffer without length parameter
src/amd_detail/buffer.h Added new Buffer constructor and updated BufferMap method signatures
src/amd_detail/buffer.cpp Implemented new Buffer constructor that queries HIP runtime for size, simplified getBuffer logic
src/amd_detail/batch/batch.cpp Updated call to getFileAndBuffer without length parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gpu_id = _attrs.device;

HipMemAddressRange alp{Context<Hip>::get()->hipMemGetAddressRange(buffer)};
length = alp.size - (reinterpret_cast<uintptr_t>(alp.base) - reinterpret_cast<uintptr_t>(buffer));
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer arithmetic expression (reinterpret_cast<uintptr_t>(alp.base) - reinterpret_cast<uintptr_t>(buffer)) should be replaced with the more readable reinterpret_cast<uintptr_t>(buffer) - reinterpret_cast<uintptr_t>(alp.base) to clarify that we're calculating the offset of buffer from the base, then add it to the size calculation as length = alp.size - (reinterpret_cast<uintptr_t>(buffer) - reinterpret_cast<uintptr_t>(alp.base)). This makes the intent clearer: the available length is the total size minus the offset into the allocation.

Suggested change
length = alp.size - (reinterpret_cast<uintptr_t>(alp.base) - reinterpret_cast<uintptr_t>(buffer));
length = alp.size - (reinterpret_cast<uintptr_t>(buffer) - reinterpret_cast<uintptr_t>(alp.base));

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant