hipfile: Allow buffer offsets when issuing IO to an unregistered buffer#177
hipfile: Allow buffer offsets when issuing IO to an unregistered buffer#177kurtmcmillan wants to merge 6 commits intodevelopfrom
Conversation
There are two options here. 1. hipfile does not store the size of the buffer in the Buffer object. This option relies on the bounds checking performed by the THUNK to ensure that the IO operation fits within the bounds of the buffer. This makes IO with an unregistered buffer a little faster because hipFile does not need to call hipMemGetAdressRange on every IO. However, if the THUNK detects that an IO would overflow the buffer the error is lost as the error propagates back up the stack and hipfile can only return -hipFileDriverError or -hipfileInternalError. 2. hipfile stores the size of the buffer in the Buffer object. hipFile performs its own bounds checking on each IO. This allows hipfile to return -hipfileInvalidValue when an IO would overflow the buffer. The downsides are IO to unregistered buffers now have to call hipMemGetAdressRange() on every IO and the bounds check duplicates the check done in the THUNK. hipfile will go with the second option because (1) IO with unregistered buffers is expected to e slower than with registered buffers and (2) it is better to give clients a more precise error (-hipfileInvalidValue).
The IO path checks that IO will not overflow. It is not clear what value this check provides.
…dBuffer() Rename getBuffer(ptr) to getRegisteredBuffer() now that BufferMap::getBuffer(ptr, length, flags) no longer uses length nor flags. This rename is necessary as a future commit will remove the unused parameters.
…fer() Rename getBuffer(ptr) to getRegisteredBuffer() now that BufferMap::getBuffer(ptr, length, flags) no longer uses length nor flags. This rename is necessary as a future commit will remove the unused parameters.
There was a problem hiding this comment.
Pull request overview
This PR fixes an issue where hipfile would incorrectly reject IO operations to unregistered buffers when a non-zero buffer offset was specified. The fix simplifies the buffer handling logic by removing the length parameter from buffer lookup functions and having temporary buffers automatically determine their size by querying the HIP runtime.
Changes:
- Introduced
getRegisteredBuffer()method to explicitly retrieve only registered buffers - Modified
getBuffer()to automatically determine buffer size via HIP runtime queries instead of requiring a length parameter - Removed validation that rejected temporary buffers with offsets by eliminating the size comparison check
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/system/io.cpp | Added new test cases verifying IO operations with buffer offsets work correctly and fail appropriately on overflow |
| test/amd_detail/state_mt.cpp | Updated test calls to use new getRegisteredBuffer/getBuffer API without length parameters |
| test/amd_detail/mstate.h | Updated mock interface to reflect new API signatures |
| test/amd_detail/mbuffer.h | Updated mock interface to reflect new API signatures |
| test/amd_detail/hipfile-api.cpp | Removed test for invalid buffer length validation and updated mock expectations |
| test/amd_detail/fallback.cpp | Updated test calls to use getRegisteredBuffer for explicitly registered buffers |
| test/amd_detail/buffer.cpp | Updated tests to use getRegisteredBuffer, removed test for length validation, and added test for temporary buffer creation |
| test/CMakeLists.txt | Added new io.cpp test file to build |
| src/amd_detail/state.h | Updated method signatures to remove length/flags parameters and added getRegisteredBuffer |
| src/amd_detail/state.cpp | Renamed getBuffer to getRegisteredBuffer and simplified getBuffer implementation |
| src/amd_detail/hipfile.cpp | Updated call to getFileAndBuffer without length parameter |
| src/amd_detail/buffer.h | Added new Buffer constructor and updated BufferMap method signatures |
| src/amd_detail/buffer.cpp | Implemented new Buffer constructor that queries HIP runtime for size, simplified getBuffer logic |
| src/amd_detail/batch/batch.cpp | Updated call to getFileAndBuffer without length parameter |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| gpu_id = _attrs.device; | ||
|
|
||
| HipMemAddressRange alp{Context<Hip>::get()->hipMemGetAddressRange(buffer)}; | ||
| length = alp.size - (reinterpret_cast<uintptr_t>(alp.base) - reinterpret_cast<uintptr_t>(buffer)); |
There was a problem hiding this comment.
The pointer arithmetic expression (reinterpret_cast<uintptr_t>(alp.base) - reinterpret_cast<uintptr_t>(buffer)) should be replaced with the more readable reinterpret_cast<uintptr_t>(buffer) - reinterpret_cast<uintptr_t>(alp.base) to clarify that we're calculating the offset of buffer from the base, then add it to the size calculation as length = alp.size - (reinterpret_cast<uintptr_t>(buffer) - reinterpret_cast<uintptr_t>(alp.base)). This makes the intent clearer: the available length is the total size minus the offset into the allocation.
| length = alp.size - (reinterpret_cast<uintptr_t>(alp.base) - reinterpret_cast<uintptr_t>(buffer)); | |
| length = alp.size - (reinterpret_cast<uintptr_t>(buffer) - reinterpret_cast<uintptr_t>(alp.base)); |
When hipfile creates a temporary buffer to handle an IO the IO size is passed into Buffer's constructor as the buffer's size. Before issuing the IO, the backends (fastpath/fallback) check if the IO would overflow the buffer. This check fails with temporary buffers because buffer_size < io_size + offset when 0 < offset.