Skip to content

RFC: Segment aware vm addressing#480

Open
rjmansfield wants to merge 2 commits intogoogle:mainfrom
rjmansfield:segment-aware-vm-addressing
Open

RFC: Segment aware vm addressing#480
rjmansfield wants to merge 2 commits intogoogle:mainfrom
rjmansfield:segment-aware-vm-addressing

Conversation

@rjmansfield
Copy link
Contributor

Bloaty currently cannot correctly analyze macho universal binaries. When processing a universal binary containing multiple architectures. e.g. arm64 and x86_64, each architecture slice should have its own virtual address space. However Bloaty's current implementation uses a single flat address space, causing these overlapping addresses to conflict and produce incorrect results. As suggested in #153 (comment), this adds a segment id, and then updates the logic to handle multiple address spaces.

This bulk of the changes is introducing a VMAddr structure which contains a segment identifier and an address. The more challenging and complex changes are to ComputeRollup which required relaxing the logic for secondary maps to accommodate things like, padding or gaps (otherwise these maps trigger asserts in the previous code). With the segment infrastructure changes in place, the macho changes were fairly straightforward.

@haberman Does this approach seem reasonable to you? I hope my understanding of the rollup algorithm and the changes are correct. All of the existing tests are passing for me locally but it's possible I've missed something, so any feedback would be appreciated.

This patch introduces the infrastructure for segment aware virtual
memory addressing, enabling support for universal Mach-O binaries
where multiple address spaces need to coexist.

The relaxations in ComputeRollup are necessary because with
segment separated address spaces, secondary maps can have gaps
when switching between segments.
correctly analyze universal binaries where each architecture has its
own address space.
// RangeMap maps
//
// [uint64_t, uint64_t) -> std::string, [optional other range base]
// [VMAddr, uint64_t) -> std::string, [optional other range base]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be simpler to make RangeMap mostly unaware of segments, and instead make the container holding the RangeMap contain a map of segment_id -> RangeMap.

I think there is only one place that RangeMap needs to know about segment, and that is other_start, which will indeed need to know which segment the other start belongs to (if the "other" domain is the file, then the segment will always be 0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants