-
Notifications
You must be signed in to change notification settings - Fork 41
[AIE2PS] Fix alignment of 64-bit spills and improve CSR spills of L registers #1001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5107a0e
ba81d31
5bcacc0
20545cc
cac0f5e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -85,13 +85,85 @@ void AIE2PSFrameLowering::determineCalleeSaves(MachineFunction &MF, | |
| BitVector &SavedRegs, | ||
| RegScavenger *RS) const { | ||
| TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS); | ||
| Register FPReg = STI.getRegisterInfo()->getFrameRegister(MF); | ||
| const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo(); | ||
| const MachineFrameInfo &MFI = MF.getFrameInfo(); | ||
|
|
||
| // If there is a frame pointer (dynamic stack allocation), p7 will be used as | ||
| // a frame pointer. The register allocator will not be able to see the | ||
| // When both L registers and their sub-GPRs are in the CSR list, we need to | ||
| // decide whether to save as L register or individual GPRs. | ||
| // | ||
| // Strategy: | ||
| // - If only one GPR of the pair is used: save just that GPR | ||
| // - If both GPRs are used AND function has calls: use L register save | ||
| // (stack spill is required, 1 L spill is more efficient than 2 GPR spills) | ||
| // - If both GPRs are used AND no calls: use individual GPR saves | ||
| // (allows GPR-to-GPR spilling via scratch registers) | ||
| // Build the list of callee-saved L registers from the callee-saved regs | ||
| // provided by CSR list. | ||
| SmallVector<MCPhysReg, 4> CalleeSavedLRegs; | ||
| const MCPhysReg *CSRegs = MF.getRegInfo().getCalleeSavedRegs(); | ||
| for (unsigned I = 0; CSRegs[I]; ++I) { | ||
| MCPhysReg Reg = CSRegs[I]; | ||
| if (AIE2PS::eLRegClass.contains(Reg)) | ||
| CalleeSavedLRegs.push_back(Reg); | ||
| } | ||
|
|
||
| for (MCPhysReg LReg : CalleeSavedLRegs) { | ||
| // Get the two GPR subregisters of this L register | ||
| MCPhysReg EvenGPR = TRI->getSubReg(LReg, AIE2PS::sub_l_even); | ||
| MCPhysReg OddGPR = TRI->getSubReg(LReg, AIE2PS::sub_l_odd); | ||
|
|
||
| // Check what's marked for saving by the base determineCalleeSaves. | ||
| // This already reflects which registers are actually clobbered. | ||
| const bool LRegMarked = SavedRegs.test(LReg); | ||
| const bool EvenMarked = SavedRegs.test(EvenGPR); | ||
| const bool OddMarked = SavedRegs.test(OddGPR); | ||
|
|
||
| if (!LRegMarked && !EvenMarked && !OddMarked) | ||
| continue; | ||
|
|
||
| SavedRegs.reset(EvenGPR); | ||
| SavedRegs.reset(OddGPR); | ||
| SavedRegs.reset(LReg); | ||
|
|
||
| assert((!(EvenMarked || OddMarked) || LRegMarked) && | ||
| "sub-reg mark without L pair mark violates invariant"); | ||
|
|
||
| // Determine if both subregisters actually need saving. | ||
| // LRegMarked alone doesn't mean both - check individual GPR marks. | ||
| const bool BothNeeded = | ||
| (EvenMarked && OddMarked) || (LRegMarked && !EvenMarked && !OddMarked); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have both L and R at the same time? Can we assert?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The check
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It means, there could be cases L register and corresponding R register marked at the same time, right?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly, but only one of them. This is a big limitation of the generic determineCalleeSaves as it will happily mark all super registers for saving even if only of their subregisters is actually used. So far this has not been an issue as our CSR list only contained atomic registers.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: You could add an assert |
||
|
|
||
| // When there is calls we mark the L register so that we get a single | ||
| // spill instead of 2. When there are no calls, we prefer marking the | ||
| // subregisters since they can be copied to non CSR registers instead of | ||
| // spilled to memory (There is no move instruction between L registers). | ||
| // For the call case we have no choice but to spill anyway since we don't | ||
| // know which registers the callee is going to use. | ||
| if (BothNeeded) { | ||
| // Both subregisters need saving. | ||
| if (MFI.hasCalls()) { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don`t fully understand why we have to do special handling with a call in the function, since we are working on callee saved registers. Would be nice to include a comment for future refernce.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As discussed offline, when there is calls we mark the L register so that we get a single spill instead of 2. When there are no calls, we prefer marking the subregsiters since they can be copied to non CSR registers instead of spilled to memory (There is no move instruction between L registers). For the call case we have no choice but to spill anyway since we don't know which registers the callee is going to use. I'll add this as a comment. |
||
| // Use L register save. Stack spill is required (scratch regs | ||
| // clobbered by calls), so 1 L spill is more efficient than 2 GPR | ||
| // spills. | ||
| SavedRegs.set(LReg); | ||
|
khallouh marked this conversation as resolved.
|
||
| } else { | ||
| // No calls: use individual GPRs for GPR-to-GPR copy. | ||
| SavedRegs.set(EvenGPR); | ||
| SavedRegs.set(OddGPR); | ||
| } | ||
| } else if (EvenMarked) { | ||
| SavedRegs.set(EvenGPR); | ||
| } else { | ||
| assert(OddMarked); | ||
| SavedRegs.set(OddGPR); | ||
| } | ||
| } | ||
| // If there is a frame pointer (dynamic stack allocation), p7 will be used | ||
| // as a frame pointer. The register allocator will not be able to see the | ||
| // redefinition of p7 as the prologue and the epilogue are emitted after the | ||
| // register allocation. Thus, we make sure to spill p7 at the beginning of the | ||
| // function body and restore it at the end by adding it in SavedRegs. | ||
| // register allocation. Thus, we make sure to spill p7 at the beginning of | ||
| // the function body and restore it at the end by adding it in SavedRegs. | ||
| const Register FPReg = TRI->getFrameRegister(MF); | ||
| if (hasFP(MF)) | ||
| SavedRegs.set(FPReg); | ||
| } | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.