diff options
| author | Quentin Colombet <qcolombet@apple.com> | 2019-11-12 16:32:12 -0800 |
|---|---|---|
| committer | Quentin Colombet <qcolombet@apple.com> | 2019-11-13 11:17:56 -0800 |
| commit | de94cda81bde8556bd847a37b0a1f83eaeceaf5b (patch) | |
| tree | 265e4cb8c92d9117ae183d9429c186d948c78516 /llvm/lib | |
| parent | 2bf9b9a5a3a4d3817e44d31579a6cd5d67907b2c (diff) | |
| download | bcm5719-llvm-de94cda81bde8556bd847a37b0a1f83eaeceaf5b.tar.gz bcm5719-llvm-de94cda81bde8556bd847a37b0a1f83eaeceaf5b.zip | |
[LiveInterval] Allow updating subranges with slightly out-dated IR
During register coalescing, we update the live-intervals on-the-fly.
To do that we are in this strange mode where the live-intervals can
be slightly out-of-sync (more precisely they are forward looking)
compared to what the IR actually represents.
This happens because the register coalescer only updates the IR when
it is done with updating the live-intervals and it has to do it this
way because updating the IR on-the-fly would actually clobber some
information on how the live-ranges that are being updated look like.
This is problematic for updates that rely on the IR to accurately
represents the state of the live-ranges. Right now, we have only
one of those: stripValuesNotDefiningMask.
To reconcile this need of out-of-sync IR, this patch introduces a
new argument to LiveInterval::refineSubRanges that allows the code
doing the live range updates to reason about how the code should
look like after the coalescer will have rewritten the registers.
Essentially this captures how a subregister index with be offseted
to match its position in a new register class.
E.g., let say we want to merge:
V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32>
We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32>
overlap, i.e., by choosing a class where we can find "offset + 1 == 3".
Put differently we align V2's sub3 with V1's sub1:
V2: sub0 sub1 sub2 sub3
V1: <offset> sub0 sub1
This offset will look like a composed subregidx in the the class:
V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
=> V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32>
Now if we didn't rewrite the uses and def of V1, all the checks for V1
need to account for this offset to match what the live intervals intend
to capture.
Prior to this patch, we would fail to recognize the uses and def of V1
and would end up with machine verifier errors: No live segment at def.
This could lead to miscompile as we would drop some live-ranges and
thus, miss some interferences.
For this problem to trigger, we need to reach stripValuesNotDefiningMask
while having a mismatch between the IR and the live-ranges (i.e.,
we have to apply a subreg offset to the IR.)
This requires the following three conditions:
1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1>
2. An update with Tuple registers with a possibility to coalesce the
subreg index: e.g., v1.dsub_1 == v2.dsub_3
3. Subreg liveness enabled.
looking at the IR to decide what is alive and what is not, i.e., calling
stripValuesNotDefiningMask.
coalescer maintains for the live-ranges information.
None of the targets that currently use subreg liveness (i.e., the targets
that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and
and #2, so this patch also artificial enables subreg liveness for ARM,
so that a nice test case can be attached.
Diffstat (limited to 'llvm/lib')
| -rw-r--r-- | llvm/lib/CodeGen/LiveInterval.cpp | 19 | ||||
| -rw-r--r-- | llvm/lib/CodeGen/RegisterCoalescer.cpp | 12 | ||||
| -rw-r--r-- | llvm/lib/Target/ARM/ARMSubtarget.cpp | 5 | ||||
| -rw-r--r-- | llvm/lib/Target/ARM/ARMSubtarget.h | 3 |
4 files changed, 29 insertions, 10 deletions
diff --git a/llvm/lib/CodeGen/LiveInterval.cpp b/llvm/lib/CodeGen/LiveInterval.cpp index 54ac46f2e7c..930dc116205 100644 --- a/llvm/lib/CodeGen/LiveInterval.cpp +++ b/llvm/lib/CodeGen/LiveInterval.cpp @@ -883,7 +883,8 @@ void LiveInterval::clearSubRanges() { static void stripValuesNotDefiningMask(unsigned Reg, LiveInterval::SubRange &SR, LaneBitmask LaneMask, const SlotIndexes &Indexes, - const TargetRegisterInfo &TRI) { + const TargetRegisterInfo &TRI, + unsigned ComposeSubRegIdx) { // Phys reg should not be tracked at subreg level. // Same for noreg (Reg == 0). if (!Register::isVirtualRegister(Reg) || !Reg) @@ -905,7 +906,12 @@ static void stripValuesNotDefiningMask(unsigned Reg, LiveInterval::SubRange &SR, continue; if (MOI->getReg() != Reg) continue; - if ((TRI.getSubRegIndexLaneMask(MOI->getSubReg()) & LaneMask).none()) + LaneBitmask OrigMask = TRI.getSubRegIndexLaneMask(MOI->getSubReg()); + LaneBitmask ExpectedDefMask = + ComposeSubRegIdx + ? TRI.composeSubRegIndexLaneMask(ComposeSubRegIdx, OrigMask) + : OrigMask; + if ((ExpectedDefMask & LaneMask).none()) continue; hasDef = true; break; @@ -924,7 +930,8 @@ static void stripValuesNotDefiningMask(unsigned Reg, LiveInterval::SubRange &SR, void LiveInterval::refineSubRanges( BumpPtrAllocator &Allocator, LaneBitmask LaneMask, std::function<void(LiveInterval::SubRange &)> Apply, - const SlotIndexes &Indexes, const TargetRegisterInfo &TRI) { + const SlotIndexes &Indexes, const TargetRegisterInfo &TRI, + unsigned ComposeSubRegIdx) { LaneBitmask ToApply = LaneMask; for (SubRange &SR : subranges()) { LaneBitmask SRMask = SR.LaneMask; @@ -944,8 +951,10 @@ void LiveInterval::refineSubRanges( MatchingRange = createSubRangeFrom(Allocator, Matching, SR); // Now that the subrange is split in half, make sure we // only keep in the subranges the VNIs that touch the related half. - stripValuesNotDefiningMask(reg, *MatchingRange, Matching, Indexes, TRI); - stripValuesNotDefiningMask(reg, SR, SR.LaneMask, Indexes, TRI); + stripValuesNotDefiningMask(reg, *MatchingRange, Matching, Indexes, TRI, + ComposeSubRegIdx); + stripValuesNotDefiningMask(reg, SR, SR.LaneMask, Indexes, TRI, + ComposeSubRegIdx); } Apply(*MatchingRange); ToApply &= ~Matching; diff --git a/llvm/lib/CodeGen/RegisterCoalescer.cpp b/llvm/lib/CodeGen/RegisterCoalescer.cpp index c44a302c499..e25f0638d68 100644 --- a/llvm/lib/CodeGen/RegisterCoalescer.cpp +++ b/llvm/lib/CodeGen/RegisterCoalescer.cpp @@ -225,7 +225,8 @@ namespace { /// @p ToMerge will occupy in the coalescer register. @p LI has its subrange /// lanemasks already adjusted to the coalesced register. void mergeSubRangeInto(LiveInterval &LI, const LiveRange &ToMerge, - LaneBitmask LaneMask, CoalescerPair &CP); + LaneBitmask LaneMask, CoalescerPair &CP, + unsigned DstIdx); /// Join the liveranges of two subregisters. Joins @p RRange into /// @p LRange, @p RRange may be invalid afterwards. @@ -3271,7 +3272,8 @@ void RegisterCoalescer::joinSubRegRanges(LiveRange &LRange, LiveRange &RRange, void RegisterCoalescer::mergeSubRangeInto(LiveInterval &LI, const LiveRange &ToMerge, LaneBitmask LaneMask, - CoalescerPair &CP) { + CoalescerPair &CP, + unsigned ComposeSubRegIdx) { BumpPtrAllocator &Allocator = LIS->getVNInfoAllocator(); LI.refineSubRanges( Allocator, LaneMask, @@ -3284,7 +3286,7 @@ void RegisterCoalescer::mergeSubRangeInto(LiveInterval &LI, joinSubRegRanges(SR, RangeCopy, SR.LaneMask, CP); } }, - *LIS->getSlotIndexes(), *TRI); + *LIS->getSlotIndexes(), *TRI, ComposeSubRegIdx); } bool RegisterCoalescer::isHighCostLiveInterval(LiveInterval &LI) { @@ -3350,12 +3352,12 @@ bool RegisterCoalescer::joinVirtRegs(CoalescerPair &CP) { if (!RHS.hasSubRanges()) { LaneBitmask Mask = SrcIdx == 0 ? CP.getNewRC()->getLaneMask() : TRI->getSubRegIndexLaneMask(SrcIdx); - mergeSubRangeInto(LHS, RHS, Mask, CP); + mergeSubRangeInto(LHS, RHS, Mask, CP, DstIdx); } else { // Pair up subranges and merge. for (LiveInterval::SubRange &R : RHS.subranges()) { LaneBitmask Mask = TRI->composeSubRegIndexLaneMask(SrcIdx, R.LaneMask); - mergeSubRangeInto(LHS, R, Mask, CP); + mergeSubRangeInto(LHS, R, Mask, CP, DstIdx); } } LLVM_DEBUG(dbgs() << "\tJoined SubRanges " << LHS << "\n"); diff --git a/llvm/lib/Target/ARM/ARMSubtarget.cpp b/llvm/lib/Target/ARM/ARMSubtarget.cpp index c9316a71bdf..eb4d39b01cb 100644 --- a/llvm/lib/Target/ARM/ARMSubtarget.cpp +++ b/llvm/lib/Target/ARM/ARMSubtarget.cpp @@ -72,6 +72,9 @@ static cl::opt<bool> ForceFastISel("arm-force-fast-isel", cl::init(false), cl::Hidden); +static cl::opt<bool> EnableSubRegLiveness("arm-enable-subreg-liveness", + cl::init(false), cl::Hidden); + /// initializeSubtargetDependencies - Initializes using a CPU and feature string /// so that we can use initializer lists for subtarget initialization. ARMSubtarget &ARMSubtarget::initializeSubtargetDependencies(StringRef CPU, @@ -379,6 +382,8 @@ bool ARMSubtarget::enableMachineScheduler() const { return useMachineScheduler(); } +bool ARMSubtarget::enableSubRegLiveness() const { return EnableSubRegLiveness; } + // This overrides the PostRAScheduler bit in the SchedModel for any CPU. bool ARMSubtarget::enablePostRAScheduler() const { if (enableMachineScheduler()) diff --git a/llvm/lib/Target/ARM/ARMSubtarget.h b/llvm/lib/Target/ARM/ARMSubtarget.h index 8478665979f..f582a92f656 100644 --- a/llvm/lib/Target/ARM/ARMSubtarget.h +++ b/llvm/lib/Target/ARM/ARMSubtarget.h @@ -806,6 +806,9 @@ public: /// True for some subtargets at > -O0. bool enablePostRAMachineScheduler() const override; + /// Check whether this subtarget wants to use subregister liveness. + bool enableSubRegLiveness() const override; + /// Enable use of alias analysis during code generation (during MI /// scheduling, DAGCombine, etc.). bool useAA() const override { return true; } |

