diff options
| author | Chandler Carruth <chandlerc@gmail.com> | 2015-07-22 03:32:42 +0000 |
|---|---|---|
| committer | Chandler Carruth <chandlerc@gmail.com> | 2015-07-22 03:32:42 +0000 |
| commit | ccffdaf7edbba53e1a5cc35257e0d4a13accdb89 (patch) | |
| tree | 9292396cbe122d914dfd6af1aa7b4bbbbb735be0 /llvm/lib | |
| parent | 33e0f7ef94124a53404555879666cef0e370f8c2 (diff) | |
| download | bcm5719-llvm-ccffdaf7edbba53e1a5cc35257e0d4a13accdb89.tar.gz bcm5719-llvm-ccffdaf7edbba53e1a5cc35257e0d4a13accdb89.zip | |
[SROA] Fix a nasty pile of bugs to do with big-endian, different alloca
types and loads, loads or stores widened past the size of an alloca,
etc.
This started off with a bug report about big-endian behavior with
bitfields and loads and stores to a { i32, i24 } struct. An initial
attempt to fix this was sent for review in D10357, but that didn't
really get to the root of the problem.
The core issue was that canConvertValue and convertValue in SROA were
handling different bitwidth integers by doing a zext of the integer. It
wouldn't do a trunc though, only a zext! This would in turn lead SROA to
form an i24 load from an i24 alloca, zext it to i32, and then use it.
This would at least produce the wrong value for big-endian systems.
One of my many false starts here was to correct the computation for
big-endian systems by shifting. But this doesn't actually work because
the original code has a 64-bit store to the entire 8 bytes, and a 32-bit
load of the last 4 bytes, and because the alloc size is 8 bytes, we
can't lose that last (least significant if bigendian) byte! The real
problem here is that we're forming an i24 load in SROA which is actually
not sufficiently wide to load all of the necessary bits here. The source
has an i32 load, and SROA needs to form that as well.
The straightforward way to do this is to disable the zext logic in
canConvertValue and convertValue, forcing us to actually load all
32-bits. This seems like a really good change, but it in turn breaks
several other parts of SROA.
First in the chain of knock-on failures, we had places where we were
doing integer-widening promotion even though some of the integer loads
or stores extended *past the end* of the alloca's memory! There was even
a comment about preventing this, but it only prevented the case where
the type had a different bit size from its store size. So I added checks
to handle the cases where we actually have a widened load or store and
to avoid trying to special integer widening promotion in those cases.
Second, we actually rely on the ability to promote in the face of loads
past the end of an alloca! This is important so that we can (for
example) speculate loads around PHI nodes to do more promotion. The bits
loaded are garbage, but as long as they aren't used and the alignment is
suitable high (which it wasn't in the test case!) this is "fine". And we
can't stop promoting here, lots of things stop working well if we do. So
we need to add specific logic to handle the extension (and truncation)
case, but *only* where that extension or truncation are over bytes that
*are outside the alloca's allocated storage* and thus totally bogus to
load or store.
And of course, once we add back this correct handling of extension or
truncation, we need to correctly handle bigendian systems to avoid
re-introducing the exact bug that started us off on this chain of misery
in the first place, but this time even more subtle as it only happens
along speculated loads atop a PHI node.
I've ported an existing test for PHI speculation to the big-endian test
file and checked that we get that part correct, and I've added several
more interesting big-endian test cases that should help check that we're
getting this correct.
Fun times.
llvm-svn: 242869
Diffstat (limited to 'llvm/lib')
| -rw-r--r-- | llvm/lib/Transforms/Scalar/SROA.cpp | 63 |
1 files changed, 52 insertions, 11 deletions
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp b/llvm/lib/Transforms/Scalar/SROA.cpp index d1a0a82b9b0..947513a3657 100644 --- a/llvm/lib/Transforms/Scalar/SROA.cpp +++ b/llvm/lib/Transforms/Scalar/SROA.cpp @@ -1847,10 +1847,17 @@ static unsigned getAdjustedAlignment(Instruction *I, uint64_t Offset, static bool canConvertValue(const DataLayout &DL, Type *OldTy, Type *NewTy) { if (OldTy == NewTy) return true; - if (IntegerType *OldITy = dyn_cast<IntegerType>(OldTy)) - if (IntegerType *NewITy = dyn_cast<IntegerType>(NewTy)) - if (NewITy->getBitWidth() >= OldITy->getBitWidth()) - return true; + + // For integer types, we can't handle any bit-width differences. This would + // break both vector conversions with extension and introduce endianness + // issues when in conjunction with loads and stores. + if (isa<IntegerType>(OldTy) && isa<IntegerType>(NewTy)) { + assert(cast<IntegerType>(OldTy)->getBitWidth() != + cast<IntegerType>(NewTy)->getBitWidth() && + "We can't have the same bitwidth for different int types"); + return false; + } + if (DL.getTypeSizeInBits(NewTy) != DL.getTypeSizeInBits(OldTy)) return false; if (!NewTy->isSingleValueType() || !OldTy->isSingleValueType()) @@ -1885,10 +1892,8 @@ static Value *convertValue(const DataLayout &DL, IRBuilderTy &IRB, Value *V, if (OldTy == NewTy) return V; - if (IntegerType *OldITy = dyn_cast<IntegerType>(OldTy)) - if (IntegerType *NewITy = dyn_cast<IntegerType>(NewTy)) - if (NewITy->getBitWidth() > OldITy->getBitWidth()) - return IRB.CreateZExt(V, NewITy); + assert(!(isa<IntegerType>(OldTy) && isa<IntegerType>(NewTy)) && + "Integer types must be the exact same to convert."); // See if we need inttoptr for this type pair. A cast involving both scalars // and vectors requires and additional bitcast. @@ -2134,6 +2139,9 @@ static bool isIntegerWideningViableForSlice(const Slice &S, if (LoadInst *LI = dyn_cast<LoadInst>(U->getUser())) { if (LI->isVolatile()) return false; + // We can't handle loads that extend past the allocated memory. + if (DL.getTypeStoreSize(LI->getType()) > Size) + return false; // Note that we don't count vector loads or stores as whole-alloca // operations which enable integer widening because we would prefer to use // vector widening instead. @@ -2152,6 +2160,9 @@ static bool isIntegerWideningViableForSlice(const Slice &S, Type *ValueTy = SI->getValueOperand()->getType(); if (SI->isVolatile()) return false; + // We can't handle stores that extend past the allocated memory. + if (DL.getTypeStoreSize(ValueTy) > Size) + return false; // Note that we don't count vector loads or stores as whole-alloca // operations which enable integer widening because we would prefer to use // vector widening instead. @@ -2585,6 +2596,7 @@ private: Type *TargetTy = IsSplit ? Type::getIntNTy(LI.getContext(), SliceSize * 8) : LI.getType(); + const bool IsLoadPastEnd = DL.getTypeStoreSize(TargetTy) > SliceSize; bool IsPtrAdjusted = false; Value *V; if (VecTy) { @@ -2592,13 +2604,27 @@ private: } else if (IntTy && LI.getType()->isIntegerTy()) { V = rewriteIntegerLoad(LI); } else if (NewBeginOffset == NewAllocaBeginOffset && - canConvertValue(DL, NewAllocaTy, LI.getType())) { + NewEndOffset == NewAllocaEndOffset && + (canConvertValue(DL, NewAllocaTy, TargetTy) || + (IsLoadPastEnd && NewAllocaTy->isIntegerTy() && + TargetTy->isIntegerTy()))) { LoadInst *NewLI = IRB.CreateAlignedLoad(&NewAI, NewAI.getAlignment(), LI.isVolatile(), LI.getName()); if (LI.isVolatile()) NewLI->setAtomic(LI.getOrdering(), LI.getSynchScope()); - V = NewLI; + + // If this is an integer load past the end of the slice (which means the + // bytes outside the slice are undef or this load is dead) just forcibly + // fix the integer size with correct handling of endianness. + if (auto *AITy = dyn_cast<IntegerType>(NewAllocaTy)) + if (auto *TITy = dyn_cast<IntegerType>(TargetTy)) + if (AITy->getBitWidth() < TITy->getBitWidth()) { + V = IRB.CreateZExt(V, TITy, "load.ext"); + if (DL.isBigEndian()) + V = IRB.CreateShl(V, TITy->getBitWidth() - AITy->getBitWidth(), + "endian_shift"); + } } else { Type *LTy = TargetTy->getPointerTo(); LoadInst *NewLI = IRB.CreateAlignedLoad(getNewAllocaSlicePtr(IRB, LTy), @@ -2718,10 +2744,25 @@ private: if (IntTy && V->getType()->isIntegerTy()) return rewriteIntegerStore(V, SI); + const bool IsStorePastEnd = DL.getTypeStoreSize(V->getType()) > SliceSize; StoreInst *NewSI; if (NewBeginOffset == NewAllocaBeginOffset && NewEndOffset == NewAllocaEndOffset && - canConvertValue(DL, V->getType(), NewAllocaTy)) { + (canConvertValue(DL, V->getType(), NewAllocaTy) || + (IsStorePastEnd && NewAllocaTy->isIntegerTy() && + V->getType()->isIntegerTy()))) { + // If this is an integer store past the end of slice (and thus the bytes + // past that point are irrelevant or this is unreachable), truncate the + // value prior to storing. + if (auto *VITy = dyn_cast<IntegerType>(V->getType())) + if (auto *AITy = dyn_cast<IntegerType>(NewAllocaTy)) + if (VITy->getBitWidth() > AITy->getBitWidth()) { + if (DL.isBigEndian()) + V = IRB.CreateLShr(V, VITy->getBitWidth() - AITy->getBitWidth(), + "endian_shift"); + V = IRB.CreateTrunc(V, AITy, "load.trunc"); + } + V = convertValue(DL, IRB, V, NewAllocaTy); NewSI = IRB.CreateAlignedStore(V, &NewAI, NewAI.getAlignment(), SI.isVolatile()); |

