summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
diff options
context:
space:
mode:
authorGeorge Burgess IV <george.burgess.iv@gmail.com>2017-06-09 03:56:15 +0000
committerGeorge Burgess IV <george.burgess.iv@gmail.com>2017-06-09 03:56:15 +0000
commita20352e13ed895e0c67a328a5da2dc3e5481ee58 (patch)
tree6de91f6be0f28abc2c483c6c485cda409554b572 /llvm/lib/Transforms/Vectorize
parentcb9327b02db9a13379bb83b499166aff2930050a (diff)
downloadbcm5719-llvm-a20352e13ed895e0c67a328a5da2dc3e5481ee58.tar.gz
bcm5719-llvm-a20352e13ed895e0c67a328a5da2dc3e5481ee58.zip
[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.
If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char *from, unsigned char *to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char *from, unsigned char *to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] | 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. llvm-svn: 305053
Diffstat (limited to 'llvm/lib/Transforms/Vectorize')
-rw-r--r--llvm/lib/Transforms/Vectorize/LoopVectorize.cpp6
1 files changed, 5 insertions, 1 deletions
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 4040be10a14..1abdb248485 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3814,7 +3814,11 @@ void InnerLoopVectorizer::truncateToMinimalBitwidths() {
if (auto *BO = dyn_cast<BinaryOperator>(I)) {
NewI = B.CreateBinOp(BO->getOpcode(), ShrinkOperand(BO->getOperand(0)),
ShrinkOperand(BO->getOperand(1)));
- cast<BinaryOperator>(NewI)->copyIRFlags(I);
+
+ // Any wrapping introduced by shrinking this operation shouldn't be
+ // considered undefined behavior. So, we can't unconditionally copy
+ // arithmetic wrapping flags to NewI.
+ cast<BinaryOperator>(NewI)->copyIRFlags(I, /*IncludeWrapFlags=*/false);
} else if (auto *CI = dyn_cast<ICmpInst>(I)) {
NewI =
B.CreateICmp(CI->getPredicate(), ShrinkOperand(CI->getOperand(0)),
OpenPOWER on IntegriCloud