| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change passes down ACT to SampleProfileLoader for the new PM. Also remove the default value for SampleProfileLoader class as it is not used.
Reviewers: eraman, davidxl
Reviewed By: eraman
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D37773
llvm-svn: 313080
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The current promoteLoopAccessesToScalars method receives an AliasSet, but
the information used is in fact a list of Value*, known to must alias.
Create the list ahead of time to make this method independent of the AliasSet class.
While there is no functionality change, this adds overhead for creating
a set of Value*, when promotion would normally exit earlier.
This is meant to be as a first refactoring step in order to start replacing
AliasSetTracker with MemorySSA.
And while the end goal is to redesign LICM, the first few steps will focus on
adding MemorySSA as an alternative to the AliasSetTracker using most of the
existing functionality.
Reviewers: mkuper, danielcdh, dberlin
Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D35439
llvm-svn: 313075
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When the MaxVectorSize > ConstantTripCount, we should just clamp the
vectorization factor to be the ConstantTripCount.
This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF.
Earlier we were finding the maximum vector width, which could be greater than
the trip count itself. The Loop vectorizer does all the work for generating a
vectorizable loop, but in the end we would always choose the scalar loop (since
the VF > trip count). This allows us to choose the VF keeping in mind the trip
count if available.
This is a fix on top of rL312472.
Reviewers: Ayal, zvi, hfinkel, dneilson
Reviewed by: Ayal
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37702
llvm-svn: 313046
|
| |
|
|
|
|
| |
Reduces number of loops during instructions analysis.
llvm-svn: 313035
|
| |
|
|
|
|
|
|
|
|
| |
symbol constants.
The rationale is the same as for r312967.
Differential Revision: https://reviews.llvm.org/D37408
llvm-svn: 312968
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
symbol constants.
Not all targets support the use of absolute symbols to export
constants. In particular, ARM has a wide variety of constant encodings
that cannot currently be relocated by linkers. So instead of exporting
the constants using symbols, export them directly in the summary.
The values of the constants are left as zeroes on targets that support
symbolic exports.
This may result in more cache misses when targeting those architectures
as a result of arbitrary changes in constant values, but this seems
somewhat unavoidable for now.
Differential Revision: https://reviews.llvm.org/D37407
llvm-svn: 312967
|
| |
|
|
| |
llvm-svn: 312878
|
| |
|
|
|
|
|
|
|
| |
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null
Differential Revision: https://reviews.llvm.org/D37628
llvm-svn: 312869
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented
as an independent pass, so there's no stretching of scope and feature creep for an existing pass.
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028
The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.
Decomposing remainder may allow removing some code from the backend (PPC and possibly others).
Differential Revision: https://reviews.llvm.org/D37121
llvm-svn: 312862
|
| |
|
|
|
|
| |
function (which is too slow)
llvm-svn: 312855
|
| |
|
|
| |
llvm-svn: 312853
|
| |
|
|
|
|
|
|
|
|
|
|
| |
analysis of vector to be vectorized.
Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D37212
llvm-svn: 312802
|
| |
|
|
| |
llvm-svn: 312793
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.
Differential revision: https://reviews.llvm.org/D27846
llvm-svn: 312791
|
| |
|
|
|
|
|
|
| |
Re-applying after the found bug was fixed.
Differential Revision: https://reviews.llvm.org/D36215
llvm-svn: 312783
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
index f72a808..9fa49fd 100644
--- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
+++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
@@ -450,20 +450,10 @@ struct LoopStructure {
// equivalent to:
//
// intN_ty inc = IndVarIncreasing ? 1 : -1;
- // pred_ty predicate = IndVarIncreasing
- // ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT
- // : IsSignedPredicate ? ICMP_SGT : ICMP_UGT;
+ // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT;
//
- //
- // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt);
- // iv = IndVarNext)
+ // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase)
// ... body ...
- //
- // Here IndVarBase is either current or next value of the induction variable.
- // in the former case, IsIndVarNext = false and IndVarBase points to the
- // Phi node of the induction variable. Otherwise, IsIndVarNext = true and
- // IndVarBase points to IV increment instruction.
- //
Value *IndVarBase;
Value *IndVarStart;
@@ -471,13 +461,12 @@ struct LoopStructure {
Value *LoopExitAt;
bool IndVarIncreasing;
bool IsSignedPredicate;
- bool IsIndVarNext;
LoopStructure()
: Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr),
LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr),
IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr),
- IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {}
+ IndVarIncreasing(false), IsSignedPredicate(true) {}
template <typename M> LoopStructure map(M Map) const {
LoopStructure Result;
@@ -493,7 +482,6 @@ struct LoopStructure {
Result.LoopExitAt = Map(LoopExitAt);
Result.IndVarIncreasing = IndVarIncreasing;
Result.IsSignedPredicate = IsSignedPredicate;
- Result.IsIndVarNext = IsIndVarNext;
return Result;
}
@@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
return false;
};
- // `ICI` can either be a comparison against IV or a comparison of IV.next.
- // Depending on the interpretation, we calculate the start value differently.
+ // `ICI` is interpreted as taking the backedge if the *next* value of the
+ // induction variable satisfies some constraint.
- // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch
- // comparisons happens against the IV before or after its value is
- // incremented. Two valid combinations for them are:
- //
- // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false },
- // 2) { iv.next; true }.
- //
- // The latch comparison happens against IndVarBase which can be either current
- // or next value of the induction variable.
const SCEVAddRecExpr *IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV);
bool IsIncreasing = false;
bool IsSignedPredicate = true;
- bool IsIndVarNext = false;
ConstantInt *StepCI;
if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) {
FailureReason = "LHS in icmp not induction variable";
return None;
}
- const SCEV *IndVarStart = nullptr;
- // TODO: Currently we only handle comparison against IV, but we can extend
- // this analysis to be able to deal with comparison against sext(iv) and such.
- if (isa<PHINode>(LeftValue) &&
- cast<PHINode>(LeftValue)->getParent() == Header)
- // The comparison is made against current IV value.
- IndVarStart = IndVarBase->getStart();
- else {
- // Assume that the comparison is made against next IV value.
- const SCEV *StartNext = IndVarBase->getStart();
- const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
- IndVarStart = SE.getAddExpr(StartNext, Addend);
- IsIndVarNext = true;
- }
+ const SCEV *StartNext = IndVarBase->getStart();
+ const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
+ const SCEV *IndVarStart = SE.getAddExpr(StartNext, Addend);
const SCEV *Step = SE.getSCEV(StepCI);
ConstantInt *One = ConstantInt::get(IndVarTy, 1);
@@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
Result.IndVarIncreasing = IsIncreasing;
Result.LoopExitAt = RightValue;
Result.IsSignedPredicate = IsSignedPredicate;
- Result.IsIndVarNext = IsIndVarNext;
FailureReason = nullptr;
@@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd(
BranchToContinuation);
NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader);
- auto *FixupValue =
- LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN;
- NewPHI->addIncoming(FixupValue, RRI.ExitSelector);
+ NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch),
+ RRI.ExitSelector);
RRI.PHIValuesAtPseudoExit.push_back(NewPHI);
}
@@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop *L, LPPassManager &LPM) {
}
LoopStructure LS = MaybeLoopStructure.getValue();
const SCEVAddRecExpr *IndVar =
- cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase));
- if (LS.IsIndVarNext)
- IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar,
- SE.getSCEV(LS.IndVarStep)));
+ cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep)));
Optional<InductiveRangeCheck::Range> SafeIterRange;
Instruction *ExprInsertPt = Preheader->getTerminator();
diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll
deleted file mode 100644
index afea0e6..0000000
--- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll
+++ /dev/null
@@ -1,182 +0,0 @@
-; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 | FileCheck %s
-
-; Check that IRCE is able to deal with loops where the latch comparison is
-; done against current value of the IV, not the IV.next.
-
-; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-
-; SLT condition for increasing loop from 0 to 100.
-define void @test_01(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK: test_01
-; CHECK: entry:
-; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at
-; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK: loop:
-; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT: %abc = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK: in.bounds:
-; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT: store i32 0, i32* %addr
-; CHECK-NEXT: %next = icmp slt i32 %idx, 100
-; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK: main.exit.selector:
-; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100
-; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK: loop.postloop:
-; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT: %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
- %len = load i32, i32* %a_len_ptr, !range !0
- br label %loop
-
-loop:
- %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
- %idx.next = add nsw nuw i32 %idx, 1
- %abc = icmp slt i32 %idx, %len
- br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
- %addr = getelementptr i32, i32* %arr, i32 %idx
- store i32 0, i32* %addr
- %next = icmp slt i32 %idx, 100
- br i1 %next, label %loop, label %exit
-
-out.of.bounds:
- ret void
-
-exit:
- ret void
-}
-
-; ULT condition for increasing loop from 0 to 100.
-define void @test_02(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK: test_02
-; CHECK: entry:
-; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at
-; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK: loop:
-; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT: %abc = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK: in.bounds:
-; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT: store i32 0, i32* %addr
-; CHECK-NEXT: %next = icmp ult i32 %idx, 100
-; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK: main.exit.selector:
-; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100
-; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK: loop.postloop:
-; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT: %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
- %len = load i32, i32* %a_len_ptr, !range !0
- br label %loop
-
-loop:
- %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
- %idx.next = add nsw nuw i32 %idx, 1
- %abc = icmp ult i32 %idx, %len
- br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
- %addr = getelementptr i32, i32* %arr, i32 %idx
- store i32 0, i32* %addr
- %next = icmp ult i32 %idx, 100
- br i1 %next, label %loop, label %exit
-
-out.of.bounds:
- ret void
-
-exit:
- ret void
-}
-
-; Same as test_01, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_03(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK: test_03
-
-entry:
- %len = load i64, i64* %a_len_ptr, !range !1
- br label %loop
-
-loop:
- %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
- %idx.next = add nsw nuw i32 %idx, 1
- %idx.ext = sext i32 %idx to i64
- %abc = icmp slt i64 %idx.ext, %len
- br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
- %addr = getelementptr i32, i32* %arr, i32 %idx
- store i32 0, i32* %addr
- %next = icmp slt i32 %idx, 100
- br i1 %next, label %loop, label %exit
-
-out.of.bounds:
- ret void
-
-exit:
- ret void
-}
-
-; Same as test_02, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_04(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK: test_04
-
-entry:
- %len = load i64, i64* %a_len_ptr, !range !1
- br label %loop
-
-loop:
- %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
- %idx.next = add nsw nuw i32 %idx, 1
- %idx.ext = sext i32 %idx to i64
- %abc = icmp ult i64 %idx.ext, %len
- br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
- %addr = getelementptr i32, i32* %arr, i32 %idx
- store i32 0, i32* %addr
- %next = icmp ult i32 %idx, 100
- br i1 %next, label %loop, label %exit
-
-out.of.bounds:
- ret void
-
-exit:
- ret void
-}
-
-!0 = !{i32 0, i32 50}
-!1 = !{i64 0, i64 50}
llvm-svn: 312775
|
| |
|
|
|
|
|
|
|
|
|
| |
comdat.
This is required when targeting COFF, as the comdat name must match
one of the names of the symbols in the comdat.
Differential Revision: https://reviews.llvm.org/D37550
llvm-svn: 312767
|
| |
|
|
|
|
|
| |
Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.
llvm-svn: 312759
|
| |
|
|
|
|
|
|
|
|
| |
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318
Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490
llvm-svn: 312758
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider this type of a loop:
for (...) {
...
if (...) continue;
...
}
Normally, the "continue" would branch to the loop control code that
checks whether the loop should continue iterating and which contains
the (often) unique loop latch branch. In certain cases jump threading
can "thread" the inner branch directly to the loop header, creating
a second loop latch. Loop canonicalization would then transform this
loop into a loop nest. The problem with this is that in such a loop
nest neither loop is countable even if the original loop was. This
may inhibit subsequent loop optimizations and be detrimental to
performance.
Differential Revision: https://reviews.llvm.org/D36404
llvm-svn: 312664
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
null constants
This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite():
https://bugs.llvm.org/show_bug.cgi?id=27145
In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno
with a constant operand.
But while looking at those patterns, I realized we were missing a canonicalization for nonzero
constants. Rather than limiting to just folds for constants, we're adding a general value
tracking method for this based on an existing DAG helper.
By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps()
and pick up missing vector folds.
Differential Revision: https://reviews.llvm.org/D37427
llvm-svn: 312591
|
| |
|
|
| |
llvm-svn: 312575
|
| |
|
|
|
|
| |
enable reuse in a future patch.
llvm-svn: 312518
|
| |
|
|
|
|
|
|
|
|
| |
for sure we're going to use it and avoid an unnecessary call to m_APInt.
Instead of creating a Constant and then calling m_APInt with it (which will always return true). Just create an APInt initially, and use that for the checks in isSelect01 function. If it turns out we do need the Constant, create it from the APInt.
This is a refactor for a future patch that will do some more checks of the constant values here.
llvm-svn: 312517
|
| |
|
|
|
|
| |
detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant.
llvm-svn: 312510
|
| |
|
|
|
|
| |
aggregate value simplification
llvm-svn: 312509
|
| |
|
|
| |
llvm-svn: 312508
|
| |
|
|
|
|
| |
finding is done. Where we had it before, we would stop looking when we hit the original instruction, but skip it. Now we skip it and keep looking.
llvm-svn: 312507
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count.
Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow:
1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks.
2. Compute MaxVF returned 16 on a target supporting AVX512.
3. OptForSize -> choose VF:=MaxVF.
4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop.
With this patch step 2. will choose MaxVF=8 based on TripCount.
Reviewers: Ayal, dorit, mkuper, hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D37425
llvm-svn: 312472
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Debug information can be, and was, corrupted when the runtime
remainder loop was fully unrolled. This is because a !null node can
be created instead of a unique one describing the loop. In this case,
the original node gets incorrectly updated with the NewLoopID
metadata.
In the case when the remainder loop is going to be quickly fully
unrolled, there isn't the need to add loop metadata for it anyway.
Differential Revision: https://reviews.llvm.org/D37338
llvm-svn: 312471
|
| |
|
|
|
|
| |
See https://reviews.llvm.org/rL312411 for related InstSimplify tests.
llvm-svn: 312421
|
| |
|
|
|
|
|
|
| |
In addition to removing chunks of duplicated code, we don't
want these to diverge. If there's a fold for one, there
should be a fold of the other via DeMorgan's Laws.
llvm-svn: 312420
|
| |
|
|
|
|
|
|
|
| |
We had these locals:
Value *Op0RHS = LHS->getOperand(1);
Value *Op1LHS = RHS->getOperand(0);
...so we confusingly transposed the meaning of left/right and op0/op1.
llvm-svn: 312418
|
| |
|
|
|
|
| |
LLVM transforms this into a bit test which is a lot faster and smaller.
llvm-svn: 312417
|
| |
|
|
| |
llvm-svn: 312416
|
| |
|
|
|
|
|
|
| |
This makes it easier to see that they're almost duplicates.
As with the similar icmp functions, there should be identical
folds for both logic ops because those are DeMorganized variants.
llvm-svn: 312415
|
| |
|
|
| |
llvm-svn: 312414
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
should have no leaders.
Summary:
After a discussion with Rekka, i believe this (or a small variant)
should fix the remaining phi-of-ops problems.
Rekka's algorithm for completeness relies on looking up expressions
that should have no leader, and expecting it to fail (IE looking up
expressions that can't exist in a predecessor, and expecting it to
find nothing).
Unfortunately, sometimes these expressions can be simplified to
constants, but we need the lookup to fail anyway. Additionally, our
simplifier outsmarts this by taking these "not quite right"
expressions, and simplifying them into other expressions or walking
through phis, etc. In the past, we've sometimes been able to find
leaders for these expressions, incorrectly.
This change causes us to not to try to phi of ops such expressions.
We determine safety by seeing if they depend on a phi node in our
block.
This is not perfect, we can do a bit better, but this should be a
"correctness start" that we can then improve. It also requires a
bunch of caching that i'll eventually like to eliminate.
The right solution, longer term, to the simplifier issues, is to make
the query interface for the instruction simplifier/constant folder
have the flags we need, so that we can keep most things going, but
turn off the possibly-invalid parts (threading through phis, etc).
This is an issue in another wrong code bug as well.
Reviewers: davide, mcrosier
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D37175
llvm-svn: 312401
|
| |
|
|
|
|
| |
Use warnings; other minor fixes (NFC).
llvm-svn: 312383
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.
This allows some code to be removed from InstSimplify that was doing this functionality.
This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.
There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.
Differential Revision: https://reviews.llvm.org/D37158
llvm-svn: 312382
|
| |
|
|
|
|
|
|
|
|
| |
getMaskedTypeForICmpPair.
A future patch will make the code look through truncates feeding the compare. So the compares might be different types but the pretruncated types might be the same.
This should be safe because we still require the same Value* to be used truncated or not in both compares. So that serves to ensure the types are the same.
llvm-svn: 312381
|
| |
|
|
|
|
|
|
| |
ConstantInt, make sure we use the type from the Value* that was also returned from decomposeBitTestICmp.
Previously we used the type from the LHS of the compare, but a future patch will change decomposeBitTestICmp to look through truncates so it will return a pretruncated Value* and the type needs to match that.
llvm-svn: 312380
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context.
Reviewers: davide, mcrosier
Subscribers: sanjoy, Prazek, llvm-commits
Differential Revision: https://reviews.llvm.org/D37174
llvm-svn: 312352
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
LoopVectorizer is creating casts between vec<ptr> and vec<float> types
on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a
floating point type to a pointer type even if the types have same size
causing a crash. Fix the crash using a two-step casting by bitcasting
to integer and integer to pointer/float.
Fixes PR33804.
Reviewers: mkuper, Ayal, dlj, rengolin, srhines
Reviewed By: rengolin
Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35498
llvm-svn: 312331
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
MergeICmps.cpp(68,15): error: chosen constructor is explicit in copy-initialization
return {};
APInt.h(339,12): note: explicit constructor declared here
explicit APInt() : BitWidth(1) { U.VAL = 0; }
^
MergeICmps.cpp(56,9): note: in implicit initialization of field 'Offset' with omitted
initializer
APInt Offset;
^
llvm-svn: 312326
|
| |
|
|
|
|
|
|
|
|
| |
turns chains of integer
Add missing header.
This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf.
llvm-svn: 312322
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch provides such debug information for integer
variables whose type is shrinked to bool by providing
dwarf expression which returns either constant initial
value or other value.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D35994
llvm-svn: 312318
|
| |
|
|
|
|
|
|
|
|
| |
of integer"
Break build
This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b.
llvm-svn: 312317
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
comparisons into memcmp.
Thanks to recent improvements in the LLVM codegen, the memcmp is typically
inlined as a chain of efficient hardware comparisons.
This typically benefits C++ member or nonmember operator==().
For now this is disabled by default until:
- https://bugs.llvm.org/show_bug.cgi?id=33329 is complete
- Benchmarks show that this is always useful.
Differential Revision:
https://reviews.llvm.org/D33987
llvm-svn: 312315
|
| |
|
|
|
|
| |
warnings; other minor fixes. Also affected in files (NFC).
llvm-svn: 312289
|