| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.
For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)
Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32.
llvm-svn: 362975
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context.
Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening. For the nestedIV example from elim-extend, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)
This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :)
In review, we decided to accept that test change. This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change. (I wanted it separate for problem attribution purposes.)
Differential Revision: https://reviews.llvm.org/D62880
llvm-svn: 362971
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Move some code around, in preparation for later fixes
to the non-integral addrspace handling (D59661)
Patch By Jameson Nash <jameson@juliacomputing.com>
Reviewed By: reames, loladiro
Differential Revision: https://reviews.llvm.org/D59729
llvm-svn: 362853
|
| |
|
|
|
|
| |
This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops.
llvm-svn: 362727
|
| |
|
|
|
|
|
|
|
|
| |
The AllConstant check needs to be moved out of the if/else if chain to
avoid a test regression. The "there is no SimplifyZExt" comment
puzzles me, since there is SimplifyCastInst. Additionally, the
Simplify* calls seem to not see the operand as constant, so this needs
to be tried if the simplify failed.
llvm-svn: 362653
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
handling for SwitchInst"
This reverts commit 5b32f60ec31ce136edac6f693538aeb6039f4ad0.
The fix is in commit 4f9e68148bd0dada2d6997625432385918ac2e2c.
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126
llvm-svn: 362583
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D62858
llvm-svn: 362558
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D62819
llvm-svn: 362449
|
| |
|
|
|
|
| |
(And remember to actually mark the first one static.)
llvm-svn: 362415
|
| |
|
|
| |
llvm-svn: 362411
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix
for LFTR poison handling issues in general.
When LFTR moves a condition from pre-inc to post-inc, it may now
depend on value that is poison due to nowrap flags. To avoid this,
we clear any nowrap flag that SCEV cannot prove for the post-inc
addrec.
Additionally, LFTR may switch to a different IV that is dynamically
dead and as such may be arbitrarily poison. This patch will correct
nowrap flags in some but not all cases where this happens. This is
related to the adoption of IR nowrap flags for the pre-inc addrec.
(See some of the switch_to_different_iv tests, where flags are not
dropped or insufficiently dropped.)
Finally, there are likely similar issues with the handling of GEP
inbounds, but we don't have a test case for this yet.
Differential Revision: https://reviews.llvm.org/D60935
llvm-svn: 362292
|
| |
|
|
| |
llvm-svn: 362285
|
| |
|
|
| |
llvm-svn: 362284
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment, LoopPredication completely bails out if it sees a latch of the form:
%cmp = icmp ne %iv, %N
br i1 %cmp, label %loop, label %exit
OR
%cmp = icmp ne %iv.next, %NPlus1
br i1 %cmp, label %loop, label %exit
This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can.
For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially.
For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem.
This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later.
Differential Revision: https://reviews.llvm.org/D62748
llvm-svn: 362282
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If we can determine that a saturating add/sub will not overflow based
on range analysis, convert it into a simple binary operation. This is
a sibling transform to the existing with.overflow handling.
Reapplying this with an additional check that the saturating intrinsic
has integer type, as LVI currently does not support vector types.
Differential Revision: https://reviews.llvm.org/D62703
llvm-svn: 362263
|
| |
|
|
|
|
|
|
| |
Noticed on D62703. LVI only handles plain integers, not vectors of
integers. This was previously not an issue, because vector support
for with.overflow is only a relatively recent addition.
llvm-svn: 362261
|
| |
|
|
|
|
|
|
| |
This reverts commit 1e692d1777ae34dcb93524b5798651a29defae09.
Causes assertion failure in builtins-wasm.c clang test.
llvm-svn: 362254
|
| |
|
|
|
|
|
|
|
|
| |
If we can determine that a saturating add/sub will not overflow
based on range analysis, convert it into a simple binary operation.
This is a sibling transform to the existing with.overflow handling.
Differential Revision: https://reviews.llvm.org/D62703
llvm-svn: 362242
|
| |
|
|
|
|
|
| |
Change argument from WithOverflowInst to BinaryOpIntrinsic, so this
function can also be used for saturating math intrinsics.
llvm-svn: 362152
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I'm adding ORE to memset/memcpy formation, with tests,
but mainly this is split off from D61144.
Reviewers: reames, anemet, thegameg, craig.topper
Reviewed By: thegameg
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62631
llvm-svn: 362092
|
| |
|
|
|
|
| |
Split off from D61144
llvm-svn: 362091
|
| |
|
|
| |
llvm-svn: 362031
|
| |
|
|
|
|
|
|
| |
runOnNoncountableLoop()
Split off from D61144
llvm-svn: 362022
|
| |
|
|
| |
llvm-svn: 361990
|
| |
|
|
| |
llvm-svn: 361957
|
| |
|
|
|
|
|
|
|
|
|
|
| |
handling for SwitchInst"
This reverts commit 53f2f3286572cb879b3861d7c15480e4d830dd3b.
As reported on D62126, this causes assertion failures if the switch
has incorrect branch_weights metadata, which may happen as a result
of other transforms not handling it correctly yet.
llvm-svn: 361881
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
SwitchInst
This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126
llvm-svn: 361808
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.
This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.
Fixes PR41725.
Reviewers: efriedma, mcrosier, davide, jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D61576
llvm-svn: 361743
|
| |
|
|
|
|
|
|
|
|
| |
The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.
To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.
llvm-svn: 361698
|
| |
|
|
|
|
|
|
|
| |
Just a minor refactoring to use the new helper method
DataLayout::typeSizeEqualsStoreSize(). This is done when
checking if getTypeSizeInBits is equal/non-equal to
getTypeStoreSizeInBits.
llvm-svn: 361613
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:
1. All conditional branches that are direct children are uniform.
2. And either:
a. All sub-regions are uniform.
b. There is one or less conditional branches among the direct
children.
Differential Revision: https://reviews.llvm.org/D62198
llvm-svn: 361610
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The DeadStoreElimination pass now skips doing
PartialStoreMerging when stores overlap according to
OW_PartialEarlierWithFullLater and at least one of
the stores is having a store size that is different
from the size of the type being stored.
This solves problems seen in
https://bugs.llvm.org/show_bug.cgi?id=41949
for which we in the past could end up with
mis-compiles or assertions.
The content and location of the padding bits is not
formally described (or undefined) in the LangRef
at the moment. So the solution is chosen based on
that we cannot assume anything about the padding bits
when having a store that clobbers more memory than
indicated by the type of the value that is stored
(such as storing an i6 using an 8-bit store instruction).
Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949
Reviewers: spatel, efriedma, fhahn
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62250
llvm-svn: 361605
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Mirror tuning option from old pass manager in new pass manager.
Reviewers: chandlerc
Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61612
llvm-svn: 361560
|
| |
|
|
|
|
|
|
| |
bounds, step, induction variable, and guard branch.
This reverts r361517 (git commit 2049e4dd8f61100f88f14db33bd95d197bcbfbbc)
llvm-svn: 361553
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
step, induction variable, and guard branch.
Summary:
This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard.
/// Example:
/// for (int i = lb; i < ub; i+=step)
/// <loop body>
/// --- pseudo LLVMIR ---
/// beforeloop:
/// guardcmp = (lb < ub)
/// if (guardcmp) goto preheader; else goto afterloop
/// preheader:
/// loop:
/// i1 = phi[{lb, preheader}, {i2, latch}]
/// <loop body>
/// i2 = i1 + step
/// latch:
/// cmp = (i2 < ub)
/// if (cmp) goto loop
/// exit:
/// afterloop:
///
/// getBounds
/// getInitialIVValue --> lb
/// getStepInst --> i2 = i1 + step
/// getStepValue --> step
/// getFinalIVValue --> ub
/// getCanonicalPredicate --> '<'
/// getDirection --> Increasing
/// getGuard --> if (guardcmp) goto loop; else goto afterloop
/// getInductionVariable --> i1
/// getAuxiliaryInductionVariable --> {i1}
/// isCanonical --> false
Committed on behalf of @Whitney (Whitney Tsang).
Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn
Reviewed By: kbarton
Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60565
llvm-svn: 361517
|
| |
|
|
|
|
|
|
|
|
| |
`fadd` and `fsub` have recently (r351850) been added as `atomicrmw`
operations. This diff adds lowering cases for them to the LowerAtomic
transform.
Patch by Josh Berdine!
llvm-svn: 361512
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: gchatelet, spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62287
llvm-svn: 361490
|
| |
|
|
| |
llvm-svn: 361366
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Because the sort order was not strongly stable on the RHS, whether the
chain could merge would depend on the order of the blocks in the Phi.
EXPENSIVE_CHECKS would shuffle the blocks before sorting, resulting in
non-deterministic merging.
Reviewers: gchatelet
Subscribers: hiraditya, llvm-commits, RKSimon
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62193
llvm-svn: 361281
|
| |
|
|
|
|
| |
Broke some bots.
llvm-svn: 361263
|
| |
|
|
|
|
|
|
| |
And handle for self-move. This is required so that llvm::sort can work
with EXPENSIVE_CHECKS, as it will do a random shuffle of the input
which can result in self-moves.
llvm-svn: 361257
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In preparation for D60318 .
Reviewers: gchatelet, efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62068
llvm-svn: 361239
|
| |
|
|
| |
llvm-svn: 361103
|
| |
|
|
| |
llvm-svn: 361024
|
| |
|
|
|
|
|
|
|
|
| |
With a fix for PR41917: The predecessor list was changing under our feet.
- for (BasicBlock *Pred : predecessors(EntryBlock_)) {
+ while (!pred_empty(EntryBlock_)) {
+ BasicBlock* const Pred = *pred_begin(EntryBlock_);
llvm-svn: 361009
|
| |
|
|
| |
llvm-svn: 360978
|
| |
|
|
|
|
| |
Using dominance vs a set membership check is indistinguishable from a compile time perspective, and the two queries return equivelent results. Simplify code by using the existing function.
llvm-svn: 360976
|
| |
|
|
| |
llvm-svn: 360972
|
| |
|
|
|
|
| |
I'm slowly wrapping my head around this code, and am making comment improvements where I can.
llvm-svn: 360968
|
| |
|
|
|
|
| |
It caused PR41917.
llvm-svn: 360963
|