| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
This will later hold more general logic to parse conjunctions of range
checks.
llvm-svn: 270802
|
| |
|
|
| |
llvm-svn: 270798
|
| |
|
|
|
|
|
| |
While here, convert the logic of the pass to use static function(s).
This is in preparation for porting this pass to the new PM.
llvm-svn: 270734
|
| |
|
|
|
|
| |
long time.
llvm-svn: 270677
|
| |
|
|
| |
llvm-svn: 270647
|
| |
|
|
|
|
|
|
| |
more time.
This reverts commit r270577.
llvm-svn: 270630
|
| |
|
|
|
|
| |
Make `GuardWideningImpl::RangeCheck` into a class and add accessors.
llvm-svn: 270611
|
| |
|
|
|
|
|
| |
This is better layering, since the caller needs to check if the index
was an add-rec anyway.
llvm-svn: 270582
|
| |
|
|
|
|
|
|
| |
analysis by default.
Chromium builds are still hitting the assert in PR27874.
llvm-svn: 270577
|
| |
|
|
|
|
|
|
|
| |
default.""
This reverts commit r270512 and reapplies r270478. Originally it caused
PR27847, but it was fixed in r270517.
llvm-svn: 270518
|
| |
|
|
|
|
| |
This caused PR27847.
llvm-svn: 270512
|
| |
|
|
|
|
|
|
|
|
| |
This changes IRCE to optimize uses, and not branches. This change is
NFCI since the uses we do inspect are in practice only ever going to be
the condition use in conditional branches; but this flexibility will
later allow us to analyze more complex expressions than just a direct
branch on a range check.
llvm-svn: 270500
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).
Test results for compile time(using 4 samples to reduce noise):
```
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%
```
I didn't see any performance changes in the testsuite, but it improves
some internal tests.
Reviewers: hfinkel, chandlerc
Subscribers: llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D20482
llvm-svn: 270478
|
| |
|
|
|
|
|
|
| |
The InductiveRangeCheck struct is only five words long; so passing these
around value is fine. The allocator makes the code look more complex
than it is.
llvm-svn: 270309
|
| |
|
|
| |
llvm-svn: 270308
|
| |
|
|
|
|
|
|
|
|
|
| |
I had used `std::remove_if` under the assumption that it moves the
predicate matching elements to the end, but actaully the elements
remaining towards the end (after the iterator returned by
`std::remove_if`) are indeterminate. Fix the bug (and make the code
more straightforward) by using a temporary SmallVector, and add a test
case demonstrating the issue.
llvm-svn: 270306
|
| |
|
|
|
|
| |
Inline getAnalysisUsage() while I'm here.
llvm-svn: 270231
|
| |
|
|
| |
llvm-svn: 270228
|
| |
|
|
|
|
| |
porting this pass to the new PM.
llvm-svn: 270225
|
| |
|
|
| |
llvm-svn: 270155
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sequences of range checks expressed using guards, like
guard((I - 2) u< L)
guard((I - 1) u< L)
guard((I + 0) u< L)
guard((I + 1) u< L)
guard((I + 2) u< L)
can sometimes be combined into a smaller sequence:
guard((I - 2) u< L AND (I + 2) u< L)
if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks
expressed in the previous sequence.
This change teaches GuardWidening to do this kind of merging when
feasible.
llvm-svn: 270151
|
| |
|
|
| |
llvm-svn: 270074
|
| |
|
|
|
|
|
| |
`ConstantRange::getEquivalentICmp` is more general, and better
factored.
llvm-svn: 270019
|
| |
|
|
|
|
|
| |
PredicatePassProbability is a better name for what LikelyBranchWeight
was trying to express.
llvm-svn: 269999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implement guard widening in LLVM. Description from GuardWidening.cpp:
The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM
transform it so that it fails more often that it did before the
transform. This optimization is called "widening" and can be used hoist
and common runtime checks in situations like these:
```
%cmp0 = 7 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
%cmp1 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ]
...
```
to
```
%cmp0 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
...
```
If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back
to a generic implementation of the same function, which will have the
correct semantics from that point onward. It is always _legal_ to
deoptimize (so replacing `%cmp0` with false is "correct"), though it may
not always be profitable to do so.
NB! This pass is a work in progress. It hasn't been tuned to be
"production ready" yet. It is known to have quadriatic running time and
will not scale to large numbers of guards
Reviewers: reames, atrick, bogner, apilipenko, nlewycky
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20143
llvm-svn: 269997
|
| |
|
|
|
|
|
|
|
| |
branches, along with their operands.
Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.
llvm-svn: 269985
|
| |
|
|
| |
llvm-svn: 269937
|
| |
|
|
|
|
| |
Patch by JakeVanAdrighem. Thanks!
llvm-svn: 269847
|
| |
|
|
|
|
|
| |
Guards are expected to basically never fail. Reflect this in the branch
probabilities in their lowered form.
llvm-svn: 269791
|
| |
|
|
|
|
|
|
|
|
| |
This is assertion is no longer necessary since we never record
constants in the live set anyway. (They are never recorded in
the initial live set, and constant bases are removed near line 2119)
Differential Revision: http://reviews.llvm.org/D20293
llvm-svn: 269764
|
| |
|
|
| |
llvm-svn: 269624
|
| |
|
|
| |
llvm-svn: 269591
|
| |
|
|
|
|
|
|
|
|
|
|
| |
TargetLibraryInfoWrapperPass is a dependency of
SCCP but it's not listed as such. Chandler pointed
out this is an easy mistake to make which only
surfaces in weird crashes with some flag combinations.
This code will go away anyway at some point in the
future, but as long as it's (still) exercised, try
to make it correct.
llvm-svn: 269589
|
| |
|
|
| |
llvm-svn: 269578
|
| |
|
|
| |
llvm-svn: 269511
|
| |
|
|
|
|
|
|
|
|
| |
increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...""
This reverts commit r269395.
Try to reapply with a fix from chapuni.
llvm-svn: 269486
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes fixed in r269433.
Reviewers: joker.eph, mcrosier
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20248
llvm-svn: 269456
|
| |
|
|
| |
llvm-svn: 269445
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit.
Reviewers: arsenm, joker.eph, mcrosier
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D20176
llvm-svn: 269433
|
| |
|
|
|
|
|
|
|
|
|
| |
tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."
This reverts commit r269388.
It caused some bots to fail, I'm reverting it until I investigate the
issue.
llvm-svn: 269395
|
| |
|
|
|
|
|
| |
This should fix some compile-time regressions after r267672. Thanks to
Chris Matthews for bisecting it.
llvm-svn: 269392
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.
This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.
This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.
The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.
Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.
Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.
We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.
Reviewers: chandlerc
Subscribers: sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D11758
llvm-svn: 269388
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Ported DA to the new PM by splitting the former DependenceAnalysis Pass
into a DependenceInfo result type and DependenceAnalysisWrapperPass type
and adding a new PM-style DependenceAnalysis analysis pass returning the
DependenceInfo.
Patch by Philip Pfaffe, most of the review by Justin.
Differential Revision: http://reviews.llvm.org/D18834
llvm-svn: 269370
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20025
llvm-svn: 269322
|
| |
|
|
|
|
|
|
|
| |
Shifts beyond the bitwidth are undef but SCCP resolved them to zero.
Instead, DTRT and resolve them to undef.
This reimplements the transform which caused PR27712.
llvm-svn: 269269
|
| |
|
|
|
|
|
|
| |
defined."
This reverts commit r269105 as it caused PR27712.
llvm-svn: 269252
|
| |
|
|
|
|
|
| |
This reverts commit r269125. It was in my tree when I ran "git svn dcommit".
It's really still under review.
llvm-svn: 269127
|
| |
|
|
|
|
|
|
| |
Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block
that really implements a memcpy operation, backends can benefit from seeing
this.
llvm-svn: 269125
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before r268509, Clang would disable the loop unroll pass when optimizing
for size. That commit enabled it to be able to support unroll pragmas
in -Os builds. However, this regressed binary size in one of Chromium's
DLLs with ~100 KB.
This restores the original behaviour of no unrolling at -Os, but doing it
in LLVM instead of Clang makes more sense, and also allows the pragmas to
keep working.
Differential revision: http://reviews.llvm.org/D20115
llvm-svn: 269124
|
| |
|
|
|
|
|
|
|
| |
This patch extend loopreroll to allow the instruction chain
of loop control only IV has sext.
Differential Revision: http://reviews.llvm.org/D19820
llvm-svn: 269121
|