| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
https://bugs.llvm.org/show_bug.cgi?id=42399
https://rise4fun.com/Alive/kBb
https://rise4fun.com/Alive/1SB
llvm-svn: 364430
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Doing better separation of Cost and Threshold.
Cost counts the abstract complexity of live instructions, while Threshold is an upper bound of complexity that inlining is comfortable to pay.
There are two parts:
- huge 15K last-call-to-static bonus is no longer subtracted from Cost
but rather is now added to Threshold.
That makes much more sense, as the cost of inlining (Cost) is not changed by the fact
that internal function is called once. It only changes the likelyhood of this inlining
being profitable (Threshold).
- bonus for calls proved-to-be-inlinable into callee is no longer subtracted from Cost
but added to Threshold instead.
While calculations are somewhat different, overall InlineResult should stay the same since Cost >= Threshold compares the same.
Reviewers: eraman, greened, chandlerc, yrouban, apilipenko
Reviewed By: apilipenko
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60740
llvm-svn: 364422
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
opt pipeline."
Breaks sanitizers:
libFuzzer :: cxxstring.test
libFuzzer :: memcmp.test
libFuzzer :: recommended-dictionary.test
libFuzzer :: strcmp.test
libFuzzer :: value-profile-mem.test
libFuzzer :: value-profile-strcmp.test
llvm-svn: 364416
|
| |
|
|
|
|
|
|
|
| |
This allows later passes (in particular InstCombine) to optimize more
cases.
One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.
llvm-svn: 364412
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch generalizes the UnrollLoop utility to support loops that exit
from the header instead of the latch. Usually, LoopRotate would take care
of must of those cases, but in some cases (e.g. -Oz), LoopRotate does
not kick in.
Codesize impact looks relatively neutral on ARM64 with -Oz + LTO.
Program master patch diff
External/S.../CFP2006/447.dealII/447.dealII 629060.00 627676.00 -0.2%
External/SPEC/CINT2000/176.gcc/176.gcc 1245916.00 1244932.00 -0.1%
MultiSourc...Prolangs-C/simulator/simulator 86100.00 86156.00 0.1%
MultiSourc...arks/Rodinia/backprop/backprop 66212.00 66252.00 0.1%
MultiSourc...chmarks/Prolangs-C++/life/life 67276.00 67312.00 0.1%
MultiSourc...s/Prolangs-C/compiler/compiler 69824.00 69788.00 -0.1%
MultiSourc...Prolangs-C/assembler/assembler 86672.00 86696.00 0.0%
Reviewers: efriedma, vsk, paquette
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D61962
llvm-svn: 364398
|
| |
|
|
|
|
|
| |
https://bugs.llvm.org/show_bug.cgi?id=42391
https://rise4fun.com/Alive/9E2
llvm-svn: 364393
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-> icmp eq/ne (and %x, (lshr -C1, C2)), 0.
Simplify 'shl' inequality test into 'and' equality test.
This pattern happens in the middle-end while simplifying bitfield access,
Exposed in https://reviews.llvm.org/D63505
https://rise4fun.com/Alive/6uz
Reviewers: lebedev.ri, efriedma
Reviewed By: lebedev.ri
Subscribers: spatel, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63675
llvm-svn: 364348
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero.
This is matching the isPowerOf2() patterns used in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314
But there's at least 1 instcombine follow-up needed to match the alternate form:
(v & (v - 1)) == 0;
We should have all of the backend expansions handled with:
rL364319
(x86-specific changes still needed for optimal code based on subtarget)
And the larger patterns to exclude zero as a power-of-2 are joining with this change after:
rL364153 ( D63660 )
rL364246
Differential Revision: https://reviews.llvm.org/D63777
llvm-svn: 364341
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides the low-level support to start using MVE vector types in
LLVM IR, loading and storing them, passing them to __asm__ statements
containing hand-written MVE vector instructions, and *if* you have the
hard-float ABI turned on, using them as function parameters.
(In the soft-float ABI, vector types are passed in integer registers,
and combining all those 32-bit integers into a q-reg requires support
for selection DAG nodes like insert_vector_elt and build_vector which
aren't implemented yet for MVE. In fact I've also had to add
`arm_aapcs_vfpcc` to a couple of existing tests to avoid that
problem.)
Specifically, this commit adds support for:
* spills, reloads and register moves for MVE vector registers
* ditto for the VPT predication mask that lives in VPR.P0
* make all the MVE vector types legal in ISel, and provide selection
DAG patterns for BITCAST, LOAD and STORE
* make loads and stores of scalar FP types conditional on
`hasFPRegs()` rather than `hasVFP2Base()`. As a result a few
existing tests needed their llc command lines updating to use
`-mattr=-fpregs` as their method of turning off all hardware FP
support.
Reviewers: dmgreen, samparker, SjoerdMeijer
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60708
llvm-svn: 364329
|
| |
|
|
|
|
|
|
|
| |
The expensive buildbots highlighted the mir tests were broken, which
I've now updated and added --verify-machineinstrs to them. This also
uncovered a couple of bugs in the backend pass, so these have also
been fixed.
llvm-svn: 364323
|
| |
|
|
|
|
|
|
|
|
| |
This is a pre-commit of the tests introduced by the SuperNode SLP patch D63661.
Committed on behalf of @vporpo (Vasileios Porpodas)
Differential Revision: https://reviews.llvm.org/D63664
llvm-svn: 364320
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decrement_reg is lowered to two, so that we can separate
the decrement and branching operations. The pseudo instructions are
expanded pre-emission, where we can still decide whether we actually
want to generate a low-overhead loop, in a new pass:
ARMLowOverheadLoops. The pass currently bails, reverting to an sub,
icmp and br, in the cases where a call or stack spill/restore happens
between the decrement and branching instructions, or if the loop is
too large.
Differential Revision: https://reviews.llvm.org/D63476
llvm-svn: 364288
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C2), C1.
Summary:
'shl' inequality test
```
icmp ult/uge (shl %x, C2), C1 iff C1 is power of two
```
can be simplified as 'and' equality test
```
icmp eq/ne (and %x, (lshr -C1, C2)), 0.
```
Reviewers: lebedev.ri, efriedma
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63670
llvm-svn: 364256
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
u</u>= (-C) earlier.
Summary:
To generate simplified IR, make sure fold
(X & ~C) ==/!= 0 --> X u</u>= C+1
is scheduled before fold
((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.
https://rise4fun.com/Alive/7ZN
Reviewers: lebedev.ri, efriedma, spatel, craig.topper
Reviewed By: lebedev.ri
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63505
llvm-svn: 364255
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the Demorgan'd 'not' of the pattern handled in:
D63660 / rL364153
This is another intermediate IR step towards solving PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314
We can test if a value is not a power-of-2 using ctpop(X) > 1,
so combining that with an is-zero check of the input is the
same as testing if not exactly 1 bit is set:
(X == 0) || (ctpop(X) u> 1) --> ctpop(X) != 1
llvm-svn: 364246
|
| |
|
|
|
|
| |
Alive says this is OK.
llvm-svn: 364235
|
| |
|
|
|
|
| |
Alive says this is OK.
llvm-svn: 364234
|
| |
|
|
|
|
| |
Alive says this is OK.
llvm-svn: 364233
|
| |
|
|
| |
llvm-svn: 364232
|
| |
|
|
|
|
|
|
|
|
|
| |
Inference of nowrap flags in CVP has been disabled, because it
triggered a bug in LFTR (https://bugs.llvm.org/show_bug.cgi?id=31181).
This issue has been fixed in D60935, so we should be able to reenable
nowrap flag inference now.
Differential Revision: https://reviews.llvm.org/D62776
llvm-svn: 364228
|
| |
|
|
| |
llvm-svn: 364227
|
| |
|
|
|
|
| |
Prep work for upcoming patch D63505.
llvm-svn: 364224
|
| |
|
|
|
|
| |
(tests for D63733)
llvm-svn: 364220
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D63609
llvm-svn: 364219
|
| |
|
|
|
|
|
|
|
|
| |
Prefer the more exact intrinsic to remove a use of the input value
and possibly make further transforms easier (we will still need
to match patterns with funnel-shift of wider types as pieces of
bswap, especially if we want to canonicalize to funnel-shift with
constant shift amount). Discussed in D46760.
llvm-svn: 364187
|
| |
|
|
| |
llvm-svn: 364184
|
| |
|
|
|
|
|
|
| |
trunc(lshr) handling - if the shift is out of range (undefined) then bail like we do for non-constant shifts.
Fixes OSS Fuzz #15217
llvm-svn: 364181
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Handle smul.fix.sat in the scalarizer. This is done by
adding smul.fix.sat to the set of "isTriviallyVectorizable"
intrinsics.
The addition of smul.fix.sat in isTriviallyVectorizable and
hasVectorInstrinsicScalarOpd can also be seen as a preparation
to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding.
Reviewers: rengolin, RKSimon, dblaikie
Reviewed By: rengolin
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63704
llvm-svn: 364177
|
| |
|
|
| |
llvm-svn: 364156
|
| |
|
|
|
|
| |
In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration). This extends that to eliminate the dead comparison left around.
llvm-svn: 364155
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another intermediate IR step towards solving PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314
We can test if a value is power-of-2-or-0 using ctpop(X) < 2,
so combining that with a non-zero check of the input is the
same as testing if exactly 1 bit is set:
(X != 0) && (ctpop(X) u< 2) --> ctpop(X) == 1
Differential Revision: https://reviews.llvm.org/D63660
llvm-svn: 364153
|
| |
|
|
|
|
|
|
|
|
| |
This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning.
Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here.
Differential Revision: https://reviews.llvm.org/D63618
llvm-svn: 364135
|
| |
|
|
|
|
| |
The limit for the pointer case is incorrect.
llvm-svn: 364128
|
| |
|
|
|
|
|
|
|
| |
This reverts r364084 (git commit 5698921be2d567f6abf925479ac9f5a376d6d74f)
It caused crashes while compiling a file in Chrome. Reduction
forthcoming.
llvm-svn: 364111
|
| |
|
|
|
|
|
|
|
|
| |
This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).
Committed on behalf of @vporpo (Vasileios Porpodas)
Differential Revision: https://reviews.llvm.org/D60897
llvm-svn: 364084
|
| |
|
|
| |
llvm-svn: 364083
|
| |
|
|
| |
llvm-svn: 364082
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
```
%a = sub i32 31, %x
%r = shl i32 1, %a
=>
%d = shl i32 1, 31
%r = lshr i32 %d, %x
Done: 1
Optimization is correct!
```
https://rise4fun.com/Alive/btZm
Reviewers: spatel, lebedev.ri, nikic
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63652
llvm-svn: 364073
|
| |
|
|
| |
llvm-svn: 364069
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Signedness does not change number of trailing zeros.
Reviewers: spatel, lebedev.ri, nikic
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D63546
llvm-svn: 364064
|
| |
|
|
|
|
|
| |
Patch based on suggestion by James Molloy (@jmolloy) in:
https://bugs.llvm.org/show_bug.cgi?id=42346
llvm-svn: 364062
|
| |
|
|
| |
llvm-svn: 364060
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The motivation for this was to propagate fast-math flags like nnan and
ninf on vector floating point operations to the corresponding scalar
operations to take advantage of follow-on optimizations. But I think
the same argument applies to all of our IR flags: if they apply to the
vector operation then they also apply to all the individual scalar
operations, and they might enable follow-on optimizations.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63593
llvm-svn: 364051
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
LLVM Allows Targets to provide information that guides optimisations
made to LLVM IR. This is done with callbacks on a TargetTransformInfo object.
This patch adds a TargetTransformInfo class for RISC-V. This will allow us to
implement RISC-V specific callbacks as they become necessary.
This commit also adds the getIntImmCost callbacks, and tests them with a simple
constant hoisting test. Our immediate costs are on the conservative side, for
the moment, but we prevent hoisting in most circumstances anyway.
Previous review was on D63007
Reviewers: asb, luismarques
Reviewed By: asb
Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63433
llvm-svn: 364046
|
| |
|
|
|
|
|
|
| |
Also, add a FIXME for the unsafe transform on a unary FNeg. A unary FNeg can only be transformed to a FMul by -1.0 when the nnan flag is present. The unary FNeg project is a WIP, so the unsafe transformation is acceptable until that work is complete.
The bogus assert with introduced in D63445.
llvm-svn: 363998
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314
Improving the canonicalization for these patterns:
rL363956
...means we should adjust/enhance the related simplification.
https://rise4fun.com/Alive/w1cp
Name: isPow2 or zero
%x = and i32 %xx, 2048
%a = add i32 %x, -1
%r = and i32 %a, %x
=>
%r = i32 0
llvm-svn: 363997
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being
clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch).
If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink.
Given:
```
for i
load a[i]
store a[i]
```
The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it.
In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either
Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs.
Caught by PR42294.
This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict
hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop.
Reviewers: george.burgess.iv
Subscribers: jlebar, Prazek, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63582
llvm-svn: 363982
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
I added a canonicalization to create this general pattern in:
rL363956
But as noted in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314#c11
...we have a (potentially expensive) simplification for the version
of the code that we just canonicalized away from, so we should
add/adjust that code to match.
llvm-svn: 363981
|
| |
|
|
| |
llvm-svn: 363978
|
| |
|
|
| |
llvm-svn: 363967
|