| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
| |
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).
llvm-svn: 155729
|
| |
|
|
| |
llvm-svn: 155727
|
| |
|
|
| |
llvm-svn: 155725
|
| |
|
|
|
|
|
| |
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.
llvm-svn: 155722
|
| |
|
|
| |
llvm-svn: 155701
|
| |
|
|
| |
llvm-svn: 155698
|
| |
|
|
|
|
|
|
|
|
|
| |
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().
Also cache the ExprBase in IVChain to avoid frequent recomputations.
No functional change intended.
llvm-svn: 155676
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 155675
|
| |
|
|
|
|
|
|
|
|
| |
(x & y) | (x ^ y) -> x | y
(x & y) + (x ^ y) -> x | y
Patch by Manman Ren.
rdar://10770603
llvm-svn: 155674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.
Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!
Credit to Richard Smith for the core algorithm, and helping code the
patch itself.
llvm-svn: 155616
|
| |
|
|
| |
llvm-svn: 155567
|
| |
|
|
|
|
|
|
| |
in poor taste.
Talking through some alternate solutions with Chandler.
llvm-svn: 155530
|
| |
|
|
|
|
|
| |
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.
llvm-svn: 155513
|
| |
|
|
|
|
|
|
| |
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.
llvm-svn: 155500
|
| |
|
|
|
|
| |
<rdar://problem/11291436>.
llvm-svn: 155468
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original commit message:
Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.
Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.
These transformations are deferred:
(X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses)
(X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1)
(X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2)
The corresponding exact transformations are preserved, just like
div-exact + mul:
(X >>?,exact C) << C --> X
(X >>?,exact C1) << C2 --> X << (C2-C1)
(X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)
The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.
llvm-svn: 155362
|
| |
|
|
|
|
| |
the compiled source file.
llvm-svn: 155346
|
| |
|
|
| |
llvm-svn: 155341
|
| |
|
|
|
|
|
|
|
| |
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.
I'll put the patch back in after fixing the SelectionDAG problem.
llvm-svn: 155181
|
| |
|
|
| |
llvm-svn: 155166
|
| |
|
|
|
|
| |
loop repeatedlt making the same change. This is for rdar://11256239.
llvm-svn: 155160
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.
Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.
These transformations are deferred:
(X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses)
(X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1)
(X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2)
The corresponding exact transformations are preserved, just like
div-exact + mul:
(X >>?,exact C) << C --> X
(X >>?,exact C1) << C2 --> X << (C2-C1)
(X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)
The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.
llvm-svn: 155136
|
| |
|
|
|
|
| |
a function with arguments. This fixes rdar://11265785.
llvm-svn: 155073
|
| |
|
|
|
|
|
|
|
|
|
| |
If the loop contains invoke instructions, whose unwind edge escapes the loop,
then don't try to unswitch the loop. Doing so may cause the unwind edge to be
split, which not only is non-trivial but doesn't preserve loop simplify
information.
Fixes PR12573
llvm-svn: 154987
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a threshold of 200 IV Users, which is very
conservative but should be sufficient to avoid serious compile time
sink or stack overflow. The llvm test-suite with LTO never exceeds 190
users per loop.
The bug doesn't relate to a specific type of loop. Checking in an
arbitrary giant loop as a unit test would be silly.
Fixes rdar://11262507.
llvm-svn: 154983
|
| |
|
|
|
|
| |
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint
llvm-svn: 154960
|
| |
|
|
| |
llvm-svn: 154810
|
| |
|
|
| |
llvm-svn: 154793
|
| |
|
|
|
|
| |
suggested by Duncan).
llvm-svn: 154787
|
| |
|
|
|
|
|
|
|
|
| |
When vectorizing pointer types it is important to realize that potential
pairs cannot be connected via the address pointer argument of a load or store.
This is because even after vectorization, the address is still a scalar because
the address of the higher half of the pair is implicit from the address of the
lower half (it need not be, and should not be, explicitly computed).
llvm-svn: 154735
|
| |
|
|
| |
llvm-svn: 154734
|
| |
|
|
| |
llvm-svn: 154700
|
| |
|
|
| |
llvm-svn: 154687
|
| |
|
|
|
|
|
| |
their argument as "escape" points for objc_retainBlock optimization.
This fixes rdar://11229925.
llvm-svn: 154682
|
| |
|
|
|
|
|
|
|
|
| |
As has been suggested by Duncan and others, Early-CSE and GVN should
do similar redundancy elimination, but Early-CSE is much less expensive.
Most of my autovectorization benchmarks show a performance regresion, but
all of these are < 0.1%, and so I think that it is still worth using
the less expensive pass.
llvm-svn: 154673
|
| |
|
|
|
|
|
| |
library return value optimization for phi uses. Even when the
phi itself is not dominated, the specific use may be dominated.
llvm-svn: 154647
|
| |
|
|
|
|
|
|
| |
obviously cannot know that this code is present, let alone used. So prevent the
internalize pass from internalizing those global values which code-gen may
insert.
llvm-svn: 154645
|
| |
|
|
|
|
|
| |
optimizing autorelease calls on phi nodes with null operands.
This fixes rdar://11207070.
llvm-svn: 154642
|
| |
|
|
| |
llvm-svn: 154522
|
| |
|
|
|
|
| |
Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome.
llvm-svn: 154492
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- don't isntrument reads from constant globals.
Saves ~1.5% of instrumented instructions on CPU2006
(counting static instructions, not their execution).
- don't insrument reads from vtable (which is a global constant too).
Saves ~5%.
I did not measure the run-time impact of this,
but it is certainly non-negative.
llvm-svn: 154444
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
a write to the same temp follows in the same BB.
Also add stats printing.
On Spec CPU2006 this optimization saves roughly 4% of instrumented reads
(which is 3% of all instrumented accesses):
Writes : 161216
Reads : 446458
Reads-before-write: 18295
llvm-svn: 154418
|
| |
|
|
|
|
|
|
| |
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.
llvm-svn: 154386
|
| |
|
|
| |
llvm-svn: 154385
|
| |
|
|
|
|
|
|
|
|
|
|
| |
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).
llvm-svn: 154285
|
| |
|
|
| |
llvm-svn: 154249
|
| |
|
|
|
|
|
|
|
|
|
| |
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.
llvm-svn: 154179
|
| |
|
|
|
|
| |
The modifications are a lot more trivial than they appear to be in the diff!
llvm-svn: 154174
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.
It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.
Thanks to John Regehr's numerous test cases for teasing this out.
llvm-svn: 154157
|
| |
|
|
|
|
| |
testcase slightly less trivial. This fixes rdar://11171718.
llvm-svn: 154118
|