| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
case as well as the case of a nonnull attribute (already handled but not tested).
llvm-svn: 209193
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
types larger than i128.
Currently the X86 backend doesn't support types larger than i128 very well. For
example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted.
This fix changes the cost model to never hoist constants for types larger than
i128. Once the codegen issues have been resolved, the cost model can be updated
to allow also larger types.
This is related to <rdar://problem/16954938>
llvm-svn: 209162
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D3815
llvm-svn: 209150
|
| |
|
|
|
|
|
|
|
| |
The cost model conservatively assumes that it will always get scalarized and
that's about as good as we can get with the generic TTI; reasoning whether a
shuffle with an efficient lowering is available is hard. We can override that
conservative estimate for some targets in the future.
llvm-svn: 209125
|
| |
|
|
|
|
|
|
|
|
|
| |
This removes TODO added in r208849 [http://reviews.llvm.org/D3629]
MIN(MIN(A, 97), 23) -> MIN(A, 23)
MAX(MAX(A, 23), 97) -> MAX(A, 97)
Differential Revision: http://reviews.llvm.org/D3785
llvm-svn: 209110
|
| |
|
|
|
|
| |
It broke clang selfhosting even after r209065.
llvm-svn: 209067
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently LLVM will generally merge GEPs. This allows backends to use more
complex addressing modes. In some cases this is not happening because there
is PHI inbetween the two GEPs:
GEP1--\
|-->PHI1-->GEP3
GEP2--/
This patch checks to see if GEP1 and GEP2 are similiar enough that they can be
cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123):
GEP1--\ --\ --\
|-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123
GEP2--/ --/ --/
This also breaks certain use chains that are preventing GEP->GEP merges that the
the existing instcombine would merge otherwise.
Tests included.
rdar://15547484
llvm-svn: 209049
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to put dynamic initializers for weak data into the same
comdat group as the data being initialized. This is necessary for MSVC
ABI compatibility. Once we have comdats for guard variables, we can use
the combination to help GlobalOpt fire more often for weak data with
guarded initialization on other platforms.
Reviewers: nlewycky
Differential Revision: http://reviews.llvm.org/D3499
llvm-svn: 209015
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the design of GlobalAlias so that it doesn't take a
ConstantExpr anymore. It now points directly to a GlobalObject, but its type is
independent of the aliasee type.
To avoid changing all alias related tests in this patches, I kept the common
syntax
@foo = alias i32* @bar
to mean the same as now. The cases that used to use cast now use the more
general syntax
@foo = alias i16, i32* @bar.
Note that GlobalAlias now behaves a bit more like GlobalVariable. We
know that its type is always a pointer, so we omit the '*'.
For the bitcode, a nice surprise is that we were writing both identical types
already, so the format change is minimal. Auto upgrade is handled by looking
through the casts and no new fields are needed for now. New bitcode will
simply have different types for Alias and Aliasee.
One last interesting point in the patch is that replaceAllUsesWith becomes
smart enough to avoid putting a ConstantExpr in the aliasee. This seems better
than checking and updating every caller.
A followup patch will delete getAliasedGlobal now that it is redundant. Another
patch will add support for an explicit offset.
llvm-svn: 209007
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Analyze the range of values produced by ashr/lshr cst, %V when it is
being used in an icmp.
Reviewers: nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3774
llvm-svn: 209000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The dividend in an sdiv tells us the largest and smallest possible
results. Use this fact to optimize comparisons against an sdiv with a
constant dividend.
Reviewers: nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3795
llvm-svn: 208999
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r208934.
The patch depends on aliases to GEPs with non zero offsets. That is not
supported and fairly broken.
The good news is that GlobalAlias is being redesigned and will have support
for offsets, so this patch should be a nice match for it.
llvm-svn: 208978
|
| |
|
|
|
|
|
|
|
|
|
| |
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.
For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.
llvm-svn: 208934
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The allocas going out of scope are immediately killed by the return
instruction.
This is a resend of r208912, which was committed accidentally.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3792
llvm-svn: 208920
|
| |
|
|
|
|
|
|
| |
This reverts commit r208912.
It was committed accidentally without review.
llvm-svn: 208914
|
| |
|
|
|
|
|
|
|
|
|
| |
The allocas going out of scope are immediately killed by the return
instruction.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3630
llvm-svn: 208912
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The interesting case is what happens when you inline a musttail call
through a musttail call site. In this case, we can't break perfect
forwarding or allow any stack growth.
Instead of merging control flow from the inlined return instruction
after a musttail call into the body of the caller, leave the inlined
return instruction in the caller so that the musttail call stays in the
tail position.
More work is required in http://reviews.llvm.org/D3630 to handle the
case where the inlined function has dynamic allocas or byval arguments.
Reviewers: chandlerc
Differential Revision: http://reviews.llvm.org/D3491
llvm-svn: 208910
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
much more effectively when trying to constant fold a load of a constant.
Previously, we only handled bitcasts by trying to find a totally generic
byte representation of the constant and use that. Now, we look through
the bitcast to see what constant we might fold the load into, and then
try to form a constant expression cast of the found value that would be
equivalent to loading the value.
You might wonder why on earth this actually matters. Well, turns out
that the Itanium ABI causes us to create a single array for a vtable
where the first elements are virtual base offsets, followed by the
virtual function pointers. Because the array is homogenous the element
type is consistently i8* and we inttoptr the virtual base offsets into
the initial elements.
Then constructors bitcast these pointers to i64 pointers prior to
loading them. Boom, no more constant folding of virtual base offsets.
This is the first fix to LLVM to address the *insane* performance Eric
Niebler discovered with Clang on his range comprehensions[1]. There is
more to come though, this doesn't *really* fix the problem fully.
[1]: http://ericniebler.com/2014/04/27/range-comprehensions/
llvm-svn: 208856
|
| |
|
|
|
|
| |
sanitizer-x86_64-linux-bootstrap/builds/3399
llvm-svn: 208852
|
| |
|
|
|
|
|
|
|
| |
MIN(MIN(A, 23), 97) -> MIN(A, 23)
MAX(MAX(A, 97), 23) -> MAX(A, 97)
Differential Revision: http://reviews.llvm.org/D3629
llvm-svn: 208849
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing
Z3 Verifications code for above transform
http://rise4fun.com/Z3/Pmsh
Differential Revision: http://reviews.llvm.org/D3717
llvm-svn: 208848
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This gets rid of a sub instruction by moving the negation to the
constant when valid.
Reviewers: nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3773
llvm-svn: 208827
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We know that -(zext V) will always be <= zero, simplify signed icmps
that have these.
Uncovered using http://www.cs.utah.edu/~regehr/souper/
Reviewers: nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3754
llvm-svn: 208809
|
| |
|
|
|
|
| |
This resolves PR19737.
llvm-svn: 208762
|
| |
|
|
|
|
| |
This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/
llvm-svn: 208750
|
| |
|
|
|
|
| |
This fix resolves PR19730.
llvm-svn: 208666
|
| |
|
|
| |
llvm-svn: 208658
|
| |
|
|
| |
llvm-svn: 208644
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tested by comparing make check VERBOSE=1 before and after to make sure
no tests are missed. (VERBOSE=1 prints the list of tests.)
Only one test :( remains where .cpp is required:
tools/llvm-cov/range_based_for.cpp:// RUN: llvm-cov range_based_for.cpp | FileCheck %s --check-prefix=STDOUT
The topic was discussed in this thread:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140428/214905.html
llvm-svn: 208621
|
| |
|
|
|
|
|
|
| |
In transformation:
BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef)
type of the undef argument must be same as type of BinOp.
llvm-svn: 208531
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Do not apply transformation:
BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
if operands v1 and v2 are of different size.
This change fixes PR19717, which was caused by r208488.
llvm-svn: 208518
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables transformations:
BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2))
BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2)
They allow to eliminate extra shuffles in some cases.
Differential Revision: http://reviews.llvm.org/D3525
llvm-svn: 208488
|
| |
|
|
|
|
|
|
|
|
|
|
| |
unreachable code.
There is no total ordering if the CFG is disconnected. We don't care if we
catch all CSE opportunities in dead code either so just exclude ignore them in
the assert.
PR19646
llvm-svn: 208461
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Since ExtractValue is not included in ComputeSpeculationCost CFGs containing
ExtractValueInsts cannot be simplified. In particular this interacts with
InstCombineCompare's tendency to insert add.with.overflow intrinsics for
certain idiomatic math operations, preventing optimization.
This patch adds ExtractValue to the ComputeSpeculationCost. Test case included
rdar://14853450
llvm-svn: 208434
|
| |
|
|
|
|
|
|
| |
instructions.
And one more test added.
llvm-svn: 208355
|
| |
|
|
| |
llvm-svn: 208308
|
| |
|
|
| |
llvm-svn: 208298
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).
This does represent a small functionality change:
- For generic x86-64 (which uses the SB model and, thus, will get some
unrolling).
- For AMD cores (because they still currently use the SB scheduling model)
- For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.
llvm-svn: 208289
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Visibilities of `hidden` and `protected` are meaningless for symbols
with local linkage.
- Change the assembler to reject non-default visibility on symbols
with local linkage.
- Change the bitcode reader to auto-upgrade `hidden` and `protected`
to `default` when the linkage is local.
- Update LangRef.
<rdar://problem/16141113>
llvm-svn: 208263
|
| |
|
|
|
|
| |
rdar://problem/11861387
llvm-svn: 208214
|
| |
|
|
|
|
|
|
|
|
| |
calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'.
The number of tail call to loop conversions remains the same (1618 by my count).
The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly.
llvm-svn: 208017
|
| |
|
|
| |
llvm-svn: 207984
|
| |
|
|
|
|
| |
<rdar://problem/16812145>
llvm-svn: 207983
|
| |
|
|
|
|
|
|
|
| |
Visibility is meaningless when the linkage is local. Change
`-internalize` to reset the visibility to `default`.
<rdar://problem/16141113>
llvm-svn: 207979
|
| |
|
|
| |
llvm-svn: 207969
|
| |
|
|
| |
llvm-svn: 207966
|
| |
|
|
|
|
|
|
|
|
|
| |
limit unrolling.
Otherwise we use the same threshold as for complete unrolling, which is
way too high. This made us unroll any loop smaller than 150 instructions
by 8 times, but only if someone specified -march=core2 or better,
which happens to be the default on darwin.
llvm-svn: 207940
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When can't assume a vectorized tree is rooted in an instruction. The IRBuilder
could have constant folded it. When we rebuild the build_vector (the series of
InsertElement instructions) use the last original InsertElement instruction. The
vectorized tree root is guaranteed to be before it.
Also, we can't assume that the n-th InsertElement inserts the n-th element into
a vector.
This reverts r207746 which reverted the revert of the revert of r205018 or so.
Fixes the test case in PR19621.
llvm-svn: 207939
|
| |
|
|
|
|
|
| |
This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer.
Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559
llvm-svn: 207901
|
| |
|
|
|
|
|
| |
See PR19608 for the details but to summarize it was easy to modify the .ll
file to get the desired def-use ordering.
llvm-svn: 207887
|