| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
A catchswitch is a terminator, instructions cannot be inserted after it.
llvm-svn: 265158
|
| |
|
|
| |
llvm-svn: 264944
|
| |
|
|
| |
llvm-svn: 264943
|
| |
|
|
| |
llvm-svn: 264124
|
| |
|
|
|
|
|
|
|
| |
CatchSwitches are not splittable, we cannot insert casts, etc. before
them.
This fixes PR26992.
llvm-svn: 263874
|
| |
|
|
|
|
|
|
|
|
| |
This patch enhances InstCombine to handle following case:
A -> B bitcast
PHI
B -> A bitcast
llvm-svn: 263734
|
| |
|
|
| |
llvm-svn: 263585
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously we had a notion of convergent functions but not of convergent
calls. This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.
Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent. As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.
Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage. Re-landed here with no changes.
Reviewers: chandlerc, jingyue
Subscribers: llvm-commits, tra, jhen, hfinkel
Differential Revision: http://reviews.llvm.org/D17739
llvm-svn: 263481
|
| |
|
|
|
|
|
|
| |
This reapplies r263258, which was reverted in r263321 because
of issues on Clang side.
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263393
|
| |
|
|
|
|
|
|
|
| |
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.
llvm-svn: 263340
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit ae14bf6488e8441f0f6d74f00455555f6f3943ac
Author: Mehdi Amini <mehdi.amini@apple.com>
Date: Fri Mar 11 17:15:50 2016 +0000
Remove PreserveNames template parameter from IRBuilder
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.
Reviewers: chandlerc
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18023
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258
91177308-0d34-0410-b5e6-96231b3b80d8
until we can figure out what to do about clang and Release build testing.
This reverts commit 263258.
llvm-svn: 263321
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Following r263086, we are now relying on a flag on the Context to
discard Value names in release builds.
Reviewers: chandlerc
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18023
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263258
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.
In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.
llvm-svn: 263219
|
| |
|
|
|
|
|
|
|
| |
Since the names are used in a loop this does more work in debug builds. In
release builds value names are generally discarded so we don't have to do
the concatenation at all. It's also simpler code, no functional change
intended.
llvm-svn: 263215
|
| |
|
|
|
|
| |
Extract out a generic interface from a recently landed patch and document a TODO in case compile time becomes a problem.
llvm-svn: 263062
|
| |
|
|
|
|
|
|
| |
When checking whether an smin is positive, we can move the comparison to one of the inputs if the other is known positive. If the known positive one is the min, then the other can't be negative. If the other is the min, then we compute the min.
Differential Revision: http://reviews.llvm.org/D17873
llvm-svn: 263059
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of r251146 InstCombine was extended to call computeKnownBits on
every value in the function to determine whether it happens to be
constant. This increases typical compiletime by 1-3% (5% in irgen+opt
time) in my measurements. On the other hand this case did not trigger
once in the whole llvm-testsuite.
This patch introduces the notion of ExpensiveCombines which are only
enabled for OptLevel > 2. I removed the check in InstructionSimplify as
that is called from various places where the OptLevel is not known but
given the rarity of the situation I think a check in InstCombine is
enough.
Differential Revision: http://reviews.llvm.org/D16835
llvm-svn: 263047
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original commit message:
calculate builtin_object_size if argument is a removable pointer
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D17337
Reland the original change with a small modification (first do a null check
and then do the cast) to satisfy ubsan.
llvm-svn: 263011
|
| |
|
|
|
|
| |
This reverts commit r262670 due to compile failure.
llvm-svn: 262916
|
| |
|
|
|
|
|
|
|
|
| |
This patch enhances InstCombine to handle following case:
A -> B bitcast
PHI
B -> A bitcast
llvm-svn: 262670
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.
The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702
If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:
define <4 x i32> @is_negative(<4 x i32> %x) {
%lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
%not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
%bc = bitcast <4 x i32> %not to <2 x i64>
%notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
%bc2 = bitcast <2 x i64> %notnot to <4 x i32>
ret <4 x i32> %bc2
}
Simplifies to the expected:
define <4 x i32> @is_negative(<4 x i32> %x) {
%lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
ret <4 x i32> %lobit
}
Differential Revision: http://reviews.llvm.org/D17583
llvm-svn: 262645
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.
Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17828
llvm-svn: 262530
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.
Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15890
llvm-svn: 262521
|
| |
|
|
|
|
|
|
| |
asm output
that is broken by this change
llvm-svn: 262440
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shifts (PR26701)
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case.
Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.
But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
llvm-svn: 262424
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17742
llvm-svn: 262419
|
| |
|
|
|
|
|
| |
Most portions of InstCombine properly propagate fast math flags, but
apparently the vector scalarization section was overlooked.
llvm-svn: 262376
|
| |
|
|
|
|
|
| |
Revert r262337 as "check-llvm ubsan" step failed on
sanitizer-x86_64-linux-fast buildbot.
llvm-svn: 262349
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D17337
llvm-svn: 262337
|
| |
|
|
|
|
|
| |
Continuation of:
http://reviews.llvm.org/rL262269
llvm-svn: 262273
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:
__m128 mload_zeros(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0));
}
__m128 mload_fakeones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(1));
}
__m128 mload_ones(float *f) {
return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}
__m128 mload_oneset(float *f) {
return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}
...so none of the above will actually generate a masked load for optimized code.
This is the masked load counterpart to:
http://reviews.llvm.org/rL262064
llvm-svn: 262269
|
| |
|
|
|
|
|
| |
We ended up removing a save/restore pair around an inalloca call,
leading to a miscompile in Chromium.
llvm-svn: 262095
|
| |
|
|
|
|
|
|
|
| |
Replicate everything for integers...because x86.
Continuation of:
http://reviews.llvm.org/rL262064
llvm-svn: 262077
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145
is that customers using the AVX intrinsics in C will benefit from combines when
the store mask is constant:
void mstore_zero_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(0), v);
}
void mstore_fake_ones_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(1), v);
}
void mstore_ones_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v);
}
void mstore_one_set_elt_mask(float *f, __m128 v) {
_mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v);
}
...so none of the above will actually generate a masked store for optimized code.
Differential Revision: http://reviews.llvm.org/D17485
llvm-svn: 262064
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of the payoff for the refactoring in:
http://reviews.llvm.org/rL261649
http://reviews.llvm.org/rL261707
In addition to removing a pile of duplicated code, the xor case was
missing the optimization for vector types because it checked
"SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()"
like 'and' and 'or' were already doing.
This solves part of:
https://llvm.org/bugs/show_bug.cgi?id=26702
llvm-svn: 261750
|
| |
|
|
|
|
|
|
|
|
| |
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.
Reviewed By: hfinkel
Differential Revision: http://reviews.llvm.org/D16180
llvm-svn: 261736
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra
check from the nearly identical 'or' case:
if ((!isa<ICmpInst>(Cast0Src) || !isa<ICmpInst>(Cast1Src))
But I'm not sure how to expose that difference in a regression test.
Without that check, the 'or' path will infinite loop on:
test/Transforms/InstCombine/zext-or-icmp.ll
because the zext-or-icmp fold is attempting a reverse transform.
The refactoring should extend to the 'xor' case next to solve part of
PR26702.
llvm-svn: 261707
|
| |
|
|
|
|
| |
Less indenting, named local variables, more descriptive names.
llvm-svn: 261659
|
| |
|
|
| |
llvm-svn: 261652
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a straight cut and paste of the existing code and is intended to
be the first step in solving part of PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702
We should be able to reuse most of this and delete the nearly identical
existing code in visitOr(). Then, we can enhance visitXor() to use the
same code too.
llvm-svn: 261649
|
| |
|
|
|
|
|
| |
This reverts r261544, which was causing a test failure in
Transforms/FunctionAttrs/readattrs.ll.
llvm-svn: 261549
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously we had a notion of convergent functions but not of convergent
calls. This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.
Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent. As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.
Reviewers: chandlerc, jingyue
Subscribers: hfinkel, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17317
llvm-svn: 261544
|
| |
|
|
| |
llvm-svn: 261484
|
| |
|
|
|
|
|
|
|
| |
Originally part of:
http://reviews.llvm.org/D17485
We need this when simplifying masked memory ops too.
llvm-svn: 261483
|
| |
|
|
|
|
| |
the lowest vector element
llvm-svn: 261460
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
more places to prevent gratuitous re-"runs" of these passes.
The passes themselves don't do any work when run, but we keep spending
time scheduling and running these needlessly when we really don't need
to do so.
This is the first patch towards fixing the really horrible loop pass
pipeline fragmentation pointed out by Sanjoy in PR24804.
llvm-svn: 261302
|
| |
|
|
|
|
| |
Cleanup for upcoming Clang warning -Wcomma. No functionality change intended.
llvm-svn: 261270
|
| |
|
|
| |
llvm-svn: 261156
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Store and loads unpacked by instcombine do not always have the right alignement. This explicitely compute the alignement and set it.
Reviewers: dblaikie, majnemer, reames, hfinkel, joker.eph
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17326
llvm-svn: 261139
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some cases, InstCombine replaces the sequence of xor/sub instruction
followed by cmp instruction into a single cmp instruction.
However, this replacement may result suboptimal result especially when
the xor/sub has more than one use, as discussed in
bug 26465 (https://llvm.org/bugs/show_bug.cgi?id=26465).
This patch make the replacement happen only when xor/sub has only one
use.
Differential Revision: http://reviews.llvm.org/D16915
Patch by Taewook Oh!
llvm-svn: 260695
|