| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 244725
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely).
InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask).
I also moved all the relevant combine tests into InstCombine/blend_x86.ll
Differential Revision: http://reviews.llvm.org/D11934
llvm-svn: 244723
|
|
|
|
|
|
|
|
|
|
| |
`InstCombiner::OptimizeOverflowCheck` was asserting an
invariant (operands to binary operations are ordered by decreasing
complexity) that wasn't really an invariant. Fix this by instead having
`InstCombiner::OptimizeOverflowCheck` establish the invariant if it does
not hold.
llvm-svn: 244676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
llvm-svn: 244580
|
|
|
|
|
|
|
|
| |
As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations.
Differential Revision: http://reviews.llvm.org/D11886
llvm-svn: 244495
|
|
|
|
| |
llvm-svn: 244402
|
|
|
|
|
|
|
| |
Found by inspection, this change should not effect the existing
landingpad behavior.
llvm-svn: 244391
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes.
e.g.
%1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>)
In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) | 15) - giving a zero result.
In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821).
Differential Revision: http://reviews.llvm.org/D11760
llvm-svn: 244341
|
|
|
|
|
|
|
|
| |
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst. This commit changes a bunch
of eligible loops to use it.
llvm-svn: 244260
|
|
|
|
| |
llvm-svn: 244021
|
|
|
|
|
|
|
|
| |
function. NFCI.
This will make some upcoming bugfixes + improvements easier to manage.
llvm-svn: 243962
|
|
|
|
| |
llvm-svn: 243424
|
|
|
|
| |
llvm-svn: 243306
|
|
|
|
|
|
|
|
| |
Now that we are generating sane codegen for vector sext/zext nodes on SSE targets, this patch uses instcombine to replace the SSE41/AVX2 pmovsx and pmovzx intrinsics with the equivalent native IR code.
Differential Revision: http://reviews.llvm.org/D11503
llvm-svn: 243303
|
|
|
|
|
|
| |
to match AMD docs. NFCI.
llvm-svn: 243226
|
|
|
|
|
|
|
| |
This exposes further optimization opportunities if the selects are
correlated.
llvm-svn: 242235
|
|
|
|
| |
llvm-svn: 242008
|
|
|
|
| |
llvm-svn: 242007
|
|
|
|
|
|
| |
Fixes PR24083
llvm-svn: 241955
|
|
|
|
|
|
|
| |
This is important to fold away the slow case of complex multiplies
emitted by clang.
llvm-svn: 241911
|
|
|
|
| |
llvm-svn: 241887
|
|
|
|
|
|
|
|
| |
Not doing this can lead to misoptimizations down the line, e.g. because
of range metadata on the replacing load excluding values that are valid
for the load that is being replaced.
llvm-svn: 241886
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes PR23809. Without passing the context to SimplifyICmpInst, we would
use the assume to prove that the condition feeding the assume is
trivially true (see isValidAssumeForContext in ValueTracking.cpp),
causing the removal of the assume which may be useful for later
optimizations.
Test Plan: pr23800.ll
Reviewers: hfinkel, majnemer
Reviewed By: hfinkel
Subscribers: henryhu, llvm-commits, wengxt, broune, meheff, eliben
Differential Revision: http://reviews.llvm.org/D10695
llvm-svn: 240683
|
|
|
|
| |
llvm-svn: 240480
|
|
|
|
| |
llvm-svn: 240478
|
|
|
|
|
|
| |
Apparently, the style needs to be agreed upon first.
llvm-svn: 240390
|
|
|
|
|
|
|
| |
This came up when examining some code generated by clang's IRGen for
certain member pointers.
llvm-svn: 240369
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Differential Revision: http://reviews.llvm.org/D10429
llvm-svn: 239940
|
|
|
|
|
|
|
|
|
|
|
| |
The original change broke clang side tests. I will be submitting those momentarily. This change includes post commit feedback on the original change from from Pete Cooper.
Original Submission comments:
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.
Differential Revision: http://reviews.llvm.org/D9132
llvm-svn: 239849
|
|
|
|
|
|
| |
I forgot to update some clang test cases. I'll fix and resubmit tomorrow.
llvm-svn: 239800
|
|
|
|
|
|
|
|
| |
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.
Differential Revision: http://reviews.llvm.org/D9132
llvm-svn: 239795
|
|
|
|
|
|
|
|
| |
There were several SelectInst combines that always returned an existing
instruction instead of modifying an old one or creating a new one.
These are prime candidates for moving to InstSimplify.
llvm-svn: 239229
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand. However, doing so is only valid if the
computation doesn't inject poison into the computation.
It might be helpful to consider the following example:
(select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)
The select is equivalent to (add %i, 1) but not (add nsw %i, 1).
Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.
llvm-svn: 239215
|
|
|
|
|
|
|
|
|
| |
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least on ARM, increasing the
compilation time (stage 2, clang's) by 5x.
llvm-svn: 239175
|
|
|
|
|
|
|
|
| |
This change is NFC because both the ``break;`` and the fall through end
up returning immediately. However, this helps clarify intent and also
ensures correctness in case more ``case`` blocks are added later.
llvm-svn: 239172
|
|
|
|
|
|
| |
PR23751 was caused by a missing ``break;`` in r234388.
llvm-svn: 239171
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
the value. One of the instructions is the original select, causing
LLVM to never reach fixpoint.
Instead, strip the flags only when we are sure we are going to perform
the simplification.
llvm-svn: 239141
|
|
|
|
|
|
|
|
|
| |
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?
Likely, we shouldn't return nullptr?
llvm-svn: 239139
|
|
|
|
|
|
|
|
|
|
|
| |
We cleverly handle cases where computation done in one argument of a select
instruction is suitable for the other operand, thus obviating the need
of the select and the comparison. However, the other operand cannot
have flags.
This fixes PR23757.
llvm-svn: 239115
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238602
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we only fold a BitCast into a Load when the BitCast is its
only user.
Do the same for any no-op cast.
Differential Revision: http://reviews.llvm.org/D9152
llvm-svn: 238452
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InstCombine transforms A *nsw B +nsw A *nsw C to A *nsw (B + C).
This is incorrect -- e.g. if A = -1, B = 1, C = INT_SMAX. Then
nothing in the LHS overflows, but the multiplication in RHS overflows.
We need to first make sure that we won't multiple by INT_SMAX + 1.
Test case `add_of_mul` contributed by Sanjoy Das.
This fixes PR23635.
Differential Revision: http://reviews.llvm.org/D9629
llvm-svn: 238066
|
|
|
|
|
|
|
|
|
| |
This change does a few things:
- Move some InstCombine transforms to InstSimplify
- Run SimplifyCall from within InstCombine::visitCallInst
- Teach InstSimplify to fold [us]mul_with_overflow(X, undef) to 0.
llvm-svn: 237995
|
|
|
|
|
|
|
|
|
| |
A refactoring made @llvm.ssub.with.overflow.i32(i32 %X, i32 0) transform
into undef instead of %X.
This fixes PR23624.
llvm-svn: 237968
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure if we're truncating a constant that would then be sign extended
that the sign extension of the truncated constant is the same as the
original constant.
> Canonicalize min/max expressions correctly.
>
> This patch introduces a canonical form for min/max idioms where one operand
> is extended or truncated. This often happens when the other operand is a
> constant. For example:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = sext i32 %a to i64
> %3 = select i1 %1, i64 %2, i64 0
>
> Would now be canonicalized into:
>
> %1 = icmp slt i32 %a, i32 0
> %2 = select i1 %1, i32 %a, i32 0
> %3 = sext i32 %2 to i64
>
> This builds upon a patch posted by David Majenemer
> (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass
> passively stopped instcombine from ruining canonical patterns. This
> patch additionally actively makes instcombine canonicalize too.
>
> Canonicalization of expressions involving a change in type from int->fp
> or fp->int are not yet implemented.
llvm-svn: 237821
|
|
|
|
|
|
| |
This caused PR23583.
llvm-svn: 237739
|
|
|
|
|
|
| |
init only
llvm-svn: 237624
|
|
|
|
|
|
|
|
|
| |
SimplifyDemandedBits was "simplifying" a constant by removing just sign bits.
This caused a canonicalization race between different parts of instcombine.
Fix and regression test added - third time lucky?
llvm-svn: 237539
|
|
|
|
|
|
|
|
| |
The AArch64 LNT bot is unhappy - I've found that the problem is in
SimpliftDemandedBits, but that's going to require another code review
so reverting in the meantime.
llvm-svn: 237528
|