| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference).
I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1.
llvm-svn: 360528
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).
Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.
Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59709
llvm-svn: 360466
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61784
llvm-svn: 360461
|
| |
|
|
|
|
| |
We currently don't calcuate result ranges for these binary operators.
llvm-svn: 360460
|
| |
|
|
|
|
|
| |
One half of the bound is already computed correctly for these
tests, the other isn't.
llvm-svn: 360445
|
| |
|
|
| |
llvm-svn: 360437
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The test case checks were produced by the update_test_checks.py
scripts and I assumed that is sufficient. However, the behaviour
is different with different default target triples. Specify the
triple explicitly in the test case.
If this doesn't clean up the build bot breaks, I'll remove the test
case until I can get to the bottom of why the behaviour on build bots
is different from my machine.
llvm-svn: 360434
|
| |
|
|
| |
llvm-svn: 360433
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Constant expressions may not be added in strict postorder as the
forward instruction scan order. Thus, for a constant express (CE0), if
its operand (CE1) is used in an previous instruction, they are not in
postorder. However, different from
`cloneInstructionWithNewAddressSpace`,
`cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred
instructions for later resolving. That results in failure of inferring
constant address.
- This patch adds the support to infer constant expression operand
recursively, since there won't be loop, if that operand is another
constant expression.
Reviewers: arsenm
Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61760
llvm-svn: 360431
|
| |
|
|
|
|
|
| |
This patch just adds a test case to show the differences in code emitted
by opt before and after https://reviews.llvm.org/D61726.
llvm-svn: 360426
|
| |
|
|
| |
llvm-svn: 360424
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test file has a long history of edits from changes outside
of vectorization, and it would happen again with the proposal in
D61726.
End-to-end testing shouldn't be happening in a test file that is
specifically checking for vector masked load/store ops.
Larger-scale testing goes in PhaseOrdering or the test-suite.
I've hopefully preserved the intent by taking what was completely
unoptimized IR in some tests and passing that through the -O1
pipeline. That becomes the input IR, and now we just run the loop
vectorizer and verify that the vector masked ops are produced as
expected.
llvm-svn: 360340
|
| |
|
|
|
|
| |
And use a more compact name for the tested struct.
llvm-svn: 360319
|
| |
|
|
| |
llvm-svn: 360314
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61685
llvm-svn: 360281
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
InsertBinop tries to move insertion-points out of loops for expressions
that are loop-invariant. This patch adds a new parameter, IsSafeToHost,
to guard that hoisting. This allows callers to suppress that hoisting
for unsafe situations, such as divisions that may have a zero
denominator.
This fixes PR38697.
Differential Revision: https://reviews.llvm.org/D55232
llvm-svn: 360280
|
| |
|
|
| |
llvm-svn: 360279
|
| |
|
|
| |
llvm-svn: 360275
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reassociation's NegateValue moved instructions to the beginning of
blocks (after PHIs) without checking for exception handling pads.
It's possible for reassociation to move something into an exception
handling block so we need to make sure we don't move things too early
in the block. This change advances the insertion point past any
exception handling pads.
If the block we want to move into contains a catchswitch, we cannot
move into it. In that case just create a new neg as if we had not
found an existing neg to move.
Differential Revision: https://reviews.llvm.org/D61089
llvm-svn: 360262
|
| |
|
|
|
|
|
|
|
| |
This reverts commit 3b137a495686bd6018d115ea82fb8bb7718349fd.
As reported in https://reviews.llvm.org/D60846, this is causing
miscompiles.
llvm-svn: 360260
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we fold a branch/switch to an unconditional branch to another dead block we
replace the branch with unreachable, to avoid attempting to fold the
unconditional branch.
Reviewers: davide, efriedma, mssimpso, jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D61300
llvm-svn: 360232
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61661
llvm-svn: 360223
|
| |
|
|
|
|
|
|
|
| |
Improve isKnownNonZero for integers in order to improve cttz
optimizations.
Differential Revision: https://reviews.llvm.org/D60846
llvm-svn: 360222
|
| |
|
|
|
|
| |
From the LangRef: "Returns NaN only if both operands are NaN."
llvm-svn: 360206
|
| |
|
|
| |
llvm-svn: 360204
|
| |
|
|
| |
llvm-svn: 360203
|
| |
|
|
| |
llvm-svn: 360197
|
| |
|
|
| |
llvm-svn: 360190
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(X | C1) + C2 --> (X | C1) ^ C1 iff (C1 == -C2)
I verified the correctness using Alive:
https://rise4fun.com/Alive/YNV
This transform enables the following transform that already exists in
instcombine:
(X | Y) ^ Y --> X & ~Y
As a result, the full expected transform is:
(X | C1) + C2 --> X & ~C1 iff (C1 == -C2)
There already exists the transform in the sub case:
(X | Y) - Y --> X & ~Y
However this does not trigger in the case where Y is constant due to an earlier
transform:
X - (-C) --> X + C
With this new add fold, both the add and sub constant cases are handled.
Patch by Chris Dawson.
Differential Revision: https://reviews.llvm.org/D61517
llvm-svn: 360185
|
| |
|
|
|
|
|
|
|
| |
Fundamentally/generally, we should not have to rely on bailouts/crippling of
folds. In this particular case, I think we always recognize the inverted
predicate min/max pattern, so there should not be any loss of optimization.
Codegen looks better because we are eliminating an fneg.
llvm-svn: 360180
|
| |
|
|
| |
llvm-svn: 360173
|
| |
|
|
| |
llvm-svn: 360170
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024
The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:
A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.
In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.
Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel
Reviewed By: hfinkel
Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits
Tags: #llvm, #debug-info
Differential Revision: https://reviews.llvm.org/D60831
llvm-svn: 360162
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently we express umin as `~umax(~x, ~y)`. However, this becomes
a problem for operands in non-integral pointer spaces, because `~x`
is not something we can compute for `x` non-integral. However, since
comparisons are generally still allowed, we are actually able to
express `umin(x, y)` directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions were
largely identical.
Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D50167
llvm-svn: 360159
|
| |
|
|
| |
llvm-svn: 360149
|
| |
|
|
|
|
|
|
|
|
| |
sink function calls without used results (PR41259)"
This reverts r357452 (git commit 21eb771dcb5c11d7500fa6ad551c97a921997f05).
This was causing strange optimization-related test failures on an internal test. Will followup with more details offline.
llvm-svn: 360086
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't always get this:
Cond ? -X : -Y --> -(Cond ? X : Y)
...even with the legacy IR form of fneg in the case with extra uses,
and we miss matching with the newer 'fneg' instruction because we
are expecting binops through the rest of the path.
Differential Revision: https://reviews.llvm.org/D61604
llvm-svn: 360075
|
| |
|
|
| |
llvm-svn: 360058
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D61573
llvm-svn: 360053
|
| |
|
|
| |
llvm-svn: 360052
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Fixes PR40699.
Reviewers: gchatelet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61585
llvm-svn: 360021
|
| |
|
|
|
|
|
|
|
|
| |
Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert
DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian
targets for same reasons as in D59687.
Differential Revision: https://reviews.llvm.org/D60611
llvm-svn: 360013
|
| |
|
|
| |
llvm-svn: 359982
|
| |
|
|
| |
llvm-svn: 359970
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(2nd try)
This is a subset of the original commit from rL359879
which was reverted because it could crash when using the 'RemovedInstructions'
structure that enables delayed deletion of dead instructions. The motivating
compile-time win does not require that change though. We should get most of
that win from this change alone.
Using/updating a dominator tree to match math overflow patterns may be very
expensive in compile-time (because of the way CGP uses a DT), so just handle
the single-block case.
See post-commit thread for rL354298 for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html
Differential Revision: https://reviews.llvm.org/D61075
llvm-svn: 359969
|
| |
|
|
|
|
|
|
| |
block"
This reverts commit r359879, which introduced a compiler crash.
llvm-svn: 359908
|
| |
|
|
| |
llvm-svn: 359897
|
| |
|
|
| |
llvm-svn: 359881
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using/updating a dominator tree to match math overflow patterns may be very
expensive in compile-time (because of the way CGP uses a DT), so just handle
the single-block case.
Also, we were restarting the iterator loops when doing the overflow intrinsic
transforms by marking the dominator tree for update. That was done to prevent
iterating over a removed instruction. But we can postpone the deletion using
the existing "RemovedInsts" structure, and that means we don't need to update
the DT.
See post-commit thread for rL354298 for more details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html
Differential Revision: https://reviews.llvm.org/D61075
llvm-svn: 359879
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Inalloca parameters require special handling in some optimizations.
This change causes globalopt to strip the inalloca attribute from
function parameters when it is safe to do so, removes the special
handling for inallocas from argpromotion, and replaces it with a
simple check that causes argpromotion to skip functions that receive
inallocas (for when the pass is invoked on code that didn't run
through globalopt first). This also avoids a case where argpromotion
would incorrectly try to pass an inalloca in a register.
Fixes PR41658.
Reviewers: rnk, efriedma
Reviewed By: rnk
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61286
llvm-svn: 359743
|