| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shuffles
Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66804
llvm-svn: 370190
|
|
|
|
|
|
|
|
| |
arguments. NFCI.
These were always being passed in and it allowed me to add the explicit tag to stop a cppcheck warning about 1 argument constructors.
llvm-svn: 370189
|
|
|
|
|
|
|
|
|
|
| |
Copied directly from the IR version.
Most of the testcases I've added for this are somewhat problematic
because they really end up testing the yet to be implemented version
for MUL_I24/MUL_U24.
llvm-svn: 370099
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(-X) * (-Y) + Z --> X * Y + Z
This is a missing optimization that shows up as a potential regression in D66050,
so we should solve it first. We appear to be partly missing this fold in IR as well.
We do handle the simpler case already:
(-X) * (-Y) --> X * Y
And it might be beneficial to make the constraint less conservative (eg, if both
operands are cheap, but not necessarily cheaper), but that causes infinite looping
for the existing fmul transform.
Differential Revision: https://reviews.llvm.org/D66755
llvm-svn: 370071
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
parallelizeChainedStores
Summary: As per title.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66659
llvm-svn: 370056
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
relegalization.
Summary: As per title.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66702
llvm-svn: 370040
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
visitGetElementPtr.
ConstantDataVector is a specialized verison of ConstantVector
that stores data in a packed array of bits instead of as
individual pointers to other Constants. But we really shouldn't
expose that if we can void it. And we should handle regular
ConstantVector equally well.
This removes a dyn_cast to ConstantDataVector and just calls
getSplatValue directly on a Constant* if the type is a vector.
llvm-svn: 370018
|
|
|
|
| |
llvm-svn: 370017
|
|
|
|
|
|
|
| |
This change causes instrumented builds of Clang to have a fatal error in the
backend. https://reviews.llvm.org/D66537 has the details.
llvm-svn: 370006
|
|
|
|
|
|
|
|
|
|
| |
undef/constant), (concat Y, undef/constant) -> concat (VBinOp X, Y), VecC
This improves the combine I included in D66504 to handle constants in the upper operands of the concat. If we can constant fold them away we can pull the concat after the bin op. This helps with chains of madd reductions on X86 from loop unrolling. The loop madd reduction pattern creates pmaddwd with half the width of the add that follows it using zeroes to fill the upper bits. If we have two of these added together we can pull the zeroes through the accumulating add and then shrink it.
Differential Revision: https://reviews.llvm.org/D66680
llvm-svn: 369937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This comes as a first step toward processing the DAG nodes in topological orders. Doing so ensure that arguments of a node are combined before the node itself is combined, which exposes ore opportunities for optimization and/or reduce the amount of patterns a node has to match for.
DAGCombiner adding nodes to the worklist is various places causes the nodes to be in a different order from what is expected. In addition, this is reduant because these nodes end up being added to the worklist anyways due to the machinery at line 1621.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66537
llvm-svn: 369927
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
inserting into undef
Summary:
Concat_vectors is more canonical during early DAG combine. For example, its what's used by SelectionDAGBuilder when converting IR shuffles into SelectionDAG shuffles when element counts between inputs and mask don't match. We also have combines in DAGCombiner than can pull concat_vectors through a shuffle. See partitionShuffleOfConcats. So it seems like concat_vectors is a better operation to use here. I had to teach DAGCombiner's SimplifyVBinOp to also handle concat_vectors with undef. I haven't checked yet if we can remove the INSERT_SUBVECTOR version in there or not.
I didn't want to mess with the other caller of getShuffleHalfVectors that's used during shuffle lowering where insert_subvector probably is what we want to produce so I've enabled this via a boolean passed to the function.
Reviewers: spatel, RKSimon
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66504
llvm-svn: 369872
|
|
|
|
|
|
|
|
|
|
| |
These can turn up during multiplication legalization. In principle
these should also apply to smul_lohi, but I wasn't able to figure
out how to produce those with the necessary operands.
Differential Revision: https://reviews.llvm.org/D66380
llvm-svn: 369864
|
|
|
|
| |
llvm-svn: 369803
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression
Patch showing the effect of enabling bool vector oversimplification.
Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify:
insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i
Preventing the removal of the AND to clear the upper bits of result
Differential Revision: https://reviews.llvm.org/D53022
llvm-svn: 369780
|
|
|
|
|
|
|
|
| |
If the accumulator and either of the multiply operands are negatable then we can we negate the entire expression.
Differential Revision: https://reviews.llvm.org/D63141
llvm-svn: 369746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
computations
Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel
Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66548
llvm-svn: 369662
|
|
|
|
|
|
|
|
|
|
| |
The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497.
The struct contain argument flags which will pass to makeLibCall function.
The patch should not has any functionality changes.
Differential Revision: https://reviews.llvm.org/D65795
llvm-svn: 369622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These calls change the order in which some nodes are processed and so have an effect on codegen.
The change in fixup-bw-copy.ll is due to (and (load anyext)) gets transformed into (load zext) while previously the and was removed by SimplifyDemandedBits, so the (load anyext) remained.
Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66543
llvm-svn: 369561
|
|
|
|
| |
llvm-svn: 369520
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(concat_vectors X, Y), undef)) -> (concat_vectors X, Y, undef, undef)
I also had to add a new combine to X86's combineExtractSubvector to prevent a regression.
This helps our vXi1 code see the full concat operation and allow it optimize undef to a zero if there is already a zero in the concat. This helped us use a movzx instead of an AND in some of the tests. In those tests, one concat comes from SelectionDAGBuilder and the second comes from type legalization of v4i1->i4 bitcasts which uses an additional concat. Though these changes weren't my original motivation.
I'm looking at making X86ISelLowering's narrowShuffle emit a concat_vectors instead of an insert_subvector since concat_vectors is more canonical during early DAG combine. This patch helps prevent a regression from my experiments with that.
Differential Revision: https://reviews.llvm.org/D66456
llvm-svn: 369459
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The general fold is only valid for positive divisors.
Which effectively means, it is invalid for `INT_MIN` divisors,
and we currently bailout if we see them.
But that is too strict, we can just fix-up the results.
For that, let's do a second computation 'in parallel':
```
Name: srem -> and
Pre: isPowerOf2(C)
%o = srem i8 %X, C
%r = icmp eq %o, 0
=>
%n = and i8 %X, C-1
%r = icmp eq %n, 0
```
https://rise4fun.com/Alive/Sup
And then just blend results: if the divisor was `INT_MIN`,
pick the value we got via bit-test,
else pick the value from general fold.
There's interesting observation - `ISD::ROTR` is set to
`LegalizeAction::Expand` before AVX512, so we should not
treat `INT_MIN` divisor as even; and as it can be seen
while `@test_srem_odd_even_one` improves on all run-lines,
`@test_srem_odd_even_INT_MIN` only improves for AVX512.
Reviewers: RKSimon, craig.topper, spatel
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66300
llvm-svn: 369268
|
|
|
|
| |
llvm-svn: 369204
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add the following DAGCombiner folds for mulfix being
one of SMULFIX/UMULFIX/SMULFIXSAT:
(mulfix x, undef, scale) -> 0
(mulfix x, 0, scale) -> 0
Also added canonicalization of constants to RHS.
Reviewers: RKSimon, craig.topper, spatel
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66052
llvm-svn: 369103
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
|
|
|
|
|
|
|
|
| |
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
|
|
|
|
|
|
| |
SmallBitVector/BitVector types aren't used at all in the cpp file.
llvm-svn: 369008
|
|
|
|
|
|
| |
BitVector type isn't used at all in the cpp file.
llvm-svn: 369007
|
|
|
|
|
|
|
|
| |
Set the StartIdx type to size_t so that it matches the StoreNodes SmallVector size() and index types.
Silences the MSVC analyzer warning that unsigned increment might overflow before exceeding size_t on 64-bit targets - this isn't likely to happen but it means we use consistent types and reduces the warning "noise" a little.
llvm-svn: 368998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the last step in an FP add reduction allows reassociation and doesn't care
about -0.0, then we are free to recognize that computation as a reduction
that may reorder the intermediate steps.
This is requested directly by PR42705:
https://bugs.llvm.org/show_bug.cgi?id=42705
and solves PR42947 (if horizontal math instructions are actually faster than
the alternative):
https://bugs.llvm.org/show_bug.cgi?id=42947
Differential Revision: https://reviews.llvm.org/D66236
llvm-svn: 368995
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a ptrmask intrinsic which allows masking out bits of a
pointer that must be zero when accessing it, because of ABI alignment
requirements or a restriction of the meaningful bits of a pointer
through the data layout.
This avoids doing a ptrtoint/inttoptr round trip in some cases (e.g. tagged
pointers) and allows us to not lose information about the underlying
object.
Reviewers: nlopes, efriedma, hfinkel, sanjoy, jdoerfert, aqjune
Reviewed by: sanjoy, jdoerfert
Differential Revision: https://reviews.llvm.org/D59065
llvm-svn: 368986
|
|
|
|
|
|
| |
vector zeroinitializer indices in addition to scalar zeroes.
llvm-svn: 368926
|
|
|
|
| |
llvm-svn: 368905
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This implements an optimization described in Hacker's Delight 10-17:
when `C` is constant, the result of `X % C == 0` can be computed
more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.
One huge caveat: this signed case is only valid for positive divisors.
While we can freely negate negative divisors, we can't negate `INT_MIN`,
so for now if `INT_MIN` is encountered, we bailout.
As a follow-up, it should be possible to handle that more gracefully
via extra `and`+`setcc`+`select`.
This passes llvm's test-suite, and from cursory(!) cross-examination
the folds (the assembly) match those of GCC, and manual checking via alive
did not reveal any issues (other than the `INT_MIN` case)
Reviewers: RKSimon, spatel, hermord, craig.topper, xbolva00
Reviewed By: RKSimon, xbolva00
Subscribers: xbolva00, thakis, javed.absar, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65366
llvm-svn: 368702
|
|
|
|
|
|
|
|
| |
The comment initially matched the code, but the code was incorrect
and was fixed after the initial revert back back when it was introduced,
but the comment was never updated.
llvm-svn: 368701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT"
This introduced a false positive MemorySanitizer warning about use of
uninitialized memory in a vectorized crc function in Chromium. That suggests
maybe something is not right with this transformation. See
https://crbug.com/992853#c7 for a reproducer.
This also reverts the follow-up commits r368307 and r368308 which
depended on this.
> This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.
>
> In particular this helps remove some unnecessary scalar->vector->scalar patterns.
>
> The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.
>
> Differential Revision: https://reviews.llvm.org/D65887
llvm-svn: 368660
|
|
|
|
|
|
| |
for ISD::TRUNCATE
llvm-svn: 368553
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After the commits that changed x86 backend to widen vectors
instead of using promotion some of our downstream tests
started to fail. It was noticed that WidenVectorResult has
been missing support for SMULFIX/UMULFIX/SMULFIXSAT. This
patch adds the missing functionality.
Reviewers: craig.topper, RKSimon
Reviewed By: craig.topper
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66051
llvm-svn: 368540
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the codegen part of fixing:
https://bugs.llvm.org/show_bug.cgi?id=32939
Even with the optimal/canonical IR that is ideally created by D65954,
we would reverse that transform in DAGCombiner and end up with the same
asm on AArch64 or x86.
I see 2 options for trying to correct this:
1. Limit isNegatibleForFree() by special-casing the fmul pattern (this patch).
2. Avoid creating (fmul X, 2.0) in the 1st place by adding a special-case
transform to SelectionDAG::getNode() and/or SelectionDAGBuilder::visitFMul()
that matches the transform done by DAGCombiner.
This seems like the less intrusive patch, but if there's some other reason to
prefer 1 option over the other, we can change to the other option.
Differential Revision: https://reviews.llvm.org/D66016
llvm-svn: 368490
|
|
|
|
|
|
|
|
|
| |
This is handled at node creation time (similar to X/1.0)
after:
rL357029
(no fast-math-flags needed)
llvm-svn: 368443
|
|
|
|
|
|
|
|
|
|
| |
mload/mstore/mgather/mscatter from DAGCombiner to the type legalizer.
We may be able to look to how VSELECT is handled to further
improve this, but this appears to be neutral or an improvement
on the test cases we have.
llvm-svn: 368344
|
|
|
|
| |
llvm-svn: 368343
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for ISD::EXTRACT_VECTOR_ELT
This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.
In particular this helps remove some unnecessary scalar->vector->scalar patterns.
The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.
Differential Revision: https://reviews.llvm.org/D65887
llvm-svn: 368276
|
|
|
|
|
|
| |
with a fix to clear the SDNode map when SelectionDAG is cleared.
llvm-svn: 368230
|
|
|
|
|
|
|
|
| |
for ISD::VECTOR_SHUFFLE
In particular this helps the SSE vector shift cvttps2dq+add+shl pattern by avoiding the need for zeros in shuffle style extensions to vXi32 types as we'll be shifting out those bits anyway
llvm-svn: 368155
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D65698
This adds a KnownBits analysis pass for GISel. This was done as a
pass (compared to static functions) so that we can add other features
such as caching queries(within a pass and across passes) in the future.
This patch only adds the basic pass boiler plate, and implements a lazy
non caching knownbits implementation (ported from SelectionDAG). I've
also hooked up the AArch64PreLegalizerCombiner pass to use this - there
should be no compile time regression as the analysis is lazy.
llvm-svn: 368065
|
|
|
|
|
|
|
|
| |
undemanded ops
If we demand no bits/elts from an Op, just return UNDEF
llvm-svn: 368043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the DAG legalizer to respect the operation actions
set by the target for strict floating-point operations. (Currently, the
legalizer will usually fall back to mutate to the non-strict action
(which is assumed to be legal), and only skip mutation if the strict
operation is marked legal.)
With this patch, if whenever a strict operation is marked as Legal or
Custom, it is passed to the target as usual. Only if it is marked as
Expand will the legalizer attempt to mutate to the non-strict operation.
Note that this will now fail if the non-strict operation is itself
marked as Custom -- the target will have to provide a Custom definition
for the strict operation then as well.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D65226
llvm-svn: 368012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Before this patch MGATHER/MSCATTER is capable of representing all
common addressing modes, but only when illegal types are used.
This patch adds an IndexType property so more representations
are available when using legal types only.
Original modes:
vector of bases
base + vector of signed scaled offsets
New modes:
base + vector of signed unscaled offsets
base + vector of unsigned scaled offsets
base + vector of unsigned unscaled offsets
The current behaviour of addressing modes for gather/scatter remains
unchanged.
Patch by Paul Walker.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D65636
llvm-svn: 368008
|
|
|
|
| |
llvm-svn: 367974
|