| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifySetCC could shrink a load without checking for
profitability or legality of such shink with a target.
Added checks to prevent shrinking of aligned scalar loads
in AMDGPU below dword as scalar engine does not support it.
Differential Revision: https://reviews.llvm.org/D53846
llvm-svn: 345778
|
|
|
|
|
|
|
|
|
| |
lowerRangeToAssertZExt currently relies on something like EarlyCSE having
eliminated the constant range [0,1). At -O0 this leads to an assert.
Differential Revision: https://reviews.llvm.org/D53888
llvm-svn: 345770
|
|
|
|
|
|
| |
builds. NFC
llvm-svn: 345761
|
|
|
|
| |
llvm-svn: 345758
|
|
|
|
|
|
| |
We should be using the getShiftAmountTy value type for shift amounts.
llvm-svn: 345756
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: RKSimon, spatel, javed.absar, craig.topper, t.p.northover
Reviewed By: RKSimon
Subscribers: craig.topper, llvm-commits
Differential Revision: https://reviews.llvm.org/D52504
llvm-svn: 345721
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D53216
llvm-svn: 345650
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Normalize the offset for endianess before checking
if the store cover the load in ForwardStoreValueToDirectLoad.
Without this we missed out on some optimizations for big
endian targets. If for example having a 4 bytes store followed
by a 1 byte load, loading the least significant byte from the
store, the STCoversLD check would fail (see @test4 in
test/CodeGen/AArch64/load-store-forwarding.ll).
This patch also fixes a problem seen in an out-of-tree target.
The target has i40 as a legal type, it is big endian,
and the StoreSize for i40 is 48 bits. So when normalizing
the offset for endianess we need to take the StoreSize into
account (assuming that padding added when storing into
a larger StoreSize always is added at the most significant
end).
Reviewers: niravd
Reviewed By: niravd
Subscribers: javed.absar, kristof.beyls, llvm-commits, uabelho
Differential Revision: https://reviews.llvm.org/D53776
llvm-svn: 345636
|
|
|
|
| |
llvm-svn: 345623
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Narrowing vector binops came up in the demanded bits discussion in D52912.
I don't think we're going to be able to do this transform in IR as a canonicalization
because of the risk of creating unsupported widths for vector ops, but we already have
a DAG TLI hook to allow what I was hoping for: isExtractSubvectorCheap(). This is
currently enabled for x86, ARM, and AArch64 (although only x86 has existing regression
test diffs).
This is artificially limited to not look through bitcasts because there are so many
test diffs already, but that's marked with a TODO and is a small follow-up.
Differential Revision: https://reviews.llvm.org/D53784
llvm-svn: 345602
|
|
|
|
| |
llvm-svn: 345598
|
|
|
|
|
|
|
|
|
|
| |
Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy.
This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle.
Differential Revision: https://reviews.llvm.org/D53760
llvm-svn: 345578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Tests by @spatel, thanks
Reviewers: spatel, RKSimon
Reviewed By: spatel
Subscribers: sdardis, atanasyan, llvm-commits, spatel
Differential Revision: https://reviews.llvm.org/D52668
llvm-svn: 345575
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector output type and a vector input type that needs to be widened
Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86.
Reviewers: efriedma, RKSimon
Reviewed By: efriedma
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D53229
llvm-svn: 345567
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 integers and perform saturation subtraction on
them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53783
llvm-svn: 345512
|
|
|
|
| |
llvm-svn: 345481
|
|
|
|
|
|
| |
TargetLowering::expandUINT_TO_FP.
llvm-svn: 345478
|
|
|
|
|
|
|
|
| |
Add vector support to TargetLowering::expandFP_TO_UINT.
This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this.
llvm-svn: 345473
|
|
|
|
|
|
|
|
| |
Enable constant folding when both operands are vectors of constants.
Turn into FNEG/FABS when the RHS is a splat constant vector.
llvm-svn: 345469
|
|
|
|
|
|
|
|
| |
TargetLowering::expandFP_TO_UINT. NFCI.
First step towards fixing PR17686 and adding vector support.
llvm-svn: 345452
|
|
|
|
|
|
|
| |
We can extend this code to handle many more cases
if an extract is cheap, so prepping for that change.
llvm-svn: 345430
|
|
|
|
|
|
|
|
|
|
|
|
| |
illegal setccs. Add checks for valid setccs
The DAGTypeLegalizer::getSETCCWidenedResultTy was widening the MaskVT, but the code in convertMask called after getSETCCWidenedResultTy had no idea this widening had occurred. So none of the operands were widened when convertMask created new setccs with the widened VT.
This patch removes the widening and adds some asserts to getNode to validate the types of setccs to prevent issues like this in the future.
Differential Revision: https://reviews.llvm.org/D53743
llvm-svn: 345428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "dead" markings allow existing target-independent optimizations,
like MachineSink, to trigger more frequently. The CPSR defs would have
eventually been marked dead by LiveVariables, so this only affects
optimizations before regalloc.
The ARMBaseInstrInfo.cpp change is fixing a bug which is only visible
with this change: the transform adds a use to an otherwise dead def
of CPSR. This is covered by existing regression tests.
thumb2-tbh.ll breaks for Thumb1 due to MachineLICM changing the
generated code; I'll fix it in D53452.
Differential Revision: https://reviews.llvm.org/D53453
llvm-svn: 345420
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds support for LSDA (exception table) generation for wasm EH.
Wasm EH mostly follows the structure of Itanium-style exception tables,
with one exception: a call site table entry in wasm EH corresponds to
not a call site but a landing pad.
In wasm EH, the VM is responsible for stack unwinding. After an
exception occurs and the stack is unwound, the control flow is
transferred to wasm 'catch' instruction by the VM, after which the
personality function is called from the compiler-generated code. (Refer
to WasmEHPrepare pass for more information on this part.)
This patch:
- Changes wasm.landingpad.index intrinsic to take a token argument, to
make this 1:1 match with a catchpad instruction
- Stores landingpad index info and catch type info MachineFunction in
before instruction selection
- Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an
exception table
- Adds WasmException class with overridden methods for table generation
- Adds support for LSDA section in Wasm object writer
Reviewers: dschuff, sbc100, rnk
Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D52748
llvm-svn: 345345
|
|
|
|
|
|
|
|
|
| |
Replacing BinaryOperator::isFNeg(...) to avoid regressions when we
separate FNeg from the FSub IR instruction.
Differential Revision: https://reviews.llvm.org/D53650
llvm-svn: 345295
|
|
|
|
|
|
|
|
| |
As noticed on D52965, the SINT_TO_FP i64 to f32 legalization code has been dead for years - protected by an assert.
Differential Revision: https://reviews.llvm.org/D53703
llvm-svn: 345290
|
|
|
|
| |
llvm-svn: 345257
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well.
Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets.
The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it.
Differential Revision: https://reviews.llvm.org/D53649
llvm-svn: 345256
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.
Reviewers: arsenm, aheejin, dschuff, javed.absar
Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D53112
llvm-svn: 345218
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Depends on D52765.
Reviewers: aheejin, dschuff
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52768
llvm-svn: 345210
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, we've only checked whether merging stores would cause a cycle via
the value argument, but the address and indexed offset arguments are also
capable of creating cycles in some situations.
The addresses are all base+offset with notionally the same base, but the base
SDNode may still be different (e.g. via an indexed load in one case, and an
ISD::ADD elsewhere). This allows cycles to creep in if one of these sources
depends on another.
The indexed offset is usually undef (representing a non-indexed store), but on
some architectures (e.g. 32-bit ARM-mode ARM) it can be an arbitrary value,
again allowing dependency cycles to creep in.
llvm-svn: 345200
|
|
|
|
|
|
|
|
| |
Add a SimplifyDemandedBitsForTargetNode callback to handle target nodes.
Differential Revision: https://reviews.llvm.org/D53643
llvm-svn: 345179
|
|
|
|
|
|
|
|
| |
Use SrcVT/DestVT types and correct shift type.
Part of prep work for D52965
llvm-svn: 345158
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When implementing memset's today we often see this pattern:
$x0 = MOV 0xXYXYXYXYXYXYXYXY
store $x0, ...
$w1 = MOV 0xXYXYXYXY
store $w1, ...
We first create a 64bit constant in a 64bit register with all bytes the
same and then create a 32bit constant with all bytes the same in a 32bit
register. In many targets we could just access the lower byte of the
64bit register instead.
- Ideally this would be handled by the ConstantHoist pass but it runs
too early when memset isn't expanded yet.
- The memset expansion code already had this optimization implemented,
however SelectionDAG constantfolding would constantfold the
"trunc(bigconstnat)" pattern to "smallconstant".
- This patch makes the memset expansion mark the constant as Opaque and
stop DAGCombiner from constant folding in this situation. (Similar to
how ConstantHoisting marks things as Opaque to avoid folding
ADD/SUB/etc.)
Differential Revision: https://reviews.llvm.org/D53181
llvm-svn: 345102
|
|
|
|
|
|
|
|
| |
As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer.
Proper vector support will be added by D53258.
llvm-svn: 345066
|
|
|
|
|
|
|
|
| |
As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
Extension to D53474
llvm-svn: 345060
|
|
|
|
|
|
|
|
|
| |
Vector types are not possible here because this code only starts
matching from the scalar bool value of a conditional branch, but
this is another step towards completely removing the fake binop
queries for not/neg/fneg.
llvm-svn: 345041
|
|
|
|
| |
llvm-svn: 345040
|
|
|
|
|
|
|
|
|
|
| |
As suggested on D53258, this patch demonstrates sharing common CTTZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
I intend to move CTLZ and (scalar) CTPOP over as well and then update D53258 accordingly.
Differential Revision: https://reviews.llvm.org/D53474
llvm-svn: 345039
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 integers and perform unsigned saturation
addition on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53340
llvm-svn: 344971
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've included a fix to DAGCombiner::ForwardStoreValueToDirectLoad that I believe will prevent the previous miscompile.
Original commit message:
Theoretically this was done to simplify the amount of isel patterns that were needed. But it also meant a substantial number of our isel patterns have to match an explicit bitcast. By making the vXi32/vXi16/vXi8 types legal for loads, DAG combiner should be able to change the load type to rem
I had to add some additional plain load instruction patterns and a few other special cases, but overall the isel table has reduced in size by ~12000 bytes. So it looks like this promotion was hurting us more than helping.
I still have one crash in vector-trunc.ll that I'm hoping @RKSimon can help with. It seems to relate to using getTargetConstantFromNode on a load that was shrunk due to an extract_subvector combine after the constant pool entry was created. So we end up decoding more mask elements than the lo
I'm hoping this patch will simplify the number of patterns needed to remove the and/or/xor promotion.
Reviewers: RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits, RKSimon
Differential Revision: https://reviews.llvm.org/D53306
llvm-svn: 344965
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce new versions that follow the IEEE semantics
to help with legalization that may need quieted inputs.
There are some regressions from inserting unnecessary
canonicalizes when these are matched from fast math
fcmp + select which should be fixed in a future commit.
llvm-svn: 344914
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a late backend subset of the IR transform added with:
D52439
We can confirm that the conversion to a 'trunc' is correct by running:
$ opt -instcombine -data-layout="e"
(assuming the IR transforms are correct; change "e" to "E" for big-endian)
As discussed in PR39016:
https://bugs.llvm.org/show_bug.cgi?id=39016
...the pattern may emerge during legalization, so that's we are waiting for an
insertelement to become a scalar_to_vector in the pattern matching here.
The DAG allows for fun variations that are not possible in IR. Result types for
extracts and scalar_to_vector don't necessarily match input types, so that means
we have to be a bit more careful in the transform (see code comments).
The tests show that we don't handle cases that require a shift (as we did in the
IR version). I've left that as a potential follow-up because I'm not sure if
that's a real concern at this late stage.
Differential Revision: https://reviews.llvm.org/D53201
llvm-svn: 344872
|
|
|
|
|
|
|
|
| |
This reverts commit r344575.
Newly introduced test eh-lsda.ll.test fails with use-after-free under
ASAN build.
llvm-svn: 344639
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 integers and perform saturation addition on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53053
llvm-svn: 344629
|
|
|
|
|
|
|
|
| |
Use SrcVT/DestVT types, correct shift type and AND instead of ZERO_EXTEND_IN_REG.
Part of prep work for D52965
llvm-svn: 344602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This adds support for LSDA (exception table) generation for wasm EH.
Wasm EH mostly follows the structure of Itanium-style exception tables,
with one exception: a call site table entry in wasm EH corresponds to
not a call site but a landing pad.
In wasm EH, the VM is responsible for stack unwinding. After an
exception occurs and the stack is unwound, the control flow is
transferred to wasm 'catch' instruction by the VM, after which the
personality function is called from the compiler-generated code. (Refer
to WasmEHPrepare pass for more information on this part.)
This patch:
- Changes wasm.landingpad.index intrinsic to take a token argument, to
make this 1:1 match with a catchpad instruction
- Stores landingpad index info and catch type info MachineFunction in
before instruction selection
- Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an
exception table
- Adds WasmException class with overridden methods for table generation
- Adds support for LSDA section in Wasm object writer
Reviewers: dschuff, sbc100, rnk
Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D52748
llvm-svn: 344575
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is intended to make the backend on par with functionality that was
added to the IR version of SimplifyDemandedVectorElts in:
rL343727
...and the original motivation is that we need to improve demanded-vector-elements
in several ways to avoid problems that would be exposed in D51553.
Differential Revision: https://reviews.llvm.org/D52912
llvm-svn: 344541
|
|
|
|
| |
llvm-svn: 344534
|
|
|
|
|
|
| |
The transform doesn't work if the vector constant has undef elements.
llvm-svn: 344532
|