| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
support. NFCI.
Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types.
llvm-svn: 346139
|
|
|
|
|
|
|
|
| |
This patch adds support for expanding vector CTPOP instructions and removes the x86 'bitmath' lowering which replicates the same expansion.
Differential Revision: https://reviews.llvm.org/D53258
llvm-svn: 345869
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifySetCC could shrink a load without checking for
profitability or legality of such shink with a target.
Added checks to prevent shrinking of aligned scalar loads
in AMDGPU below dword as scalar engine does not support it.
Differential Revision: https://reviews.llvm.org/D53846
llvm-svn: 345778
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 integers and perform saturation subtraction on
them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53783
llvm-svn: 345512
|
|
|
|
|
|
| |
TargetLowering::expandUINT_TO_FP.
llvm-svn: 345478
|
|
|
|
|
|
|
|
| |
Add vector support to TargetLowering::expandFP_TO_UINT.
This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this.
llvm-svn: 345473
|
|
|
|
|
|
|
|
| |
TargetLowering::expandFP_TO_UINT. NFCI.
First step towards fixing PR17686 and adding vector support.
llvm-svn: 345452
|
|
|
|
|
|
|
|
|
|
|
|
| |
As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well.
Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets.
The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it.
Differential Revision: https://reviews.llvm.org/D53649
llvm-svn: 345256
|
|
|
|
|
|
|
|
| |
Add a SimplifyDemandedBitsForTargetNode callback to handle target nodes.
Differential Revision: https://reviews.llvm.org/D53643
llvm-svn: 345179
|
|
|
|
|
|
|
|
| |
As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer.
Proper vector support will be added by D53258.
llvm-svn: 345066
|
|
|
|
|
|
|
|
| |
As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
Extension to D53474
llvm-svn: 345060
|
|
|
|
|
|
|
|
|
|
| |
As suggested on D53258, this patch demonstrates sharing common CTTZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
I intend to move CTLZ and (scalar) CTPOP over as well and then update D53258 accordingly.
Differential Revision: https://reviews.llvm.org/D53474
llvm-svn: 345039
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 integers and perform unsigned saturation
addition on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53340
llvm-svn: 344971
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce new versions that follow the IEEE semantics
to help with legalization that may need quieted inputs.
There are some regressions from inserting unnecessary
canonicalizes when these are matched from fast math
fcmp + select which should be fixed in a future commit.
llvm-svn: 344914
|
|
|
|
|
|
|
|
|
|
|
| |
Add an intrinsic that takes 2 integers and perform saturation addition on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53053
llvm-svn: 344629
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is intended to make the backend on par with functionality that was
added to the IR version of SimplifyDemandedVectorElts in:
rL343727
...and the original motivation is that we need to improve demanded-vector-elements
in several ways to avoid problems that would be exposed in D51553.
Differential Revision: https://reviews.llvm.org/D52912
llvm-svn: 344541
|
|
|
|
|
|
| |
Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used.
llvm-svn: 344138
|
|
|
|
|
|
| |
Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support.
llvm-svn: 344136
|
|
|
|
|
|
|
|
|
|
| |
SimplifyDemandedBits/SimplifyDemandedVectorElts
Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful.
Differential Revision: https://reviews.llvm.org/D53026
llvm-svn: 344132
|
|
|
|
|
|
|
|
| |
SimplifyDemandedBits
Fix for AVX1 masked load/store regression on D52964
llvm-svn: 344043
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyDemandedVectorElts simplification
rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....).
Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ.
Thanks to Jan Vesely (@jvesely) for bisecting the bug.
llvm-svn: 343935
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
simplification
This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector.
There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify.
As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.).
Fixes PR39178
Differential Revision: https://reviews.llvm.org/D52935
llvm-svn: 343913
|
|
|
|
|
|
|
|
|
|
|
| |
Adding NonNull as attributes to returned pointers has the unfortunate side
effect of disabling tail calls. This patch ignores the NonNull attribute when
we decide whether to tail merge, in the same way that we ignore the NoAlias
attribute, as it has no affect on the call sequence.
Differential Revision: https://reviews.llvm.org/D52238
llvm-svn: 343091
|
|
|
|
|
|
|
|
|
| |
This was trying to scalarizing a scalar FP type,
resulting in an assert.
Fixes unaligned f64 stack stores for AMDGPU.
llvm-svn: 342132
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
condition operand
This is the DAG equivalent of D51433.
If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition.
The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit
vectors because we don't need those anyway.
I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed
to be running there?
Differential Revision: https://reviews.llvm.org/D51696
llvm-svn: 341762
|
|
|
|
|
|
|
|
| |
Fix remaining cases not committed in https://reviews.llvm.org/D49574
Differential Revision: https://reviews.llvm.org/D50659
llvm-svn: 341380
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A follow-up for D49266 / rL337166 + D49497 / rL338044.
This is still the same pattern to check for the [lack of]
signed truncation, but in this case the constants and the predicate
are negated.
https://rise4fun.com/Alive/BDV
https://rise4fun.com/Alive/n7Z
Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51532
llvm-svn: 341287
|
|
|
|
|
|
| |
This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1.
llvm-svn: 340260
|
|
|
|
|
|
| |
Fuzz tests have detected an issue, currently working on a fix.
llvm-svn: 340195
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators.
This is the last patch necessary to close PR36545
Differential Revision: https://reviews.llvm.org/D50765
llvm-svn: 339908
|
|
|
|
|
|
| |
Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator.
llvm-svn: 339898
|
|
|
|
| |
llvm-svn: 339813
|
|
|
|
|
|
| |
Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support.
llvm-svn: 339763
|
|
|
|
|
|
|
|
| |
scalar/vector magic value collection. NFCI.
Use the same ISD::matchUnaryPredicate pattern that was used in D50392.
llvm-svn: 339758
|
|
|
|
|
|
|
|
| |
This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators.
Differential Revision: https://reviews.llvm.org/D50392
llvm-svn: 339756
|
|
|
|
|
|
|
|
| |
for XOR. NFCI
We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead.
llvm-svn: 339509
|
|
|
|
| |
llvm-svn: 339508
|
|
|
|
|
|
| |
As requested in D50392, pull the magic constant calculations out into a helper function.
llvm-svn: 339346
|
|
|
|
|
|
|
|
| |
Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike.
I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x
llvm-svn: 339254
|
|
|
|
|
|
|
|
| |
As requested in D50392, this is a minor refactor to BuildExactSDIV to stop taking the uniform constant APInt divisor and instead extract it locally.
I also cleanup the operands and valuetypes to better match BuildUDiv (and BuildSDIV in the near future).
llvm-svn: 339246
|
|
|
|
|
|
| |
We're not handling the UDIV by one special case properly - for now just early out.
llvm-svn: 339229
|
|
|
|
|
|
| |
This was missed in D49248
llvm-svn: 339146
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators.
It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV.
Differential Revision: https://reviews.llvm.org/D49248
llvm-svn: 339121
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.
Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.
llvm-svn: 338910
|
|
|
|
|
|
|
|
| |
First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used.
Differential Revision: https://reviews.llvm.org/D50185
llvm-svn: 338838
|
|
|
|
| |
llvm-svn: 338352
|
|
|
|
|
|
|
|
| |
BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.
llvm-svn: 338329
|
|
|
|
|
|
| |
BuildUDIV was already correct.
llvm-svn: 338304
|
|
|
|
|
|
| |
BuildSDIVPow2.
llvm-svn: 338303
|
|
|
|
|
|
|
|
|
|
| |
BuildSDIV/BuildUDIV.
This removes the need for an assert to ensure the pointer isn't null.
Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here.
llvm-svn: 338205
|