summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [Intrinsic] Signed Fixed Point Multiplication IntrinsicLeonard Chan2018-12-121-5/+70
| | | | | | | | | | | | Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D54719 llvm-svn: 348912
* [TargetLowering] Add ISD::EXTRACT_VECTOR_ELT support to SimplifyDemandedBitsSimon Pilgrim2018-12-111-0/+19
| | | | | | | | Let SimplifyDemandedBits attempt to simplify all elements of a vector extraction. Part of PR39689. llvm-svn: 348839
* [TargetLowering] Add UNDEF folding to SimplifyDemandedVectorEltsSimon Pilgrim2018-12-101-1/+6
| | | | | | | | | | If all the demanded elements of the SimplifyDemandedVectorElts are known to be UNDEF, we can simplify to an ISD::UNDEF node. Zero constant folding will be handled in a future patch - its a little trickier as we often have bitcasted zero values. Differential Revision: https://reviews.llvm.org/D55511 llvm-svn: 348784
* [TargetLowering] Remove ISD::ANY_EXTEND/ANY_EXTEND_VECTOR_INREG opcodes from ↵Simon Pilgrim2018-12-051-2/+0
| | | | | | | | SimplifyDemandedVectorElts These have no test coverage and the KnownZero flags can't be guaranteed unlike SIGN/ZERO_EXTEND cases. llvm-svn: 348361
* [SelectionDAG] Initial support for FSHL/FSHR funnel shift opcodes (PR39467)Simon Pilgrim2018-12-051-0/+48
| | | | | | | | | | This is an initial patch to add a minimum level of support for funnel shifts to the SelectionDAG and to begin wiring it up to the X86 SHLD/SHRD instructions. Some partial legalization code has been added to handle the case for 'SlowSHLD' where we want to expand instead and I've added a few DAG combines so we don't get regressions from the existing DAG builder expansion code. Differential Revision: https://reviews.llvm.org/D54698 llvm-svn: 348353
* [TargetLowering] SimplifyDemandedVectorElts - don't alter DemandedElts maskSimon Pilgrim2018-12-051-2/+3
| | | | | | | | Fix potential issue with the ISD::INSERT_VECTOR_ELT case tweaking the DemandedElts mask instead of using a local copy - so later uses of the mask use the tweaked version..... Noticed while investigating adding zero/undef folding to SimplifyDemandedVectorElts and the altered DemandedElts mask was causing mismatches. llvm-svn: 348348
* [SelectionDAG] Redefine isGAPlusOffset in terms of unwrapAddress. NFCI.Nirav Dave2018-12-041-1/+4
| | | | llvm-svn: 348288
* [TargetLowering] expandFP_TO_UINT - avoid FPE due to out of range conversion ↵Simon Pilgrim2018-12-041-11/+30
| | | | | | | | | | | | | | (PR17686) PR17686 demonstrates that for some targets FP exceptions can fire in cases where the FP_TO_UINT is expanded using a FP_TO_SINT instruction. The existing code converts both the inrange and outofrange cases using FP_TO_SINT and then selects the result, this patch changes this for 'strict' cases to pre-select the FP_TO_SINT input and the offset adjustment. The X87 cases don't need the strict flag but generates much nicer code with it.... Differential Revision: https://reviews.llvm.org/D53794 llvm-svn: 348251
* [TargetLowering] Add SimplifyDemandedVectorElts support to EXTEND opcodesSimon Pilgrim2018-12-041-0/+17
| | | | | | | | Add support for ISD::*_EXTEND and ISD::*_EXTEND_VECTOR_INREG opcodes. The extra broadcast in trunc-subvector.ll will be fixed in an upcoming patch. llvm-svn: 348246
* [SelectionDAG] Improve SimplifyDemandedBits to SimplifyDemandedVectorElts ↵Simon Pilgrim2018-12-011-5/+3
| | | | | | | | | | | | simplification D52935 introduced the ability for SimplifyDemandedBits to call SimplifyDemandedVectorElts through BITCASTs if the demanded bit mask entirely covered the sub element. This patch relaxes this to demanding an element if we need any bit from it. Differential Revision: https://reviews.llvm.org/D54761 llvm-svn: 348073
* [TargetLowering] SimplifyDemandedBits - only reduce known bits for integer ↵Simon Pilgrim2018-11-211-1/+3
| | | | | | | | constants Avoids fuzzing crash found by Mikael Holmén. llvm-svn: 347393
* [DAGCombine] Add calls to SimplifyDemandedVectorElts from ↵Simon Pilgrim2018-11-201-1/+1
| | | | | | | | visitINSERT_SUBVECTOR (PR37989) This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices. llvm-svn: 347313
* [TargetLowering] Improve SimplifyDemandedVectorElts/SimplifyDemandedBits supportSimon Pilgrim2018-11-201-0/+17
| | | | | | | | | | For bitcast nodes from larger element types, add the ability for SimplifyDemandedVectorElts to call SimplifyDemandedBits by merging the elts mask to a bits mask. I've raised https://bugs.llvm.org/show_bug.cgi?id=39689 to deal with the few places where SimplifyDemandedBits's lack of vector handling is a problem. Differential Revision: https://reviews.llvm.org/D54679 llvm-svn: 347301
* [TargetLowering] expandFP_TO_UINT - improve fp16 supportSimon Pilgrim2018-11-191-10/+18
| | | | | | | | | | As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode. I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary. Differential Revision: https://reviews.llvm.org/D54703 llvm-svn: 347251
* [TargetLowering] Cleanup more of the EXTEND demanded bits cases so that they ↵Simon Pilgrim2018-11-161-10/+11
| | | | | | | | match. NFCI. Use the same variable names etc. llvm-svn: 347045
* [TargetLowering] Begin generalizing TargetLowering::expandFP_TO_SINT ↵Simon Pilgrim2018-11-051-26/+26
| | | | | | | | support. NFCI. Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types. llvm-svn: 346139
* [LegalizeDAG] Add generic vector CTPOP expansion (PR32655)Simon Pilgrim2018-11-011-2/+13
| | | | | | | | This patch adds support for expanding vector CTPOP instructions and removes the x86 'bitmath' lowering which replicates the same expansion. Differential Revision: https://reviews.llvm.org/D53258 llvm-svn: 345869
* Check shouldReduceLoadWidth from SimplifySetCCStanislav Mekhanoshin2018-10-311-1/+2
| | | | | | | | | | | | SimplifySetCC could shrink a load without checking for profitability or legality of such shink with a target. Added checks to prevent shrinking of aligned scalar loads in AMDGPU below dword as scalar engine does not support it. Differential Revision: https://reviews.llvm.org/D53846 llvm-svn: 345778
* [Intrinsic] Signed and Unsigned Saturation Subtraction IntirnsicsLeonard Chan2018-10-291-16/+36
| | | | | | | | | | | | Add an intrinsic that takes 2 integers and perform saturation subtraction on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53783 llvm-svn: 345512
* [TargetLowering] Move i64/vXi64 to f32/vXf32 UINT_TO_FP handling to ↵Simon Pilgrim2018-10-281-29/+72
| | | | | | TargetLowering::expandUINT_TO_FP. llvm-svn: 345478
* [VectorLegalizer] Enable TargetLowering::expandFP_TO_UINT support.Simon Pilgrim2018-10-281-0/+5
| | | | | | | | Add vector support to TargetLowering::expandFP_TO_UINT. This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this. llvm-svn: 345473
* [TargetLowering] Move LegalizeDAG FP_TO_UINT handling to ↵Simon Pilgrim2018-10-271-0/+31
| | | | | | | | TargetLowering::expandFP_TO_UINT. NFCI. First step towards fixing PR17686 and adding vector support. llvm-svn: 345452
* [TargetLowering] Improve vXi64 UINT_TO_FP vXf64 support (P38226)Simon Pilgrim2018-10-251-0/+42
| | | | | | | | | | | | As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well. Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets. The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it. Differential Revision: https://reviews.llvm.org/D53649 llvm-svn: 345256
* [TargetLowering] Add SimplifyDemandedBitsForTargetNode callbackSimon Pilgrim2018-10-241-0/+24
| | | | | | | | Add a SimplifyDemandedBitsForTargetNode callback to handle target nodes. Differential Revision: https://reviews.llvm.org/D53643 llvm-svn: 345179
* [LegalizeDAG] Share Vector/Scalar CTPOP ExpansionSimon Pilgrim2018-10-231-0/+49
| | | | | | | | As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer. Proper vector support will be added by D53258. llvm-svn: 345066
* [LegalizeDAG] Share Vector/Scalar CTLZ ExpansionSimon Pilgrim2018-10-231-0/+53
| | | | | | | | As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering. Extension to D53474 llvm-svn: 345060
* [LegalizeDAG] Share Vector/Scalar CTTZ ExpansionSimon Pilgrim2018-10-231-0/+55
| | | | | | | | | | As suggested on D53258, this patch demonstrates sharing common CTTZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering. I intend to move CTLZ and (scalar) CTPOP over as well and then update D53258 accordingly. Differential Revision: https://reviews.llvm.org/D53474 llvm-svn: 345039
* [Intrinsic] Unigned Saturation Addition IntrinsicLeonard Chan2018-10-221-22/+27
| | | | | | | | | | | | Add an intrinsic that takes 2 integers and perform unsigned saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53340 llvm-svn: 344971
* DAG: Change behavior of fminnum/fmaxnum nodesMatt Arsenault2018-10-221-0/+29
| | | | | | | | | | | Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914
* [Intrinsic] Signed Saturation Addition IntrinsicLeonard Chan2018-10-161-0/+43
| | | | | | | | | | | Add an intrinsic that takes 2 integers and perform saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53053 llvm-svn: 344629
* [SelectionDAG] allow FP binops in SimplifyDemandedVectorEltsSanjay Patel2018-10-151-1/+6
| | | | | | | | | | | | This is intended to make the backend on par with functionality that was added to the IR version of SimplifyDemandedVectorElts in: rL343727 ...and the original motivation is that we need to improve demanded-vector-elements in several ways to avoid problems that would be exposed in D51553. Differential Revision: https://reviews.llvm.org/D52912 llvm-svn: 344541
* [TargetLowering] SimplifyDemandedBits - rename demanded mask args. NFCI.Simon Pilgrim2018-10-101-80/+89
| | | | | | Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used. llvm-svn: 344138
* [TargetLowering] SimplifyDemandedBits - pull out repeated getOperands. NFCI.Simon Pilgrim2018-10-101-119/+119
| | | | | | Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support. llvm-svn: 344136
* [TargetLowering] Add root node back to work list after successful ↵Simon Pilgrim2018-10-101-2/+6
| | | | | | | | | | SimplifyDemandedBits/SimplifyDemandedVectorElts Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful. Differential Revision: https://reviews.llvm.org/D53026 llvm-svn: 344132
* [SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG and CONCAT_VECTORS support to ↵Simon Pilgrim2018-10-091-0/+30
| | | | | | | | SimplifyDemandedBits Fix for AVX1 masked load/store regression on D52964 llvm-svn: 344043
* [SelectionDAG] Respect multiple uses in SimplifyDemandedBits to ↵Simon Pilgrim2018-10-071-1/+1
| | | | | | | | | | | | SimplifyDemandedVectorElts simplification rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....). Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ. Thanks to Jan Vesely (@jvesely) for bisecting the bug. llvm-svn: 343935
* [SelectionDAG] Add SimplifyDemandedBits to SimplifyDemandedVectorElts ↵Simon Pilgrim2018-10-061-10/+46
| | | | | | | | | | | | | | | | simplification This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector. There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify. As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.). Fixes PR39178 Differential Revision: https://reviews.llvm.org/D52935 llvm-svn: 343913
* [CodeGen] Enable tail calls for functions with NonNull attributes.David Green2018-09-261-1/+3
| | | | | | | | | | | Adding NonNull as attributes to returned pointers has the unfortunate side effect of disabling tail calls. This patch ignores the NonNull attribute when we decide whether to tail merge, in the same way that we ignore the NoAlias attribute, as it has no affect on the call sequence. Differential Revision: https://reviews.llvm.org/D52238 llvm-svn: 343091
* DAG: Fix expansion of unaligned FP loads and storesMatt Arsenault2018-09-131-4/+6
| | | | | | | | | This was trying to scalarizing a scalar FP type, resulting in an assert. Fixes unaligned f64 stack stores for AMDGPU. llvm-svn: 342132
* [SelectionDAG] enhance vector demanded elements to look at a vector select ↵Sanjay Patel2018-09-091-4/+12
| | | | | | | | | | | | | | | | condition operand This is the DAG equivalent of D51433. If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition. The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit vectors because we don't need those anyway. I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed to be running there? Differential Revision: https://reviews.llvm.org/D51696 llvm-svn: 341762
* [CodeGen] Fix remaining zext() assertions in SelectionDAGScott Linder2018-09-041-14/+12
| | | | | | | | Fix remaining cases not committed in https://reviews.llvm.org/D49574 Differential Revision: https://reviews.llvm.org/D50659 llvm-svn: 341380
* [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle inverted patternRoman Lebedev2018-09-021-4/+18
| | | | | | | | | | | | | | | | | | | | | | Summary: A follow-up for D49266 / rL337166 + D49497 / rL338044. This is still the same pattern to check for the [lack of] signed truncation, but in this case the constants and the predicate are negated. https://rise4fun.com/Alive/BDV https://rise4fun.com/Alive/n7Z Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51532 llvm-svn: 341287
* [TargetLowering] Add BuildSDiv support for division by one or negone.Simon Pilgrim2018-08-211-15/+27
| | | | | | This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1. llvm-svn: 340260
* [TargetLowering] Disable BuildSDiv division by one or negone.Simon Pilgrim2018-08-201-1/+2
| | | | | | Fuzz tests have detected an issue, currently working on a fix. llvm-svn: 340195
* [TargetLowering] Add support for non-uniform vectors to BuildSDIVSimon Pilgrim2018-08-161-10/+24
| | | | | | | | | | This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators. This is the last patch necessary to close PR36545 Differential Revision: https://reviews.llvm.org/D50765 llvm-svn: 339908
* [TargetLowering] Refactor BuildSDIV in preparation for D50765. NFCI.Simon Pilgrim2018-08-161-24/+36
| | | | | | Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator. llvm-svn: 339898
* DAG: Use getObjectOffset helperMatt Arsenault2018-08-151-4/+1
| | | | llvm-svn: 339813
* [TargetLowering] Minor cleanup of TargetLowering::BuildSDIV. NFCI.Simon Pilgrim2018-08-151-21/+20
| | | | | | Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support. llvm-svn: 339763
* [TargetLowering] Minor refactor to TargetLowering::BuildUDIV to merge ↵Simon Pilgrim2018-08-151-41/+31
| | | | | | | | scalar/vector magic value collection. NFCI. Use the same ISD::matchUnaryPredicate pattern that was used in D50392. llvm-svn: 339758
* [TargetLowering] Add support for non-uniform vectors to BuildExactSDIVSimon Pilgrim2018-08-151-12/+24
| | | | | | | | This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators. Differential Revision: https://reviews.llvm.org/D50392 llvm-svn: 339756
OpenPOWER on IntegriCloud