summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [TargetLowering] improve the default expansion of uaddsat/usubsatSanjay Patel2019-03-171-0/+11
| | | | | | | | | | | | | | | This is a subset of what was proposed in: D59006 ...and may overlap with test changes from: D59174 ...but it seems like a good general optimization to turn selects into bitwise-logic when possible because we never know exactly what can happen at this stage of DAG combining depending on how the target has defined things. Differential Revision: https://reviews.llvm.org/D59066 llvm-svn: 356332
* [SelectionDAG] Add SimplifyDemandedBits handling for ISD::SCALAR_TO_VECTORSimon Pilgrim2019-03-151-0/+13
| | | | | | Fixes a lot of constant folding mismatches between i686 and x86_64 llvm-svn: 356273
* [DAG] Move integer setcc %x, %x folding into FoldSetCCSimon Pilgrim2019-03-131-5/+2
| | | | | | | | | | First step towards PR40800 - I intend to move the float case in a separate future patch. I had to tweak the (overly reduced) thumb2 test and the x86 widening test change is annoying (no longer rematerializable) but we should address this separately. Differential Revision: https://reviews.llvm.org/D59244 llvm-svn: 356040
* [SDAG] Expand pow2 mulo using shiftsNikita Popov2019-03-121-4/+23
| | | | | | | | | | | Expand MULO with constant power of two operand into a shift. The overflow is checked with (x << shift) >> shift == x, where the right shift will be logical for umulo and arithmetic for smulo (with exception for multiplications by signed_min). Differential Revision: https://reviews.llvm.org/D59041 llvm-svn: 355937
* [SDAG][AArch64] Legalize VECREDUCENikita Popov2019-03-111-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=36796. Implement basic legalizations (PromoteIntRes, PromoteIntOp, ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes. There are more legalizations missing (esp float legalizations), but there's no way to test them right now, so I'm not adding them. This also includes a few more changes to make this work somewhat reasonably: * Add support for expanding VECREDUCE in SDAG. Usually experimental.vector.reduce is expanded prior to codegen, but if the target does have native vector reduce, it may of course still be necessary to expand due to legalization issues. This uses a shuffle reduction if possible, followed by a naive scalar reduction. * Allow the result type of integer VECREDUCE to be larger than the vector element type. For example we need to be able to reduce a v8i8 into an (nominally) i32 result type on AArch64. * Use the vector operand type rather than the scalar result type to determine the action, so we can control exactly which vector types are supported. Also change the legalize vector op code to handle operations that only have vector operands, but no vector results, as is the case for VECREDUCE. * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE), explicitly specify for which vector types the reductions are supported. This does not handle anything related to VECREDUCE_STRICT_*. Differential Revision: https://reviews.llvm.org/D58015 llvm-svn: 355860
* [DAG] Move SetCC NaN handling into FoldSetCCSimon Pilgrim2019-03-111-15/+1
| | | | llvm-svn: 355845
* [DAG] TargetLowering::SimplifySetCC - call FoldSetCC early to handle ↵Simon Pilgrim2019-03-111-13/+6
| | | | | | | | constant/commute folds. Noticed while looking at PR40800 (and also D57921) llvm-svn: 355828
* [TargetLowering] simplify code for uaddsat/usubsat expansion; NFCSanjay Patel2019-03-061-17/+13
| | | | | | We had 2 local variable names for the same type. llvm-svn: 355516
* [TargetLowering] simplify code for uaddsat/usubsat expansion; NFCSanjay Patel2019-03-061-8/+5
| | | | llvm-svn: 355508
* Use SDValue::getConstantOperandAPInt helper where possible. NFCI.Simon Pilgrim2019-03-021-5/+3
| | | | llvm-svn: 355267
* Add support for computing "zext of value" in KnownBits. NFCIBjorn Pettersson2019-02-281-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The description of KnownBits::zext() and KnownBits::zextOrTrunc() has confusingly been telling that the operation is equivalent to zero extending the value we're tracking. That has not been true, instead the user has been forced to explicitly set the extended bits as known zero afterwards. This patch adds a second argument to KnownBits::zext() and KnownBits::zextOrTrunc() to control if the extended bits should be considered as known zero or as unknown. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: javed.absar, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58650 llvm-svn: 355099
* [SDAG] Support vector UMULO/SMULONikita Popov2019-02-201-16/+24
| | | | | | | | | | | | | | | Second part of https://bugs.llvm.org/show_bug.cgi?id=40442. This adds an extra UnrollVectorOverflowOp() method to SDAG, because the general UnrollOverflowOp() method can't deal with multiple results. Additionally we need to expand UMULO/SMULO during vector op legalization, as it may result in unrolling, which may need additional type legalization. Differential Revision: https://reviews.llvm.org/D57997 llvm-svn: 354513
* [SelectionDAG] Extract [US]MULO expansion into TL method; NFCNikita Popov2019-02-171-1/+121
| | | | | | | | | | | | In preparation for supporting vector expansion. Add an isPostTypeLegalization flag to makeLibCall(), because this expansion relies on the legalized form using MERGE_VALUES. Drop the corresponding variant of ExpandLibCall, which is no longer used. Differential Revision: https://reviews.llvm.org/D58006 llvm-svn: 354226
* [X86] Fix LowerAsmOutputForConstraint.Nirav Dave2019-02-151-1/+1
| | | | | | | | | | | | | | | | | Summary: Update Flag when generating cc output. Fixes PR40737. Reviewers: rnk, nickdesaulniers, craig.topper, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58283 llvm-svn: 354163
* Fix 80-column limit in SimplifyDemandedBits/SimplifyDemandedVectorElts. NFCI.Simon Pilgrim2019-02-151-70/+78
| | | | llvm-svn: 354152
* [CallSite removal] Migrate the statepoint GC infrastructure to use theChandler Carruth2019-02-111-12/+12
| | | | | | | | | | | | | | | `CallBase` class rather than `CallSite` wrappers. I pushed this change down through most of the statepoint infrastructure, completely removing the use of CallSite where I could reasonably do so. I ended up making a couple of cut-points: generic call handling (instcombine, TLI, SDAG). As soon as it hit truly generic handling with users outside the immediate code, I simply transitioned into or out of a `CallSite` to make this a reasonable sized chunk. Differential Revision: https://reviews.llvm.org/D56122 llvm-svn: 353660
* [CodeGen][X86] Don't scalarize vector saturating add/subNikita Popov2019-02-101-15/+6
| | | | | | | | | | | Now that we have vector support for [US](ADD|SUB)O we no longer need to scalarize when expanding [US](ADD|SUB)SAT. This matches what the cost model already does. Differential Revision: https://reviews.llvm.org/D57348 llvm-svn: 353651
* [TargetLowering] refactor setcc folds to fix another miscompile (PR40657)Sanjay Patel2019-02-101-55/+55
| | | | | | | | | | SimplifySetCC still has much room for improvement, but this should fix the remaining problem examples from: https://bugs.llvm.org/show_bug.cgi?id=40657 The initial fix for this problem was rL353615. llvm-svn: 353639
* [TargetLowering] add tests to show effect of setcc sub->shift; NFCSanjay Patel2019-02-091-1/+0
| | | | | | | | | There's effectively no difference for the cases with variables. We just trade a sub for an add on those. But the case with a subtract from constant would require an extra move instruction on x86, so this looks like a reasonable generic combine. llvm-svn: 353619
* [TargetLowering] avoid miscompile in setcc transform (PR40657)Sanjay Patel2019-02-091-1/+3
| | | | llvm-svn: 353615
* Revert "[SelectionDAG] Extract [US]MULO expansion into TL method; NFC"Nikita Popov2019-02-091-109/+0
| | | | | | | | This reverts commit r353611. Triggers an assertion during the libcall expansion on ARM. llvm-svn: 353612
* [SelectionDAG] Extract [US]MULO expansion into TL method; NFCNikita Popov2019-02-091-0/+109
| | | | | | | | | In preparation for supporting vector expansion. Also drop a variant of ExpandLibCall, of which the MULO expansions were the only user. llvm-svn: 353611
* Implementation of asm-goto support in LLVMCraig Topper2019-02-081-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
* [TargetLowering] Use ISD::FSHR in expandFixedPointMulSimon Pilgrim2019-02-081-5/+2
| | | | | | Replace OR(SHL,SRL) pattern with ISD::FSHR (legalization expands this later if necessary) - this helps with the scale == 0 'undefined' drop-through case that was discussed on D55720. llvm-svn: 353546
* [TargetLowering] Add SimplifyDemandedBits funnel shift support Simon Pilgrim2019-02-081-0/+39
| | | | llvm-svn: 353539
* [InlineAsm][X86] Add backend support for X86 flag output parameters.Nirav Dave2019-02-061-0/+6
| | | | | | | Allow custom handling of inline assembly output parameters and add X86 flag parameter support. llvm-svn: 353307
* [Intrinsic] Unsigned Fixed Point Multiplication IntrinsicLeonard Chan2019-02-041-9/+21
| | | | | | | | | | | | | Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D55625 llvm-svn: 353059
* [TargetLowering] try harder to determine undef elements of vector binopsSanjay Patel2019-02-011-7/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This might be the start of tracking all vector element constants generally if we take it to its logical conclusion, but let's stop here and make sure this is correct/beneficial so far. The affected tests require a convoluted path before they get simplified currently because we don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands directly in SimplifyDemandedVectorElts(). That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So something like this is happening: 1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that originates from a shuffle. 2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes. 3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again. 4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing those to become full undef constants. 5. Simplify the binop now that it has a full undef operand. As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution. We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to track NaN for FP ops too (assuming we don't have fast-math-flags set). Differential Revision: https://reviews.llvm.org/D57066 llvm-svn: 352880
* [Intrinsic] Expand SMULFIX to MUL, MULH[US], or [US]MUL_LOHI on vector argumentsLeonard Chan2019-01-311-14/+12
| | | | | | | | | | | r zero scale SMULFIX, expand into MUL which produces better code for X86. For vector arguments, expand into MUL if SMULFIX is provided with a zero scale. Otherwise, expand into MULH[US] or [US]MUL_LOHI. Differential Revision: https://reviews.llvm.org/D56987 llvm-svn: 352783
* Adjust documentation for git migration.James Y Knight2019-01-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes most references to the paths: llvm.org/svn/ llvm.org/git/ llvm.org/viewvc/ github.com/llvm-mirror/ github.com/llvm-project/ reviews.llvm.org/diffusion/ to instead point to https://github.com/llvm/llvm-project. This is *not* a trivial substitution, because additionally, all the checkout instructions had to be migrated to instruct users on how to use the monorepo layout, setting LLVM_ENABLE_PROJECTS instead of checking out various projects into various subdirectories. I've attempted to not change any scripts here, only documentation. The scripts will have to be addressed separately. Additionally, I've deleted one document which appeared to be outdated and unneeded: lldb/docs/building-with-debug-llvm.txt Differential Revision: https://reviews.llvm.org/D57330 llvm-svn: 352514
* [CodeGen][X86] Expand UADDSAT to NOT+UMIN+ADDNikita Popov2019-01-281-0/+6
| | | | | | | | | Followup to D56636, this time handling the UADDSAT case by expanding uadd.sat(a, b) to umin(a, ~b) + b. Differential Revision: https://reviews.llvm.org/D56869 llvm-svn: 352409
* [TargetLowering] Rename getExpandedFixedPointMultiplication to ↵Simon Pilgrim2019-01-241-2/+1
| | | | | | | | expandFixedPointMul. NFCI. Match the (much shorter) name used in various legalization methods. llvm-svn: 352056
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* Reapply "[CodeGen][X86] Expand USUBSAT to UMAX+SUB, also for vectors"Nikita Popov2019-01-151-4/+16
| | | | | | | | | | | | | Related to https://bugs.llvm.org/show_bug.cgi?id=40123. Rather than scalarizing, expand a vector USUBSAT into UMAX+SUB, which produces much better code for X86. Reapplying with updated SLPVectorizer tests. Differential Revision: https://reviews.llvm.org/D56636 llvm-svn: 351219
* Revert "[CodeGen][X86] Expand USUBSAT to UMAX+SUB, also for vectors"Nikita Popov2019-01-141-16/+4
| | | | | | | | | This reverts commit r351125. I missed test changes in an SLPVectorizer test, due to the cost model changes. Reverting for now. llvm-svn: 351129
* [CodeGen][X86] Expand USUBSAT to UMAX+SUB, also for vectorsNikita Popov2019-01-141-4/+16
| | | | | | | | | | | Related to https://bugs.llvm.org/show_bug.cgi?id=40123. Rather than scalarizing, expand a vector USUBSAT into UMAX+SUB, which produces much better code for X86. Differential Revision: https://reviews.llvm.org/D56636 llvm-svn: 351125
* [X86] Rename overly verbose method; NFCNikita Popov2019-01-131-2/+1
| | | | | | As suggested on D56636. llvm-svn: 351021
* Use getShiftAmountTy for shift amounts.Simon Pilgrim2019-01-121-1/+2
| | | | llvm-svn: 351005
* [X86][AARCH64] Improve ISD::ABS supportSimon Pilgrim2019-01-121-0/+20
| | | | | | | | This patch takes some of the code from D49837 to allow us to enable ISD::ABS support for all SSE vector types. Differential Revision: https://reviews.llvm.org/D56544 llvm-svn: 350998
* Remove check for single use in ShrinkDemandedConstantStanislav Mekhanoshin2019-01-091-3/+0
| | | | | | | | | | | | | | | This removes check for single use from general ShrinkDemandedConstant to the BE because of the AArch64 regression after D56289/rL350475. After several hours of experiments I did not come up with a testcase failing on any other targets if check is not performed. Moreover, direct call to ShrinkDemandedConstant is not really needed and superceed by SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D56406 llvm-svn: 350684
* [TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes ↵Craig Topper2019-01-071-50/+0
| | | | | | | | | | | | | | a User and OpIdx. Stop using it in AMDGPU target for simplifyI24. As we saw in D56057 when we tried to use this function on X86, it's unsafe. It allows the operand node to have multiple users, but doesn't prevent recursing past the first node when it does have multiple users. This can cause other simplifications earlier in the graph without regard to what bits are needed by the other users of the first node. Ideally all we should do to the first node if it has multiple uses is bypass it when its not needed by the user we started from. Doing any other transformation that SimplifyDemandedBits can do like turning ZEXT/SEXT into AEXT would result in an increase in instructions. Fortunately, we already have a function that can do just that, GetDemandedBits. It will only make transformations that involve bypassing a node. This patch changes AMDGPU's simplifyI24, to use a combination of GetDemandedBits to handle the multiple use simplifications. And then uses the regular SimplifyDemandedBits on each operand to handle simplifications allowed when the operand only has a single use. Unfortunately, GetDemandedBits simplifies constants more aggressively than SimplifyDemandedBits. This caused the -7 constant in the changed test to be simplified to remove the upper bits. I had to modify computeKnownBits to account for this by ignoring the upper 8 bits of the input. Differential Revision: https://reviews.llvm.org/D56087 llvm-svn: 350560
* Added single use check to ShrinkDemandedConstantStanislav Mekhanoshin2019-01-051-0/+3
| | | | | | | | | Fixes cvt_f32_ubyte combine. performCvtF32UByteNCombine() could shrink source node to demanded bits only even if there are other uses. Differential Revision: https://reviews.llvm.org/D56289 llvm-svn: 350475
* [SelectionDAG] Always use the version of computeKnownBits that returns a ↵Simon Pilgrim2018-12-211-5/+4
| | | | | | | | value. NFCI. Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version. llvm-svn: 349907
* [TargetLowering] Fix propagation of undefs in zero extension ops (PR40091)Simon Pilgrim2018-12-191-0/+14
| | | | | | | | | | | | As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625
* [TargetLowering] Fallback from SimplifyDemandedVectorElts to ↵Simon Pilgrim2018-12-181-1/+8
| | | | | | | | SimplifyDemandedBits For opcodes not covered by SimplifyDemandedVectorElts, SimplifyDemandedBits might be able to help now that it supports demanded elts as well. llvm-svn: 349466
* NFC: remove unused variableJF Bastien2018-12-171-1/+0
| | | | | | D55768 removed its use. llvm-svn: 349377
* [TargetLowering] Add DemandedElts mask to SimplifyDemandedBits (PR40000)Simon Pilgrim2018-12-171-42/+120
| | | | | | | | | | This is an initial patch to add the necessary support for a DemandedElts argument to SimplifyDemandedBits, more closely matching computeKnownBits and to help improve vector codegen. I've added only a small amount of the changes necessary to get at least one test to update - a lot more can be done but I'd like to add these methodically with proper test coverage, at the same time the hope is to slowly move some/all of SimplifyDemandedVectorElts into SimplifyDemandedBits as well. Differential Revision: https://reviews.llvm.org/D55768 llvm-svn: 349374
* [TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorEltsSimon Pilgrim2018-12-151-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D55600 llvm-svn: 349264
* [TargetLowering] Add ISD::ROTL/ROTR vector expansionSimon Pilgrim2018-12-131-0/+45
| | | | | | | | | | Move existing rotation expansion code into TargetLowering and set it up for vectors as well. Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment. Begun removing x86 vector rotate custom lowering to use the expansion. llvm-svn: 349025
* [TargetLowering] Add ISD::AND handling to SimplifyDemandedVectorEltsSimon Pilgrim2018-12-121-0/+16
| | | | | | | | If either of the operand elements are zero then we know the result element is going to be zero (even if the other element is undef). Differential Revision: https://reviews.llvm.org/D55558 llvm-svn: 348926
OpenPOWER on IntegriCloud