summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFC] Refactor visitIntrinsicCall so it doesn't return a const char*Guillaume Chatelet2019-05-202-141/+148
| | | | | | | | | | | | | | Summary: API simplification Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61306 llvm-svn: 361140
* [DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC)Petar Jovanovic2019-05-202-4/+3
| | | | | | | | | | | Refactor DIExpression::With* into a flag enum in order to be less error-prone to use (as discussed on D60866). Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D61943 llvm-svn: 361137
* Revert "[NFC] Refactor visitIntrinsicCall so it doesn't return a const char*"Guillaume Chatelet2019-05-202-145/+138
| | | | | | This reverts commit 706d3cd6388cc3446aab282f3af879862b10cbed. llvm-svn: 361130
* [NFC] Refactor visitIntrinsicCall so it doesn't return a const char*Guillaume Chatelet2019-05-202-138/+145
| | | | | | | | | | | | | | Summary: API simplification Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61306 llvm-svn: 361129
* [DAGCombiner] visitShiftByConstant(): drop bogus signbit checkRoman Lebedev2019-05-171-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: That check claims that the transform is illegal otherwise. That isn't true: 1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter https://rise4fun.com/Alive/K4A 2. For `ISD::AND`, there is no restriction on constants: https://rise4fun.com/Alive/Wy3 3. For `ISD::OR`, there is no restriction on constants: https://rise4fun.com/Alive/GOH 3. For `ISD::XOR`, there is no restriction on constants: https://rise4fun.com/Alive/ml6 So, why is it there then? This changes the testcase that was touched by @spatel in rL347478, but i'm not sure that test tests anything particular? Reviewers: RKSimon, spatel, craig.topper, jojo, rengolin Reviewed By: spatel Subscribers: javed.absar, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61918 llvm-svn: 361044
* [CodeGen] Fixed de-optimization of legalize subvector extractTim Renouf2019-05-161-0/+18
| | | | | | | | | | | | | | | The recent introduction of v3i32 etc as an MVT, and its use in AMDGPU 3-dword memory instructions, caused a de-optimization problem for code with such a load that then bitcasts via vector of i8, because v12i8 is not an MVT so it legalizes the bitcast by widening it. This commit adds the ability to widen a bitcast using extract_subvector on the result, so the value does not need to go via memory. Differential Revision: https://reviews.llvm.org/D60457 Change-Id: Ie4abb7760547e54a2445961992eafc78e80d4b64 llvm-svn: 360942
* [CodeGen] Add lround/llround builtinsAdhemerval Zanella2019-05-166-0/+144
| | | | | | | | | | | | | This patch add the ISD::LROUND and ISD::LLROUND along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lround/llround generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. llvm-svn: 360889
* [codeview] Fix SDNode representation of annotation labelsReid Kleckner2019-05-152-1/+3
| | | | | | | | | | | Before this change, they were erroneously constructed with the EH_LABEL SDNode opcode, which caused other passes to interact with them in incorrect ways. See the FIXME about fastisel that this addresses in the existing test case. Fixes PR41890 llvm-svn: 360818
* [[DAGCombiner][NFC] Add a comment.Clement Courbet2019-05-151-0/+2
| | | | | | As suggested in D61846. llvm-svn: 360755
* [SDAG] fix unused variable warning and unneeded indirection; NFCSanjay Patel2019-05-142-3/+3
| | | | llvm-svn: 360640
* [SDAG, x86] allow targets to override test for binop opcodesSanjay Patel2019-05-142-9/+10
| | | | | | | | This follows the pattern of the existing isCommutativeBinOp(). x86 shows improvements from vector narrowing for the min/max opcodes. llvm-svn: 360639
* [TargetLowering] Handle multi depth GEPs w/ inline asm constraintsNick Desaulniers2019-05-131-38/+33
| | | | | | | | | | | | | | | | | | | | | | | Summary: X86TargetLowering::LowerAsmOperandForConstraint had better support than TargetLowering::LowerAsmOperandForConstraint for arbitrary depth getelementpointers for "i", "n", and "s" extended inline assembly constraints. Hoist its support from the derived class into the base class. Link: https://github.com/ClangBuiltLinux/linux/issues/469 Reviewers: echristo, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, E5ten, kees, jyknight, nemanjai, javed.absar, eraman, hiraditya, jsji, llvm-commits, void, craig.topper, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D61560 llvm-svn: 360604
* [TargetLowering] Add SimplifyDemandedBits support for ZERO_EXTEND_VECTOR_INREGSimon Pilgrim2019-05-131-0/+24
| | | | | | More work for PR39709. llvm-svn: 360592
* [DAGCombiner] narrow vector binop with inserts/extractSanjay Patel2019-05-131-1/+33
| | | | | | | | | | | | We catch most of these patterns (on x86 at least) by matching a concat vectors opcode early in combining, but the pattern may emerge later using insert subvector instead. The AVX1 diffs for add/sub overflow show another missed narrowing pattern. That one may be falling though the cracks because of combine ordering and multiple uses. llvm-svn: 360585
* Add constrained fptrunc and fpext intrinsics.Kevin P. Neal2019-05-137-29/+245
| | | | | | | | | | | The new fptrunc and fpext intrinsics are constrained versions of the regular fptrunc and fpext instructions. Reviewed by: Andrew Kaylor, Craig Topper, Cameron McInally, Conner Abbot Approved by: Craig Topper Differential Revision: https://reviews.llvm.org/D55897 llvm-svn: 360581
* TargetLowering::SimplifyDemandedBits - early-out for UNDEF ops. NFCI.Simon Pilgrim2019-05-131-3/+5
| | | | llvm-svn: 360579
* [DAGCombiner] Fix invalid alias analysis.Clement Courbet2019-05-131-3/+2
| | | | | | | | | | | | | | | | | | | | | Summary: When we know for sure whether two addresses do or do not alias, we should immediately return from DAGCombiner::isAlias(). I think this comes from a bad copy/paste, Sorry for not catching that during the code review. Fixes PR41855. Reviewers: niravd, gchatelet, EricWF Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61846 llvm-svn: 360566
* Recommit r358887 "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits ↵Craig Topper2019-05-131-1/+25
| | | | | | | | | | | | | | | | | | | | bitcast handling" I've included a new fix in X86RegisterInfo to prevent PR41619 without reintroducing r359392. We might be able to improve that in the base class implementation of shouldRewriteCopySrc somehow. But this hopefully enables forward progress on SimplifyDemandedBits improvements for now. Original commit message: This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGComb but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. llvm-svn: 360552
* [DAGCombiner] try to move bitcast after extract_subvectorSanjay Patel2019-05-121-0/+24
| | | | | | | | | | | | | | | | | I noticed that we were failing to narrow an x86 ymm math op in a case similar to the 'madd' test diff. That is because a bitcast is sitting between the math and the extract subvector and thwarting our pattern matching for narrowing: t56: v8i32 = add t59, t58 t68: v4i64 = bitcast t56 t73: v2i64 = extract_subvector t68, Constant:i64<2> t96: v4i32 = bitcast t73 There are a few wins and neutral diffs in the other tests. Differential Revision: https://reviews.llvm.org/D61806 llvm-svn: 360541
* [DAG] Add SimplifyDemandedBits support for BITREVERSESimon Pilgrim2019-05-111-0/+10
| | | | | | Pulled out of D58017 while I continue to investigate the BSWAP regression on PPC llvm-svn: 360534
* SelectionDAGISel::CodeGenAndEmitDAG - remove unused variable. NFCI.Simon Pilgrim2019-05-111-3/+0
| | | | llvm-svn: 360514
* Revert [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactorJordan Rupprecht2019-05-101-3/+2
| | | | | | This reverts r360171 (git commit a9d6c32eafc645c55b07eb50698c428e14c0bffd). A repro showing the asan/msan failures is forthcoming. llvm-svn: 360481
* [LegalizeVectorOps] Remove calls to LegalizeOp on the return value from ↵Craig Topper2019-05-101-2/+2
| | | | | | | | | | ExpandLoad/ExpandStore. We already updated the LegalizedNodes map at the end of the Expand call. This would have marked the new node as being mapped to itself. So the LegalizeOp call will find that an immediately return. llvm-svn: 360472
* [SDAG] Recursively legalize both vector mulo resultsNikita Popov2019-05-101-3/+7
| | | | | | | | | | | | | | | | Split out from D61692 per RKSimon's suggestion. Vector op legalization will automatically recursively legalize the returned SDValue, but we need to take care of the other results ourselves. Otherwise it will end up getting legalized only during op legalization, by which point it might be too late (though I'm not aware of any specific cases right now). There are codegen differences because expansion occurs earlier now and we don't get a DAGCombiner run in between. Differential Revision: https://reviews.llvm.org/D61744 llvm-svn: 360470
* [DAGCombiner] reduce code duplication; NFCSanjay Patel2019-05-101-10/+8
| | | | llvm-svn: 360462
* SelectionDAG: accommodate atomic floating stores.Tim Northover2019-05-101-1/+4
| | | | | | | We were applying a pointer truncation to floating types, which crashed LLVM. That is Not A Good Thing(TM). llvm-svn: 360421
* [CodeGen] Add comment about FSUB <-> FNEG xformsCameron McInally2019-05-091-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D61741 llvm-svn: 360366
* [DAGCombiner] Limit number of nodes explored as store candidates.Florian Hahn2019-05-091-2/+5
| | | | | | | | | | | | | | To find the candidates to merge stores we iterate over all nodes in a chain for each store, which leads to quadratic compile times for large basic blocks with a large number of stores. Reviewers: niravd, spatel, craig.topper Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D61511 llvm-svn: 360357
* [SelectionDAG] Expand ADD/SUBCARRYLeonard Chan2019-05-091-0/+42
| | | | | | | | This patch allows for expansion of ADDCARRY and SUBCARRY when the target does not support it. Differential Revision: https://reviews.llvm.org/D61411 llvm-svn: 360303
* [SelectionDAG] fold 'fneg undef' to undefSanjay Patel2019-05-081-0/+4
| | | | | | | | | | | | | | | | | This is extracted from the original draft of D61419 with some additional tests. We don't currently get this in IR (it's conservatively turned into a NaN), but presumably that'll get updated as we add real IR support for 'fneg' rather than 'fsub -0.0, x'. The x86-32 run shows the following, and I haven't looked further to see why, but that seems to be independent: Legalizing: t1: f32 = undef Trying to expand node Creating fp constant: t4: f32 = ConstantFP<0.000000e+00> Differential Revision: https://reviews.llvm.org/D61516 llvm-svn: 360296
* [FastISel][X86] Support FNeg instruction in target independent fast isel ↵Craig Topper2019-05-081-0/+3
| | | | | | | | | | handling This patch adds support for calling selectFNeg for FNeg instructions in addition to the fsub idiom Differential Revision: https://reviews.llvm.org/D61624 llvm-svn: 360273
* [LegalizeDAG] Assert non-power-of-2 load/store op splits are in range. NFCI.Simon Pilgrim2019-05-081-2/+6
| | | | | | Fixes static analyzer undefined/out-of-range shift warnings. llvm-svn: 360245
* Fix cppcheck operator precedence warning. NFCI.Simon Pilgrim2019-05-081-2/+2
| | | | llvm-svn: 360234
* [NFC] Add a static function to do the endian checkQingShan Zhang2019-05-081-15/+37
| | | | | | | | Add a new function to do the endian check, as I will commit another patch later, which will also need the endian check. Differential Revision: https://reviews.llvm.org/D61236 llvm-svn: 360226
* [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactorFlorian Hahn2019-05-071-2/+3
| | | | | | | | | | | | | | | | | | When simplifying TokenFactors, we potentially iterate over all operands of a large number of TokenFactors. This causes quadratic compile times in some cases and the large token factors cause additional scalability problems elsewhere. This patch adds some limits to the number of nodes explored for the cases mentioned above. Reviewers: niravd, spatel, craig.topper Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D61397 llvm-svn: 360171
* Avoid use-after-move warnings by using swap instead. NFCI.Simon Pilgrim2019-05-071-2/+5
| | | | | | Swap should be as quick in these cases, and leaves the original variables in a known (empty) state. llvm-svn: 360164
* [FastISel][X86] If selectFNeg fails, fall back to SelectionDAG not treating ↵Craig Topper2019-05-071-8/+9
| | | | | | | | | | | | | | | | | | | | | it as an fsub. Summary: If fneg lowering for fsub -0.0, x fails we currently fall back to treating it as an fsub. This has different behavior for nans than the xor with sign bit trick we normally try to do. On X86, the xor trick for double fails fast-isel in 32-bit mode with sse2 due to 64 bit integer types not being available. With -O2 we would always use an xorpd for this case. If we use subsd, this creates an observable behavior difference between -O0 and -O2. So fall back to SelectionDAG if we can't fast-isel it, that way SelectionDAG will use the xorpd. I believe this patch is restoring the behavior prior to r345295 from last October. This was missed then because our fast isel case in 32-bit mode aborted fast-isel earlier for another reason. But I've added new tests to cover that. Reviewers: andrew.w.kaylor, cameron.mcinally, spatel, efriedma Reviewed By: cameron.mcinally Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61622 llvm-svn: 360111
* [FastISel] Pass the fneg input operand to hasTrivialKill in ↵Craig Topper2019-05-061-1/+1
| | | | | | | | FastISel::selectFNeg. We're trying to calculate the kill flag for OpReg which is the input so we need to pass the input here. llvm-svn: 360097
* Fix pr33010, a 2 year old crashing regressionPhilip Reames2019-05-061-0/+4
| | | | | | | | The problem was that we were creating a CMOV64rr <TargetFrameIndex>, <TargetFrameIndex>. The entire point of a TFI is that address code is not generated, so there's no way to legalize/lower this. Instead, simply prevent it's creation. Arguably, we shouldn't be using *Target*FrameIndices in StatepointLowering at all, but that's a much deeper change. llvm-svn: 360090
* [SelectionDAG][X86] Support inline assembly returning an mmx register into a ↵Craig Topper2019-05-061-0/+8
| | | | | | | | | | | | | | | | | | type with fewer than 64 bits. It's possible to use the 'y' mmx constraint with a type narrower than 64-bits. This patch supports this by bitcasting the mmx type to 64-bits and then truncating to the desired type. There are probably other missing type combinations we need to support, but this is the case we have a bug report for. Fixes PR41748. Differential Revision: https://reviews.llvm.org/D61582 llvm-svn: 360069
* Revert r359392 and r358887Craig Topper2019-05-061-25/+1
| | | | | | | | | | | | | | | | | | | | Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead" Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling" Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list. Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work there. I'll file a separate PR for that and add test cases. Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised if that bug can still be hit independent of that. This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again. llvm-svn: 360066
* [SDAG][AArch64] Boolean and/or reduce to umax/min reduce (PR41635)Nikita Popov2019-05-061-0/+12
| | | | | | | | | | | This addresses one half of https://bugs.llvm.org/show_bug.cgi?id=41635 by combining a VECREDUCE_AND/OR into VECREDUCE_UMIN/UMAX (if latter is legal but former is not) for zero-or-all-ones boolean reductions (which are detected based on sign bits). Differential Revision: https://reviews.llvm.org/D61398 llvm-svn: 360054
* [SelectionDAG] Replace llvm_unreachable at the end of getCopyFromParts with ↵Craig Topper2019-05-061-1/+1
| | | | | | | | | | | | | | | | a report_fatal_error. Based on PR41748, not all cases are handled in this function. llvm_unreachable is treated as an optimization hint than can prune code paths in a release build. This causes weird behavior when PR41748 is encountered on a release build. It appears to generate an fp_round instruction from the floating point code. Making this a report_fatal_error prevents incorrect optimization of the code and will instead generate a message to file a bug report. llvm-svn: 360008
* [SelectionDAG] Use any_of/all_of where possible. NFCI.Simon Pilgrim2019-05-051-14/+4
| | | | llvm-svn: 359974
* [DAGCombine] Remove repeated variables. NFCI.Simon Pilgrim2019-05-031-8/+3
| | | | llvm-svn: 359915
* [TargetLowering] SimplifySetCC - remove repeated variable. NFCI.Simon Pilgrim2019-05-031-2/+1
| | | | | | Also reduce scope of Temp variable. llvm-svn: 359911
* [SelectionDAG] CreateTopologicalOrder - don't use iteratorSimon Pilgrim2019-05-031-10/+6
| | | | | | | | We shouldn't use an iterator to loop across a std::vector when the same loop is adding elements to that std::vector Found by cppcheck llvm-svn: 359900
* [TargetLowering] ShrinkDemandedConstant - reduce scope of TLO.DAG variable. ↵Simon Pilgrim2019-05-031-3/+2
| | | | | | | | NFCI. Only ever used in one block llvm-svn: 359890
* [TargetLowering] expandUnalignedStore - cleanup EVT variables. NFCI.Simon Pilgrim2019-05-031-23/+18
| | | | | | Avoid duplicated EVTs and rename Store/Load VTs to avoid -Wshadow warnings. llvm-svn: 359877
* [SelectionDAG] Use INT_MIN as (1 << 31) is UB for signed integers. NFCI.Simon Pilgrim2019-05-031-2/+2
| | | | llvm-svn: 359873
OpenPOWER on IntegriCloud