summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Update SelectionDAGDumper to print the extension type and expanding ↵Craig Topper2019-01-241-0/+30
| | | | | | flag for masked loads. Add truncating and compressing for masked stores. llvm-svn: 352029
* [DAGCombine] Enable more pre-indexed storesSam Parker2019-01-231-1/+7
| | | | | | | | | | | The current check in CombineToPreIndexedLoadStore is too conversative, preventing a pre-indexed store when the base pointer is a predecessor of the value being stored. Instead, we should check the pointer operand of the store. Differential Revision: https://reviews.llvm.org/D56719 llvm-svn: 351933
* [LegalizeTypes] Add debug prints to the top of PromoteFloatOperand and ↵Craig Topper2019-01-221-0/+12
| | | | | | | | | | PromoteFloatResult. Also add debug prints in the default case of the switches in these routines. Most if not all of the type legalization handlers already do this so this makes promoting floats consistent llvm-svn: 351890
* [SelectionDAGBuilder] Defer C_Register Assignments to be in line withNirav Dave2019-01-221-13/+3
| | | | | | those of C_RegisterClass. NFCI. llvm-svn: 351854
* Codegen support for atomicrmw fadd/fsubMatt Arsenault2019-01-223-0/+5
| | | | llvm-svn: 351851
* [DAGCombiner] narrow vector binop with 2 insert subvector operandsSanjay Patel2019-01-221-1/+24
| | | | | | | | | | | | | | | | | | vecbo (insertsubv undef, X, Z), (insertsubv undef, Y, Z) --> insertsubv VecC, (vecbo X, Y), Z This is another step in generic vector narrowing. It's also a step towards more horizontal op formation specifically for x86 (although we still failed to match those in the affected tests). The scalarization cases are also not optimal (we should be scalarizing those), but it's still an improvement to use a narrower vector op when we know part of the result must be constant because both inputs are undef in some vector lanes. I think a similar match but checking for a constant operand might help some of the cases in D51553. Differential Revision: https://reviews.llvm.org/D56875 llvm-svn: 351825
* [DAGCombiner] fix crash when converting build vector to shuffleSanjay Patel2019-01-211-5/+11
| | | | | | | | | | The regression test is reduced from the example shown in D56281. This does raise a question as noted in the test file: do we want to handle this pattern? I don't have a motivating example for that on x86 yet, but it seems like we could have that pattern there too, so we could avoid the back-and-forth using a shuffle. llvm-svn: 351753
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-1932-128/+96
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [SelectionDAG] Updates for -dag-dump-verboseBjorn Pettersson2019-01-182-35/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch makes some changes related to -dag-dump-verbose. Main use case has been when debugging how SelectionDAG is dealing with debug info (SDDbgValue nodes). 1) We now print the number of DbgValues that are mapped to each SDNode. 2) Removed duplicated printing of DebugLoc (nowadays DebugLoc is printed also when not using -dag-dump-verbose). 3) Renamed SDDbgValue::dump to SDDbgValue::print, and added a new SDDbgValue::dump that will start a new line after calling print. 4) SDDbgValue::print now prints "Order", and it also prints some additional information when kind is CONST/FRAMEIX/VREG. 5) SelectionDAG::dump() now dumps all SDDbgValue nodes after the list of SDNodes (both "regular" and "ByVal" SDDbgValue:s). Invalidated nodes are not printed. 6) Prohibit inline printing of SDNode operands that has SDDbgValue nodes associated to them. Reviewers: jmorse, aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56793 llvm-svn: 351581
* [SelectionDAG] Split very large token factors for chained stores to 64k chunks.Florian Hahn2019-01-181-1/+1
| | | | | | | | | | | | | | Similar to D55073. Without this change, the DAG combiner crashes on code with more than 64k of stores in a single basic block that form parallelizable chains. No test case, as it would be very IR file. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D56740 llvm-svn: 351571
* [SelectionDAGBuilder] Cleanup InlineAsm Output generation. NFCI.Nirav Dave2019-01-181-115/+104
| | | | | | | | Defer inline asm's output fixup work until after we've generated the inline asm node itself. Remove StoresToEmit, IndirectStoresToEmit, and RetValRegs in favor of using ConstraintOperands. llvm-svn: 351558
* [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands.Florian Hahn2019-01-182-13/+14
| | | | | | | | | This functionality is required at multiple places which potentially create large operand lists, like SelectionDAGBuilder or DAGCombiner. Differential Revision: https://reviews.llvm.org/D56739 llvm-svn: 351552
* [SelectionDAG] Add static getMaxNumOperands function to SDNode.Florian Hahn2019-01-182-3/+2
| | | | | | | | | | | | | | | | | | | Summary: Use this helper to make sure we use the same value at various places. This will likely be needed at more places were we currently crash because we use more operands than possible. Also makes it easier to change in the future. Reviewers: RKSimon, craig.topper, efriedma, aemerson Reviewed By: RKSimon Subscribers: hiraditya, arsenm, llvm-commits Differential Revision: https://reviews.llvm.org/D56859 llvm-svn: 351537
* [ScheduleDAGRRList] Do not preschedule the node has ADJCALLSTACKDOWN parentShiva Chen2019-01-181-0/+23
| | | | | | | | | | | | We should not pre-scheduled the node has ADJCALLSTACKDOWN parent, or else, when bottom-up scheduling, ADJCALLSTACKDOWN and ADJCALLSTACKUP may hold CallResource too long and make other calls can't be scheduled. If there's no other available node to schedule, the scheduler will try to rename the register by creating copy to avoid the conflict which will fail because CallResource is not a real physical register. llvm-svn: 351527
* Allow FP types for atomicrmw xchgMatt Arsenault2019-01-173-1/+47
| | | | llvm-svn: 351427
* [COFF, ARM64] Implement support for SEH extensions __try/__except/__finallyMandeep Singh Grang2019-01-161-0/+2
| | | | | | | | | | | | | | | | | Summary: This patch supports MS SEH extensions __try/__except/__finally. The intrinsics localescape and localrecover are responsible for communicating escaped static allocas from the try block to the handler. We need to preserve frame pointers for SEH. So we create a new function/property HasLocalEscape. Reviewers: rnk, compnerd, mstorsjo, TomTan, efriedma, ssijaric Reviewed By: rnk, efriedma Subscribers: smeenai, jrmuizel, alex, majnemer, ssijaric, ehsan, dmajor, kristina, javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D53540 llvm-svn: 351370
* [DebugInfo] Allow creation of DBG_VALUEs in blocks where the operand is not usedJeremy Morse2019-01-161-5/+6
| | | | | | | | | | | | | dbg.value intrinsics can appear in blocks where their operand is not used, meaning the operand never receives an SDNode, and thus no DBG_VALUE will be created. Get around this by looking to see whether the operand has already been allocated a virtual register. This allows dbg.values of Phi node and Values that are used across basic blocks to successfully be translated into DBG_VALUEs. Differential Revision: https://reviews.llvm.org/D56678 llvm-svn: 351358
* [SelectionDAG] Update check in createOperands to reflect max() is a valid value.Florian Hahn2019-01-161-1/+1
| | | | | | | | | | | | | | | The value returned by max() is the last valid value, adjust the comparison accordingly. The code added in D55073 creates TokenFactors with max() operands. Reviewers: aemerson, efriedma, RKSimon, craig.topper Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D56738 llvm-svn: 351318
* [DAGCombine] Fix ReduceLoadWidth for shifted offsetsSam Parker2019-01-161-12/+8
| | | | | | | | | | | | ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310
* Reapply "[CodeGen][X86] Expand USUBSAT to UMAX+SUB, also for vectors"Nikita Popov2019-01-152-4/+28
| | | | | | | | | | | | | Related to https://bugs.llvm.org/show_bug.cgi?id=40123. Rather than scalarizing, expand a vector USUBSAT into UMAX+SUB, which produces much better code for X86. Reapplying with updated SLPVectorizer tests. Differential Revision: https://reviews.llvm.org/D56636 llvm-svn: 351219
* [SelectionDAG] Check membership of register in class for singleNirav Dave2019-01-151-6/+1
| | | | | | | | | register constraints. NFCI. Now that X86's ST(7) constraints are fixed this check can be reinstated. llvm-svn: 351207
* [DAGCombiner] reduce buildvec of zexted extracted element to shuffleSanjay Patel2019-01-151-0/+75
| | | | | | | | | | | | | | | The motivating case for this is shown in the first regression test. We are transferring to scalar and back rather than just zero-extending with 'vpmovzxdq'. That's a special-case for a more general pattern as shown here. In all tests, we're avoiding the vector-scalar-vector moves in favor of vector ops. We aren't producing optimal shuffle code in some cases though, so the patch is limited to reduce regressions. Differential Revision: https://reviews.llvm.org/D56281 llvm-svn: 351198
* Revert "[CodeGen][X86] Expand USUBSAT to UMAX+SUB, also for vectors"Nikita Popov2019-01-142-28/+4
| | | | | | | | | This reverts commit r351125. I missed test changes in an SLPVectorizer test, due to the cost model changes. Reverting for now. llvm-svn: 351129
* [CodeGen][X86] Expand USUBSAT to UMAX+SUB, also for vectorsNikita Popov2019-01-142-4/+28
| | | | | | | | | | | Related to https://bugs.llvm.org/show_bug.cgi?id=40123. Rather than scalarizing, expand a vector USUBSAT into UMAX+SUB, which produces much better code for X86. Differential Revision: https://reviews.llvm.org/D56636 llvm-svn: 351125
* Reland "Refactor GetRegistersForValue. NFCI."Nirav Dave2019-01-141-55/+44
| | | | | | Remove over-strictification class membership check. llvm-svn: 351074
* [DAGCombiner] Add (sub_sat x, x) -> 0 combineSimon Pilgrim2019-01-141-0/+4
| | | | llvm-svn: 351073
* [DAGCombiner] Enable sub saturation constant foldingSimon Pilgrim2019-01-142-1/+8
| | | | llvm-svn: 351072
* [DAGCombiner] Add add/sub saturation undef handlingSimon Pilgrim2019-01-142-0/+14
| | | | | | | | Match ConstantFolding.cpp: (add_sat x, undef) -> -1 (sub_sat x, undef) -> 0 llvm-svn: 351070
* [DAGCombiner] Enable add saturation constant foldingSimon Pilgrim2019-01-142-2/+5
| | | | llvm-svn: 351060
* [DAGCombiner] Add add saturation constant folding tests.Simon Pilgrim2019-01-141-2/+3
| | | | | | Exposes an issue with sadd_sat for computeOverflowKind, so I've disabled it for now. llvm-svn: 351057
* [SelectionDAG] Add type sanity assertions for add/sub saturation node creation.Simon Pilgrim2019-01-141-0/+4
| | | | llvm-svn: 351055
* [DAGCombiner] If add_sat(x,y) can't overflow -> add(x,y)Simon Pilgrim2019-01-131-0/+4
| | | | | NOTE: We need more powerful signed overflow detection in computeOverflowKind llvm-svn: 351026
* Fix unused variable warning. NFCI.Simon Pilgrim2019-01-131-1/+0
| | | | llvm-svn: 351025
* [DAGCombiner] Some very basic add/sub saturation combines.Simon Pilgrim2019-01-131-0/+64
| | | | | | Handle combines with zero and constant canonicalization for adds. llvm-svn: 351024
* [LegalizeDAG] Remove 'NeedInvert' code from expansion of BR_CC. Replace with ↵Craig Topper2019-01-131-4/+1
| | | | | | | | | | | | an assert. I accidentally triggered this code while doing some experiments and it doesn't look lke it could possibly work. It calls 'getNOT' on a node that should be a CondCode. I think to do this right we would need to swap the branch target and the fallthrough target. But that's not easy to do. Or we could create an explicit SetCC and feed that into a new BR_CC? llvm-svn: 351022
* [X86] Rename overly verbose method; NFCNikita Popov2019-01-133-8/+5
| | | | | | As suggested on D56636. llvm-svn: 351021
* [DAGCombiner] fold insert_subvector of insert_subvectorSanjay Patel2019-01-121-0/+8
| | | | | | | | | | | | | | | | | | | This pattern: t33: v8i32 = insert_subvector undef:v8i32, t35, Constant:i64<0> t21: v16i32 = insert_subvector undef:v16i32, t33, Constant:i64<0> ...shows up in PR33758: https://bugs.llvm.org/show_bug.cgi?id=33758 ...although this patch doesn't make any difference to the final result on that yet. In the affected tests here, it looks like it just makes RA wiggle. But we might as well squash this to prevent it interfering with other pattern-matching. Differential Revision: https://reviews.llvm.org/D56604 llvm-svn: 351008
* Use getShiftAmountTy for shift amounts.Simon Pilgrim2019-01-121-1/+2
| | | | llvm-svn: 351005
* [X86][AARCH64] Improve ISD::ABS supportSimon Pilgrim2019-01-123-0/+43
| | | | | | | | This patch takes some of the code from D49837 to allow us to enable ISD::ABS support for all SSE vector types. Differential Revision: https://reviews.llvm.org/D56544 llvm-svn: 350998
* [Legalizer] Use correct ValueType of SELECT_CC node during Float promotionPirama Arumuga Nainar2019-01-111-3/+3
| | | | | | | | | | | | | | | | | Summary: When legalizing the result of a SELECT_CC node by promoting the floating-point type, use the promoted-to type rather than the original type. Fix PR40273. Reviewers: efriedma, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56566 llvm-svn: 350951
* Revert "[SelectionDAGBuilder] Refactor GetRegistersForValue. NFCI."Martin Storsjo2019-01-111-42/+60
| | | | | | | This reverts commit r350841, as it actually had functional changes and broke compilation. See PR40290. llvm-svn: 350921
* [DAGCombiner] simplify code; NFCSanjay Patel2019-01-101-11/+11
| | | | llvm-svn: 350844
* [SelectionDAGBuilder] Refactor GetRegistersForValue. NFCI.Nirav Dave2019-01-101-60/+42
| | | | llvm-svn: 350841
* [SelectionDAGBuilder] Fix formatting. NFC.Nirav Dave2019-01-101-1/+2
| | | | llvm-svn: 350839
* [SelectionDAGBuilder] Refactor visitInlineAsm. NFC.Nirav Dave2019-01-101-45/+24
| | | | llvm-svn: 350837
* [opaque pointer types] Remove some calls to generic Type subtype accessors.James Y Knight2019-01-101-7/+7
| | | | | | | | | | | | That is, remove many of the calls to Type::getNumContainedTypes(), Type::subtypes(), and Type::getContainedType(N). I'm not intending to remove these accessors -- they are useful/necessary in some cases. However, removing the pointee type from pointers would potentially break some uses, and reducing the number of calls makes it easier to audit. llvm-svn: 350835
* Remove check for single use in ShrinkDemandedConstantStanislav Mekhanoshin2019-01-091-3/+0
| | | | | | | | | | | | | | | This removes check for single use from general ShrinkDemandedConstant to the BE because of the AArch64 regression after D56289/rL350475. After several hours of experiments I did not come up with a testcase failing on any other targets if check is not performed. Moreover, direct call to ShrinkDemandedConstant is not really needed and superceed by SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D56406 llvm-svn: 350684
* [TargetLowering][AMDGPU] Remove the SimplifyDemandedBits function that takes ↵Craig Topper2019-01-071-50/+0
| | | | | | | | | | | | | | a User and OpIdx. Stop using it in AMDGPU target for simplifyI24. As we saw in D56057 when we tried to use this function on X86, it's unsafe. It allows the operand node to have multiple users, but doesn't prevent recursing past the first node when it does have multiple users. This can cause other simplifications earlier in the graph without regard to what bits are needed by the other users of the first node. Ideally all we should do to the first node if it has multiple uses is bypass it when its not needed by the user we started from. Doing any other transformation that SimplifyDemandedBits can do like turning ZEXT/SEXT into AEXT would result in an increase in instructions. Fortunately, we already have a function that can do just that, GetDemandedBits. It will only make transformations that involve bypassing a node. This patch changes AMDGPU's simplifyI24, to use a combination of GetDemandedBits to handle the multiple use simplifications. And then uses the regular SimplifyDemandedBits on each operand to handle simplifications allowed when the operand only has a single use. Unfortunately, GetDemandedBits simplifies constants more aggressively than SimplifyDemandedBits. This caused the -7 constant in the changed test to be simplified to remove the upper bits. I had to modify computeKnownBits to account for this by ignoring the upper 8 bits of the input. Differential Revision: https://reviews.llvm.org/D56087 llvm-svn: 350560
* [LegalizeVectorOps] Add FSHL/FSHR to the list of vector operations that ↵Craig Topper2019-01-061-0/+2
| | | | | | | | should be handled. The FSHL/FSHR nodes are handled in the expand function, but they need to also be listed in the code that queries for the operation action too. llvm-svn: 350490
* Added single use check to ShrinkDemandedConstantStanislav Mekhanoshin2019-01-051-0/+3
| | | | | | | | | Fixes cvt_f32_ubyte combine. performCvtF32UByteNCombine() could shrink source node to demanded bits only even if there are other uses. Differential Revision: https://reviews.llvm.org/D56289 llvm-svn: 350475
OpenPOWER on IntegriCloud