summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* Merging r339536:Hans Wennborg2018-08-161-8/+11
| | | | | | | | | | | | ------------------------------------------------------------------------ r339536 | ctopper | 2018-08-13 08:53:49 +0200 (Mon, 13 Aug 2018) | 3 lines [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the fp_to_fp16 in case the result type isn't a scalar integer. This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary. ------------------------------------------------------------------------ llvm-svn: 339857
* Merging r339535:Hans Wennborg2018-08-161-2/+2
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r339535 | ctopper | 2018-08-13 08:53:47 +0200 (Mon, 13 Aug 2018) | 5 lines [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, make sure the output type is scalar. For vectors, use a store and load of temporary. Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid. This is basically the opposite case of the root cause of PR38533. ------------------------------------------------------------------------ llvm-svn: 339856
* Merging r339533:Hans Wennborg2018-08-161-2/+4
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r339533 | ctopper | 2018-08-13 07:26:49 +0200 (Mon, 13 Aug 2018) | 5 lines [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the fp16_to_fp in case the input type isn't an i16. The bitcast can be further legalized as needed. Fixes PR38533. ------------------------------------------------------------------------ llvm-svn: 339855
* Merging r339600:Hans Wennborg2018-08-141-2/+2
| | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r339600 | scott.linder | 2018-08-13 20:44:21 +0200 (Mon, 13 Aug 2018) | 8 lines [CodeGen] Fix assert in SelectionDAG::computeKnownBits Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR when zero extending the demanded elements mask if it is already as long as the source vector. Differential Revision: https://reviews.llvm.org/D49574 ------------------------------------------------------------------------ llvm-svn: 339664
* Merging r339225:Hans Wennborg2018-08-131-16/+37
| | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r339225 | thopre | 2018-08-08 11:35:26 +0200 (Wed, 08 Aug 2018) | 11 lines Support inline asm with multiple 64bit output in 32bit GPR Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45437 ------------------------------------------------------------------------ llvm-svn: 339539
* Merging r338915:Hans Wennborg2018-08-071-11/+28
| | | | | | | | | | | | | | ------------------------------------------------------------------------ r338915 | ctopper | 2018-08-03 22:14:18 +0200 (Fri, 03 Aug 2018) | 5 lines [SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store. The mask operand is visited before the data operand so we need to be able to widen it. Fixes PR38436. ------------------------------------------------------------------------ llvm-svn: 339106
* Merging r338665:Hans Wennborg2018-08-071-16/+12
| | | | | | | | | | | | | | | | | | | | ------------------------------------------------------------------------ r338665 | lliu0 | 2018-08-02 03:54:12 +0200 (Thu, 02 Aug 2018) | 11 lines Fix FCOPYSIGN expansion In expansion of FCOPYSIGN, the shift node is missing when the two operands of FCOPYSIGN are of the same size. We should always generate shift node (if the required shift bit is not zero) to put the sign bit into the right position, regardless of the size of underlying types. Differential Revision: https://reviews.llvm.org/D49973 ------------------------------------------------------------------------ llvm-svn: 339098
* DAG: Correct pointer type used for stack slotMatt Arsenault2018-07-311-1/+2
| | | | | | | | | | Correct the address space for the inserted argument stack slot. AMDGPU seems to not do anything with this information, so I don't think this was breaking anything. llvm-svn: 338428
* DAG: Fix PromoteFloatResult for fcanonicalizeMatt Arsenault2018-07-311-1/+2
| | | | llvm-svn: 338382
* Test commit.Hsiangkai Wang2018-07-311-1/+1
| | | | llvm-svn: 338352
* [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to ↵Craig Topper2018-07-302-9/+8
| | | | | | | | BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329
* [DAGCombiner] transform sub-of-shifted-signbit to addSanjay Patel2018-07-301-0/+11
| | | | | | | | | | | | | | | | This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317
* [TargetLowering] In BuildSDIV, add the MULHS/SMUL_LOHI to the Created vector.Craig Topper2018-07-301-0/+3
| | | | | | BuildUDIV was already correct. llvm-svn: 338304
* [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to ↵Craig Topper2018-07-302-2/+2
| | | | | | BuildSDIVPow2. llvm-svn: 338303
* Revert r338222 "[DAGCombiner] Remove unnecessary calls to AddToWorklist."Craig Topper2018-07-301-8/+46
| | | | | | | | Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead. So its probably safer to leave this alone. llvm-svn: 338298
* Remove trailing spaceFangrui Song2018-07-308-43/+43
| | | | | | sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h} llvm-svn: 338293
* [DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a ↵David Bolvansky2018-07-301-17/+156
| | | | | | | | | | | | | | | | | | | rotate can be formed Summary: Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op. Patch by: sameconrad (Sam Conrad) Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar Reviewed By: lebedev.ri Subscribers: efriedma, kparzysz, llvm-commits Differential Revision: https://reviews.llvm.org/D47681 llvm-svn: 338270
* Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"Thomas Preud'homme2018-07-301-9/+23
| | | | | | | | | | | | This reapplies commit r338206 reverted by r338214 since the bug that r338206 uncovered has been fixed in r338268. Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338269
* [DAGCombiner] Remove unnecessary calls to AddToWorklist.Craig Topper2018-07-291-46/+8
| | | | | | | | The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes. I've removed the most obvious cases here. There are probably more than can be removed. llvm-svn: 338222
* revert r338206 because the test does not passSanjay Patel2018-07-291-23/+9
| | | | | | | Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214
* Fix crash on inline asm with 64bit matching input in 32bit GPRThomas Preud'homme2018-07-281-9/+23
| | | | | | | | | Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206
* [SelectionDAG] Pass std::vector by reference instead of by pointer to ↵Craig Topper2018-07-282-18/+14
| | | | | | | | | | BuildSDIV/BuildUDIV. This removes the need for an assert to ensure the pointer isn't null. Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here. llvm-svn: 338205
* DAG: Add calling convention argument to calling convention funcsMatt Arsenault2018-07-285-104/+128
| | | | | | | | This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194
* [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)Craig Topper2018-07-281-0/+6
| | | | | | | | This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
* [DAGCombiner] fold 'not' with signbit mathSanjay Patel2018-07-271-0/+45
| | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132
* DAG: Remove unnecessary .str()Matt Arsenault2018-07-271-1/+1
| | | | llvm-svn: 338112
* [SelectionDAGBuilder] Add masked loads to PendingLoads rather than calling ↵Craig Topper2018-07-261-4/+2
| | | | | | | | | | DAG.setRoot. Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load. This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization. llvm-svn: 338085
* [SelectionDAG] Add MLOAD/MSTORE/MGATHER/MSCATTER to AddNodeIDCustom to ↵Craig Topper2018-07-261-0/+28
| | | | | | properly calculate their folding set ID to allow them to be CSEd. llvm-svn: 338080
* [DAGCombiner] Remove some calls to AddToWorklist that should be unnecessary.Craig Topper2018-07-261-3/+0
| | | | | | The DAGCombiner has a system for ensuring all nodes are visited. It doesn't require an AddToWorkList for every node that is created by a combine. llvm-svn: 338079
* [DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst usersVedant Kumar2018-07-264-33/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LowerDbgDeclare inserts a dbg.value before each use of an address described by a dbg.declare. When inserting a dbg.value before a CallInst use, however, it fails to append DW_OP_deref to the DIExpression. The DW_OP_deref is needed to reflect the fact that a dbg.value describes a source variable directly (as opposed to a dbg.declare, which relies on pointer indirection). This patch adds in the DW_OP_deref where needed. This results in the correct values being shown during a debug session for a program compiled with ASan and optimizations (see https://reviews.llvm.org/D49520). Note that ConvertDebugDeclareToDebugValue is already correct -- no changes there were needed. One complication is that SelectionDAG is unable to distinguish between direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also fixes this long-standing issue in order to not regress integration tests relying on the incorrect assumption that all frame-index SDDbgValues are indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot be lowered properly otherwise. Basically the fix prevents a direct SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice by a debugger. There were a handful of tests relying on this incorrect "FRAMEIX => indirect" assumption which actually had incorrect DW_AT_locations: these are all fixed up in this patch. Testing: - check-llvm, and an end-to-end test using lldb to debug an optimized program. - Existing unit tests for DIExpression::appendToStack fully cover the new DIExpression::append utility. - check-debuginfo (the debug info integration tests) Differential Revision: https://reviews.llvm.org/D49454 llvm-svn: 338069
* [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle ule,ugt CondCodes.Roman Lebedev2018-07-261-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A follow-up for D49266 / rL337166. At least one of these cases is more canonical, so we really do have to handle it. https://godbolt.org/g/pkzP3X https://rise4fun.com/Alive/pQyhZZ We won't get to these cases with I1 being -1, as that will be constant-folded to true or false. I'm also not sure we actually hit the 'ule' case, but i think the worst think that could happen is that being dead code. Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49497 llvm-svn: 338044
* [SelectionDAG] try to convert funnel shift directly to rotate if legalSanjay Patel2018-07-251-1/+10
| | | | | | | | | | | If the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. This sidesteps the issue of custom lowering for rotates raised in PR38243: https://bugs.llvm.org/show_bug.cgi?id=38243 ...by only dealing with legal operations. llvm-svn: 337966
* Fix corruption of result number in LegalizeVectorOps.cppUlrich Weigand2018-07-251-1/+2
| | | | | | | | | | | | When VectorLegalizer::LegalizeOp creates a new SDValue after iterating over its arguments, we need to refer to the same result number of the new node that the original value used. Reviewed by: cameron.mcinally Differential Revision: https://reviews.llvm.org/D49805 llvm-svn: 337939
* Fix PR34170: Crash on inline asm with 64bit output in 32bit GPRThomas Preud'homme2018-07-251-20/+36
| | | | | | | | Add support for inline assembly with output operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). llvm-svn: 337903
* [SelectionDAG] Reduce DanglingDebugInfo memory traffic, NFCVedant Kumar2018-07-231-19/+23
| | | | | | | This avoids approx. 2 x 10^5 DenseMap insertions in both non-debug and debug -O2 builds of the sqlite3 amalgamation. llvm-svn: 337751
* [Legalize] Elide MERGE_VALUES created by scalarizeVectorLoad.Nirav Dave2018-07-232-3/+10
| | | | | | | scalarizeVectorLoad creates MERGE_VALUES nodes which are immediately decomposed in expandLoad. Elide the node in these cases. llvm-svn: 337708
* [FPEnv] Legalize double width StrictFP vector operationsCameron McInally2018-07-232-0/+70
| | | | | | Differential Revision: https://reviews.llvm.org/D48809 llvm-svn: 337698
* [SelectionDAGBuilder] Use APInt::isZero instead of comparing ↵Craig Topper2018-07-221-1/+1
| | | | | | | | APInt::getZExtValue to 0 in a place where we can't be sure contents of the APInt fit in a uint64_t. This is used on an extract vector element index which is most cases is going to be an i32 or i64 and the element will be a valid element number. But it is possible to construct IR with a larger type and large out of range value. llvm-svn: 337652
* [SelectionDAGBuilder] Restrict vector reduction check to types with a power ↵Craig Topper2018-07-221-0/+4
| | | | | | | | of 2 number of elements. The check for the shuffles usages probably isn't correct for non power of 2 vectors. llvm-svn: 337651
* [DAG] Avoid Node Update assertion due to AND simplificationNirav Dave2018-07-201-3/+5
| | | | | | | | | | | | | | | | | Check for construction-time folding for incomplete AND nodes in BackwardsPropagateMask. Fixes PR38185. Reviewers: RKSimon, samparker Reviewed By: samparker Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D49444 llvm-svn: 337563
* [DAG] Fix Memory ordering check in ReduceLoadOpStore.Nirav Dave2018-07-201-16/+18
| | | | | | | | | | | | | | | | | | When merging through a TokenFactor we need to check that the load may be ordered such that no other aliasing memory operations may happen. It is not sufficient to just check that the load is a member of the chain token factor as it there may be a indirect chain. Require the load's chain has only one use. This fixes PR37826. Reviewers: spatel, davide, efriedma, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49388 llvm-svn: 337560
* [DAGCombiner] Fold X - (-Y *Z) -> X + (Y * Z)Craig Topper2018-07-201-0/+18
| | | | llvm-svn: 337518
* Skip out of SimplifyDemandedBits for BITCAST of f16 to i16Stephen Canon2018-07-191-0/+1
| | | | | | | | Mirrors the existing exit path for f128, avoiding a crash later on. Differential Revision: https://reviews.llvm.org/D49524 llvm-svn: 337506
* [DAGCombiner] Teach DAGCombiner that A-(-B) is A+B.Craig Topper2018-07-191-0/+5
| | | | | | We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub. llvm-svn: 337502
* [ScheduleDAG] Fix unfolding of SUnits to already existent nodes.Nirav Dave2018-07-181-18/+30
| | | | | | | | | | | | | | | | | Summary: If unfolding an SUnit results in both load or the operation using it which already exist in the DAG, abort the unfold if they are already scheduled. If not, make sure we don't add duplicate dependencies. This fixes PR37916. Reviewers: davide, eli.friedman, fhahn, bogner Subscribers: MatzeB, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48666 llvm-svn: 337409
* [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELTSimon Pilgrim2018-07-171-4/+23
| | | | | | | | If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258
* [Intrinsics] define funnel shift IR intrinsics + DAG builder supportSanjay Patel2018-07-161-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions, and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift" operation which may also be a single hardware instruction. Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working on that (much larger patch), but then I concluded that we don't need it. At least as a first step, we have all of the backend support necessary to match these ops...because it was required. And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are likely all that we'll ever need. There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes (along with improving the oversized shift documentation). Again, I don't think that's strictly necessary (as the test results here prove). That can be an efficiency improvement as a small follow-up patch. So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support. Differential Revision: https://reviews.llvm.org/D49242 llvm-svn: 337221
* [CodeGen] Fix inconsistent declaration parameter nameFangrui Song2018-07-165-32/+32
| | | | llvm-svn: 337200
* [X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' patternRoman Lebedev2018-07-161-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. But the IR-optimal patter does not lower efficiently, so we want to undo it.. This handles the simple pattern. There is a second pattern with predicate and constants inverted. NOTE: we do not check uses here. we always do the transform. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49266 llvm-svn: 337166
* Avoid losing Hi part when expanding VAARG nodes on big endian machinesDaniel Cederman2018-07-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Summary: If the high part of the load is not used the offset to the next element will not be set correctly. For example, on Sparc V8, the following code will read val2 from offset 4 instead of 8. ``` int val = __builtin_va_arg(va, long long); int val2 = __builtin_va_arg(va, int); ``` Reviewers: jyknight Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D48595 llvm-svn: 337161
OpenPOWER on IntegriCloud