summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* [NFC] small addendum to r334242, FMF propagationMichael Berg2018-08-011-1/+1
| | | | llvm-svn: 338604
* [SelectionDAG] fix bug in translating funnel shift with non-power-of-2 typeSanjay Patel2018-08-011-31/+39
| | | | | | | | | | | | | | | | | | The bug is visible in the constant-folded x86 tests. We can't use the negated shift amount when the type is not power-of-2: https://rise4fun.com/Alive/US1r ...so in that case, use the regular lowering that includes a select to guard against a shift-by-bitwidth. This path is improved by only calculating the modulo shift amount once now. Also, improve the rotate (with power-of-2 size) lowering to use a negate rather than subtract from bitwidth. This improves the codegen whether we have a rotate instruction or not (although we can still see that we're not matching to a legal rotate in all cases). llvm-svn: 338592
* [SelectionDAG] Make binop reduction matcher available to all targetsSimon Pilgrim2018-08-011-0/+58
| | | | | | | | There is nothing x86-specific about this code, so it'd be nice to make this available for other targets to use in the future (and get it out of X86ISelLowering!). Differential Revision: https://reviews.llvm.org/D50083 llvm-svn: 338586
* [FPEnv] Widen illegal width StrictFP vector operations as neededCameron McInally2018-08-012-58/+205
| | | | | | Differential Revision: https://reviews.llvm.org/D49806 llvm-svn: 338562
* DAG: Correct pointer type used for stack slotMatt Arsenault2018-07-311-1/+2
| | | | | | | | | | Correct the address space for the inserted argument stack slot. AMDGPU seems to not do anything with this information, so I don't think this was breaking anything. llvm-svn: 338428
* DAG: Fix PromoteFloatResult for fcanonicalizeMatt Arsenault2018-07-311-1/+2
| | | | llvm-svn: 338382
* Test commit.Hsiangkai Wang2018-07-311-1/+1
| | | | llvm-svn: 338352
* [DAGCombiner][TargetLowering] Pass a SmallVector instead of a std::vector to ↵Craig Topper2018-07-302-9/+8
| | | | | | | | BuildSDIV/BuildUDIV/etc. The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation. llvm-svn: 338329
* [DAGCombiner] transform sub-of-shifted-signbit to addSanjay Patel2018-07-301-0/+11
| | | | | | | | | | | | | | | | This is exchanging a sub-of-1 with add-of-minus-1: https://rise4fun.com/Alive/plKAH This is another step towards improving select-of-constants codegen (see D48970). x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral. I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but I think canonicalizing to 'add' is more likely to produce further transforms because we have more folds for 'add'. Differential Revision: https://reviews.llvm.org/D49924 llvm-svn: 338317
* [TargetLowering] In BuildSDIV, add the MULHS/SMUL_LOHI to the Created vector.Craig Topper2018-07-301-0/+3
| | | | | | BuildUDIV was already correct. llvm-svn: 338304
* [DAGCombiner][PowerPC][AArch64] Pass Created vector by reference to ↵Craig Topper2018-07-302-2/+2
| | | | | | BuildSDIVPow2. llvm-svn: 338303
* Revert r338222 "[DAGCombiner] Remove unnecessary calls to AddToWorklist."Craig Topper2018-07-301-8/+46
| | | | | | | | Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead. So its probably safer to leave this alone. llvm-svn: 338298
* Remove trailing spaceFangrui Song2018-07-308-43/+43
| | | | | | sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h} llvm-svn: 338293
* [DAGCombiner] Bug 31275- Extract a shift from a constant mul or udiv if a ↵David Bolvansky2018-07-301-17/+156
| | | | | | | | | | | | | | | | | | | rotate can be formed Summary: Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op. Patch by: sameconrad (Sam Conrad) Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar Reviewed By: lebedev.ri Subscribers: efriedma, kparzysz, llvm-commits Differential Revision: https://reviews.llvm.org/D47681 llvm-svn: 338270
* Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"Thomas Preud'homme2018-07-301-9/+23
| | | | | | | | | | | | This reapplies commit r338206 reverted by r338214 since the bug that r338206 uncovered has been fixed in r338268. Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338269
* [DAGCombiner] Remove unnecessary calls to AddToWorklist.Craig Topper2018-07-291-46/+8
| | | | | | | | The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes. I've removed the most obvious cases here. There are probably more than can be removed. llvm-svn: 338222
* revert r338206 because the test does not passSanjay Patel2018-07-291-23/+9
| | | | | | | Example of bot failure: http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll llvm-svn: 338214
* Fix crash on inline asm with 64bit matching input in 32bit GPRThomas Preud'homme2018-07-281-9/+23
| | | | | | | | | Add support for inline assembly with matching input operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR). Note that regular input is already handled by existing code. llvm-svn: 338206
* [SelectionDAG] Pass std::vector by reference instead of by pointer to ↵Craig Topper2018-07-282-18/+14
| | | | | | | | | | BuildSDIV/BuildUDIV. This removes the need for an assert to ensure the pointer isn't null. Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here. llvm-svn: 338205
* DAG: Add calling convention argument to calling convention funcsMatt Arsenault2018-07-285-104/+128
| | | | | | | | This seems like a pretty glaring omission, and AMDGPU wants to treat kernels differently from other calling conventions. llvm-svn: 338194
* [DAGCombiner] Teach DAG combiner that A-(B-C) can be folded to A+(C-B)Craig Topper2018-07-281-0/+6
| | | | | | | | This can be useful since addition is commutable, and subtraction is not. This matches a transform that is also done by InstCombine. llvm-svn: 338181
* [DAGCombiner] fold 'not' with signbit mathSanjay Patel2018-07-271-0/+45
| | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D48970. Alive proofs: https://rise4fun.com/Alive/sII We can eliminate an instruction in the usual select-of-constants to bit hack transform by adjusting the add/sub with constant. This is always a win. There are more transforms that are likely wins, but they may need target hooks in case some targets do not benefit. This is another step towards making up for canonicalizing to select-of-constants in rL331486. llvm-svn: 338132
* DAG: Remove unnecessary .str()Matt Arsenault2018-07-271-1/+1
| | | | llvm-svn: 338112
* [SelectionDAGBuilder] Add masked loads to PendingLoads rather than calling ↵Craig Topper2018-07-261-4/+2
| | | | | | | | | | DAG.setRoot. Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load. This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization. llvm-svn: 338085
* [SelectionDAG] Add MLOAD/MSTORE/MGATHER/MSCATTER to AddNodeIDCustom to ↵Craig Topper2018-07-261-0/+28
| | | | | | properly calculate their folding set ID to allow them to be CSEd. llvm-svn: 338080
* [DAGCombiner] Remove some calls to AddToWorklist that should be unnecessary.Craig Topper2018-07-261-3/+0
| | | | | | The DAGCombiner has a system for ensuring all nodes are visited. It doesn't require an AddToWorkList for every node that is created by a combine. llvm-svn: 338079
* [DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst usersVedant Kumar2018-07-264-33/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LowerDbgDeclare inserts a dbg.value before each use of an address described by a dbg.declare. When inserting a dbg.value before a CallInst use, however, it fails to append DW_OP_deref to the DIExpression. The DW_OP_deref is needed to reflect the fact that a dbg.value describes a source variable directly (as opposed to a dbg.declare, which relies on pointer indirection). This patch adds in the DW_OP_deref where needed. This results in the correct values being shown during a debug session for a program compiled with ASan and optimizations (see https://reviews.llvm.org/D49520). Note that ConvertDebugDeclareToDebugValue is already correct -- no changes there were needed. One complication is that SelectionDAG is unable to distinguish between direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also fixes this long-standing issue in order to not regress integration tests relying on the incorrect assumption that all frame-index SDDbgValues are indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot be lowered properly otherwise. Basically the fix prevents a direct SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice by a debugger. There were a handful of tests relying on this incorrect "FRAMEIX => indirect" assumption which actually had incorrect DW_AT_locations: these are all fixed up in this patch. Testing: - check-llvm, and an end-to-end test using lldb to debug an optimized program. - Existing unit tests for DIExpression::appendToStack fully cover the new DIExpression::append utility. - check-debuginfo (the debug info integration tests) Differential Revision: https://reviews.llvm.org/D49454 llvm-svn: 338069
* [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle ule,ugt CondCodes.Roman Lebedev2018-07-261-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A follow-up for D49266 / rL337166. At least one of these cases is more canonical, so we really do have to handle it. https://godbolt.org/g/pkzP3X https://rise4fun.com/Alive/pQyhZZ We won't get to these cases with I1 being -1, as that will be constant-folded to true or false. I'm also not sure we actually hit the 'ule' case, but i think the worst think that could happen is that being dead code. Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49497 llvm-svn: 338044
* [SelectionDAG] try to convert funnel shift directly to rotate if legalSanjay Patel2018-07-251-1/+10
| | | | | | | | | | | If the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. This sidesteps the issue of custom lowering for rotates raised in PR38243: https://bugs.llvm.org/show_bug.cgi?id=38243 ...by only dealing with legal operations. llvm-svn: 337966
* Fix corruption of result number in LegalizeVectorOps.cppUlrich Weigand2018-07-251-1/+2
| | | | | | | | | | | | When VectorLegalizer::LegalizeOp creates a new SDValue after iterating over its arguments, we need to refer to the same result number of the new node that the original value used. Reviewed by: cameron.mcinally Differential Revision: https://reviews.llvm.org/D49805 llvm-svn: 337939
* Fix PR34170: Crash on inline asm with 64bit output in 32bit GPRThomas Preud'homme2018-07-251-20/+36
| | | | | | | | Add support for inline assembly with output operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). llvm-svn: 337903
* [SelectionDAG] Reduce DanglingDebugInfo memory traffic, NFCVedant Kumar2018-07-231-19/+23
| | | | | | | This avoids approx. 2 x 10^5 DenseMap insertions in both non-debug and debug -O2 builds of the sqlite3 amalgamation. llvm-svn: 337751
* [Legalize] Elide MERGE_VALUES created by scalarizeVectorLoad.Nirav Dave2018-07-232-3/+10
| | | | | | | scalarizeVectorLoad creates MERGE_VALUES nodes which are immediately decomposed in expandLoad. Elide the node in these cases. llvm-svn: 337708
* [FPEnv] Legalize double width StrictFP vector operationsCameron McInally2018-07-232-0/+70
| | | | | | Differential Revision: https://reviews.llvm.org/D48809 llvm-svn: 337698
* [SelectionDAGBuilder] Use APInt::isZero instead of comparing ↵Craig Topper2018-07-221-1/+1
| | | | | | | | APInt::getZExtValue to 0 in a place where we can't be sure contents of the APInt fit in a uint64_t. This is used on an extract vector element index which is most cases is going to be an i32 or i64 and the element will be a valid element number. But it is possible to construct IR with a larger type and large out of range value. llvm-svn: 337652
* [SelectionDAGBuilder] Restrict vector reduction check to types with a power ↵Craig Topper2018-07-221-0/+4
| | | | | | | | of 2 number of elements. The check for the shuffles usages probably isn't correct for non power of 2 vectors. llvm-svn: 337651
* [DAG] Avoid Node Update assertion due to AND simplificationNirav Dave2018-07-201-3/+5
| | | | | | | | | | | | | | | | | Check for construction-time folding for incomplete AND nodes in BackwardsPropagateMask. Fixes PR38185. Reviewers: RKSimon, samparker Reviewed By: samparker Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D49444 llvm-svn: 337563
* [DAG] Fix Memory ordering check in ReduceLoadOpStore.Nirav Dave2018-07-201-16/+18
| | | | | | | | | | | | | | | | | | When merging through a TokenFactor we need to check that the load may be ordered such that no other aliasing memory operations may happen. It is not sufficient to just check that the load is a member of the chain token factor as it there may be a indirect chain. Require the load's chain has only one use. This fixes PR37826. Reviewers: spatel, davide, efriedma, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D49388 llvm-svn: 337560
* [DAGCombiner] Fold X - (-Y *Z) -> X + (Y * Z)Craig Topper2018-07-201-0/+18
| | | | llvm-svn: 337518
* Skip out of SimplifyDemandedBits for BITCAST of f16 to i16Stephen Canon2018-07-191-0/+1
| | | | | | | | Mirrors the existing exit path for f128, avoiding a crash later on. Differential Revision: https://reviews.llvm.org/D49524 llvm-svn: 337506
* [DAGCombiner] Teach DAGCombiner that A-(-B) is A+B.Craig Topper2018-07-191-0/+5
| | | | | | We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub. llvm-svn: 337502
* [ScheduleDAG] Fix unfolding of SUnits to already existent nodes.Nirav Dave2018-07-181-18/+30
| | | | | | | | | | | | | | | | | Summary: If unfolding an SUnit results in both load or the operation using it which already exist in the DAG, abort the unfold if they are already scheduled. If not, make sure we don't add duplicate dependencies. This fixes PR37916. Reviewers: davide, eli.friedman, fhahn, bogner Subscribers: MatzeB, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48666 llvm-svn: 337409
* [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELTSimon Pilgrim2018-07-171-4/+23
| | | | | | | | If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258
* [Intrinsics] define funnel shift IR intrinsics + DAG builder supportSanjay Patel2018-07-161-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions, and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift" operation which may also be a single hardware instruction. Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working on that (much larger patch), but then I concluded that we don't need it. At least as a first step, we have all of the backend support necessary to match these ops...because it was required. And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are likely all that we'll ever need. There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes (along with improving the oversized shift documentation). Again, I don't think that's strictly necessary (as the test results here prove). That can be an efficiency improvement as a small follow-up patch. So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support. Differential Revision: https://reviews.llvm.org/D49242 llvm-svn: 337221
* [CodeGen] Fix inconsistent declaration parameter nameFangrui Song2018-07-165-32/+32
| | | | llvm-svn: 337200
* [X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' patternRoman Lebedev2018-07-161-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. But the IR-optimal patter does not lower efficiently, so we want to undo it.. This handles the simple pattern. There is a second pattern with predicate and constants inverted. NOTE: we do not check uses here. we always do the transform. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49266 llvm-svn: 337166
* Avoid losing Hi part when expanding VAARG nodes on big endian machinesDaniel Cederman2018-07-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Summary: If the high part of the load is not used the offset to the next element will not be set correctly. For example, on Sparc V8, the following code will read val2 from offset 4 instead of 8. ``` int val = __builtin_va_arg(va, long long); int val2 = __builtin_va_arg(va, int); ``` Reviewers: jyknight Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D48595 llvm-svn: 337161
* [DAGCombiner] fix typo in comment; NFCSanjay Patel2018-07-151-1/+1
| | | | llvm-svn: 337132
* [DAGCombiner] extend(ifpositive(X)) -> shift-right (not X)Sanjay Patel2018-07-151-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is almost the same as an existing IR canonicalization in instcombine, so I'm assuming this is a good early generic DAG combine too. The motivation comes from reduced bit-hacking for select-of-constants in IR after rL331486. We want to restore that functionality in the DAG as noted in the commit comments for that change and the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2018-July/124433.html The PPC and AArch tests show that those targets are already doing something similar. x86 will be neutral in the minimal case and generally better when this pattern is extended with other ops as shown in the signbit-shift.ll tests. Note the asymmetry: we don't include the (extend (ifneg X)) transform because it already exists in SimplifySelectCC(), and that is verified in the later unchanged tests in the signbit-shift.ll files. Without the 'not' op, the general transform to use a shift is always a win because that's a single instruction. Alive proofs: https://rise4fun.com/Alive/ysli Name: if pos, get -1 %c = icmp sgt i16 %x, -1 %r = sext i1 %c to i16 => %n = xor i16 %x, -1 %r = ashr i16 %n, 15 Name: if pos, get 1 %c = icmp sgt i16 %x, -1 %r = zext i1 %c to i16 => %n = xor i16 %x, -1 %r = lshr i16 %n, 15 Differential Revision: https://reviews.llvm.org/D48970 llvm-svn: 337130
* CodeGen: Remove pipeline dependencies on StackProtector; NFCMatthias Braun2018-07-131-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This re-applies r336929 with a fix to accomodate for the Mips target scheduling multiple SelectionDAG instances into the pass pipeline. PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336964
OpenPOWER on IntegriCloud