summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* [Debug info] Transfer DI to fragment expressions for split integer values.Jonas Devlieghere2017-08-173-19/+62
| | | | | | | | | This patch teaches the SDag type legalizer how to split up debug info for integer values that are split into a hi and lo part. Differential Revision: https://reviews.llvm.org/D36805 llvm-svn: 311102
* Improve line debug info when translating a CaseBlock to SDNodes.Adrian Prantl2017-08-172-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SelectionDAGBuilder translates various conditional branches into CaseBlocks which are then translated into SDNodes. If a conditional branch results in multiple CaseBlocks only the first CaseBlock is translated into SDNodes immediately, the rest of the CaseBlocks are put in a queue and processed when all LLVM IR instructions in the basic block have been processed. When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder is queried for the current LLVM IR instruction and the resulting SDNodes are annotated with the debug info of the current instruction (if it exists and has debug metadata). When the deferred CaseBlocks are processed, the SelectionDAGBuilder does not have a current LLVM IR instruction, and the resulting SDNodes will not have any debuginfo. As DwarfDebug::beginInstruction() outputs a .loc directive for the first instruction in a labeled block (typically the case for something coming from a CaseBlock) this tends to produce a line-0 directive. This patch changes the handling of CaseBlocks to store the current instruction's debug info into the CaseBlock when it is created (and the SelectionDAGBuilder knows the current instruction) and to always use the stored debug info when translating a CaseBlock to SDNodes. Patch by Frej Drejhammar! Differential Revision: https://reviews.llvm.org/D36671 llvm-svn: 311097
* [DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << ↵Simon Pilgrim2017-08-171-5/+9
| | | | | | c)) -> x << c llvm-svn: 311083
* [SelectionDAG] Teach the vector-types operand scalarizer about SETCCElad Cohen2017-08-172-0/+34
| | | | | | | | | | | | | | | | When v1i1 is legal (e.g. AVX512) the legalizer can reach a case where a v1i1 SETCC with an illgeal vector type operand wasn't scalarized (since v1i1 is legal) but its operands does have to be scalarized. This used to assert because SETCC was missing from the vector operand scalarizer. This patch attemps to teach the legalizer to handle these cases by scalazring the operands, converting the node into a scalar SETCC node. Differential revision: https://reviews.llvm.org/D36651 llvm-svn: 311071
* Add strictfp attribute to prevent unwanted optimizations of libm callsAndrew Kaylor2017-08-141-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885
* [DAGCombine] Do not try to deduplicate commutative operations if both ↵Amaury Sechet2017-08-141-3/+3
| | | | | | | | | | | | | | operand are the same. Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33840 llvm-svn: 310832
* [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))Elad Cohen2017-08-141-0/+9
| | | | | | | | | into vextract(vNiX,Idx) when creating vextract with getNode(). This case appeared in AVX512 after fixing pr33349 in r310552. Differential revision: https://reviews.llvm.org/D36571 llvm-svn: 310828
* Revert "[DAGCombiner] Extending pattern detection for vector shuffle ↵Elad Cohen2017-08-141-47/+2
| | | | | | | | (REAPPLIED)" This reverts commit r310782. llvm-svn: 310822
* [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheapCraig Topper2017-08-131-1/+1
| | | | | | | | | | | | | | | | | Summary: Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not. For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors. Reviewers: RKSimon, zvi, efriedma Reviewed By: RKSimon Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D36649 llvm-svn: 310793
* [DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)Simon Pilgrim2017-08-121-2/+47
| | | | | | | | | | | | If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Reapplied with fix to only work with simple value types. Committed on behalf of @jbhateja (Jatin Bhateja) Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 310782
* [x86] use more shift or LEA for select-of-constants (2nd try)Sanjay Patel2017-08-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous rev (r310208) failed to account for overflow when subtracting the constants to see if they're suitable for shift/lea. This version add a check for that and more test were added in r310490. We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs. 5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310717
* Improve handling of insert_subvector of bitcast valuesNirav Dave2017-08-111-0/+35
| | | | | | | | | | | | Fix insert_subvector / extract_subvector merges of bitcast values. Reviewers: efriedma, craig.topper, RKSimon Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D34571 llvm-svn: 310711
* [X86][DAG] Switch X86 Target to post-legalized store mergeNirav Dave2017-08-111-1/+2
| | | | | | | | | | | | | | | | Move store merge to happen after intrinsic lowering to allow lowered stores to be merged. Some regressions due in MergeConsecutiveStores to missing insert_subvector that are addressed in follow up patch. Reviewers: craig.topper, efriedma, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34559 llvm-svn: 310710
* [DAGCombiner] Remove shuffle support from simplifyShuffleMaskSimon Pilgrim2017-08-111-2/+0
| | | | | | | | rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs. Removing support until I can triage the problem llvm-svn: 310699
* Revert "[DAG] Cleanup unused nodes after store merge. NFCI."Nirav Dave2017-08-101-11/+1
| | | | | | This reverts commit r310648 which causes an unexpected assertion failure llvm-svn: 310659
* [DAG] Relax type restriction for store mergeNirav Dave2017-08-101-24/+64
| | | | | | | | | | | | | | Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary. Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34569 llvm-svn: 310655
* [DAG] Cleanup unused nodes after store merge. NFCI.Nirav Dave2017-08-101-1/+11
| | | | llvm-svn: 310648
* [DAG] Rewrite expression. NFC.Nirav Dave2017-08-101-2/+2
| | | | llvm-svn: 310608
* [X86] Keep dependencies when constructing loads in combineStoreNirav Dave2017-08-101-5/+6
| | | | | | | | | | | | | | | | Summary: Preserve chain dependecies between old and new loads constructed to prevent loads from reordering below later stores. Fixes PR34088. Reviewers: craig.topper, spatel, RKSimon, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36528 llvm-svn: 310604
* [SelectionDAG] Allow constant folding for implicitly truncating BUILD_VECTOR ↵Guy Blank2017-08-101-2/+16
| | | | | | | | | | | | | nodes. In FoldConstantArithmetic, handle BUILD_VECTOR nodes that do implicit truncation on the elements. This is similar to what is done in FoldConstantVectorArithmetic. Differential Revision: https://reviews.llvm.org/D36506 llvm-svn: 310593
* [SelectionDAG] When scalarizing vselect, don't assert onElad Cohen2017-08-101-1/+15
| | | | | | | | | | | | | | | | | | | | | | | a legal cond operand. When scalarizing the result of a vselect, the legalizer currently expects to already have scalarized the operands. While this is true for the true/false operands (which have the same type as the result), it is not case for the condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is illegal to hit an assertion during the scalarization. The handling is similar to r205625. This also exposes the fact that (v1i1 extract_subvector) should be legal and selectable on AVX512 - We do this by custom lowering to vector_extract_elt. This still leaves us in some cases with redundant dag nodes which will be combined in a separate soon to come patch. This fixes pr33349. Differential revision: https://reviews.llvm.org/D36511 llvm-svn: 310552
* Reduce variable scope by moving declaration into if clauseDavid Blaikie2017-08-091-8/+8
| | | | llvm-svn: 310506
* [DAG] Explicitly cleanup merged load values during store merge. NFCI.Nirav Dave2017-08-091-2/+8
| | | | llvm-svn: 310474
* [DAG] Introduce peekThroughBitcast function. NFCI.Nirav Dave2017-08-081-23/+14
| | | | llvm-svn: 310405
* [DAG] Update comments. NFC.Nirav Dave2017-08-081-8/+9
| | | | llvm-svn: 310404
* [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as ↵Simon Pilgrim2017-08-081-11/+10
| | | | | | | | well as BUILD_VECTOR Minor extension to D36393 llvm-svn: 310372
* [DAGCombiner] Simplify shuffle mask index if the referenced input element is ↵Simon Pilgrim2017-08-081-0/+36
| | | | | | | | | | UNDEF Fixes one of the cases in PR34041. Differential Revision: https://reviews.llvm.org/D36393 llvm-svn: 310344
* [x86] revert r310208 to investigate test-suite failures (PR34105 / PR34097) Sanjay Patel2017-08-071-1/+1
| | | | llvm-svn: 310264
* [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.Nirav Dave2017-08-071-12/+35
| | | | | | | | | | | | | | | | Relanding after case to insert explicit truncation as necessary. Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally improves vector shuffle computations. Reviewers: efriedma, RKSimon, spatel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35566 llvm-svn: 310256
* [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocksGuy Blank2017-08-071-0/+3
| | | | | | | | | | | | | The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types. But before calling CodeGenAndEmitDAG we build the DAG for the basic block. So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'. This patch sets the flag to false before SDAG building each basic block. Differential Revision: https://reviews.llvm.org/D33435 llvm-svn: 310239
* [x86] use more shift or LEA for select-of-constantsSanjay Patel2017-08-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can convert any select-of-constants to math ops: http://rise4fun.com/Alive/d7d For this patch, I'm enhancing an existing x86 transform that uses fake multiplies (they always become shl/lea) to avoid cmov or branching. The current code misses cases where we have a negative constant and a positive constant, so this is just trying to plug that hole. The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start with a select in IR, create a select DAG node, convert it into a sext, convert it back into a select, and then lower it to sext machine code. Some notes about the test diffs: 1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR. 2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. I think we could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a post-DAG problem though. 3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if that's a regression, but I think those would always be nearly equivalent. 4. pr22338.ll and sext-i1.ll - These tests have undef operands, so I don't think we actually care about these diffs. 5. sbb.ll - This shows a win for what I think is a common case: choose -1 or 0. 6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again. 7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops. Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass. Differential Revision: https://reviews.llvm.org/D35340 llvm-svn: 310208
* Revert r310058, it caused PR34073.Nico Weber2017-08-041-47/+2
| | | | llvm-svn: 310118
* [DAGCombiner] Extending pattern detection for vector shuffle.Simon Pilgrim2017-08-041-2/+47
| | | | | | | | | | If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index. Committed on behalf of @jbhateja (Jatin Bhateja) Differential Revision: https://reviews.llvm.org/D35788 llvm-svn: 310058
* DAG: Provide access to Pass instance from SelectionDAGMatt Arsenault2017-08-032-2/+4
| | | | | | This allows accessing an analysis pass during lowering. llvm-svn: 309991
* [DAG] Allow merging of stores of vector loadsNirav Dave2017-08-031-6/+0
| | | | | | | | | | | | | Remove restriction disallowing merging of stores vector loads into larger store of larger vector load. Reviewers: RKSimon, efriedma, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36158 llvm-svn: 309951
* [DAG] Improve candidate pruning in store merge failure case. NFCINirav Dave2017-08-021-20/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During store merge we construct a sorted list of consecutive store candidates and consider subsequences for merging into a single store. For each subsequence we check if the stored value type is legal the merged store would have valid and fast and if the constructed value to be stored is valid. The only properties that affect this check between subsequences is the size of the subsequence, the alignment of the first store, the alignment of the stored load value (when merging stores-of-loads), and whether the merged value is a constant zero. If we do not find a viable mergeable subsequence starting from the first store of length N, we know that a subsequence starting at a later store of length N will also fail unless the new store's alignment, the new load's alignment (if we're merging store-of-loads), or we've dropped stores of nonzero value and could construct a merged stores of zero (for merging constants). As a result if we fail to find a valid subsequence starting from the first store we can safely skip considering subsequences that start with subsequent stores unless one of the above properties is true. This significantly (2x) improves compile time in some pathological cases. Reviewers: RKSimon, efriedma, zvi, spatel, waltl Subscribers: grandinj, llvm-commits Differential Revision: https://reviews.llvm.org/D35901 llvm-svn: 309830
* [DAG] Refactor store merge subexpressions. NFC.Nirav Dave2017-08-021-23/+28
| | | | | | Distribute various expressions across ifs. llvm-svn: 309777
* DAG: Undo and->or combine with FrameIndexesMatt Arsenault2017-08-021-0/+9
| | | | | | | | | | | | | | This pattern shows up when lowering byval copies on AMDGPU. The byval object access is split into 4-byte chunks, adding a constant offset to the FixedStack base. When some of the offsets turn into ors, this prevents combining the constant offsets. This makes it not apparent that the object is there when matching addressing modes, so it ends up using a scratch wave offset relative access and the lengthy frame index expansion for that. llvm-svn: 309775
* Use helper function instead of manually constructing DBG_VALUEs (NFC)Adrian Prantl2017-08-011-5/+2
| | | | | | rdar://problem/33580047 llvm-svn: 309757
* [DAG] Factor out common expressions. NFC.Nirav Dave2017-08-011-19/+21
| | | | llvm-svn: 309740
* [DebugInfo] Don't turn dbg.declare into DBG_VALUE for static allocasReid Kleckner2017-08-011-0/+7
| | | | | | | | | | | | | | | | Summary: We already have information about static alloca stack locations in our side table. Emitting instructions for them is inefficient, and it only happens when the address of the alloca has been materialized within the current block, which isn't often. Reviewers: aprantl, probinson, dblaikie Subscribers: jfb, dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits, aheejin Differential Revision: https://reviews.llvm.org/D36117 llvm-svn: 309729
* Pull out VectorNumElements value. NFC.Nirav Dave2017-08-011-13/+9
| | | | llvm-svn: 309719
* Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."Nirav Dave2017-08-011-26/+11
| | | | | | | This reverts commit r309680 which appears to be raising an assertion in the test-suite. llvm-svn: 309717
* [DAG] Convert extload check to equivalent type check. NFC.Nirav Dave2017-08-011-5/+10
| | | | | | Replace check with check that consuming store has the same type. llvm-svn: 309708
* [DAG] Move extload check in store merge. NFC.Nirav Dave2017-08-011-5/+3
| | | | | | Move candidate check from later check to initial candidate check. llvm-svn: 309698
* [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.Nirav Dave2017-08-011-11/+26
| | | | | | | | | | | | | | | Summary: Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally improves vector shuffle computations. Reviewers: efriedma, RKSimon, spatel Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35566 llvm-svn: 309680
* [ScheduleDAG] Don't schedule node with physical register interferenceEli Friedman2017-08-011-25/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/D31536 didn't really solve the problem it was trying to solve; it got rid of the assertion failure, but we were still scheduling the DAG incorrectly (mixing together instructions from different calls), leading to a MachineVerifier failure. In order to schedule the DAG correctly, we have to make sure we don't schedule a node which should be blocked by an interference. Fix ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node like that. The added call to FindAvailableNode() is the key change here; this makes sure we don't try to schedule a call while we're in the middle of scheduling a different call. I'm not sure this is the right approach; in particular, I'm not sure how to prove we don't end up with an infinite loop of repeatedly backtracking. This also reverts the code change from D31536. It doesn't do anything useful: we should never schedule an ADJCALLSTACKDOWN unless we've already scheduled the corresponding ADJCALLSTACKUP. Differential Revision: https://reviews.llvm.org/D33818 llvm-svn: 309642
* [SelectionDAG][mips] Fix PR33883Simon Dardis2017-07-311-15/+24
| | | | | | | | | | | | | | | PR33883 shows that calls to intrinsic functions should not have their vector arguments or returns subject to ABI changes required by the target. This resolves PR33883. Thanks to Alex Crichton for reporting the issue! Reviewers: zoran.jovanovic, atanasyan Differential Revision: https://reviews.llvm.org/D35765 llvm-svn: 309561
* [SelectionDAG][X86] CombineBT - more aggressively determine demanded bitsSimon Pilgrim2017-07-291-0/+12
| | | | | | | | | | | | This patch is in 2 parts: 1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT. 2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match. Differential Revision: https://reviews.llvm.org/D35896 llvm-svn: 309486
* Remove the unused offset from DBG_VALUE (NFC)Adrian Prantl2017-07-283-6/+8
| | | | | | | Followup to r309426. rdar://problem/33580047 llvm-svn: 309450
OpenPOWER on IntegriCloud