summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
* {DAGCombiner] Fold (rot x, 0) -> xSimon Pilgrim2017-07-051-0/+4
| | | | llvm-svn: 307184
* [DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions ↵Andrew Zhogin2017-07-051-0/+19
| | | | | | | | | | into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179
* fix trivial typos in comments; NFCHiroshi Inoue2017-07-041-1/+1
| | | | llvm-svn: 307094
* [DAGCombiner] Intermediate variables in visitRotate promoted to the ↵Andrew Zhogin2017-07-041-6/+9
| | | | | | function's begin. NFC precommit for D12833. llvm-svn: 307091
* DAGCombine: Combine BUILD_VECTOR to TRUNCATEZvi Rackover2017-07-031-0/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036
* [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.Nirav Dave2017-06-291-11/+11
| | | | | | | | | | | Relanding after restricting equalBaseIndex to not erroneuosly consider a FrameIndices stemming from alloca from being comparable as its offset is set post-selectionDAG. Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to general BaseIndexOffset. llvm-svn: 306688
* Fold fneg and fabs like multiplicationsStanislav Mekhanoshin2017-06-281-0/+46
| | | | | | | | | | | Given no NaNs and no signed zeroes it folds: (fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X)) (fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X) Differential Revision: https://reviews.llvm.org/D34579 llvm-svn: 306592
* Revert "[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI."Nirav Dave2017-06-281-11/+23
| | | | | | This reverts commit r306498 which appears to cause a compilrt-rt test failures llvm-svn: 306501
* Allow to truncate left shift with non-constant shift amountStanislav Mekhanoshin2017-06-281-10/+12
| | | | | | | | | | | That is pretty common for clang to produce code like (shl %x, (and %amt, 31)). In this situation we can still perform trunc (shl) into shl (trunc) conversion given the known value range of shift amount. Differential Revision: https://reviews.llvm.org/D34723 llvm-svn: 306499
* [DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI.Nirav Dave2017-06-281-23/+11
| | | | | | | Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to general BaseIndexOffset. llvm-svn: 306498
* [SelectionDAG] set dereferenceable flag in MergeConsecutiveStores to fix ↵Hiroshi Inoue2017-06-271-2/+12
| | | | | | | | | | | | assetion failure When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities. This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable. Differential Revision: https://reviews.llvm.org/D34679 llvm-svn: 306404
* DAGCombine: Make sure we only eliminate trunc/extend when the scales of ↵Wolfgang Pieb2017-06-261-5/+9
| | | | | | | | | | | | truncation and extension match. This fixes PR33368. Reviewer: rksimon Differential Revision: https://reviews.llvm.org/D34069 llvm-svn: 306345
* [DAG] Add Target Store Merge pass ordering functionNirav Dave2017-06-221-1/+2
| | | | | | | Allow targets to specify if they should merge stores before or after legalization. llvm-svn: 306006
* [DAG] Move BaseIndexOffset into separate Libarary. NFC.Nirav Dave2017-06-211-114/+1
| | | | | | | Move BaseIndexOffset analysis out of DAGCombiner for use in other files. llvm-svn: 305921
* [DAG] Remove Node csonstruction from BaseIndexOffset match. NFCI.Nirav Dave2017-06-211-52/+69
| | | | | | | | Move GlobalAddress Offset decomposition from initial match into comparision check and removing the possibility of constructing a new offseted global address when examining addresses. llvm-svn: 305917
* [DAGCombiner] Add another combine from build vector to shuffleGuy Blank2017-06-211-0/+5
| | | | | | | Add support for combining a build vector to a shuffle. When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. llvm-svn: 305883
* [DAG] Simplify BaseIndexOffset. NFCI.Nirav Dave2017-06-201-59/+57
| | | | | | Remove tail calls and cleanup codeflow. llvm-svn: 305768
* Allow truncated and extend memory operations in Store Merge. NFCI.Nirav Dave2017-06-191-24/+46
| | | | | | | | | | As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. Relanding after fixing selection between truncated and normal store. llvm-svn: 305701
* [SelectionDAG] Use APInt::isSubsetOf. NFCCraig Topper2017-06-161-1/+1
| | | | llvm-svn: 305606
* [SelectionDAG] Use APInt::isNullValue/isOneValue. NFCCraig Topper2017-06-161-2/+2
| | | | llvm-svn: 305605
* Revert "[DAG] Allow truncated and extend memory operations in Store Merge. ↵Ahmed Bougacha2017-06-151-21/+10
| | | | | | | | NFCI." This reverts commit r305468, as it caused PR33475. llvm-svn: 305527
* [DAG] As StoreMerge now generates only legal nodes remove unecessary guard ↵Nirav Dave2017-06-151-4/+2
| | | | | | when run post-legalization NFCI. llvm-svn: 305477
* [DAG] Defer Pre/Post IndexStore merge to after mergestore. NFCI.Nirav Dave2017-06-151-4/+4
| | | | | | | | In preparation for doing storemerge post-legalization, reorder visitSTORE passes to move pre/post-index combining after store merge. Reordered passes other than store merge are unaffected. llvm-svn: 305473
* [DAG] Allow truncated and extend memory operations in Store Merge. NFCI.Nirav Dave2017-06-151-10/+21
| | | | | | | | As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. llvm-svn: 305468
* [DAG] Make MergeStores generate legalized stores. NFCI.Nirav Dave2017-06-151-4/+21
| | | | | | | Realized merged stores as truncstores if store will be realized as such by legalization. llvm-svn: 305467
* [DAG] Use correct size for truncated store merge of load. NFCI.Nirav Dave2017-06-151-2/+2
| | | | | | | Avoid non-legal memory ops by checking correct size when merging stores of loads into a extload-truncstore pair. llvm-svn: 305466
* [DAG] add helper to bind memop chains; NFCISanjay Patel2017-06-121-15/+1
| | | | | | | | | | This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192
* [DAGCombine] Make sure we check the ResNo from UADDO before combiningAmaury Sechet2017-06-111-1/+2
| | | | | | | | | | | | Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid. Reviewers: joerg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34088 llvm-svn: 305162
* [DAG] Improve Store Merge candidate pruning. NFC.Nirav Dave2017-06-071-3/+15
| | | | | | | | | When considering merging stores values are the results of loads only consider stores whose values come from loads from the same base. This fixes much of the longer compile times in PR33330. llvm-svn: 304934
* [DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.Simon Pilgrim2017-06-071-1/+1
| | | | | | | | This will allow commutation of target-specific DAG nodes in future patches Differential Revision: https://reviews.llvm.org/D33882 llvm-svn: 304911
* [DAGCombine] Fix unchecked calls to DAGCombiner::*ExtPromoteOperandSanjay Patel2017-06-051-6/+6
| | | | | | | | | | | | | | | | | Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode. Falling back to any extend in this case instead of failing outright seems correct to me. No test case because: The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Legal);. Patch by Jacob Young! Differential Revision: https://reviews.llvm.org/D33633 llvm-svn: 304723
* [SDAG] Fix CombineTo ordering in visitZERO_EXTEND and visitSIGN_EXTENDNirav Dave2017-06-011-15/+8
| | | | | | | | | | | | Reorder CombineTo Calls to prevent references to stale/deleted SDNodes which caused undue assertions. Reviewers: dbabokin Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D31625 llvm-svn: 304460
* DAG: Remove pointless type checkMatt Arsenault2017-06-011-1/+1
| | | | | | These are only integer operations. llvm-svn: 304417
* Only generate addcarry node when it is legal.Amaury Sechet2017-06-011-7/+11
| | | | | | | | | | | | | | | Summary: This is a problem uncovered by stage2 testing. ADDCARRY end up being generated on target that do not support it. The patch that introduced the problem has other patches layed on top of it, so we want to fix the issue rather than revert it to avoid creating a lor of churn. A regression test will be added shortly, but this is committed as this in order to get the build back to green promptly. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33770 llvm-svn: 304409
* Do not legalize large setcc with setcce, introduce setcccarry and do it with ↵Amaury Sechet2017-06-011-0/+15
| | | | | | | | | | | | | | | | | usubo/setcccarry. Summary: This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process. This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33374 llvm-svn: 304404
* [DAGCombine] Refactor common addcarry pattern.Amaury Sechet2017-06-011-0/+35
| | | | | | | | | | | | Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32756 llvm-svn: 304402
* [DAGCombine] (add/uaddo X, Carry) -> (addcarry X, 0, Carry)Amaury Sechet2017-06-011-0/+49
| | | | | | | | | | | | | | | Summary: This enables further transforms. Depends on D32916 Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32925 llvm-svn: 304401
* [DAG] Avoid use of stale store.Nirav Dave2017-05-311-2/+2
| | | | | | | | | | | | | | | | | | Correct references to alignment of store which may be deleted in a previous iteration of merge. Instead use first store that would be merged. Corrects pr33172's use-after-poison caught by ASan. Reviewers: spatel, hfinkel, RKSimon Reviewed By: RKSimon Subscribers: thegameg, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33686 llvm-svn: 304299
* [DAGCombiner] fix load narrowing transform to exclude loads with extensionSanjay Patel2017-05-291-1/+2
| | | | | | | | | | The extending load possibility was missed in: https://reviews.llvm.org/rL304072 We might want to handle this cases as a follow-up, but bailing out for now to avoid miscompiling. llvm-svn: 304153
* [DAGCombiner] use narrow load to avoid vector extractSanjay Patel2017-05-271-0/+50
| | | | | | | | | | | | | | | | | | If we have (extract_subvector(load wide vector)) with no other users, that can just be (load narrow vector). This is intentionally conservative. Follow-ups may loosen the one-use constraint to account for the extract cost or just remove the one-use check. The memop chain updating is based on code that already exists multiple times in x86 lowering, so that should be pulled into a helper function as a follow-up. Background: this is a potential improvement noticed via regressions caused by making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see comments in D33137). Differential Revision: https://reviews.llvm.org/D33578 llvm-svn: 304072
* Make helper functions static. NFC.Benjamin Kramer2017-05-261-5/+6
| | | | llvm-svn: 304029
* [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790)Sanjay Patel2017-05-261-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the best case: extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN ...we kill all of the extract/concat and just have narrow binops remaining. If only one of the binop operands is amenable, this transform is still worthwhile because we kill some of the extract/concat. Optional bitcasting makes the code more complicated, but there doesn't seem to be a way to avoid that. The TODO about extending to more than bitwise logic is there because we really will regress several x86 tests including madd, psad, and even a plain integer-multiply-by-2 or shift-left-by-1. I don't think there's anything fundamentally wrong with this patch that would cause those regressions; those folds are just missing or brittle. If we extend to more binops, I found that this patch will fire on at least one non-x86 regression test. There's an ARM NEON test in test/CodeGen/ARM/coalesce-subregs.ll with a pattern like: t5: v2f32 = vector_shuffle<0,3> t2, t4 t6: v1i64 = bitcast t5 t8: v1i64 = BUILD_VECTOR Constant:i64<0> t9: v2i64 = concat_vectors t6, t8 t10: v4f32 = bitcast t9 t12: v4f32 = fmul t11, t10 t13: v2i64 = bitcast t12 t16: v1i64 = extract_subvector t13, Constant:i32<0> There was no functional change in the codegen from this transform from what I could see though. For the x86 test changes: 1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case, but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops, there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op. SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat ops to match the pattern. 2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract. Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026 3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT. 4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction count by one in each case by eliminating two insert/extract while adding one narrower logic op. https://bugs.llvm.org/show_bug.cgi?id=32790 Differential Revision: https://reviews.llvm.org/D33137 llvm-svn: 303997
* [DAG] Move legal type checks in store merge to be checked onlyNirav Dave2017-05-261-2/+4
| | | | | | on non-legal cases. NFC. llvm-svn: 303994
* [DAG] Prevent crashes when merging constant stores with high-bit set. NFC.Nirav Dave2017-05-241-2/+2
| | | | llvm-svn: 303802
* [DAG] Add AddressSpace parameter to canMergeStoresTo. NFC.Nirav Dave2017-05-231-7/+10
| | | | llvm-svn: 303673
* [DAG] Add canMergeStoresTo predicate checks. NFCI.Nirav Dave2017-05-231-4/+6
| | | | | | Propagate canMergeStoresTo checks to missing cases in StoreMerge. llvm-svn: 303668
* [DAG] Rework store merge to loop on load candidates. NFCI.Nirav Dave2017-05-221-202/+225
| | | | | | | Continue to consider remaining candidate merges until all possible merges have been considered. llvm-svn: 303560
* [DAGCombine] (addcarry 0, 0, X) -> (ext/trunc X)Amaury Sechet2017-05-191-0/+11
| | | | | | | | | | | | | | | Summary: While this makes some case better and some case worse - so it's unclear if it is a worthy combine just by itself - this is a useful canonicalisation. As per discussion in D32756 . Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32916 llvm-svn: 303441
* Elide stores which are overwritten without being observed.Nirav Dave2017-05-161-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 llvm-svn: 303198
* [DAG] Prune deleted nodes in TokenFactorNirav Dave2017-05-161-8/+3
| | | | | | Fix visitTokenFactor to correctly remove deleted nodes. NFC. llvm-svn: 303181
OpenPOWER on IntegriCloud