summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombine] Improve select (not Cond), N1, N2 -> select Cond, N2, N1 foldSimon Pilgrim2019-03-061-9/+15
| | | | | | | | Move the x86 combine from D58974 into the DAGCombine VSELECT code and update the SELECT version to use the isBooleanFlip helper as well. Requested by @spatel on D59006 llvm-svn: 355533
* [DAGCombiner] Enable UADDO/USUBO vector combine supportSimon Pilgrim2019-03-061-11/+8
| | | | | | Differential Revision: https://reviews.llvm.org/D58965 llvm-svn: 355517
* [DAGCombiner] Add SADDO/SSUBO combine supportSimon Pilgrim2019-03-061-0/+54
| | | | | | | | Basic constant handling folds, for both scalars and vectors Differential Revision: https://reviews.llvm.org/D58967 llvm-svn: 355506
* [DAGCombiner] Enable SMULO/UMULO vector combine support (PR40442)Simon Pilgrim2019-03-061-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D58968 llvm-svn: 355495
* [DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast ↵Craig Topper2019-03-041-6/+9
| | | | | | | | | | (build_vector constants)) between legalize types and legalize dag. This patch enables combining integer bitcasts of integer build vectors when the new scalar type is legal. I've avoided floating point because the implementation bitcasts float to int along the way and we would need to check the intermediate types for legality Differential Revision: https://reviews.llvm.org/D58884 llvm-svn: 355324
* [DAG] Fix constant store folding to handle non-byte sizes.Nirav Dave2019-02-261-8/+8
| | | | | | | | | | | | | | | | Avoid crashes from zero-byte values due to sub-byte store sizes. Reviewers: uabelho, courbet, rnk Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58626 llvm-svn: 354884
* Fix a sign compare warning breaking the -Werror build.Andrea Di Biagio2019-02-251-1/+1
| | | | | | The warning was introduced at r354793. llvm-svn: 354810
* [DAGCombine] Add undef shuffle elt support to partitionShuffleOfConcatsSimon Pilgrim2019-02-251-28/+29
| | | | | | | | Support undef shuffle mask indices in the shuffle(concat_vectors, concat_vectors) -> concat_vectors fold Differential Revision: https://reviews.llvm.org/D58585 llvm-svn: 354793
* [NFC] Fix typos: preceeding -> precedingJordan Rupprecht2019-02-231-4/+4
| | | | llvm-svn: 354715
* Disable big-endian constant store merges from rL354676.Nirav Dave2019-02-221-10/+11
| | | | llvm-svn: 354677
* [DAGCombine] Fold overlapping constant storesNirav Dave2019-02-221-0/+26
| | | | | | | | | | | | | | | Fold a smaller constant store into larger constant stores immediately preceeding it. Reviewers: rnk, courbet Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58468 llvm-svn: 354676
* [DAGCombiner] prevent infinite looping by truncating 'and' (PR40793)Sanjay Patel2019-02-211-2/+3
| | | | | | | | | | | | | | This fold can occur during legalization, so it can fight with promotion to the larger type. It apparently takes a special sequence and subtarget to avoid more basic simplifications that would hide the problem. But there's a bigger question raised here: why does distributeTruncateThroughAnd() even exist? It duplicates functionality from a more minimal pattern that we already have. But getting rid of this function requires some preliminary steps. https://bugs.llvm.org/show_bug.cgi?id=40793 llvm-svn: 354594
* [DAGCombine] Generalize Dead Store to overlapping stores.Nirav Dave2019-02-201-14/+17
| | | | | | | | | | | | | | | | | | Summary: Remove stores that are immediately overwritten by larger stores. Reviewers: courbet, rnk Reviewed By: rnk Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58467 llvm-svn: 354518
* Re-land the refactoring part of r354244 "[DAGCombiner] Eliminate dead stores ↵Clement Courbet2019-02-201-35/+5
| | | | | | | | to stack." This is an NFC. llvm-svn: 354476
* Revert r354244 "[DAGCombiner] Eliminate dead stores to stack."Clement Courbet2019-02-181-90/+35
| | | | | | Breaks some bots. llvm-svn: 354245
* [DAGCombiner] Eliminate dead stores to stack.Clement Courbet2019-02-181-35/+90
| | | | | | | | | | | | | | | Summary: A store to an object whose lifetime is about to end can be removed. See PR40550 for motivation. Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57541 llvm-svn: 354244
* [DAGCombiner] convert logic-of-setcc into bit magic (PR40611)Sanjay Patel2019-02-121-0/+26
| | | | | | | | | | | | | | | | | | | | If we're comparing some value for equality against 2 constants and those constants have an absolute difference of just 1 bit, then we can offset and mask off that 1 bit and reduce to a single compare against zero: and/or (setcc X, C0, ne), (setcc X, C1, ne/eq) --> setcc ((add X, -C1), ~(C0 - C1)), 0, ne/eq https://rise4fun.com/Alive/XslKj This transform is disabled by default using a TLI hook ("convertSetCCLogicToBitwiseLogic()"). That should be overridden for AArch64, MIPS, Sparc and possibly others based on the asm shown in: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353859
* Move some classes into anonymous namespaces. NFC.Benjamin Kramer2019-02-111-0/+2
| | | | llvm-svn: 353710
* [DAG] Add optional AllowUndefs to isNullOrNullSplatSimon Pilgrim2019-02-101-5/+1
| | | | | | No change in default behaviour (AllowUndefs = false) llvm-svn: 353646
* [DAGCombine] Simplify funnel shifts with undef/zero args to bitshiftsSimon Pilgrim2019-02-101-2/+41
| | | | | | | | Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero. Differential Revision: https://reviews.llvm.org/D58009 llvm-svn: 353645
* [DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))Nemanja Ivanovic2019-02-081-4/+14
| | | | | | | | | | | | The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
* [TargetLowering] Add SimplifyDemandedBits funnel shift support Simon Pilgrim2019-02-081-0/+4
| | | | llvm-svn: 353539
* Revert r353416 "[DAG] Cleanup unused nodes on failed store-to-load forward ↵Nirav Dave2019-02-081-21/+9
| | | | | | | | combine." This cleanup causes out-of-tree crashes. llvm-svn: 353527
* [DAGCombiner] (add (umax X, C), -C) --> (usubsat X, C) (PR40111)Simon Pilgrim2019-02-071-0/+12
| | | | | | | | | | Move the (add (umax X, C), -C) --> (usubsat X, C) X86 combine into generic DAGCombiner First of a number of saturated arithmetic folds that can be moved out of X86-specific code for PR40111. Differential Revision: https://reviews.llvm.org/D57754 llvm-svn: 353457
* Revert "[DAG] Cleanup of unused node in SimplifySelectCC."Nirav Dave2019-02-071-15/+8
| | | | | | Causes ASAN use-after-poison errors. llvm-svn: 353442
* [DAGCombiner] fold add/sub with bool operand based on target's boolean contentsSanjay Patel2019-02-071-3/+16
| | | | | | | | | | | | | | | | | | | I noticed that we are missing this canonicalization in IR: rL352515 ...and then realized that we don't get this right in SDAG either, so this has to be fixed first regardless of what we choose to do in IR. The existing fold was limited to scalars and using the wrong predicate to guard the transform. We have a boolean contents TLI query that can be used to decide which direction to fold. This may eventually lead back to the problems/question in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but it makes no difference to that yet. Differential Revision: https://reviews.llvm.org/D57401 llvm-svn: 353433
* [DAG] Cleanup of unused node in SimplifySelectCC.Nirav Dave2019-02-071-8/+15
| | | | llvm-svn: 353428
* [DAG] Cleanup unused node on failed SELECT Combine.Nirav Dave2019-02-071-0/+6
| | | | llvm-svn: 353426
* [DAG] Cleanup unused nodes on failed store-to-load forward combine.Nirav Dave2019-02-071-9/+21
| | | | llvm-svn: 353416
* [DAG] Immediately cleanup unused nodes from extend-based combines.Nirav Dave2019-02-061-2/+7
| | | | llvm-svn: 353338
* [DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode.Clement Courbet2019-02-061-8/+8
| | | | | | | GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with static typing instead of runtime cast. llvm-svn: 353291
* [DAGCombiner] Discard pointer info when combining extract_vector_elt of a ↵Craig Topper2019-02-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | vector load when the index isn't constant Summary: If the index isn't constant, this transform inserts a multiply and an add on the index to calculating the base pointer for a scalar load. But we still create a memory operand with an offset of 0 and the size of the scalar access. But the access is really to an unknown offset within the original access size. This can cause the machine scheduler to incorrectly calculate dependencies between this load and other accesses. In the case we saw, there was a 32 byte vector store that was split into two 16 byte stores, one with offset 0 and one with offset 16. The size of the memory operand for both was 16. The scheduler correctly detected the alias with the offset 0 store, but not the offset 16 store. This patch discards the pointer info so we don't incorrectly detect aliasing. I wasn't sure if we could keep using the original offset and size without risking some other transform on the load changing the size. I tried to reduce a test case, but there's still a lot of memory operations needed to get the scheduler to do the bad reordering. So it looked pretty fragile to maintain. Reviewers: efriedma Reviewed By: efriedma Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57616 llvm-svn: 353124
* [DAGCombine] Add ADD(SUB,SUB) combinesSimon Pilgrim2019-02-041-0/+12
| | | | | | | | Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case. We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage. llvm-svn: 353044
* [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.Simon Pilgrim2019-02-021-1/+1
| | | | | | | | We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961
* [DAGCombine] Avoid CombineZExtLogicopShiftLoad if there is free ZEXTGuozhi Wei2019-01-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes pr39098. For the attached test case, CombineZExtLogicopShiftLoad can optimize it to t25: i64 = Constant<1099511627775> t35: i64 = Constant<0> t0: ch = EntryToken t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64 t58: i64 = srl t57, Constant:i8<1> t60: i64 = and t58, Constant:i64<524287> t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64 But later visitANDLike transforms it to t25: i64 = Constant<1099511627775> t35: i64 = Constant<0> t0: ch = EntryToken t57: i64,ch = load<(load 4 from `i40* undef`, align 8), zext from i32> t0, undef:i64, undef:i64 t61: i32 = truncate t57 t63: i32 = srl t61, Constant:i8<1> t64: i32 = and t63, Constant:i32<524287> t65: i64 = zero_extend t64 t58: i64 = srl t57, Constant:i8<1> t60: i64 = and t58, Constant:i64<524287> t29: ch = store<(store 5 into `i40* undef`, align 8), trunc to i40> t57:1, t60, undef:i64, undef:i64 And it triggers CombineZExtLogicopShiftLoad again, causes a dead loop. Both forms should generate same instructions, CombineZExtLogicopShiftLoad generated IR looks cleaner. But it looks more difficult to prevent visitANDLike to do the transform, so I prevent CombineZExtLogicopShiftLoad to do the transform if the ZExt is free. Differential Revision: https://reviews.llvm.org/D57491 llvm-svn: 352792
* [DAG] Aggressively cleanup dangling node in CombineZExtLogicopShiftLoad.Nirav Dave2019-01-311-0/+4
| | | | | | | | | | | | | While dangling nodes will eventually be pruned when they are considered, leaving them disables combines requiring single-use. Reviewers: Carrot, spatel, craig.topper, RKSimon, efriedma Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D57520 llvm-svn: 352784
* [DAGCombiner] sub X, 0/1 --> add X, 0/-1Sanjay Patel2019-01-301-10/+22
| | | | | | | | | | This extends the existing transform for: add X, 0/1 --> sub X, 0/-1 ...to allow the sibling subtraction fold. This pattern could regress with the proposed change in D57401. llvm-svn: 352680
* [DAGCombiner] fold extract_subvector of extract_subvectorSanjay Patel2019-01-291-0/+13
| | | | | | | | | | | | | | | This is the sibling fold for insert-of-insert that was added with D56604. Now that we have x86 shuffle narrowing (D57156), this change shows improvements for lots of AVX512 reduction code (not sure that we would ever expect extract-of-extract otherwise). There's a small regression in some of the partial-permute tests (extracting followed by splat). That is tracked by PR40500: https://bugs.llvm.org/show_bug.cgi?id=40500 Differential Revision: https://reviews.llvm.org/D57336 llvm-svn: 352528
* [NFC] TLI query with default(on) behavior wrt DAG combines for fmin/fmax ↵Michael Berg2019-01-281-3/+7
| | | | | | target control llvm-svn: 352396
* [DAGCombine] Enable more pre-indexed storesSam Parker2019-01-231-1/+7
| | | | | | | | | | | The current check in CombineToPreIndexedLoadStore is too conversative, preventing a pre-indexed store when the base pointer is a predecessor of the value being stored. Instead, we should check the pointer operand of the store. Differential Revision: https://reviews.llvm.org/D56719 llvm-svn: 351933
* [DAGCombiner] narrow vector binop with 2 insert subvector operandsSanjay Patel2019-01-221-1/+24
| | | | | | | | | | | | | | | | | | vecbo (insertsubv undef, X, Z), (insertsubv undef, Y, Z) --> insertsubv VecC, (vecbo X, Y), Z This is another step in generic vector narrowing. It's also a step towards more horizontal op formation specifically for x86 (although we still failed to match those in the affected tests). The scalarization cases are also not optimal (we should be scalarizing those), but it's still an improvement to use a narrower vector op when we know part of the result must be constant because both inputs are undef in some vector lanes. I think a similar match but checking for a constant operand might help some of the cases in D51553. Differential Revision: https://reviews.llvm.org/D56875 llvm-svn: 351825
* [DAGCombiner] fix crash when converting build vector to shuffleSanjay Patel2019-01-211-5/+11
| | | | | | | | | | The regression test is reduced from the example shown in D56281. This does raise a question as noted in the test file: do we want to handle this pattern? I don't have a motivating example for that on x86 yet, but it seems like we could have that pattern there too, so we could avoid the back-and-forth using a shuffle. llvm-svn: 351753
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [SelectionDAG] Split very large token factors for chained stores to 64k chunks.Florian Hahn2019-01-181-1/+1
| | | | | | | | | | | | | | Similar to D55073. Without this change, the DAG combiner crashes on code with more than 64k of stores in a single basic block that form parallelizable chains. No test case, as it would be very IR file. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D56740 llvm-svn: 351571
* [DAGCombine] Fix ReduceLoadWidth for shifted offsetsSam Parker2019-01-161-12/+8
| | | | | | | | | | | | ReduceLoadWidth can trigger using a shifted mask is used and this requires that the function return a shl node to correct for the offset. However, the way that this was implemented meant that the returned result could be an existing node, which would be incorrect. This fixes the method of inserting the new node and replacing uses. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 351310
* [DAGCombiner] reduce buildvec of zexted extracted element to shuffleSanjay Patel2019-01-151-0/+75
| | | | | | | | | | | | | | | The motivating case for this is shown in the first regression test. We are transferring to scalar and back rather than just zero-extending with 'vpmovzxdq'. That's a special-case for a more general pattern as shown here. In all tests, we're avoiding the vector-scalar-vector moves in favor of vector ops. We aren't producing optimal shuffle code in some cases though, so the patch is limited to reduce regressions. Differential Revision: https://reviews.llvm.org/D56281 llvm-svn: 351198
* [DAGCombiner] Add (sub_sat x, x) -> 0 combineSimon Pilgrim2019-01-141-0/+4
| | | | llvm-svn: 351073
* [DAGCombiner] Enable sub saturation constant foldingSimon Pilgrim2019-01-141-1/+6
| | | | llvm-svn: 351072
* [DAGCombiner] Add add/sub saturation undef handlingSimon Pilgrim2019-01-141-0/+8
| | | | | | | | Match ConstantFolding.cpp: (add_sat x, undef) -> -1 (sub_sat x, undef) -> 0 llvm-svn: 351070
* [DAGCombiner] Enable add saturation constant foldingSimon Pilgrim2019-01-141-2/+3
| | | | llvm-svn: 351060
OpenPOWER on IntegriCloud