summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombine] Move load checks on store of loads into candidateNirav Dave2018-05-151-15/+13
| | | | | | | | | | search. NFCI. Migrate single-use and non-volatility, non-indexed requirements on stores of immediate store values to candidate collection pass from later stage. llvm-svn: 332392
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-52/+32
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [DAGCombiner] Set the right SDLoc on extended SETCC uses (7/N)Vedant Kumar2018-05-111-19/+13
| | | | | | | | | | | | | | | | | | | | ExtendSetCCUses updates SETCC nodes which use a load (OriginalLoad) to reflect a simplification to the load (ExtLoad). Based on my reading, ExtendSetCCUses may create new nodes to extend a constant attached to a SETCC. It also creates fresh SETCC nodes which refer to any updated operands. ISTM that the location applied to the new constant and SETCC nodes should be the same as the location of the ExtLoad. This was suggested by Adrian in https://reviews.llvm.org/D45995. Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D46216 llvm-svn: 332119
* [DAGCombiner] Set the right SDLoc on a newly-created sextload (6/N)Vedant Kumar2018-05-111-38/+8
| | | | | | | | | | | | | | | | This teaches tryToFoldExtOfLoad to set the right location on a newly-created extload. With that in place, the logic for performing a certain ([s|z]ext (load ...)) combine becomes identical for sexts and zexts, and we can get rid of one copy of the logic. The test case churn is due to dependencies on IROrders inherited from the wrong SDLoc. Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D46158 llvm-svn: 332118
* [DAGCombiner] Factor out duplicated logic for an extload combine, NFC (5/N)Vedant Kumar2018-05-111-38/+52
| | | | | | | | | | | | | | | | Part of the logic for combining (zext (load ...)) and (sext (load ...)) is duplicated. This creates problems because bugs in one version have to be fixed again in the other version. To address this, as a first step, I've extracted the duplicate logic into a helper. I'll fix the debug location bug in the helper and eliminate the copy of its logic in a followup. Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D46157 llvm-svn: 332117
* [DAG] Avoid using deleted node in rebuildSetCCNirav Dave2018-05-101-8/+21
| | | | | | | | | | | | | | | | | Summary: The combine in rebuildSetCC may be combined to another node leaving our references stale. Keep a handle on it to avoid stale references. Fixes PR36602. Reviewers: dbabokin, RKSimon, eli.friedman, davide Subscribers: hiraditya, uabelho, JesperAntonsson, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D46404 llvm-svn: 331985
* [DAGCombiner] In visitBITCAST when trying to constant fold the bitcast, only ↵Craig Topper2018-05-091-11/+11
| | | | | | | | | | call getBitcast if its an fp->int or int->fp conversion even when before legalize ops. Previously if !LegalOperations we would blindly call getBitcast and hope that getNode would constant fold it. But if the conversion is between a vector and a scalar, getNode has no simplification. This means we would just get back the original N. We would then return that N which would make the caller of visitBITCAST think that we used CombineTo and did our own worklist management. This prevents target specific optimizations from being called for vector/scalar bitcasts until after legal operations. llvm-svn: 331896
* [DAGCombine] Change store merge candidates check cut off to 1024.Amara Emerson2018-05-091-1/+1
| | | | | | | | | | | The previous value of 8192 resulted in severe compile time hits in some pathological cases. rdar://39781410 Differential Revision: https://reviews.llvm.org/D46581 llvm-svn: 331888
* [DAGCombiner] Masked merge: enhance handling of 'andn' with immediatesRoman Lebedev2018-05-071-4/+14
| | | | | | | | | | | | | | | | | | | | | Summary: Split off from D46031. The previous patch, D46493, completely disabled unfolding in case of immediates. But we can do better: {F6120274} {F6120277} https://rise4fun.com/Alive/xJS Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D46494 llvm-svn: 331685
* [DagCombiner] Not all 'andn''s work with immediates.Roman Lebedev2018-05-071-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Split off from D46031. In masked merge case, this degrades IPC by decreasing instruction count. {F6108777} The next patch should be able to recover and improve this. This also affects the transform @spatel have added in D27489 / rL289738, and the test coverage for X86 was missing. But after i have added it, and looked at the changes in MCA, i'm somewhat confused. {F6093591} {F6093592} {F6093593} I'd say this regression is an improvement, since `IPC` increased in that case? Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: andreadb, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D46493 llvm-svn: 331684
* [NFC][DAGCombine] unfoldMaskedMerge(): rename two variablesRoman Lebedev2018-05-061-4/+4
| | | | | | | The current names can be confused with the A and B sides of the canonical masked merge pattern. llvm-svn: 331609
* [DAGCombiner] Masked merge: don't touch "not" xor's.Roman Lebedev2018-05-051-0/+10
| | | | | | | | | | | | | | | | | | | | Summary: Split off form D46031. It seems we don't want to transform the pattern if the `xor`'s are actually `not`'s. In vector case, this breaks `andnpd` / `vandnps` patterns. That being said, we may want to re-visit this `not` handling, maybe in D46073. Reviewers: spatel, craig.topper, javed.absar Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46492 llvm-svn: 331595
* [NFC][DagCombiner] unfoldMaskedMerge(): improve readability.Roman Lebedev2018-05-051-4/+4
| | | | llvm-svn: 331588
* Fix a bunch of places where operator-> was used directly on the return from ↵Craig Topper2018-05-051-2/+2
| | | | | | | | | | dyn_cast. Inspired by r331508, I did a grep and found these. Mostly just change from dyn_cast to cast. Some cases also showed a dyn_cast result being converted to bool, so those I changed to isa. llvm-svn: 331577
* Fast Math Flag mapping into SDNodeMichael Berg2018-05-041-5/+5
| | | | | | | | | | | | | | Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage. Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar Reviewed By: spatel Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng Differential Revision: https://reviews.llvm.org/D45710 llvm-svn: 331547
* [DAGCombiner] Fix SDLoc in a (zext (zextload x)) combine (4/N)Vedant Kumar2018-05-011-33/+35
| | | | | | | | | | | | | | | | The logic for this combine is almost identical to the logic for a (sext (sextload x)) combine. This commit factors out the logic so it can be shared by both combines, and corrects the SDLoc assigned in the zext version of the combine. Prior to this patch, for the given test case, we would apply the location associated with the udiv instruction to instructions which perform the load. Part of: llvm.org/PR37262 llvm-svn: 331303
* [DAGCombiner] Fix SDLoc in a (sext (sextload x)) combine (3/N)Vedant Kumar2018-05-011-3/+3
| | | | | | | | | | | | Prior to this patch, for the given test case, we would apply the location associated with the sdiv instruction to instructions which perform the load. Part of: llvm.org/PR37262. Differential Revision: https://reviews.llvm.org/D46222 llvm-svn: 331302
* [DAGCombiner] Change the SDLoc on split extloads (2/N)Vedant Kumar2018-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | In DAGCombiner, we try to simplify this pattern: ([s|z]ext (load ...)) Conceptually, a new extload which is created while splitting the load should have the same debug location as the load. Making this change affects the IROrder of the new load, causing some test case churn. In practice, the new location is never different from the location of the [s|z]ext, at least not during check-llvm or a stage2 build. Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D46156 llvm-svn: 331301
* [DAGCombiner] Set the right SDLoc on a newly-created zextload (1/N)Vedant Kumar2018-05-011-1/+1
| | | | | | | | | | | | | | | | | | | | | Setting the right SDLoc on a newly-created zextload fixes a line table bug which resulted in non-linear stepping behavior. Several backend tests contained CHECK lines which relied on the IROrder inherited from the wrong SDLoc. This patch breaks that dependence where feasbile and regenerates test cases where not. In some cases, changing a node's IROrder may alter register allocation and spill behavior. This can affect performance. I have chosen not to prevent this by applying a "known good" IROrder to SDLocs, as this may hide a more general bug in the scheduler, or cause regressions on other test inputs. rdar://33755881, Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D45995 llvm-svn: 331300
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-26/+26
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* [DAGCombiner] rename function attribute for disabling ftrunc transformSanjay Patel2018-04-301-2/+2
| | | | | | | | | | This is the matching name change for the Clang patch at: D46236 rL331209 Differential Revision: https://reviews.llvm.org/D46237 llvm-svn: 331210
* [DAGCombiner] Fix a case of 1 in non-splat vector pow2 divisorHeejin Ahn2018-04-271-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: D42479 (rL329525) enabled SDIV combine for pow2 non-splat vector dividers. But when there is a 1 in a vector, the instruction sequence to be generated involves shifting a value by the number of its bit widths, which is undefined (https://github.com/llvm-mirror/llvm/blob/c64f4dbfe31e509f9c1092b951e524b056245af8/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L6000-L6006). Especially, in architectures that do not support vector instructions, each of element in a vector will be computed separately using scalar operations, and then the resulting value will be undef for '1' values in a vector. (All 1's vector is fine; only vectors mixed with 1 and others will be affected.) Reviewers: RKSimon, jgravelle-google Subscribers: jfb, dschuff, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D46161 llvm-svn: 331092
* [DAGCombiner] limit ftrunc optimizations with function attributeSanjay Patel2018-04-261-0/+8
| | | | | | | | | | As noted, the attribute name is subject to change once we have the clang side implemented, but it's clear that we need some kind of attribute-based predication here based on the discussion for: rL330437 llvm-svn: 330951
* [DAGCombiner] refactor FP->int->FP folds; NFCSanjay Patel2018-04-261-16/+26
| | | | | | | | | | | | As discussed in the post-review comments for rL330437, we need to guard this fold to allow existing code to keep working with the undefined behavior that they've come to rely on. That would mean duplicating more code than we already have, so let's fix that first. llvm-svn: 330947
* [DAGCombiner][X86] When promoting loads don't use ZEXTLOAD even its legalCraig Topper2018-04-241-8/+4
| | | | | | | | | | | | We were previously prefering ZEXTLOAD over EXTLOAD if it is legal. This triggers during X86's promotion of i16->i32. Not sure about other targets. Using ZEXTLOAD can prevent folding it to SEXTLOAD later if we were to promote a sign extended operand like we would need for SRA. However, X86 doesn't currently promote i16 SRA. I was looking into doing that which is how I found this issue. This is also blocking our ability to fold 4 byte aligned EXTLOADs with "loadi32". This is what caused most of the test changes here. Differential Revision: https://reviews.llvm.org/D45585#inline-402825 llvm-svn: 330781
* [DAGCombiner] Unfold scalar masked merge if profitableRoman Lebedev2018-04-231-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]]. [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly. Previously, `andl`+`andn`/`andps`+`andnps` / `bic`/`bsl` would be generated. (see `@out`) Now, they would no longer be generated (see `@in`). So we need to make sure that they are still generated. If the mask is constant, we do nothing. InstCombine should have unfolded it. Else, i use `hasAndNot()` TLI hook. For now, only handle scalars. https://rise4fun.com/Alive/bO6 ---- I *really* don't like the code i wrote in `DAGCombiner::unfoldMaskedMerge()`. It is super fragile. Is there something like IR Pattern Matchers for this? Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: andreadb, courbet, kristof.beyls, javed.absar, rengolin, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D45733 llvm-svn: 330646
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-201-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally committed at rL328921 and reverted at rL329920 to investigate failures in Chrome. This time I've added to the ReleaseNotes to warn users of the potential of exposing UB and let me repeat that here for more exposure: Optimization of floating-point casts is improved. This may cause surprising results for code that is relying on undefined behavior. Code sanitizers can be used to detect affected patterns such as this: int main() { float x = 4294967296.0f; x = (float)((int)x); printf("junk in the ftrunc: %f\n", x); return 0; } $ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of representable values of type 'int' junk in the ftrunc: 0.000000 Original commit message: fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 330437
* [DAGCombiner] Fix for oss-fuzz bugGerolf Hoflehner2018-04-171-1/+2
| | | | llvm-svn: 330178
* [DAGCombiner, PowerPC] allow X - (fpext(-Y) --> X + fpext(Y) with multiple usesSanjay Patel2018-04-151-6/+6
| | | | | | | | | | | | | This is a transform that I limited in instcombine in rL329821 because it was creating more instructions in IR when the cast has multiple uses. But if the cast is free, then we can do the transform regardless of other uses because it improves the potential throughput of the calculation by removing a dependency on the fneg. Differential Revision: https://reviews.llvm.org/D45598 llvm-svn: 330098
* [DAGCombiner] simplify code; NFCSanjay Patel2018-04-121-3/+2
| | | | llvm-svn: 329964
* revert r328921 - [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-04-121-18/+0
| | | | | | | This change is exposing UB in source code - as was warned/predicted. :) See D44909 for discussion. Reverting while we figure out how to fix things. llvm-svn: 329920
* [DAGCombine] Improve ReduceLoad for SRLSam Parker2018-04-091-4/+34
| | | | | | | | | | | | | Recommitting r329283, third time lucky... If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329551
* DAGCombiner: Combine SDIV with non-splat vector pow2 divisorZvi Rackover2018-04-081-28/+64
| | | | | | | | | | | | | | | | Summary: Extend existing SDIV combine for pow2 constant divider to handle non-splat vectors of pow2 constants. Reviewers: RKSimon, craig.topper, spatel, hfinkel, efriedma Reviewed By: RKSimon Subscribers: magabari, llvm-commits Differential Revision: https://reviews.llvm.org/D42479 llvm-svn: 329525
* [DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))Guozhi Wei2018-04-071-0/+78
| | | | | | | | | | | | | | In our real world application, we found the following optimization is missed in DAGCombiner (zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst)) If the user of original zext is an add, it may enable further lea optimization on x86. This patch add a new function CombineZExtLogicopShiftLoad to do this optimization. Differential Revision: https://reviews.llvm.org/D44402 llvm-svn: 329516
* [DAGCombiner] Add a combine to turn a build vector of zero extends of ↵Craig Topper2018-04-071-0/+52
| | | | | | extract vector elts into a vector zero extend and possibly an extract subvector. llvm-svn: 329509
* [CodeGen] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-061-6/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: bogner, rnk, MatzeB, RKSimon Reviewed By: rnk Subscribers: JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45133 llvm-svn: 329435
* [DAGCombine] Revert r329160Sam Parker2018-04-051-26/+0
| | | | | | Again, broke the big endian stage 2 builders. llvm-svn: 329283
* [DAGCombine] Improve ReduceLoadWidth for SRLSam Parker2018-04-041-0/+26
| | | | | | | | | | | | | | | | | | Recommitting rL321259. Previosuly this caused an issue with PPCBE but I didn't receieve a reproducer and didn't have the time to follow up. If the issue appears again, please provide a reproducer so I can fix it. Original commit message: If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329160
* [DAGCombine] (float)((int) f) --> ftrunc (PR36617)Sanjay Patel2018-03-311-0/+18
| | | | | | | | | | fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC, so replace a pair of casts with the equivalent node. We don't have to account for special cases (NaN, INF) because out-of-range casts are undefined. Differential Revision: https://reviews.llvm.org/D44909 llvm-svn: 328921
* [SelectionDAG] Removing FABS folding from DAGCombinerSanjay Patel2018-03-301-18/+0
| | | | | | | | | | | | | | The code has bugs dealing with -0.0. Since D44550 introduced FABS pattern folding in InstCombine, this patch removes the now-redundant code that causes https://bugs.llvm.org/show_bug.cgi?id=36600. Patch by Mikhail Dvoretckii! Differential Revision: https://reviews.llvm.org/D44683 llvm-svn: 328872
* [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to ↵Craig Topper2018-03-291-1/+1
| | | | | | | | | | | | CodeGen layer. Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it. The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly. Differential Revision: https://reviews.llvm.org/D45017 llvm-svn: 328806
* Fix layering by moving ValueTypes.h from CodeGen to IRDavid Blaikie2018-03-231-1/+1
| | | | | | ValueTypes.h is implemented in IR already. llvm-svn: 328397
* Fix layering of MachineValueType.h by moving it from CodeGen to SupportDavid Blaikie2018-03-231-1/+1
| | | | | | | | | This is used by llvm tblgen as well as by LLVM Targets, so the only common place is Support for now. (maybe we need another target for these sorts of things - but for now I'm at least making them correct & we can make them better if/when people have strong feelings) llvm-svn: 328395
* Revert "[DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))"Martin Storsjo2018-03-231-80/+0
| | | | | | | This reverts commit r328252. This change broke building a number of projects when targeting ARM and AArch64, see PR36873. llvm-svn: 328297
* [DAGCombiner] Fold (zext (and/or/xor (shl/shr (load x), cst), cst))Guozhi Wei2018-03-221-0/+80
| | | | | | | | | | | | | | In our real world application, we found the following optimization is missed in DAGCombiner (zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst)) If the user of original zext is an add, it may enable further lea optimization on x86. This patch add a new function CombineZExtLogicopShiftLoad to do this optimization. Differential Revision: https://reviews.llvm.org/D44402 llvm-svn: 328252
* [DAGCombiner] Fix type in comment. NFCCraig Topper2018-03-191-1/+1
| | | | llvm-svn: 327916
* [X86] Teach X86TargetLowering::targetShrinkDemandedConstant to set ↵Craig Topper2018-03-141-2/+16
| | | | | | | | | | | | non-demanded bits if it helps created an and mask that can be matched as a zero extend. I had to modify the bswap recognition to allow unshrunk masks to make this work. Fixes PR36689. Differential Revision: https://reviews.llvm.org/D44442 llvm-svn: 327530
* [DAGCombiner] Allow visitEXTRACT_SUBVECTOR to combine with BUILD_VECTORS ↵Craig Topper2018-03-131-1/+1
| | | | | | | | between LegalizeVectorOps and LegalizeDAG. BUILD_VECTORs aren't themselves legalized until LegalizeDAG so we should still be able to create an "illegal" one before that. This helps combine with BUILD_VECTORS that are introduced during LegalizeVectorOps due to unrolling. llvm-svn: 327446
* [DAGCombine] visitREM - Don't assume that one divrem isn't driving anotherSimon Pilgrim2018-03-131-3/+3
| | | | | | | | | | | Under some circumstances the divrems won't have been combined together before getting to this code. So replace the assertion with a if() guard to not expand to X-((X/C)*C) to give the other combine chance to happen. Reduced from OSS-Fuzz #6883 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6883 llvm-svn: 327424
* Make early exit hasPredecessorHelper return true. NFCI.Nirav Dave2018-03-091-5/+1
| | | | | | | All uses conservatively assume in early exit case that it will be a predecessor. Changing default removes checking code in all uses. llvm-svn: 327169
OpenPOWER on IntegriCloud