summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombiner] revert r295336Bill Seurer2017-02-221-19/+8
| | | | | | | | | | | r295336 causes a bootstrapped clang to fail for many compilations on powerpc BE. See http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315 for example. Reverting as per the developer's request. llvm-svn: 295849
* DAG: Check if extract_vector_elt is legal or customMatt Arsenault2017-02-211-1/+1
| | | | | | | Avoids test regressions in future AMDGPU commits when more vector types are custom lowered. llvm-svn: 295782
* Strip trailing whitespace.Simon Pilgrim2017-02-201-1/+1
| | | | llvm-svn: 295653
* [SelectionDAG] Add scalarization support for ISD::*_EXTEND_VECTOR_INREG opcodes.Simon Pilgrim2017-02-202-0/+34
| | | | | | Thanks to Mikael Holmén for the initial test case llvm-svn: 295652
* Remove redundant call to GluedNodes.back() [NFC]Artyom Skrobov2017-02-191-2/+1
| | | | llvm-svn: 295607
* [DAGCombiner] split i1 select-of-constants from non-i1 case; NFCISanjay Patel2017-02-171-9/+25
| | | | | | I can't find any tests of the non-i1 code path, so it may be unnecessary at this point. llvm-svn: 295463
* Fix signed/unsigned comparison warning.Simon Pilgrim2017-02-171-2/+2
| | | | llvm-svn: 295453
* [DAGCombine] Recognise any_extend_vector_inreg and truncation style shuffle ↵Simon Pilgrim2017-02-171-0/+125
| | | | | | | | | | | | | | masks During legalization we are often creating shuffles (via a build_vector scalarization stage) that are "any_extend_vector_inreg" style masks, and also other masks that are the equivalent of "truncate_vector_inreg" (if we had such a thing). This patch is an attempt to match these cases to help undo the effects of just leaving shuffle lowering to handle it - which typically means we lose track of the undefined elements of the shuffles resulting in an unnecessary extension+truncation stage for widened illegal types. The 2011-10-21-widen-cmp.ll regression will be fixed by making SIGN_EXTEND_VECTOR_IN_REG legal in SSE instead of lowering them to X86ISD::VSEXT (PR31712). Differential Revision: https://reviews.llvm.org/D29454 llvm-svn: 295451
* [DAGCombiner] improve readability; NFCISanjay Patel2017-02-171-44/+32
| | | | llvm-svn: 295447
* [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combineArtur Pilipenko2017-02-161-8/+19
| | | | | | | | | | | | Resubmit -r295314 with PowerPC and AMDGPU tests updated. Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295336
* Rever -r295314 "[DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in ↵Artur Pilipenko2017-02-161-19/+8
| | | | | | | | load combine" This change causes some of AMDGPU and PowerPC tests to fail. llvm-svn: 295316
* [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combineArtur Pilipenko2017-02-161-8/+19
| | | | | | | | | | Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters. Reviewed By: filcab Differential Revision: https://reviews.llvm.org/D29591 llvm-svn: 295314
* DAG: Do not scalarize fsub if fneg is legalMatt Arsenault2017-02-151-0/+15
| | | | | | Tests will be included with future commit. llvm-svn: 295242
* [DAG] Don't try to create an INSERT_SUBVECTOR with an illegal sourceMichael Kuperstein2017-02-151-1/+7
| | | | | | | | | | | | We currently can't legalize those, but we should really not be creating them in the first place, since legalization would probably look similar to the way we legalize CONCAT_VECTORS - basically replace the INSERT with a BUILD. This fixes PR311956. Differential Revision: https://reviews.llvm.org/D29961 llvm-svn: 295213
* [SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where ↵Craig Topper2017-02-151-46/+24
| | | | | | | | | | | | | | | | | | | inputs are larger than the mask Summary: The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract. This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract. Reviewers: zvi, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29926 llvm-svn: 295152
* [Tablegen] Instrumenting table gen DAGGenISelDAGAditya Nandakumar2017-02-141-0/+9
| | | | | | | | | | To help assist in debugging ISEL or to prioritize GlobalISel backend work, this patch adds two more tables to <Target>GenISelDAGISel.inc - one which contains the patterns that are used during selection and the other containing include source location of the patterns Enabled through CMake varialbe LLVM_ENABLE_DAGISEL_COV llvm-svn: 295081
* Removing a redundant assignmentArtyom Skrobov2017-02-141-1/+0
| | | | llvm-svn: 295055
* swiftcc: Don't emit tail calls from callers with swifterror parametersArnold Schwaighofer2017-02-131-0/+9
| | | | | | | | | Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982
* [FastISel] Add a diagnostic to warm on fallback.Quentin Colombet2017-02-131-0/+13
| | | | | | | | This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970
* [DAGCombiner] Teach DAG combine that inserting an extract_subvector result ↵Craig Topper2017-02-131-0/+6
| | | | | | into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932
* [DAGCombiner] Remove the half vector width check for the combine of ↵Craig Topper2017-02-121-4/+3
| | | | | | | | EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930
* [TargetLowering] fix SETCC SETLT folding with FP typesSanjay Patel2017-02-121-9/+13
| | | | | | | | | | | | The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924
* [DAGCombiner] Make the combine of INSERT_SUBVECTOR into a CONCAT_VECTOR more ↵Craig Topper2017-02-111-16/+9
| | | | | | generic to support larger concats. llvm-svn: 294875
* [TargetLowering] check for sign-bit comparisons in SimplifyDemandedBitsSanjay Patel2017-02-111-0/+19
| | | | | | | | | | | | | | | | I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often. The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here: https://llvm.org/bugs/show_bug.cgi?id=11210 Differential Revision: https://reviews.llvm.org/D29687 llvm-svn: 294863
* [DAGCombine] Allow vector constant folding of any value type before type ↵Simon Pilgrim2017-02-102-1/+6
| | | | | | | | | | | | | | | | legalization The patch comes in 2 parts: 1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types. 2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled. Fix for PR30760 Differential Revision: https://reviews.llvm.org/D29568 llvm-svn: 294749
* [SelectionDAG] Dump the DAG after legalizing vector ops and after the second ↵Craig Topper2017-02-101-0/+6
| | | | | | | | | | | | | | | | | | | | | type legalization Summary: With -debug, we aren't dumping the DAG after legalizing vector ops. In particular, on X86 with AVX1 only, we don't dump the DAG after we split 256-bit integer ops into pairs of 128-bit ADDs since this occurs during vector legalization. I'm only dumping if the legalize vector ops changes something since we don't print anything during legalize vector ops. So this dump shows up right after the first type-legalization dump happens. So if nothing changed this second dump is unnecessary. Having said that though, I think we should probably fix legalize vector ops to log what its doing. Reviewers: RKSimon, eli.friedman, spatel, arsenm, chandlerc Reviewed By: RKSimon Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D29554 llvm-svn: 294711
* [SelectionDAG] Fix bugs in inverted condition splitting code.Geoff Berry2017-02-091-4/+6
| | | | | | | | | | | | | | | | | | Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605
* [DAGCombiner] Support non-zero offset in load combineArtur Pilipenko2017-02-091-3/+8
| | | | | | | | | | | | | | | Enable folding patterns which load the value from non-zero offset: i8 *a = ... i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24) => i32 val = *((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582
* [DAGCombiner] NFC. Mark ByteProvider accessors as constArtur Pilipenko2017-02-081-2/+2
| | | | llvm-svn: 294494
* [DAGCombiner] Push truncate through adde when the carry isn't used.Amaury Sechet2017-02-081-0/+12
| | | | | | | | | | | | Summary: As per title. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29528 llvm-svn: 294394
* [SDAGISel] Simplify some SDAGISel code, NFCReid Kleckner2017-02-072-81/+84
| | | | | | | | | | | | Hoist entry block code for arguments and swift error values out of the basic block instruction selection loop. Lowering arguments once up front seems much more readable than doing it conditionally inside the loop. It also makes it clear that argument lowering can update StaticAllocaMap because no instructions have been selected yet. Also use range-based for loops where possible. llvm-svn: 294329
* [TargetLowering] fix formatting and comments for ShrinkDemandedConstant; NFCSanjay Patel2017-02-071-19/+20
| | | | llvm-svn: 294325
* Revert "[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry)"Daniel Jasper2017-02-071-6/+0
| | | | | | | | | | | | | This reverts commit r294186. On an internal test, this triggers an out-of-memory error on PPC, presumably because there is another dagcombine that does the exact opposite triggering and endless loop consuming more and more memory. Chandler has started at creating a reduced test case and we'll attach it as soon as possible. llvm-svn: 294288
* [DAGCombiner] Support bswap as a part of load combine patternsArtur Pilipenko2017-02-061-0/+3
| | | | | | | | Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29397 llvm-svn: 294201
* Add ADDC to SelectionDAG::computeKnownBits and ComputeNumSignBits.Amaury Sechet2017-02-061-1/+3
| | | | | | | | | | | | Summary: As per title. Reviewers: bkramer, sunfish, lattner, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29521 llvm-svn: 294188
* [DAGCombiner] Make DAGCombiner smarter about overflowAmaury Sechet2017-02-062-20/+30
| | | | | | | | | | | | Summary: Leverage it to transform addc into add. Reviewers: mkuper, spatel, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29524 llvm-svn: 294187
* [DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry)Amaury Sechet2017-02-061-0/+6
| | | | | | | | | | | | Summary: This is extracted from D29443 . Reviewers: mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29564 llvm-svn: 294186
* [X86][SSE] Combine shuffle nodes with multiple uses if all the users are ↵Simon Pilgrim2017-02-061-0/+15
| | | | | | | | | | | | | | being combined. Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines. We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree. This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list. Differential Revision: https://reviews.llvm.org/D29399 llvm-svn: 294183
* [SelectionDAG] In InstrEmitter, handle EXTRACT_SUBREG of a physical register.Geoff Berry2017-02-051-8/+23
| | | | | | | | | | | | | | | | | | | Summary: Without this change, the getVR() call would hit an assert since it was being passed a physical register. Update the AArch64/ldst-opt.ll test with a case that triggers this behavior by adding a run with strict-align, which causes an unaligned STR XZR instruction to be split into byte stores, creating an EXTRACT_SUBREG of XZR that triggers the original problem. Reviewers: bogner, qcolombet, MatzeB, atrick Subscribers: aemerson, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D29495 llvm-svn: 294129
* [DAGCombiner] Leverage add's commutativityAmaury Sechet2017-02-051-6/+14
| | | | | | | | | | | | Summary: This avoid the need to duplicate all pattern and actually end up exposing some opportunity to optimize existing pattern that did not exists in both directions on an existing test case. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29541 llvm-svn: 294125
* [DAGCombiner] Canonicalize the order of a chain of INSERT_SUBVECTORs.Craig Topper2017-02-041-4/+24
| | | | | | Based on similar code for INSERT_VECTOR_ELT. llvm-svn: 294110
* [DAGCombiner] Use DAG.getAnyExtOrTrunc to simplify some code. NFCCraig Topper2017-02-041-5/+1
| | | | llvm-svn: 294109
* [DAGCombiner] In visitINSERT_VECTOR_ELT, move check for BUILD_VECTOR being ↵Craig Topper2017-02-041-4/+4
| | | | | | legal below code that just canonicalizes INSERT_VECTOR_ELT without creating BUILD_VECTORS. llvm-svn: 294108
* Formatting in DAGCombiner. NFCAmaury Sechet2017-02-041-0/+2
| | | | llvm-svn: 294091
* [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-02-041-26/+62
| | | | | | | | other minor fixes (NFC). This is preparation to reduce TargetInstrInfo.h dependencies. llvm-svn: 294084
* [TLI] Robustize SDAG LibFunc proto checking by merging it into TLI.Ahmed Bougacha2017-02-031-97/+53
| | | | | | | | | | | | | | | | | | | | | | | This re-applies commit r292189, reverted in r292191. SelectionDAGBuilder recognizes libfuncs using some homegrown parameter type-checking. Use TLI instead, removing another heap of redundant code. This isn't strictly NFC, as the SDAG code was too lax. Concretely, this means changes are required to a few tests: - calling a non-variadic function via a variadic prototype isn't OK; it just happens to work on x86_64 (but not on, e.g., aarch64). - mempcpy has a size_t parameter; the SDAG code accepts any integer type, which meant using i32 on x86_64 worked. - a handful of SystemZ tests check the SDAG support for lax prototype checking: Ulrich agrees on removing them. I don't think it's worth supporting any of these (IMO) invalid testcases. Instead, fix them to be more meaningful. llvm-svn: 294028
* [SelectionDAG] Fix for PR30775: Assertion `NodeToMatch->getOpcode() !=Alexey Bataev2017-02-031-8/+12
| | | | | | | | | | | | ISD::DELETED_NODE && "NodeToMatch was removed partway through selection"' failed. NodeToMatch can be modified during matching, but code does not handle this situation. Differential Revision: https://reviews.llvm.org/D29292 llvm-svn: 294003
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2017-02-021-372/+368
| | | | | | | | | UseAA is enabled." This reverts commit r293893 which is miscompiling lua on ARM and bootstrapping for x86-windows. llvm-svn: 293915
* Use N0 instead of N->getOperand(0) in DagCombiner::visitAdd. NFCAmaury Sechet2017-02-021-1/+1
| | | | llvm-svn: 293903
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2017-02-021-368/+372
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Recommiting after fixing X86 inc/dec chain bug. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 293893
OpenPOWER on IntegriCloud