summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* ScheduleDAG: Cleanup dumping code; NFCMatthias Braun2018-09-195-20/+32
| | | | | | | | | | | | - Instead of having both `SUnit::dump(ScheduleDAG*)` and `ScheduleDAG::dumpNode(ScheduleDAG*)`, just keep the latter around. - Add `ScheduleDAG::dump()` and avoid code duplication in several places. Implement it for different ScheduleDAG variants. - Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()` functions. They were only ever used for debug dumping and putting the function into ScheduleDAG is consistent with the `dumpNode()` change. llvm-svn: 342520
* Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Amara Emerson2018-09-171-1/+10
| | | | | | | | extract_subvector with invalid index."" Fixed the assertion failure. llvm-svn: 342397
* [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x)Sanjay Patel2018-09-163-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040. Copying the motivational statement by @evandro from that patch: "This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64." I'm proposing to add only the minimum support for a DAG node here. Since we don't have an LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I don't think we need to worry about DAG builder, legalization, a strict variant, etc. We should be able to expand as needed when adding more functionality/transforms. For reference, these are transform suggestions currently listed in SimplifyLibCalls.cpp: // * cbrt(expN(X)) -> expN(x/3) // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(cbrt(x)) -> pow(x,1/9) Also, given that we bail out on long double for now, there should not be any logical differences between platforms (unless there's some platform out there that has pow() but not cbrt()). Differential Revision: https://reviews.llvm.org/D51753 llvm-svn: 342348
* Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Reid Kleckner2018-09-141-8/+1
| | | | | | | | | extract_subvector with invalid index." Causes 'isVector() && "Invalid vector type!"' assertion when building Skia in Chrome. llvm-svn: 342265
* Fix debug info for SelectionDAG legalization of DAG nodes with two results.Adrian Prantl2018-09-141-1/+1
| | | | | | | | | | | | | | | | This patch fixes the debug info handling for SelectionDAG legalization of DAG nodes with two results. When an replaced SDNode has more than one result, transferDbgValues was always copying the SDDbgValue from the first result and attaching them to all members. In reality SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes (though the type signature doesn't make this obvious (cf. the call site code in ReplaceNode()). rdar://problem/44162227 Differential Revision: https://reviews.llvm.org/D52112 llvm-svn: 342264
* fix noasserts buildAdrian Prantl2018-09-141-0/+2
| | | | llvm-svn: 342247
* SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose outputAdrian Prantl2018-09-142-0/+19
| | | | llvm-svn: 342245
* fix typosAdrian Prantl2018-09-141-1/+1
| | | | llvm-svn: 342241
* [DAGCombine] Fix crash when store merging created an extract_subvector with ↵Amara Emerson2018-09-131-1/+8
| | | | | | | | invalid index. Differential Revision: https://reviews.llvm.org/D51831 llvm-svn: 342183
* DAG: Fix expansion of unaligned FP loads and storesMatt Arsenault2018-09-131-4/+6
| | | | | | | | | This was trying to scalarizing a scalar FP type, resulting in an assert. Fixes unaligned f64 stack stores for AMDGPU. llvm-svn: 342132
* [DAGCombiner] improve formatting for select+setcc code; NFCSanjay Patel2018-09-121-17/+15
| | | | llvm-svn: 342095
* [SelectionDAG] Remove some code from PromoteIntOp_MGATHER that handles ↵Craig Topper2018-09-121-8/+1
| | | | | | | | UpdateNodeOperands returning an existing node instead of updating. I suspect this became unecessary when the CSE of mgather was fixed in r338080. It may still be possible to hit this if we widen the element type of a gather outside of type legalization and the promote the mask of a separate gather node so they become the same. But that seems pretty unlikely. llvm-svn: 342022
* DAG: Handle odd vector sizes in calling conv splittingMatt Arsenault2018-09-101-12/+17
| | | | | | | | | | | | | | This already worked if only one register piece was used, but didn't if a type was split into multiple, unequal sized pieces. Fixes not splitting 3i16/v3f16 into two registers for AMDGPU. This will also allow fixing the ABI for 16-bit vectors in a future commit so that it's the same for all subtargets. llvm-svn: 341801
* [SelectionDAG] enhance vector demanded elements to look at a vector select ↵Sanjay Patel2018-09-091-4/+12
| | | | | | | | | | | | | | | | condition operand This is the DAG equivalent of D51433. If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition. The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit vectors because we don't need those anyway. I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed to be running there? Differential Revision: https://reviews.llvm.org/D51696 llvm-svn: 341762
* [DAGCombiner] foldBitcastedFPLogic - Add basic vector supportSimon Pilgrim2018-09-071-8/+8
| | | | | | | | Add support for bitcasts from float type to an integer type of the same element bitwidth. There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet. llvm-svn: 341652
* [DAGCombiner] try to convert pow(x, 0.25) to sqrt(sqrt(x))Sanjay Patel2018-09-051-0/+41
| | | | | | | | | | | | | | | | | | | | | This was proposed as an IR transform in D49306, but it was not clearly justifiable as a canonicalization. Here, we only do the transform when the target tells us that sqrt can be lowered with inline code. This is the basic case. Some potential enhancements are in the TODO comments: 1. Generalize the transform for other exponents (allow more than 2 sqrt calcs if that's really cheaper). 2. If we have less fast-math-flags, generate code to avoid -0.0 and/or INF. 3. Allow the transform when optimizing/minimizing size (might require a target hook to get that right). Note that by default, x86 converts single-precision sqrt calcs into sqrt reciprocal estimate with refinement. That codegen is controlled by CPU attributes and can be manually overridden. We have plenty of test coverage for that already, so I didn't bother to include extra testing for that here. AArch uses its full-precision ops in all cases (not sure if that's the intended behavior or not, but that should also be covered by existing tests). Differential Revision: https://reviews.llvm.org/D51630 llvm-svn: 341481
* DAG: Factor out helper function for odd vector sizesMatt Arsenault2018-09-041-22/+28
| | | | llvm-svn: 341392
* [CodeGen] Fix remaining zext() assertions in SelectionDAGScott Linder2018-09-042-16/+14
| | | | | | | | Fix remaining cases not committed in https://reviews.llvm.org/D49574 Differential Revision: https://reviews.llvm.org/D50659 llvm-svn: 341380
* DAG: Handle extract_vector_elt in isKnownNeverNaNMatt Arsenault2018-09-031-0/+3
| | | | llvm-svn: 341317
* [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle inverted patternRoman Lebedev2018-09-021-4/+18
| | | | | | | | | | | | | | | | | | | | | | Summary: A follow-up for D49266 / rL337166 + D49497 / rL338044. This is still the same pattern to check for the [lack of] signed truncation, but in this case the constants and the predicate are negated. https://rise4fun.com/Alive/BDV https://rise4fun.com/Alive/n7Z Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51532 llvm-svn: 341287
* [DAGCombiner] Fix bad identation. NFCCraig Topper2018-08-301-1/+1
| | | | llvm-svn: 341103
* [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysisNicolai Haehnle2018-08-302-2/+2
| | | | | | | | | | | | | | | | | | | | Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes. Patch by: Simon Moll Reviewed By: nhaehnle Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50434 Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
* DAG: Don't use ABI copies in some contextsMatt Arsenault2018-08-301-2/+3
| | | | | | | | | | | | | If an ABI-like value is used in a different block, the type split used is not necessarily the same as the call's ABI. The value is used through an intermediate copy virtual registers from the other block. This resulted in copies with inconsistent sizes later. Fixes regressions since r338197 when AMDGPU started splitting vector types for calls. llvm-svn: 341018
* [DAGCombiner] Add X / X -> 1 & X % X -> 0 foldsSimon Pilgrim2018-08-291-1/+18
| | | | | | | | Adds more divrem folds to try and get in sync with InstructionSimplify Differential Revision: https://reviews.llvm.org/D50636 llvm-svn: 340919
* [X86] Support v2i32 gather/scatter indices with ↵Craig Topper2018-08-293-21/+46
| | | | | | | | | | | | | | | | -x86-experimental-vector-widening-legalization Summary: This is split out from D41062 to cover the code in LegalVectorTypes.cpp Reviewers: RKSimon, spatel, efriedma Reviewed By: efriedma Subscribers: sdardis, jvesely, nhaehnle, jrtc27, atanasyan, llvm-commits Differential Revision: https://reviews.llvm.org/D51337 llvm-svn: 340891
* [DAGCombine] Rework MERGE_VALUES to inline in single pass. NFCI.Nirav Dave2018-08-281-1/+4
| | | | | | | Avoid hyperlinear cost of inlining MERGE_VALUE node by constructing temporary vector and doing a single replacement. llvm-svn: 340853
* [DAG] Avoid recomputing Divergence checks. NFCI.Nirav Dave2018-08-281-6/+10
| | | | | | | When making multiple updates to the same SDNode, recompute node divergence only once after all changes have been made. llvm-svn: 340852
* [DAG] Fix updateDivergence calculationNirav Dave2018-08-281-1/+1
| | | | | | | Check correct SDNode when deciding if we should update the divergence property. llvm-svn: 340851
* [DAGCombiner][AMDGPU][Mips] Fold bitcast with volatile loads if the ↵Craig Topper2018-08-281-3/+12
| | | | | | | | | | | | | | | | | | | resulting load is legal for the target. Summary: I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there... I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))). Reviewers: efriedma, atanasyan, arsenm Reviewed By: efriedma Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50491 llvm-svn: 340797
* DAG: Check transformed type for forming fminnum/fmaxnum from vselectMatt Arsenault2018-08-271-2/+3
| | | | | | Follow up to r340655 to fix vector types which are split. llvm-svn: 340766
* [SelectionDAG] add helper query for binops; NFCSanjay Patel2018-08-271-11/+2
| | | | | | We will also use this in a planned enhancement for vector insertelement. llvm-svn: 340741
* [SelectionDAG][x86] turn insertelement into undef with variable index into splatSanjay Patel2018-08-261-3/+10
| | | | | | | | | | | | | | | | | | I noticed this along with the patterns in D51125, but when the index is variable, we don't convert insertelement into a build_vector. For x86, that means these get expanded at legalization time into the loading/spilling code that we see in the tests. I think it's always better to avoid going to memory on these, and we get the optimal 'broadcast' if it's available. I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have regression tests that would be affected (although I did not check what would happen in those cases). In the most basic cases shown here, AArch64 would probably do much better with a splat. Differential Revision: https://reviews.llvm.org/D51186 llvm-svn: 340705
* [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.Chandler Carruth2018-08-263-6/+6
| | | | | | | | | | | | This is a bit awkward in a handful of places where we didn't even have an instruction and now we have to see if we can build one. But on the whole, this seems like a win and at worst a reasonable cost for removing `TerminatorInst`. All of this is part of the removal of `TerminatorInst` from the `Instruction` type hierarchy. llvm-svn: 340701
* [SelectionDAG][X86] Reorder the operands the MaskedStoreSDNode to put the ↵Craig Topper2018-08-253-35/+18
| | | | | | | | | | | | | | | | | | | | | value first. Summary: Previously the value being stored is the last operand in SDNode. This causes the type legalizer to visit the mask operand before the value operand. The type legalizer was more complicated because of this since we want the type of the value to drive the decisions. This patch moves the value to be the first operand so we visit it first during type legalization. It also simplifies the type legalization code accordingly. X86 is currently the only in tree target that uses this SDNode. Not sure if there are any users out of tree. Reviewers: RKSimon, delena, hfinkel, eli.friedman Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50402 llvm-svn: 340689
* DAG: Allow matching fminnum/fmaxnum from vselectMatt Arsenault2018-08-241-8/+27
| | | | llvm-svn: 340655
* [DAGCombiner][Mips] Don't combine bitcast+store after LegalOperations when ↵Craig Topper2018-08-241-1/+1
| | | | | | | | | | | | | | the store is volatile, if the resulting store isn't Legal Previously we allowed the store to be Custom. But without knowing for sure that the Custom handling won't split the store, we shouldn't convert a volatile store. We also probably shouldn't be creating a store the requires custom handling after LegalizeOps. This could lead to an infinite loop if the custom handling was to insert a bitcast. Though I guess isStoreBitCastBeneficial could be used to block such a loop. The test changes here are due to the volatile part of this. The stores in the test are all volatile and i32 stores are marked custom, So we are no longer converting them This is related to D50491 where I was trying to allow some bitcasting of volatile loads Differential Revision: https://reviews.llvm.org/D50578 llvm-svn: 340626
* [SDAG] Add versions of computeKnownBits that return a valueJustin Bogner2018-08-241-93/+81
| | | | | | | | | | | Having the KnownBits as an output parameter is kind of awkward to use and a holdover from when it was two separate APInts. Instead, just return a KnownBits object. I'm leaving the existing interface in place for now, since updating the callers all at once would be thousands of lines of diff. llvm-svn: 340594
* [SelectionDAG] unroll unsupported vector FP ops earlier to avoid libcalls on ↵Sanjay Patel2018-08-221-7/+26
| | | | | | | | | | | | | | | | undef elements (PR38527) This solves the motivating case from: https://bugs.llvm.org/show_bug.cgi?id=38527 If we are legalizing an FP vector op that maps to 1 of the LLVM intrinsics that mimic libm calls, but we're going to end up with scalar libcalls for that vector type anyway, then we should unroll the vector op into scalars before widening. This avoids libcalls because we've lost the knowledge that some of the scalar elements are undef. Differential Revision: https://reviews.llvm.org/D50791 llvm-svn: 340469
* [ARM] Lower llvm.ctlz.i32 to a libcall when clz is not available.Eli Friedman2018-08-221-0/+15
| | | | | | | | | | The inline sequence is very long (about 70 bytes on Thumb1), so it's not really a good idea to inline it, especially when optimizing for size. Differential Revision: https://reviews.llvm.org/D47917 llvm-svn: 340458
* [WebAssembly] Don't make wasm cleanuppads into funclet entriesHeejin Ahn2018-08-211-3/+8
| | | | | | | | | | | | | | | Summary: Catchpads and cleanuppads are not funclet entries; they are only EH scope entries. We already dont't set `isEHFuncletEntry` for catchpads. This patch does the same thing for cleanuppads. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D50654 llvm-svn: 340330
* [DAGCombiner] Reduce load widths of shifted masksSam Parker2018-08-211-8/+41
| | | | | | | | | | | During combining, ReduceLoadWdith is used to combine AND nodes that mask loads into narrow loads. This patch allows the mask to be a shifted constant. This results in a narrow load which is then left shifted to compensate for the new offset. Differential Revision: https://reviews.llvm.org/D50432 llvm-svn: 340261
* [TargetLowering] Add BuildSDiv support for division by one or negone.Simon Pilgrim2018-08-211-15/+27
| | | | | | This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1. llvm-svn: 340260
* [FPEnv] Support constrained FREM intrinsicCameron McInally2018-08-203-0/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D50975 llvm-svn: 340201
* [TargetLowering] Disable BuildSDiv division by one or negone.Simon Pilgrim2018-08-201-1/+2
| | | | | | Fuzz tests have detected an issue, currently working on a fix. llvm-svn: 340195
* [SelectionDAG] Reuse the Op's VT. NFCI.Simon Pilgrim2018-08-201-2/+2
| | | | llvm-svn: 340173
* [SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for ↵Simon Pilgrim2018-08-201-2/+16
| | | | | | | | | | BITCAST nodes Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts. Handle the case where the sign bit extends to only part of the small elements. llvm-svn: 340169
* [SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for ↵Simon Pilgrim2018-08-191-1/+8
| | | | | | | | | | BITCAST nodes Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts. The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements. llvm-svn: 340143
* [DebugInfo] In FastISel, convert llvm.dbg.label to DBG_LABEL MI.Hsiangkai Wang2018-08-181-0/+12
| | | | | | | | Convert llvm.dbg.label(!label_metadata) to DBG_LABEL !label_metadata. Differential Revision: https://reviews.llvm.org/D50622 llvm-svn: 340122
* [DAGCombiner] Allow divide by constant optimization on opaque constants.Craig Topper2018-08-181-2/+2
| | | | | | | | | | | | | | | | | Summary: I believe this restores the behavior we had before r339147. Fixes PR38622. Reviewers: RKSimon, chandlerc, spatel Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50936 llvm-svn: 340120
* DAG: Fix isKnownNeverNaN for basic non-sNaN casesMatt Arsenault2018-08-171-12/+7
| | | | | | | fadd/fsub/fmul need to worry about infinities as well as fdiv. llvm-svn: 340085
OpenPOWER on IntegriCloud