summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [ModuloSchedule] Fix no-asserts buildJames Molloy2019-09-041-5/+8
| | | | | | Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893. llvm-svn: 370894
* [ModuloSchedule] Introduce PeelingModuloScheduleExpanderJames Molloy2019-09-042-6/+478
| | | | | | | | | | | | | | | This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs. This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander. Prolog and epilog peeling will come in a different patch. Differential Revision: https://reviews.llvm.org/D67081 llvm-svn: 370893
* [DebugInfo] LiveDebugValues: locations with different exprs should not be mergedJeremy Morse2019-09-041-7/+17
| | | | | | | | | | | | | | | | | When comparing variable locations, LiveDebugValues currently considers only the machine location, ignoring any DIExpression applied to it. This is a problem because that DIExpression can do pretty much anything to the machine location, for example dereferencing it. This patch adds DIExpressions to that comparison; now variables based on the same register/memory-location but with different expressions will compare differently, and be dropped if we attempt to merge them between blocks. This reduces variable coverage-range a little, but only because we were producing broken locations. Differential Revision: https://reviews.llvm.org/D66942 llvm-svn: 370877
* [LiveDebugValues][NFC] Silence an unused variable warningJeremy Morse2019-09-041-0/+1
| | | | | | | | On release builds, 'MI' isn't used by anything (it's already inserted into a block by BuildMI), while on non-release builds it's used by a LLVM_DEBUG statement. Mark as explicitly used to avoid the warning. llvm-svn: 370870
* [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type ↵Amara Emerson2019-09-041-3/+7
| | | | | | | | | | | combination. Similar to the issue with G_ZEXT that was fixed earlier, this is a quick to fall back if the source type is not exactly half of the dest type. Fixes the clang-cmake-aarch64-lld bot build. llvm-svn: 370847
* [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.Amara Emerson2019-09-031-4/+22
| | | | | | | | | Now that we have the infrastructure to support s128 types as parameters we can expand these to libcalls. Differential Revision: https://reviews.llvm.org/D66185 llvm-svn: 370823
* [GlobalISel][CallLowering] Add support for splitting types according to ↵Amara Emerson2019-09-033-37/+157
| | | | | | | | | | | | | | calling conventions. On AArch64, s128 types have to be split into s64 GPRs when passed as arguments. This change adds the generic support in call lowering for dealing with multiple registers, for incoming and outgoing args. Support for splitting for return types not yet implemented. Differential Revision: https://reviews.llvm.org/D66180 llvm-svn: 370822
* [CodeGen] Use FSHR in DAGTypeLegalizer::ExpandIntRes_MULFIXBjorn Pettersson2019-09-031-49/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: Simplify the right shift of the intermediate result (given in four parts) by using funnel shift. There are some impact on lit tests, but that seems to be related to register allocation differences due to how FSHR is expanded on X86 (giving a slightly different operand order for the OR operations compared to the old code). Reviewers: leonardchan, RKSimon, spatel, lebedev.ri Reviewed By: RKSimon Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, pzheng, bevinh, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67036 llvm-svn: 370813
* [MachinePipeliner] Add a way to unit-test the schedule emitterJames Molloy2019-09-033-0/+128
| | | | | | | | | | | | | | | | | | Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705
* [LegalizeDAG] Pass DAG to two calls to SDNode::dump in debug prints so that ↵Craig Topper2019-09-031-2/+2
| | | | | | | | | they will print target specific nodes correctly. The dump methods can only print target node names correctly if they can get access to the TLI object. llvm-svn: 370694
* [TargetLowering][PS4] Add sincos(f) lib functions when target is PS4Robert Lougher2019-09-021-0/+5
| | | | | | | | | PS4 supports sincosf and sincos. Adding the library functions enables the sin(f)+cos(f) -> sincos(f) optimization. Differential Revision: https://reviews.llvm.org/D67009 llvm-svn: 370675
* [DAGCombiner] try to form test+set out of shift+mask patternsSanjay Patel2019-09-021-0/+57
| | | | | | | | | | | | | | | | | | | | | The motivating bugs are: https://bugs.llvm.org/show_bug.cgi?id=41340 https://bugs.llvm.org/show_bug.cgi?id=42697 As discussed there, we could view this as a failure of IR canonicalization, but then we would need to implement a backend fixup with target overrides to get this right in all cases. Instead, we can just view this as a codegen opportunity. It's not even clear for x86 exactly when we should favor test+set; some CPUs have better theoretical throughput for the ALU ops than bt/test. This patch is made more complicated than I expected because there's an early DAGCombine for 'and' that can change types of the intermediate ops via trunc+anyext. Differential Revision: https://reviews.llvm.org/D66687 llvm-svn: 370668
* [DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locationsJeremy Morse2019-09-022-4/+20
| | | | | | | | | | | | | | | | | The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370648
* [AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creatingAmara Emerson2019-09-021-3/+5
| | | | | | | | the merges. Fixes PR43171. llvm-svn: 370627
* [DAGCombiner] improve throughput of shift+logic+shiftSanjay Patel2019-09-011-0/+74
| | | | | | | | | | | | | | | | | | | | | | | | The motivating case for this is a long way from here: https://bugs.llvm.org/show_bug.cgi?id=43146 ...but I think this is where we have to start. We need to canonicalize/optimize sequences of shift and logic to ease pattern matching for things like bswap and improve perf in general. But without the artificial limit of '!LegalTypes' (early combining), there are a lot of test diffs, and not all are good. In the minimal tests added for this proposal, x86 should have better throughput in all cases. AArch64 is neutral for scalar tests because it can fold shifts into bitwise logic ops. There are 3 shift opcodes and 3 logic opcodes for a total of 9 possible patterns: https://rise4fun.com/Alive/VlI https://rise4fun.com/Alive/n1m https://rise4fun.com/Alive/1Vn Differential Revision: https://reviews.llvm.org/D67021 llvm-svn: 370617
* [TargetLowering] Fix Bugzilla ID 43183 to avoid soften comparison broken ↵Shiva Chen2019-09-012-45/+88
| | | | | | | | | | with constant inputs Summary: This fixes the bugzilla id 43183 which triggerd by the following commit: [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall llvm-svn: 370604
* [DAGCombiner] clean up code in visitShiftByConstant()Sanjay Patel2019-08-311-25/+20
| | | | | | | This is not quite NFC because the SDLoc propagation is changed, but there are no regression test diffs from that. llvm-svn: 370587
* [DAGCombiner] Match (add X, X) as (shl X, 1) when detecting rotate.Amaury Sechet2019-08-311-4/+20
| | | | | | | | | | | | | | Summary: The combiner transforms (shl X, 1) into (add X, X). Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66882 llvm-svn: 370578
* [DAGCombiner] Don't create illegal narrow storesJames Molloy2019-08-311-2/+7
| | | | | | | | | | | | | Narrowing stores when the target doesn't support the narrow version forces the target to expand into a load-modify-store sequence, which is highly suboptimal. The information narrowing throws away (legality of the inverse transform) is hard to re-analyze. If the target doesn't support a store of the narrow type, don't narrow even in pre-legalize mode. No test as this is DAGCombiner and depends on target bits. llvm-svn: 370576
* [CodeGen] Refactor DAGTypeLegalizer::ExpandIntRes_MULFIX. NFCBjorn Pettersson2019-08-311-87/+92
| | | | | | | | | Restructured the code a little bit in preparation for adding UMULFIXSAT. I think it will be easier to understand the code if not interleaving the codegen for signed/unsigned/saturated cases that much. llvm-svn: 370569
* [MachinePipeliner] Separate schedule emission, NFCJames Molloy2019-08-303-1125/+1236
| | | | | | | | | | | | | | | | This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic and state required to *emit* a scheudule from the logic that *computes* and validates a schedule. This will enable (a) new schedule emitters and (b) new modulo scheduling implementations to coexist. NFC. Differential Revision: https://reviews.llvm.org/D67006 llvm-svn: 370500
* [DAGCombine] ReduceLoadWidth - remove duplicate SDLoc. NFCI.Simon Pilgrim2019-08-301-3/+2
| | | | | | SDLoc(N0) and SDLoc(cast<LoadSDNode>(N0)) should be equivalent. llvm-svn: 370498
* [TargetLowering] SimplifyDemandedBits ADD/SUB/MUL - correctly inherit ↵Simon Pilgrim2019-08-301-4/+2
| | | | | | | | | | SDNodeFlags from the original node. Just disable NSW/NUW flags. This matches what we're already doing for the other situations for these nodes, it was just missed for the demanded constant case. Noticed by inspection - confirmed in offline discussion with @spatel. I've checked we have test coverage in the x86 extract-bits.ll and extract-lowbits.ll tests llvm-svn: 370497
* GlobalISel: Fix missing pass dependencyMatt Arsenault2019-08-301-0/+1
| | | | llvm-svn: 370496
* [ValueTypes] Add v16f16 and v32f16 to EVT::getEVTString and Tablegen's ↵Craig Topper2019-08-301-0/+2
| | | | | | | | getEnumName Missed these when I hadded the enum entries llvm-svn: 370494
* [DAGCombine] visitVSELECT - remove equivalent getValueType() call. NFCI.Simon Pilgrim2019-08-301-1/+0
| | | | llvm-svn: 370489
* [DAGCombine] visitVSELECT - remove duplicate getOperand calls. NFCI.Simon Pilgrim2019-08-301-4/+3
| | | | llvm-svn: 370478
* [DAGCombine] visitVSELECT - use getShiftAmountTy for shift amounts.Simon Pilgrim2019-08-301-3/+3
| | | | llvm-svn: 370471
* [DAGCombine] visitMULHS - use getScalarValueSizeInBits() to make safe for ↵Simon Pilgrim2019-08-301-1/+1
| | | | | | | | vector types. This is hidden behind a (scalar-only) isOneConstant(N1) check at the moment, but once we get around to adding vector support we need to ensure we're dealing with the scalar bitwidth, not the total. llvm-svn: 370468
* [CodeGen] Introduce MachineBasicBlock::replacePhiUsesWith helper and use it. NFCBjorn Pettersson2019-08-302-24/+16
| | | | | | | | | | | | | | | | | | | | | Summary: Found a couple of places in the code where all the PHI nodes of a MBB is updated, replacing references to one MBB by reference to another MBB instead. This patch simply refactors the code to use a common helper (MachineBasicBlock::replacePhiUsesWith) for such PHI node updates. Reviewers: t.p.northover, arsenm, uabelho Subscribers: wdng, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66750 llvm-svn: 370463
* [DAGCombine] visitMULHS/visitMULHU - isBuildVectorAllZeros doesn't mean node ↵Simon Pilgrim2019-08-301-8/+8
| | | | | | | | | | is all zeros Return a proper zero vector, just in case some elements are undef. Noticed by inspection after dealing with a similar issue in PR43159. llvm-svn: 370460
* [LiveDebugValues] Insert entry values after bundlesDavid Stenberg2019-08-301-2/+1
| | | | | | | | | | | | | | | | | | | | Summary: Change LiveDebugValues so that it inserts entry values after the bundle which contains the clobbering instruction. Previously it would insert the debug value after the bundle head using insertAfter(), breaking the bundle. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66888 llvm-svn: 370448
* [MIPS GlobalISel] Lower fptouiPetar Avramovic2019-08-301-0/+44
| | | | | | | | | | Add lower for G_FPTOUI. Algorithm is similar to the SDAG version in TargetLowering::expandFP_TO_UINT. Lower G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D66929 llvm-svn: 370431
* [CodeGen] Fix lowering for returning the result of an extractvalueDan Gohman2019-08-301-1/+1
| | | | | | | | | | | | | | When the number of return values exceeds the number of registers available, SelectionDAGBuilder::visitRet transforms a function's return to use a pointer to a buffer to hold return values. When the returned value is an operator such as extractvalue, the value may have a non-zero result number. Add that number to the indexing when obtaining the values to store. This fixes https://bugs.llvm.org/show_bug.cgi?id=43132. Differential Revision: https://reviews.llvm.org/D66978 llvm-svn: 370430
* Revert [MBP] Disable aggressive loop rotate in plain modeJordan Rupprecht2019-08-291-80/+36
| | | | | | | | This reverts r369664 (git commit 51f48295cbe8fa3a44db263b528dd9f7bae7bf9a) It causes many benchmark regressions, internally and in llvm's benchmark suite. llvm-svn: 370398
* GlobalISel: Don't compute known bits for non-integral GEPMatt Arsenault2019-08-291-2/+7
| | | | llvm-svn: 370392
* GlobalISel: Add maskedValueIsZero and signBitIsZero to known bitsMatt Arsenault2019-08-291-0/+6
| | | | | | | I dropped the DemandedElts since it seems to be missing from some of the new interfaces, but not others. llvm-svn: 370389
* GlobalISel: Add known bits to InstructionSelectorMatt Arsenault2019-08-291-1/+5
| | | | | | | | AMDGPU uses this for some addressing mode selection patterns. The analysis run itself doesn't do anything so it seems easier to just always require this than adding a way to opt in. llvm-svn: 370388
* [DAGCombine] Fix shadow variable warnings. NFCI.Simon Pilgrim2019-08-291-12/+12
| | | | llvm-svn: 370365
* [DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locationsJeremy Morse2019-08-291-0/+1
| | | | | | | | | | | | | | | | | The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370334
* Fix signed/unsigned comparison warning. NFCI.Simon Pilgrim2019-08-291-1/+2
| | | | llvm-svn: 370333
* Fix shadow variable warning. NFCI.Simon Pilgrim2019-08-291-4/+3
| | | | llvm-svn: 370332
* [DebugInfo] LiveDebugValues should always revisit backedges if it skips themJeremy Morse2019-08-291-37/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "join" method in LiveDebugValues does not attempt to join unseen predecessor blocks if their out-locations aren't yet initialized, instead the block should be re-visited later to see if any locations have changed validity. However, because the set of blocks were all being "process"'d once before "join" saw them, that logic in "join" was actually ignoring legitimate out-locations on the first pass through. This meant that some invalidated locations were not removed from the head of loops, allowing illegal locations to persist. Fix this by removing the run of "process" before the main join/process loop in ExtendRanges. Now the unseen predecessors that "join" skips truly are uninitialized, and we come back to the block at a later time to re-run "join", see the @baz function added. This also fixes another fault where stack/register transfers in the entry block (or any other before-any-loop-block) had their tranfers initially ignored, and were then never revisited. The MIR test added tests for this behaviour. XFail a test that exposes another bug; a fix for this is coming in D66895. Differential Revision: https://reviews.llvm.org/D66663 llvm-svn: 370328
* [DAGCombiner] (insert_vector_elt (vector_shuffle X, Y), (extract_vector_elt ↵Amaury Sechet2019-08-291-4/+43
| | | | | | | | | | | | | | | | X, N), IdxC) -> (vector_shuffle X, Y) Summary: This is beneficial when the shuffle is only used once and end up being generated in a few places when some node is combined into a shuffle. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66718 llvm-svn: 370326
* LegalizeSetCCCondCode - Reduce scope of NeedSwap to fix cppcheck warning. NFCI.Simon Pilgrim2019-08-291-1/+1
| | | | | | No need for this to be defined outside the only switch case its used in. llvm-svn: 370320
* [X86] Make inline assembly 'x' and 'v' constraints work for f128.Craig Topper2019-08-291-2/+6
| | | | | | | | | Including a type legalizer fix to make bitcast operand promotion work correctly when getSoftenedFloat returns f128 instead of i128. Fixes PR43157 llvm-svn: 370293
* [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCallShiva Chen2019-08-282-8/+66
| | | | | | | | | | | | | | | | | | | | | | | | | The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult | (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275
* [FPEnv] Add fptosi and fptoui constrained intrinsics.Kevin P. Neal2019-08-2810-14/+141
| | | | | | | | | | | | | | | | | This implements constrained floating point intrinsics for FP to signed and unsigned integers. Quoting from D32319: The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed. Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton Approved by: Craig Topper Differential Revision: http://reviews.llvm.org/D63782 llvm-svn: 370228
* [AArch64][GlobalISel] Fall back when translating musttail callsJessica Paquette2019-08-281-0/+1
| | | | | | | | | | These are currently translated as normal functions calls in AArch64. Until we have proper tail call lowering, we shouldn't translate these. Differential Revision: https://reviews.llvm.org/D66842 llvm-svn: 370225
* [AMDGPU] Adjust number of SGPRs available in Calling ConventionRyan Taylor2019-08-281-14/+4
| | | | | | | | | This reduces the number of SGPRs due to some concerns about running out of SGPRs if you make all the SGPRs that aren't reserved available for the calling convention. Change-Id: Idb4ca4dc72f5b6808cb524ff7270915a8de5b4c1 llvm-svn: 370215
OpenPOWER on IntegriCloud