summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Fixes removal of dead elements from PressureDiff (PR37252).Yury Gribov2018-09-181-2/+1
| | | | | | | | Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342457
* [MachineOutliner][NFC] Don't map more illegal instrs than you have toJessica Paquette2018-09-171-0/+11
| | | | | | | | | | | | | | | We were mapping an instruction every time we saw something we couldn't map before this. Since each illegal mapping is unique, we only have to do this once. This makes it so that we don't map illegal instructions when the previous mapped instruction was illegal. In CTMark (AArch64), this results in 240 fewer instruction mappings on average over 619 files in total. The largest improvement is 12576 fewer mappings in one file, and the smallest is 0. The median improvement is 101 fewer mappings. llvm-svn: 342405
* Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Amara Emerson2018-09-171-1/+10
| | | | | | | | extract_subvector with invalid index."" Fixed the assertion failure. llvm-svn: 342397
* [DebugInfo] Fix build when std::vector::iterator is a pointerKristina Brooks2018-09-161-1/+1
| | | | | | | | | | | | std::vector::iterator type may be a pointer, then iterator::value_type fails to compile since iterator is not a class, namespace, or enumeration. Patch by orivej (Orivej Desh) Differential Revision: https://reviews.llvm.org/D52142 llvm-svn: 342354
* [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x)Sanjay Patel2018-09-164-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040. Copying the motivational statement by @evandro from that patch: "This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64." I'm proposing to add only the minimum support for a DAG node here. Since we don't have an LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I don't think we need to worry about DAG builder, legalization, a strict variant, etc. We should be able to expand as needed when adding more functionality/transforms. For reference, these are transform suggestions currently listed in SimplifyLibCalls.cpp: // * cbrt(expN(X)) -> expN(x/3) // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(cbrt(x)) -> pow(x,1/9) Also, given that we bail out on long double for now, there should not be any logical differences between platforms (unless there's some platform out there that has pow() but not cbrt()). Differential Revision: https://reviews.llvm.org/D51753 llvm-svn: 342348
* [CodeGenPrepare] Preserve debug locs in OptimizeExtractBitsVedant Kumar2018-09-151-1/+6
| | | | | | | | | CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it easier for the backend to emit fancy extract-bits instructions (e.g UBFX). Teach it to preserve debug locations and salvage debug values. llvm-svn: 342319
* [BreakFalseDeps] Fix bad formatting. NFCCraig Topper2018-09-141-1/+1
| | | | llvm-svn: 342293
* [codeview] Remove dead codeReid Kleckner2018-09-142-17/+0
| | | | llvm-svn: 342285
* Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Reid Kleckner2018-09-141-8/+1
| | | | | | | | | extract_subvector with invalid index." Causes 'isVector() && "Invalid vector type!"' assertion when building Skia in Chrome. llvm-svn: 342265
* Fix debug info for SelectionDAG legalization of DAG nodes with two results.Adrian Prantl2018-09-141-1/+1
| | | | | | | | | | | | | | | | This patch fixes the debug info handling for SelectionDAG legalization of DAG nodes with two results. When an replaced SDNode has more than one result, transferDbgValues was always copying the SDDbgValue from the first result and attaching them to all members. In reality SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes (though the type signature doesn't make this obvious (cf. the call site code in ReplaceNode()). rdar://problem/44162227 Differential Revision: https://reviews.llvm.org/D52112 llvm-svn: 342264
* fix noasserts buildAdrian Prantl2018-09-141-0/+2
| | | | llvm-svn: 342247
* SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose outputAdrian Prantl2018-09-142-0/+19
| | | | llvm-svn: 342245
* fix typosAdrian Prantl2018-09-141-1/+1
| | | | llvm-svn: 342241
* [DAGCombine] Fix crash when store merging created an extract_subvector with ↵Amara Emerson2018-09-131-1/+8
| | | | | | | | invalid index. Differential Revision: https://reviews.llvm.org/D51831 llvm-svn: 342183
* [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove ↵Craig Topper2018-09-131-2/+4
| | | | | | | | | | | | | | operands from inline assembly instructions if they have an associated flag operand. INLINEASM instructions use extra operands to carry flags. If a register operand is removed without removing the flag operand, then the flags will no longer make sense. This patch fixes this by preventing the removal when a flag operand is present. The included test case was generated by MS inline assembly. Longer term maybe we should fix the inline assembly parsing to not generate redundant operands. Differential Revision: https://reviews.llvm.org/D51829 llvm-svn: 342176
* DAG: Fix expansion of unaligned FP loads and storesMatt Arsenault2018-09-131-4/+6
| | | | | | | | | This was trying to scalarizing a scalar FP type, resulting in an assert. Fixes unaligned f64 stack stores for AMDGPU. llvm-svn: 342132
* ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.Tim Northover2018-09-131-1/+2
| | | | | | | | | | | | The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127
* [DAGCombiner] improve formatting for select+setcc code; NFCSanjay Patel2018-09-121-17/+15
| | | | llvm-svn: 342095
* [CGP] Ensure splitgep gives deterministic outputDavid Green2018-09-121-1/+1
| | | | | | | | | | | | | | The output of splitLargeGEPOffsets does not appear to be deterministic because of the way that we iterate over a DenseMap. I've changed it to a MapVector for consistent output. The test here isn't particularly great, only showing a consmetic difference in output. The original reproducer is much larger but show a diffierence in instruction ordering, leading to different codegen. Differential Revision: https://reviews.llvm.org/D51851 llvm-svn: 342043
* [SelectionDAG] Remove some code from PromoteIntOp_MGATHER that handles ↵Craig Topper2018-09-121-8/+1
| | | | | | | | UpdateNodeOperands returning an existing node instead of updating. I suspect this became unecessary when the CSE of mgather was fixed in r338080. It may still be possible to hit this if we widen the element type of a gather outside of type legalization and the promote the mask of a separate gather node so they become the same. But that seems pretty unlikely. llvm-svn: 342022
* [MachineOutliner] Add codegen size remarks to the MachineOutlinerJessica Paquette2018-09-111-1/+103
| | | | | | | | | | | | | | | | Since the outliner is a module pass, it doesn't get codegen size remarks like the other codegen passes do. This adds size remarks *to* the outliner. This is kind of a workaround, so it's peppered with FIXMEs; size remarks really ought to not ever be handled by the pass itself. However, since the outliner is the only "MachineModulePass", this works for now. Since the entire purpose of the MachineOutliner is to produce code size savings, it really ought to be included in codgen size remarks. If we ever go ahead and make a MachineModulePass (say, something similar to MachineFunctionPass), then all of this ought to be moved there. llvm-svn: 342009
* add IR flags to MIMichael Berg2018-09-115-1/+28
| | | | | | | | | | | | | | Summary: Initial support for nsw, nuw and exact flags in MI Reviewers: spatel, hfinkel, wristow Reviewed By: spatel Subscribers: nlopes Differential Revision: https://reviews.llvm.org/D51738 llvm-svn: 341996
* [GlobalISel] Lower dbg.declare into indirect DBG_VALUEJosh Stone2018-09-111-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D31439 changed the semantics of dbg.declare to take the address of a variable as the first argument, making it indirect. It specifically updated FastISel for this change here: https://reviews.llvm.org/D31439#change-WVArzi177jPl GlobalISel needs to follow suit, or else it will be missing a level of indirection in the generated debuginfo. This problem was seen in a Rust debuginfo test on aarch64, since GlobalISel is used at -O0 for aarch64. https://github.com/rust-lang/rust/issues/49807 https://bugzilla.redhat.com/show_bug.cgi?id=1611597 https://bugzilla.redhat.com/show_bug.cgi?id=1625768 Reviewers: dblaikie, aprantl, t.p.northover, javed.absar, rnk Reviewed By: rnk Subscribers: #debug-info, rovka, kristof.beyls, JDevlieghere, llvm-commits, tstellar Differential Revision: https://reviews.llvm.org/D51749 llvm-svn: 341969
* [MachineOutliner][NFC] Factor out instruction mapping into its own functionJessica Paquette2018-09-111-28/+38
| | | | | | | Just some tidy-up. Pull the mapper stuff into `populateMapper`. This makes it a bit easier to read what's going on in `runOnModule`. llvm-svn: 341959
* Add size remarks to MachineFunctionPassJessica Paquette2018-09-101-0/+36
| | | | | | | | | | | | | This adds per-function size remarks to codegen, similar to what we have in the IR layer as of r341588. This only impacts MachineFunctionPasses. This does the same thing, but for `MachineInstr`s instead of just `Instructions`. After this, when a `MachineFunctionPass` modifies the number of `MachineInstr`s in the function it ran on, you'll get a remark. To enable this, use the size-info analysis remark as before. llvm-svn: 341876
* Don't create a temporary vector of loop blocks just to iterate over them.Benjamin Kramer2018-09-101-4/+2
| | | | | | Loop's getBlocks returns an ArrayRef. llvm-svn: 341821
* DAG: Handle odd vector sizes in calling conv splittingMatt Arsenault2018-09-101-12/+17
| | | | | | | | | | | | | | This already worked if only one register piece was used, but didn't if a type was split into multiple, unequal sized pieces. Fixes not splitting 3i16/v3f16 into two registers for AMDGPU. This will also allow fixing the ABI for 16-bit vectors in a future commit so that it's the same for all subtargets. llvm-svn: 341801
* [SelectionDAG] enhance vector demanded elements to look at a vector select ↵Sanjay Patel2018-09-091-4/+12
| | | | | | | | | | | | | | | | condition operand This is the DAG equivalent of D51433. If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition. The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit vectors because we don't need those anyway. I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed to be running there? Differential Revision: https://reviews.llvm.org/D51696 llvm-svn: 341762
* Fix typos. NFCFangrui Song2018-09-081-1/+1
| | | | llvm-svn: 341740
* Remove addBlockByrefAddress(), it is dead code as far as clang is concerned.Adrian Prantl2018-09-082-134/+0
| | | | | | | | | | | | This patch removes addBlockByrefAddress(), it is dead code as far as clang is concerned: Every byref block capture is emitted with a complex expression that is equivalent to what this function does. rdar://problem/31629055 Differential Revision: https://reviews.llvm.org/D51763 llvm-svn: 341737
* [COFF] Implement llvm.global_ctors priorities for MSVC COFF targetsReid Kleckner2018-09-071-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | Summary: MSVC and LLD sort sections ASCII-betically, so we need to use section names that sort between .CRT$XCA (the start) and .CRT$XCU (the default priority). In the general case, use .CRT$XCT12345 as the section name, and let the linker sort the zero-padded digits. Users with low priorities typically want to initialize as early as possible, so use .CRT$XCA00199 for prioties less than 200. This number is arbitrary. Implements PR38552. Reviewers: majnemer, mstorsjo Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51820 llvm-svn: 341727
* [DebugInfo] Handle stack slot offsets for spilled sub-registers in LDVDavid Stenberg2018-09-071-30/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Extend LDV so that stack slot offsets for spilled sub-registers are added to the emitted debug locations. This is accomplished by querying InstrInfo::getStackSlotRange(). With this change, LDV will add a DW_OP_plus_uconst operation to the expression if a sub-register is spilled. Later on, PEI will add an offset operation for the stack slot, meaning that we will get expressions of the forms: * {DW_OP_constu #fp-offset, DW_OP_minus, DW_OP_plus_uconst #subreg-offset} * {DW_OP_plus_const #fp-offset, DW_OP_minus, DW_OP_plus_uconst #subreg-offset} The two offset operations should ideally be merged. Reviewers: rnk, aprantl, stoklund Reviewed By: aprantl Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D51612 llvm-svn: 341659
* [DAGCombiner] foldBitcastedFPLogic - Add basic vector supportSimon Pilgrim2018-09-071-8/+8
| | | | | | | | Add support for bitcasts from float type to an integer type of the same element bitwidth. There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet. llvm-svn: 341652
* Fix argument type in MachineInstr::hasPropertyInBundleSven van Haastregt2018-09-061-1/+1
| | | | | | | | | | | | | The MCID::Flag enumeration now has more than 32 items, this means that the hasPropertyBundle argument 'Mask' can overflow. This patch changes the argument to be 64 bits instead. Patch by Mikael Nilsson. Differential Revision: https://reviews.llvm.org/D51596 llvm-svn: 341536
* [DebugInfo] Do not generate label debug info if it has been processed.Hsiangkai Wang2018-09-067-63/+69
| | | | | | | | In DwarfDebug::collectEntityInfo(), if the label entity is processed in DbgLabels list, it means the label is not optimized out. There is no need to generate debug info for it with null position. llvm-svn: 341513
* [DAGCombiner] try to convert pow(x, 0.25) to sqrt(sqrt(x))Sanjay Patel2018-09-051-0/+41
| | | | | | | | | | | | | | | | | | | | | This was proposed as an IR transform in D49306, but it was not clearly justifiable as a canonicalization. Here, we only do the transform when the target tells us that sqrt can be lowered with inline code. This is the basic case. Some potential enhancements are in the TODO comments: 1. Generalize the transform for other exponents (allow more than 2 sqrt calcs if that's really cheaper). 2. If we have less fast-math-flags, generate code to avoid -0.0 and/or INF. 3. Allow the transform when optimizing/minimizing size (might require a target hook to get that right). Note that by default, x86 converts single-precision sqrt calcs into sqrt reciprocal estimate with refinement. That codegen is controlled by CPU attributes and can be manually overridden. We have plenty of test coverage for that already, so I didn't bother to include extra testing for that here. AArch uses its full-precision ops in all cases (not sure if that's the intended behavior or not, but that should also be covered by existing tests). Differential Revision: https://reviews.llvm.org/D51630 llvm-svn: 341481
* [DebugInfo] Normalize common kinds of DWARF sub-expressions.Jonas Devlieghere2018-09-052-8/+21
| | | | | | | | | | | | Normalize common kinds of DWARF sub-expressions to make debug info encoding a bit more compact: DW_OP_constu [X < 32] -> DW_OP_litX DW_OP_constu [all ones] -> DW_OP_lit0, DW_OP_not (64-bit only) Differential revision: https://reviews.llvm.org/D51640 llvm-svn: 341457
* Remove FrameAccess struct from hasLoadFromStackSlotSander de Smalen2018-09-054-36/+34
| | | | | | | | | | | | | | This removes the FrameAccess struct that was added to the interface in D51537, since the PseudoValue from the MachineMemoryOperand can be safely casted to a FixedStackPseudoSourceValue. Reviewers: MatzeB, thegameg, javed.absar Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D51617 llvm-svn: 341454
* [DebugInfo] Fix bug in LiveDebugVariables.Hsiangkai Wang2018-09-051-5/+10
| | | | | | | | | | | | | | | In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to get DebugValue's SlotIndex. However, the previous instruction may be also a debug instruction. It could not use a debug instruction to query SlotIndex in mi2iMap. Scan all debug instructions and use the first debug instruction to query SlotIndex for following debug instructions. Only handle DBG_VALUE in handleDebugValue(). Differential Revision: https://reviews.llvm.org/D50621 llvm-svn: 341446
* DAG: Factor out helper function for odd vector sizesMatt Arsenault2018-09-041-22/+28
| | | | llvm-svn: 341392
* [CodeGen] Fix remaining zext() assertions in SelectionDAGScott Linder2018-09-042-16/+14
| | | | | | | | Fix remaining cases not committed in https://reviews.llvm.org/D49574 Differential Revision: https://reviews.llvm.org/D50659 llvm-svn: 341380
* DAG: Handle extract_vector_elt in isKnownNeverNaNMatt Arsenault2018-09-031-0/+3
| | | | llvm-svn: 341317
* Fix issue introduced by r341301 that broke buildbot.Sander de Smalen2018-09-031-3/+6
| | | | | | | | | A condition in isSpillInstruction() updates a small vector rather than the 'FI' by-ref parameter, which was used in a subsequent call to 'isSpillSlotObjectIndex()'. This patch fixes the condition to check the FIs in the vector instead. llvm-svn: 341305
* Extend hasStoreToStackSlot with list of FI accesses.Sander de Smalen2018-09-034-31/+42
| | | | | | | | | | | | | | | | | | For instructions that spill/fill to and from multiple frame-indices in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot should return an array of accesses, rather than just the first encounter of such an access. This better describes FI accesses for AArch64 (paired) LDP/STP instructions. Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51537 llvm-svn: 341301
* Revert "[DebugInfo] Fix bug in LiveDebugVariables."Hsiangkai Wang2018-09-021-10/+5
| | | | | | | | | This reverts commit 8f548ff2a1819e1bc051e8218584f1a3d2cf178a. buildbot failure in LLVM on clang-ppc64be-linux http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/19765 llvm-svn: 341290
* [DebugInfo] Fix bug in LiveDebugVariables.Hsiangkai Wang2018-09-021-5/+10
| | | | | | | | | | | | | | | In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to get DebugValue's SlotIndex. However, the previous instruction may be also a debug instruction. It could not use a debug instruction to query SlotIndex in mi2iMap. Scan all debug instructions and use the first debug instruction to query SlotIndex for following debug instructions. Only handle DBG_VALUE in handleDebugValue(). Differential Revision: https://reviews.llvm.org/D50621 llvm-svn: 341289
* [DAGCombine] optimizeSetCCOfSignedTruncationCheck(): handle inverted patternRoman Lebedev2018-09-021-4/+18
| | | | | | | | | | | | | | | | | | | | | | Summary: A follow-up for D49266 / rL337166 + D49497 / rL338044. This is still the same pattern to check for the [lack of] signed truncation, but in this case the constants and the predicate are negated. https://rise4fun.com/Alive/BDV https://rise4fun.com/Alive/n7Z Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51532 llvm-svn: 341287
* SafeStack: Prevent OOB reads with mem intrinsicsVlad Tsyrklevich2018-08-301-2/+8
| | | | | | | | | | | | | | | | | | | Summary: Currently, the SafeStack analysis disallows out-of-bounds writes but not out-of-bounds reads for mem intrinsics like llvm.memcpy. This could cause leaks of pointers to the safe stack by leaking spilled registers/ frame pointers. Check for allocas used as source or destination pointers to mem intrinsics. Reviewers: eugenis Reviewed By: eugenis Subscribers: pcc, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D51334 llvm-svn: 341116
* [DAGCombiner] Fix bad identation. NFCCraig Topper2018-08-301-1/+1
| | | | llvm-svn: 341103
* Allow inconsistent offsets for 'noreturn' basic blocks when '-verify-cfiinstrs'Vladimir Stefanovic2018-08-301-0/+4
| | | | | | | | | | | | | With r295105, some 'noreturn' blocks (those that don't return and have no successors) may be merged. If such blocks' predecessors have different outgoing offset or register, don't report an error in CFIInstrInserter verify(). Thanks to Vlad Tsyrklevich for reporting the issue. Differential Revision: https://reviews.llvm.org/D51161 llvm-svn: 341087
OpenPOWER on IntegriCloud