summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [MachineCopyPropagation] Refactor copy tracking into a class. NFCJustin Bogner2018-09-211-99/+133
| | | | | | | | This is a bit easier to follow than handling the copy and src maps directly in the pass, and will make upcoming changes to how this is done easier to follow. llvm-svn: 342703
* [MachineCopyPropagation] Minor clang-formatting. NFCJustin Bogner2018-09-211-37/+37
| | | | llvm-svn: 342700
* Add the ability to register callbacks for removal and insertion of MachineInstrsAditya Nandakumar2018-09-202-1/+17
| | | | | | | | | https://reviews.llvm.org/D52127 This patch adds the ability to watch for insertions/deletions of MachineInstructions similar to MachineRegisterInfo. llvm-svn: 342696
* [MachineOutliner][NFC] Don't add MBBs with a size < 2 to the search spaceJessica Paquette2018-09-201-1/+5
| | | | | | | | | | | The suffix tree won't ever consider sequences with a length less than 2. Therefore, we really ought to not even consider them in the first place. Also add a FIXME explaining that this should be defined in terms of the size in B of an outlined call versus the size in B of the MBB. llvm-svn: 342688
* [RegAllocGreedy] Fix crash in tryLocalSplitWalter Lee2018-09-201-1/+5
| | | | | | | | | | tryLocalSplit only handles a single use block, but an interval may have multiple use blocks. So don't crash in that case. This fixes PR38795. Differential revision: https://reviews.llvm.org/D52277 llvm-svn: 342682
* [MachineOutliner][NFC] Move debug info emission to createOutlinedFunctionJessica Paquette2018-09-201-35/+23
| | | | | | | | | When you create an outlined function, you know everything you need to know to decide if debug info should be created. If we emit debug info in createOutlinedFunction, then we don't need to keep track of every IR function we create. llvm-svn: 342677
* [SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCISanjay Patel2018-09-202-38/+31
| | | | | | | | | | | | | | x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant. Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. No functional change intended. I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps. Differential Revision: https://reviews.llvm.org/D52285 llvm-svn: 342669
* [DWARF] - Emit the correct value for DW_AT_addr_base.George Rimar2018-09-205-10/+30
| | | | | | | | | | | | | | Currently, we emit DW_AT_addr_base that points to the beginning of the .debug_addr section. That is not correct for the DWARF5 case because address table contains the header and the attribute should point to the first entry following the header. This is currently the reason why LLDB does not work with such executables correctly. Patch fixes the issue. Differential revision: https://reviews.llvm.org/D52168 llvm-svn: 342635
* [MachineVerifier] Relax checkLivenessAtDef regarding dead subreg defsBjorn Pettersson2018-09-201-21/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Consider an instruction that has multiple defs of the same vreg, but defining different subregs: %7.sub1:rc, dead %7.sub2:rc = inst Calling checkLivenessAtDef for the live interval associated with %7 incorrectly reported "live range continues after a dead def". The live range for %7 has a dead def at the slot index for "inst" even if the live range continues (given that there are later uses of %7.sub1). This patch adjusts MachineVerifier::checkLivenessAtDef to allow dead subregister definitions, unless we are checking a subrange (when tracking subregister liveness). A limitation is that we do not detect the situation when the live range continues past an instruction that defines the full virtual register by multiple dead subreg defines. I also removed some dead code related to physical register in checkLivenessAtDef. Wwe only call that method for virtual registers, so I added an assertion instead. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52237 llvm-svn: 342618
* [SelectionDAG] allow vector types with isBitwiseNot()Sanjay Patel2018-09-192-4/+5
| | | | | | | The test diff in not-and-simplify.ll is from a use in SimplifyDemandedBits, and the test diff in add.ll is from a DAGCombiner transform. llvm-svn: 342594
* MachineScheduler: Add -misched-print-dags flagMatthias Braun2018-09-191-1/+6
| | | | | | | | Add a flag to dump the schedule DAG to the debug stream. This will be used in upcoming commits to test schedule DAG mutations such as macro fusion. llvm-svn: 342589
* Copy utilities updated and added for MI flagsMichael Berg2018-09-193-1/+51
| | | | | | | | | | | | | | Summary: This patch adds a GlobalIsel copy utility into MI for flags and updates the instruction emitter for the SDAG path. Some tests show new behavior and I added one for GlobalIsel which mirrors an SDAG test for handling nsw/nuw. Reviewers: spatel, wristow, arsenm Reviewed By: arsenm Subscribers: wdng Differential Revision: https://reviews.llvm.org/D52006 llvm-svn: 342576
* [DAGCombiner][x86] add transform/hook to decompose integer multiply into ↵Sanjay Patel2018-09-191-0/+26
| | | | | | | | | | | | | | | | | | | | | shift/add This is an alternative to D37896. I don't see a way to decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some duplicate code that overlaps with this transform. As a first step, we're only getting the most clear wins on the vector examples requested in PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 As noted in the code comment, it's likely that the x86 constraints are tighter than necessary, but it may not always be a win to replace a pmullw/pmulld. Differential Revision: https://reviews.llvm.org/D52195 llvm-svn: 342554
* [AtomicExpandPass]: Add a hook for custom cmpxchg expansion in IRAlex Bradbury2018-09-191-11/+27
| | | | | | | | | | | | | | | | | This involves changing the shouldExpandAtomicCmpXchgInIR interface, but I have updated the in-tree backends using this hook (ARM, AArch64, Hexagon) so they will see no functional change. Previously this hook returned bool, but it now returns AtomicExpansionKind. This hook allows targets to select how a given cmpxchg is to be expanded. D48131 uses this to expand part-word cmpxchg to a target-specific intrinsic. See my associated RFC for more info on the motivation for this change <http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html>. Differential Revision: https://reviews.llvm.org/D48130 llvm-svn: 342550
* [RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32AAlex Bradbury2018-09-191-1/+37
| | | | | | | | | | | | | | | | | | | | Introduce a new RISCVExpandPseudoInsts pass to expand atomic pseudo-instructions after register allocation. This is necessary in order to ensure that register spills aren't introduced between LL and SC, thus breaking the forward progress guarantee for the operation. AArch64 does something similar for CmpXchg (though only at O0), and Mips is moving towards this approach (see D31287). See also [this mailing list post](http://lists.llvm.org/pipermail/llvm-dev/2016-May/099490.html) from James Knight, which summarises the issues with lowering to ll/sc in IR or pre-RA. See the [accompanying RFC thread](http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html) for an overview of the lowering strategy. Differential Revision: https://reviews.llvm.org/D47882 llvm-svn: 342534
* ScheduleDAG: Cleanup dumping code; NFCMatthias Braun2018-09-1914-129/+137
| | | | | | | | | | | | - Instead of having both `SUnit::dump(ScheduleDAG*)` and `ScheduleDAG::dumpNode(ScheduleDAG*)`, just keep the latter around. - Add `ScheduleDAG::dump()` and avoid code duplication in several places. Implement it for different ScheduleDAG variants. - Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()` functions. They were only ever used for debug dumping and putting the function into ScheduleDAG is consistent with the `dumpNode()` change. llvm-svn: 342520
* [PostRASink] Make sure to remove subregisters from live-ins as wellKrzysztof Parzyszek2018-09-181-2/+5
| | | | llvm-svn: 342492
* Revert r342457 "Fixes removal of dead elements from PressureDiff (PR37252)."Hans Wennborg2018-09-181-1/+2
| | | | | | | | | | | This broke the lit tests on a bunch of buildbots, e.g. http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/36679 > Reviewed By: MatzeB > > Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342482
* [TargetLowering] Android has sincos functionsJohn Brawn2018-09-181-1/+2
| | | | | | | | | Since Android API version 9 the Android libm has had the sincos functions, so they should be recognised as libcalls and sincos optimisation should be applied. Differential Revision: https://reviews.llvm.org/D52025 llvm-svn: 342471
* Fixes removal of dead elements from PressureDiff (PR37252).Yury Gribov2018-09-181-2/+1
| | | | | | | | Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342457
* [MachineOutliner][NFC] Don't map more illegal instrs than you have toJessica Paquette2018-09-171-0/+11
| | | | | | | | | | | | | | | We were mapping an instruction every time we saw something we couldn't map before this. Since each illegal mapping is unique, we only have to do this once. This makes it so that we don't map illegal instructions when the previous mapped instruction was illegal. In CTMark (AArch64), this results in 240 fewer instruction mappings on average over 619 files in total. The largest improvement is 12576 fewer mappings in one file, and the smallest is 0. The median improvement is 101 fewer mappings. llvm-svn: 342405
* Revert "Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Amara Emerson2018-09-171-1/+10
| | | | | | | | extract_subvector with invalid index."" Fixed the assertion failure. llvm-svn: 342397
* [DebugInfo] Fix build when std::vector::iterator is a pointerKristina Brooks2018-09-161-1/+1
| | | | | | | | | | | | std::vector::iterator type may be a pointer, then iterator::value_type fails to compile since iterator is not a class, namespace, or enumeration. Patch by orivej (Orivej Desh) Differential Revision: https://reviews.llvm.org/D52142 llvm-svn: 342354
* [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x)Sanjay Patel2018-09-164-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow-up suggested in D51630 and originally proposed as an IR transform in D49040. Copying the motivational statement by @evandro from that patch: "This transformation helps some benchmarks in SPEC CPU2000 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64." I'm proposing to add only the minimum support for a DAG node here. Since we don't have an LLVM IR intrinsic for cbrt, and there are no other DAG ways to create a FCBRT node yet, I don't think we need to worry about DAG builder, legalization, a strict variant, etc. We should be able to expand as needed when adding more functionality/transforms. For reference, these are transform suggestions currently listed in SimplifyLibCalls.cpp: // * cbrt(expN(X)) -> expN(x/3) // * cbrt(sqrt(x)) -> pow(x,1/6) // * cbrt(cbrt(x)) -> pow(x,1/9) Also, given that we bail out on long double for now, there should not be any logical differences between platforms (unless there's some platform out there that has pow() but not cbrt()). Differential Revision: https://reviews.llvm.org/D51753 llvm-svn: 342348
* [CodeGenPrepare] Preserve debug locs in OptimizeExtractBitsVedant Kumar2018-09-151-1/+6
| | | | | | | | | CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it easier for the backend to emit fancy extract-bits instructions (e.g UBFX). Teach it to preserve debug locations and salvage debug values. llvm-svn: 342319
* [BreakFalseDeps] Fix bad formatting. NFCCraig Topper2018-09-141-1/+1
| | | | llvm-svn: 342293
* [codeview] Remove dead codeReid Kleckner2018-09-142-17/+0
| | | | llvm-svn: 342285
* Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Reid Kleckner2018-09-141-8/+1
| | | | | | | | | extract_subvector with invalid index." Causes 'isVector() && "Invalid vector type!"' assertion when building Skia in Chrome. llvm-svn: 342265
* Fix debug info for SelectionDAG legalization of DAG nodes with two results.Adrian Prantl2018-09-141-1/+1
| | | | | | | | | | | | | | | | This patch fixes the debug info handling for SelectionDAG legalization of DAG nodes with two results. When an replaced SDNode has more than one result, transferDbgValues was always copying the SDDbgValue from the first result and attaching them to all members. In reality SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes (though the type signature doesn't make this obvious (cf. the call site code in ReplaceNode()). rdar://problem/44162227 Differential Revision: https://reviews.llvm.org/D52112 llvm-svn: 342264
* fix noasserts buildAdrian Prantl2018-09-141-0/+2
| | | | llvm-svn: 342247
* SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose outputAdrian Prantl2018-09-142-0/+19
| | | | llvm-svn: 342245
* fix typosAdrian Prantl2018-09-141-1/+1
| | | | llvm-svn: 342241
* [DAGCombine] Fix crash when store merging created an extract_subvector with ↵Amara Emerson2018-09-131-1/+8
| | | | | | | | invalid index. Differential Revision: https://reviews.llvm.org/D51831 llvm-svn: 342183
* [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove ↵Craig Topper2018-09-131-2/+4
| | | | | | | | | | | | | | operands from inline assembly instructions if they have an associated flag operand. INLINEASM instructions use extra operands to carry flags. If a register operand is removed without removing the flag operand, then the flags will no longer make sense. This patch fixes this by preventing the removal when a flag operand is present. The included test case was generated by MS inline assembly. Longer term maybe we should fix the inline assembly parsing to not generate redundant operands. Differential Revision: https://reviews.llvm.org/D51829 llvm-svn: 342176
* DAG: Fix expansion of unaligned FP loads and storesMatt Arsenault2018-09-131-4/+6
| | | | | | | | | This was trying to scalarizing a scalar FP type, resulting in an assert. Fixes unaligned f64 stack stores for AMDGPU. llvm-svn: 342132
* ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4.Tim Northover2018-09-131-1/+2
| | | | | | | | | | | | The Technical Reference Manuals for these two CPUs state that branching to an unaligned 32-bit instruction incurs an extra pipeline reload penalty. That's bad. This also enables the optimization at -Os since it costs on average one byte per loop in return for 1 cycle per iteration, which is pretty good going. llvm-svn: 342127
* [DAGCombiner] improve formatting for select+setcc code; NFCSanjay Patel2018-09-121-17/+15
| | | | llvm-svn: 342095
* [CGP] Ensure splitgep gives deterministic outputDavid Green2018-09-121-1/+1
| | | | | | | | | | | | | | The output of splitLargeGEPOffsets does not appear to be deterministic because of the way that we iterate over a DenseMap. I've changed it to a MapVector for consistent output. The test here isn't particularly great, only showing a consmetic difference in output. The original reproducer is much larger but show a diffierence in instruction ordering, leading to different codegen. Differential Revision: https://reviews.llvm.org/D51851 llvm-svn: 342043
* [SelectionDAG] Remove some code from PromoteIntOp_MGATHER that handles ↵Craig Topper2018-09-121-8/+1
| | | | | | | | UpdateNodeOperands returning an existing node instead of updating. I suspect this became unecessary when the CSE of mgather was fixed in r338080. It may still be possible to hit this if we widen the element type of a gather outside of type legalization and the promote the mask of a separate gather node so they become the same. But that seems pretty unlikely. llvm-svn: 342022
* [MachineOutliner] Add codegen size remarks to the MachineOutlinerJessica Paquette2018-09-111-1/+103
| | | | | | | | | | | | | | | | Since the outliner is a module pass, it doesn't get codegen size remarks like the other codegen passes do. This adds size remarks *to* the outliner. This is kind of a workaround, so it's peppered with FIXMEs; size remarks really ought to not ever be handled by the pass itself. However, since the outliner is the only "MachineModulePass", this works for now. Since the entire purpose of the MachineOutliner is to produce code size savings, it really ought to be included in codgen size remarks. If we ever go ahead and make a MachineModulePass (say, something similar to MachineFunctionPass), then all of this ought to be moved there. llvm-svn: 342009
* add IR flags to MIMichael Berg2018-09-115-1/+28
| | | | | | | | | | | | | | Summary: Initial support for nsw, nuw and exact flags in MI Reviewers: spatel, hfinkel, wristow Reviewed By: spatel Subscribers: nlopes Differential Revision: https://reviews.llvm.org/D51738 llvm-svn: 341996
* [GlobalISel] Lower dbg.declare into indirect DBG_VALUEJosh Stone2018-09-111-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D31439 changed the semantics of dbg.declare to take the address of a variable as the first argument, making it indirect. It specifically updated FastISel for this change here: https://reviews.llvm.org/D31439#change-WVArzi177jPl GlobalISel needs to follow suit, or else it will be missing a level of indirection in the generated debuginfo. This problem was seen in a Rust debuginfo test on aarch64, since GlobalISel is used at -O0 for aarch64. https://github.com/rust-lang/rust/issues/49807 https://bugzilla.redhat.com/show_bug.cgi?id=1611597 https://bugzilla.redhat.com/show_bug.cgi?id=1625768 Reviewers: dblaikie, aprantl, t.p.northover, javed.absar, rnk Reviewed By: rnk Subscribers: #debug-info, rovka, kristof.beyls, JDevlieghere, llvm-commits, tstellar Differential Revision: https://reviews.llvm.org/D51749 llvm-svn: 341969
* [MachineOutliner][NFC] Factor out instruction mapping into its own functionJessica Paquette2018-09-111-28/+38
| | | | | | | Just some tidy-up. Pull the mapper stuff into `populateMapper`. This makes it a bit easier to read what's going on in `runOnModule`. llvm-svn: 341959
* Add size remarks to MachineFunctionPassJessica Paquette2018-09-101-0/+36
| | | | | | | | | | | | | This adds per-function size remarks to codegen, similar to what we have in the IR layer as of r341588. This only impacts MachineFunctionPasses. This does the same thing, but for `MachineInstr`s instead of just `Instructions`. After this, when a `MachineFunctionPass` modifies the number of `MachineInstr`s in the function it ran on, you'll get a remark. To enable this, use the size-info analysis remark as before. llvm-svn: 341876
* Don't create a temporary vector of loop blocks just to iterate over them.Benjamin Kramer2018-09-101-4/+2
| | | | | | Loop's getBlocks returns an ArrayRef. llvm-svn: 341821
* DAG: Handle odd vector sizes in calling conv splittingMatt Arsenault2018-09-101-12/+17
| | | | | | | | | | | | | | This already worked if only one register piece was used, but didn't if a type was split into multiple, unequal sized pieces. Fixes not splitting 3i16/v3f16 into two registers for AMDGPU. This will also allow fixing the ABI for 16-bit vectors in a future commit so that it's the same for all subtargets. llvm-svn: 341801
* [SelectionDAG] enhance vector demanded elements to look at a vector select ↵Sanjay Patel2018-09-091-4/+12
| | | | | | | | | | | | | | | | condition operand This is the DAG equivalent of D51433. If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition. The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit vectors because we don't need those anyway. I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed to be running there? Differential Revision: https://reviews.llvm.org/D51696 llvm-svn: 341762
* Fix typos. NFCFangrui Song2018-09-081-1/+1
| | | | llvm-svn: 341740
* Remove addBlockByrefAddress(), it is dead code as far as clang is concerned.Adrian Prantl2018-09-082-134/+0
| | | | | | | | | | | | This patch removes addBlockByrefAddress(), it is dead code as far as clang is concerned: Every byref block capture is emitted with a complex expression that is equivalent to what this function does. rdar://problem/31629055 Differential Revision: https://reviews.llvm.org/D51763 llvm-svn: 341737
* [COFF] Implement llvm.global_ctors priorities for MSVC COFF targetsReid Kleckner2018-09-071-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | Summary: MSVC and LLD sort sections ASCII-betically, so we need to use section names that sort between .CRT$XCA (the start) and .CRT$XCU (the default priority). In the general case, use .CRT$XCT12345 as the section name, and let the linker sort the zero-padded digits. Users with low priorities typically want to initialize as early as possible, so use .CRT$XCA00199 for prioties less than 200. This number is arbitrary. Implements PR38552. Reviewers: majnemer, mstorsjo Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51820 llvm-svn: 341727
OpenPOWER on IntegriCloud