summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [DebugInfo] Generate .debug_names section when it makes sensePavel Labath2018-07-203-17/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch makes us generate the debug_names section in response to some user-facing commands (previously it was only generated if explicitly selected via the -accel-tables option). My goal was to make this work for DWARF>=5 (as it's an official part of that standard), and also, as an extension, for DWARF<5 if one is explicitly tuning for lldb as a debugger (because it brings a large performance improvement there). This is slightly complicated by the fact that the debug_names tables are incompatible with the DWARF v4 type units (they assume that the type units are in the debug_info section), and unfortunately, right now we generate DWARF v4-style type units even for -gdwarf-5. For this reason, I disable all accelerator tables if the user requested type unit generation. I do this even for apple tables, as they have the same problem (in fact generating type units for apple targets makes us crash even before we get around to emitting the accelerator tables). Reviewers: JDevlieghere, aprantl, dblaikie, echristo, probinson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49420 llvm-svn: 337544
* [DAGCombiner] Fold X - (-Y *Z) -> X + (Y * Z)Craig Topper2018-07-201-0/+18
| | | | llvm-svn: 337518
* Skip out of SimplifyDemandedBits for BITCAST of f16 to i16Stephen Canon2018-07-191-0/+1
| | | | | | | | Mirrors the existing exit path for f128, avoiding a crash later on. Differential Revision: https://reviews.llvm.org/D49524 llvm-svn: 337506
* [DAGCombiner] Teach DAGCombiner that A-(-B) is A+B.Craig Topper2018-07-191-0/+5
| | | | | | We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub. llvm-svn: 337502
* [DebugInfo] Dwarfv5: Avoid unnecessary base_address specifiers in rnglistsDavid Blaikie2018-07-181-10/+18
| | | | | | | | | | | | Since DWARFv5 rnglists are self descriptive and have distinct encodings for base-relative (offset_pair) and absolute (start_length) entries, there's no need to use a base address specifier when describing a lone address range in a section. Use that, and improve the test coverage a bit here to include cases like this and others. llvm-svn: 337411
* [ScheduleDAG] Fix unfolding of SUnits to already existent nodes.Nirav Dave2018-07-181-18/+30
| | | | | | | | | | | | | | | | | Summary: If unfolding an SUnit results in both load or the operation using it which already exist in the DAG, abort the unfold if they are already scheduled. If not, make sure we don't add duplicate dependencies. This fixes PR37916. Reviewers: davide, eli.friedman, fhahn, bogner Subscribers: MatzeB, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48666 llvm-svn: 337409
* [RegAlloc][NFC] Fix the help string of the option "huge-size-for-split".Wei Mi2018-07-181-1/+2
| | | | llvm-svn: 337402
* Fix -Wdocumentation warning. NFCI.Simon Pilgrim2018-07-181-1/+1
| | | | llvm-svn: 337367
* CodeGen: Don't create address significance table entries for thread-local ↵Peter Collingbourne2018-07-181-1/+2
| | | | | | | | | | variables. The presence of these symbols in the symbol table can cause symbol type mismatch errors (or undefined symbol errors on emulated TLS targets) and they can't be ICF'd anyway. llvm-svn: 337338
* CodeGen: Add a target option for emitting .addrsig directives for all ↵Peter Collingbourne2018-07-171-0/+8
| | | | | | | | address-significant symbols. Differential Revision: https://reviews.llvm.org/D48143 llvm-svn: 337331
* More fixes for subreg join failure in RegCoalescerTim Renouf2018-07-171-4/+21
| | | | | | | | | | | | | | | | | | | | | | Summary: Part of the adjustCopiesBackFrom method wasn't correctly dealing with SubRange intervals when updating. 2 changes. The first to ensure that bogus SubRange Segments aren't propagated when encountering Segments of the form [1234r, 1234d:0) when preparing to merge value numbers. These can be removed in this case. The second forces a shrinkToUses call if SubRanges end on the copy index (instead of just the parent register). V2: Addressed review comments, plus MIR test instead of ll test Subscribers: MatzeB, qcolombet, nhaehnle Differential Revision: https://reviews.llvm.org/D40308 Change-Id: I1d2b2b4beea802fce11da01edf71feb2064aab05 llvm-svn: 337273
* [DAGCombiner] Call SimplifyDemandedVectorElts from EXTRACT_VECTOR_ELTSimon Pilgrim2018-07-171-4/+23
| | | | | | | | If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops. Differential Revision: https://reviews.llvm.org/D49262 llvm-svn: 337258
* [Intrinsics] define funnel shift IR intrinsics + DAG builder supportSanjay Patel2018-07-161-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions, and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift" operation which may also be a single hardware instruction. Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working on that (much larger patch), but then I concluded that we don't need it. At least as a first step, we have all of the backend support necessary to match these ops...because it was required. And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are likely all that we'll ever need. There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes (along with improving the oversized shift documentation). Again, I don't think that's strictly necessary (as the test results here prove). That can be an efficiency improvement as a small follow-up patch. So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support. Differential Revision: https://reviews.llvm.org/D49242 llvm-svn: 337221
* [CodeGen] Fix inconsistent declaration parameter nameFangrui Song2018-07-1637-89/+89
| | | | llvm-svn: 337200
* [llvm] Change 2 instances of std::sort to llvm::sortMandeep Singh Grang2018-07-161-1/+1
| | | | llvm-svn: 337192
* [RegAlloc] Skip global splitting if the live range is huge and its spill isWei Mi2018-07-161-0/+19
| | | | | | | | | | | | | | | | | | | | | | | trivially rematerializable. We run into a case where machineLICM hoists a large number of live ranges outside of a big loop because it thinks those live ranges are trivially rematerializable. In regalloc, global splitting is tried out first for those live ranges before they are spilled and rematerialized. Because the global splitting algorithm is quadratic, increasing a lot of global splitting candidates causes huge compile time increase (50s to 1400s on my local machine when compiling a module). However, we think for live ranges which are very large and are trivially rematerialiable, it is better to just skip global splitting so as to save compile time with little chance of sacrificing performance. We uses the segment size of live range to indirectly evaluate whether the global splitting of the live range can introduce high cost, and use an option as a knob to adjust the size limit threshold. Differential Revision: https://reviews.llvm.org/D49353 llvm-svn: 337186
* [X86][AArch64][DAGCombine] Unfold 'check for [no] signed truncation' patternRoman Lebedev2018-07-161-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]] As discussed in https://reviews.llvm.org/D49179#1158957 and later, the IR for 'check for [no] signed truncation' pattern can be improved: https://rise4fun.com/Alive/gBf ^ that pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in signed case, therefore it is probably a good idea to improve it. But the IR-optimal patter does not lower efficiently, so we want to undo it.. This handles the simple pattern. There is a second pattern with predicate and constants inverted. NOTE: we do not check uses here. we always do the transform. Reviewers: spatel, craig.topper, RKSimon, javed.absar Reviewed By: spatel Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49266 llvm-svn: 337166
* Avoid losing Hi part when expanding VAARG nodes on big endian machinesDaniel Cederman2018-07-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Summary: If the high part of the load is not used the offset to the next element will not be set correctly. For example, on Sparc V8, the following code will read val2 from offset 4 instead of 8. ``` int val = __builtin_va_arg(va, long long); int val2 = __builtin_va_arg(va, int); ``` Reviewers: jyknight Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D48595 llvm-svn: 337161
* [AccelTable] Provide DWARF5AccelTableStaticData for dsymutil.Jonas Devlieghere2018-07-161-39/+81
| | | | | | | | | | | For dsymutil we want to store offsets in the accelerator table entries rather than DIE pointers. In addition, we need a way to communicate which CU a DIE belongs to. This patch provides support for both of these issues. Differential revision: https://reviews.llvm.org/D49102 llvm-svn: 337158
* Recommit r335794 "Add support for generating a call graph profile from ↵Michael J. Spencer2018-07-161-7/+53
| | | | | | Branch Frequency Info." with fix for removed functions. llvm-svn: 337140
* [DAGCombiner] fix typo in comment; NFCSanjay Patel2018-07-151-1/+1
| | | | llvm-svn: 337132
* [DAGCombiner] extend(ifpositive(X)) -> shift-right (not X)Sanjay Patel2018-07-151-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is almost the same as an existing IR canonicalization in instcombine, so I'm assuming this is a good early generic DAG combine too. The motivation comes from reduced bit-hacking for select-of-constants in IR after rL331486. We want to restore that functionality in the DAG as noted in the commit comments for that change and the llvm-dev discussion here: http://lists.llvm.org/pipermail/llvm-dev/2018-July/124433.html The PPC and AArch tests show that those targets are already doing something similar. x86 will be neutral in the minimal case and generally better when this pattern is extended with other ops as shown in the signbit-shift.ll tests. Note the asymmetry: we don't include the (extend (ifneg X)) transform because it already exists in SimplifySelectCC(), and that is verified in the later unchanged tests in the signbit-shift.ll files. Without the 'not' op, the general transform to use a shift is always a win because that's a single instruction. Alive proofs: https://rise4fun.com/Alive/ysli Name: if pos, get -1 %c = icmp sgt i16 %x, -1 %r = sext i1 %c to i16 => %n = xor i16 %x, -1 %r = ashr i16 %n, 15 Name: if pos, get 1 %c = icmp sgt i16 %x, -1 %r = zext i1 %c to i16 => %n = xor i16 %x, -1 %r = lshr i16 %n, 15 Differential Revision: https://reviews.llvm.org/D48970 llvm-svn: 337130
* [MachineOutliner] Check the last instruction from the sequence when updating ↵Francis Visoiu Mistrih2018-07-141-1/+1
| | | | | | | | | | | | | | | | | liveness The MachineOutliner was doing an std::for_each from the call (inserted before the outlined sequence) to the iterator at the end of the sequence. std::for_each needs the iterator past the end, so the last instruction was not taken into account when propagating the liveness information. This fixes the machine verifier issue in machine-outliner-disubprogram.ll. Differential Revision: https://reviews.llvm.org/D49295 llvm-svn: 337090
* [SLH] Introduce a new pass to do Speculative Load Hardening to mitigateChandler Carruth2018-07-131-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Spectre variant #1 for x86. There is a lengthy, detailed RFC thread on llvm-dev which discusses the high level issues. High level discussion is probably best there. I've split the design document out of this patch and will land it separately once I update it to reflect the latest edits and updates to the Google doc used in the RFC thread. This patch is really just an initial step. It isn't quite ready for prime time and is only exposed via debugging flags. It has two major limitations currently: 1) It only supports x86-64, and only certain ABIs. Many assumptions are currently hard-coded and need to be factored out of the code here. 2) It doesn't include any options for more fine-grained control, either of which control flow edges are significant or which loads are important to be hardened. 3) The code is still quite rough and the testing lighter than I'd like. However, this is enough for people to begin using. I have had numerous requests from people to be able to experiment with this patch to understand the trade-offs it presents and how to use it. We would also like to encourage work to similar effect in other toolchains. The ARM folks are actively developing a system based on this for AArch64. We hope to merge this with their efforts when both are far enough along. But we also don't want to block making this available on that effort. Many thanks to the *numerous* people who helped along the way here. For this patch in particular, both Eric and Craig did a ton of review to even have confidence in it as an early, rough cut at this functionality. Differential Revision: https://reviews.llvm.org/D44824 llvm-svn: 336990
* [LiveDebugValues] Tracking copying value between registersPetar Jovanovic2018-07-131-52/+127
| | | | | | | | | | | | | | | During the execution of long functions or functions that have a lot of inlined code it could come to the situation where tracked value could be transferred from one register to another. The transfer is recognized only if destination register is a callee saved register and if source register is killed. We do not salvage caller-saved registers since there is a great chance that killed register would outlive it. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D44016 llvm-svn: 336978
* CodeGen: Remove pipeline dependencies on StackProtector; NFCMatthias Braun2018-07-1313-69/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | This re-applies r336929 with a fix to accomodate for the Mips target scheduling multiple SelectionDAG instances into the pass pipeline. PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336964
* Revert "(HEAD -> master, origin/master, arcpatch-D37582) CodeGen: Remove ↵Matthias Braun2018-07-1213-83/+69
| | | | | | | | | | pipeline dependencies on StackProtector; NFC" This was triggering pass scheduling failures. This reverts commit r336929. llvm-svn: 336934
* CodeGen: Remove pipeline dependencies on StackProtector; NFCMatthias Braun2018-07-1213-69/+83
| | | | | | | | | | | | | | | | | | | | | | | PrologEpilogInserter and StackColoring depend on the StackProtector analysis being alive from the point it is run until PEI, which requires that they are all scheduled in the same FunctionPassManager. Inserting a (machine) ModulePass between StackProtector and PEI results in these passes being in separate FunctionPassManagers and the StackProtector is not available for PEI. PEI and StackColoring don't use much information from the StackProtector pass, so transfering the required information to MachineFrameInfo is cleaner than keeping the StackProtector pass around. This commit moves the SSP layout information to MFI instead of keeping it in the pass. This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587) is a first draft of the pagerando implementation described in http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html. Patch by Stephen Crane <sjc@immunant.com> Differential Revision: https://reviews.llvm.org/D49256 llvm-svn: 336929
* [DWARF v5] Generate range list tables into the .debug_rnglists section. No ↵Wolfgang Pieb2018-07-126-19/+138
| | | | | | | | | | | | support for split DWARF and no use of DW_FORM_rnglistx with the DW_AT_ranges attribute. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D49214 llvm-svn: 336927
* [CodeGen] Emit more precise AssertZext/AssertSext nodes.Eli Friedman2018-07-111-25/+8
| | | | | | | | | | | | This is marginally helpful for removing redundant extensions, and the code is easier to read, so it seems like an all-around win. In the new test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an AssertZext i2, which allows the extension of the return value to be eliminated. Differential Revision: https://reviews.llvm.org/D49004 llvm-svn: 336868
* [CodeGen] Ignore debug uses in MachineCopyPropagationKrzysztof Parzyszek2018-07-111-1/+1
| | | | | | | Debug uses should not count as real uses, since the presence of debug information could affect the generated code. llvm-svn: 336803
* [NFC][InstCombine] Converts isLegalNarrowLoad into isLegalNarrowLdStDiogo N. Sampaio2018-07-111-41/+55
| | | | | | | | | | | Reuse this function as to test correctness and profitability of reducing width of either load or store operations. Reviewsers: samparker Differential Revision: https://reviews.llvm.org/D48624 llvm-svn: 336800
* [SelectionDAG] Add constant buildvector support to isKnownNeverZeroSimon Pilgrim2018-07-112-6/+9
| | | | | | This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants). llvm-svn: 336779
* [DAGCombiner] Support non-uniform X%C -> X-(X/C)*C foldsSimon Pilgrim2018-07-111-1/+4
| | | | | | | | | | First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic. We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit. Differential Revision: https://reviews.llvm.org/D48975 llvm-svn: 336774
* [DAGCombiner] Add (urem X, -1) -> select(X == -1, 0, x) foldSimon Pilgrim2018-07-111-0/+6
| | | | llvm-svn: 336773
* [DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1Simon Pilgrim2018-07-101-0/+9
| | | | | | udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts. llvm-svn: 336701
* Revert "[AccelTable] Provide abstraction for emitting DWARF5 accelerator ↵Jonas Devlieghere2018-07-101-56/+18
| | | | | | | | | tables." This reverts r336529 because an alternative approach turned out to be a better fit for dsymuil. llvm-svn: 336698
* [DAGCombiner] visitREM - call visitSDIVLike/visitUDIVLike directly to avoid ↵Simon Pilgrim2018-07-101-12/+9
| | | | | | | | recursive combining. As suggested by @efriedma on D48975 use the visitSDIVLike/visitUDIVLike functions introduced at rL336656. llvm-svn: 336664
* [DAGCombiner] Split SDIV/UDIV optimization expansions from the rest of the ↵Simon Pilgrim2018-07-101-15/+44
| | | | | | | | combines. NFCI. As suggested by @efriedma on D48975, this patch separates the BuildDiv/Pow2 style optimizations from the rest of the visitSDIV/visitUDIV to make it easier to reuse the combines and will allow us to avoid some rather nasty node recursive combining in visitREM. llvm-svn: 336656
* [DWARF][NFC] Refactor range list emission to use a static helperWolfgang Pieb2018-07-101-57/+59
| | | | | | | | | | | This is prep for DWARF v5 range list emission. Emission of a single range list is moved to a static helper function. Reviewer: jdevlieghere Differential Revision: https://reviews.llvm.org/D49098 llvm-svn: 336621
* RenameIndependentSubregs: Fix handling of undef tied operandsMark Searles2018-07-091-6/+10
| | | | | | | | | Ensure that, if updating a tied operand pair, to only update that pair. Differential Revision: https://reviews.llvm.org/D49052 llvm-svn: 336593
* [globalisel][irtranslator] Add support for atomicrmw and (strong) cmpxchgDaniel Sanders2018-07-092-1/+224
| | | | | | | | | | | | | | | | | | | | Summary: This patch adds support for the atomicrmw instructions and the strong cmpxchg instruction to the IRTranslator. I've left out weak cmpxchg because LangRef.rst isn't entirely clear on what difference it makes to the backend. As far as I can tell from the code, it only matters to AtomicExpandPass which is run at the LLVM-IR level. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, javed.absar Reviewed By: qcolombet Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D40092 llvm-svn: 336589
* [X86][TLI] DAGCombine: Unfold variable bit-clearing mask to two shifts.Roman Lebedev2018-07-091-0/+58
| | | | | | | | | | | | | | | | | | | | | Summary: This adds a reverse transform for the instcombine canonicalizations that were added in D47980, D47981. As discussed later, that was worse at least for the code size, and potentially for the performance, too. https://rise4fun.com/Alive/Zmpl Reviewers: craig.topper, RKSimon, spatel Reviewed By: spatel Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D48768 llvm-svn: 336585
* [SelectionDAG] Add VT consistency checks to the creation of ISD::FMA.Craig Topper2018-07-091-0/+3
| | | | | | This is similar to what is done for binops. I don't know if this would have helped us catch the bug fixed in r336566 earlier or not, but I figured it couldn't hurt. llvm-svn: 336576
* [AccelTable] Provide abstraction for emitting DWARF5 accelerator tables.Jonas Devlieghere2018-07-091-18/+56
| | | | | | | | | | | | When emitting the DWARF accelerator tables from dsymutil, we don't have a DwarfDebug instance and we use a custom class to represent Dwarf compile units. This patch adds an interface AccelTableWriterInfo to abstract these from the Dwarf5AccelTableWriter, so we can have a custom implementation for this in dsymutil. Differential revision: https://reviews.llvm.org/D49031 llvm-svn: 336529
* [AccelTable] Dwarf5AccelTableEmitter -> Writer (NFC)Jonas Devlieghere2018-07-091-38/+38
| | | | | | | Renames Dwarf5AccelTableEmitter to Dwarf5AccelTableWriter as suggested in D49031. llvm-svn: 336525
* [SelectionDAG] Split float and integer isKnownNeverZero testsSimon Pilgrim2018-07-071-1/+11
| | | | | | | | | | Splits off isKnownNeverZeroFloat to handle +/- 0 float cases. This will make it easier to be more aggressive with the integer isKnownNeverZero tests (similar to ValueTracking), use computeKnownBits etc. Differential Revision: https://reviews.llvm.org/D48969 llvm-svn: 336492
* Use const APInt& to avoid extra copy. NFCI.Simon Pilgrim2018-07-071-1/+1
| | | | | | As discussed on D48825. llvm-svn: 336491
* [DAGCombiner] Add EXTRACT_SUBVECTOR to SimplifyDemandedVectorEltsSimon Pilgrim2018-07-072-0/+22
| | | | | | | | As discussed on PR37989, this patch adds EXTRACT_SUBVECTOR handling to TargetLowering::SimplifyDemandedVectorElts and calls it from DAGCombiner::visitEXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D48825 llvm-svn: 336490
* Use Type::isIntOrPtrTy where possible, NFCVedant Kumar2018-07-061-5/+3
| | | | | | | | | | | It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() || T.isPointerTy()`. I used Python's re.sub with this regex to update users: r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)' llvm-svn: 336462
OpenPOWER on IntegriCloud