summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Sink some IntrinsicInst.h and Intrinsics.h out of llvm/includeReid Kleckner2017-09-076-0/+7
| | | | | | | Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. llvm-svn: 312759
* Revert r312318, r312325, r312424, r312489Richard Trieu2017-09-073-39/+1
| | | | | | | | | | r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 llvm-svn: 312758
* Move duplicate helpers from DbgValueInst / DbgDeclareInst to DbgInfoIntrinsicReid Kleckner2017-09-071-28/+11
| | | | | | NFC llvm-svn: 312754
* [DWARF] Line 0 should not have a discriminator.Paul Robinson2017-09-071-2/+2
| | | | | | | | It's meaningless and takes up extra space in the line table. Differential Revision: https://reviews.llvm.org/D37364 llvm-svn: 312751
* [yaml2obj][ELF] Add support for symbol indexes greater than SHN_LORESERVEPetr Hosek2017-09-071-0/+37
| | | | | | | | | | | | | Right now Symbols must be either undefined or defined in a specific section. Some symbols have section indexes like SHN_ABS however. This change adds support for outputting symbols that have such section indexes. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D37391 llvm-svn: 312745
* COFF: PDB: Allow multiple modules with the same name.Peter Collingbourne2017-09-071-18/+3
| | | | | | | | | | It is possible for two modules to have the same name if they are archive members with the same name, or if we are doing LTO (in which case all modules will have the name "lto.tmp"). Differential Revision: https://reviews.llvm.org/D37589 llvm-svn: 312744
* Remove dead code. NFCI.Peter Collingbourne2017-09-071-8/+0
| | | | llvm-svn: 312740
* [CUDA] Added rudimentary support for CUDA-9 and sm_70.Artem Belevich2017-09-071-0/+5
| | | | | | | | | | | | | For now CUDA-9 is not included in the list of CUDA versions clang searches for, so the path to CUDA-9 must be explicitly passed via --cuda-path=. On LLVM side NVPTX added sm_70 GPU type which bumps required PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment. Differential Revision: https://reviews.llvm.org/D37576 llvm-svn: 312734
* AMDGPU: Start selecting v_mad_mix_f32Matt Arsenault2017-09-074-5/+105
| | | | llvm-svn: 312732
* DAG: Allow creating extract_vector_elt post-legalizeMatt Arsenault2017-09-071-1/+4
| | | | | | | | | | | | | | | | Fixes some combine issues for AMDGPU where we weren't getting the many extract_vector_elt combines expected in a future patch. This should really be checking isOperationLegalOrCustom on the extract. That improves a number of x86 lit tests, but a few get stuck in an infinite loop from one place where a similar looking extract is created. I have a different workaround in the backend for that which keeps many of those improvements, but also adds a few regressions. llvm-svn: 312730
* AMDGPU: Handle non-temporal loads and storesKonstantin Zhuravlyov2017-09-071-23/+59
| | | | | | Differential Revision: https://reviews.llvm.org/D36862 llvm-svn: 312729
* AMDGPU: Handle more than one memory operand in SIMemoryLegalizerKonstantin Zhuravlyov2017-09-072-58/+145
| | | | | | Differential Revision: https://reviews.llvm.org/D37397 llvm-svn: 312725
* [ARM] Remove redundant vcvt patterns.Benjamin Kramer2017-09-071-14/+0
| | | | | | | | These don't add any value as they're just compositions of existing patterns. However, they can confuse the cost logic in ISel, leading to duplicated vcvt instructions like in PR33199. llvm-svn: 312724
* [X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess ↵Michael Zuckerman2017-09-071-20/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | (VF{8|16|32} stride 3). This patch expands the support of lowerInterleavedload to {8|16|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) and we plan to include the store (deinterleved side). The patch goal is to optimize the following sequence: a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 into a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 Reviewers 1. zvi 2. igor 3. guyblank 4. dorit 5. Ayal llvm-svn: 312722
* [mips] Use RegisterMCAsmBackend to register all MIPS asm backends. NFCSimon Atanasyan2017-09-075-81/+28
| | | | | | | | | | | | | This change converts the `MipsAsmBackend` constructor to the "standard" form. It makes possible to use `RegisterMCAsmBackend` for the backends registrations. Now we pass `Triple` instance to the `MipsAsmBackend` ctor and deduce all required options like endianness and bitness from the triple. We still need to implement explicit ABI checking for providing correct options to backends. Differential revision: https://reviews.llvm.org/D37519 llvm-svn: 312720
* [MachineCombiner] Update instruction depths incrementally for large BBs.Florian Hahn2017-09-072-23/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For large basic blocks with lots of combinable instructions, the MachineTraceMetrics computations in MachineCombiner can dominate the compile time, as computing the trace information is quadratic in the number of instructions in a BB and it's relevant successors/predecessors. In most cases, knowing the instruction depth should be enough to make combination decisions. As we already iterate over all instructions in a basic block, the instruction depth can be computed incrementally. This reduces the cost of machine-combine drastically in cases where lots of instructions are combined. The major drawback is that AFAIK, computing the critical path length cannot be done incrementally. Therefore we only compute instruction depths incrementally, for basic blocks with more instructions than inc_threshold. The -machine-combiner-inc-threshold option can be used to set the threshold and allows for easier experimenting and checking if using incremental updates for all basic blocks has any impact on the performance. Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn Reviewed By: fhahn Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 312719
* [MachineTraceMetrics] Add computeDepth function (NFCI).Florian Hahn2017-09-071-54/+46
| | | | | | | | | | | | | | | | Summary: This function is used in D36619 to update the instruction depths incrementally. Reviewers: efriedma, Gerolf, MatzeB, fhahn Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36696 llvm-svn: 312714
* [Sparc][NFC] Clean up SelectCC loweringAlex Bradbury2017-09-071-44/+40
| | | | | | | | | | | | | | The ARM, BPF, MSP430, Sparc and Mips backends all use a similar code sequence for lowering SelectCC. As pointed out by @reames in D29937, this code isn't particularly clear and in most of these backends doesn't actually match the comments. This patch makes the code sequence clearer for the Sparc backend through better variable naming and more accurate comments (e.g. we are inserting triangle control flow, _not_ diamond). There is no functional change. Differential Revision: https://reviews.llvm.org/D37194 llvm-svn: 312713
* Revert "[RegAlloc] Make sure live-ranges reflect the state of the IR when ↵Jonas Paulsson2017-09-072-8/+2
| | | | | | | | | | removing them" This temporarily reverts commit 463fa38 (r311401). See https://bugs.llvm.org/show_bug.cgi?id=34502 llvm-svn: 312708
* X86: Improve AVX512 fptoui loweringZvi Rackover2017-09-073-0/+11
| | | | | | | | | | | | | | | | | Summary: Add patterns for fptoui <16 x float> to <16 x i8> fptoui <16 x float> to <16 x i16> Reviewers: igorb, delena, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37505 llvm-svn: 312704
* [X86] Force shuffle lowering to only create X86ISD::VPERM2X128 with 64-bit ↵Craig Topper2017-09-072-22/+5
| | | | | | | | | | element types so we can remove some patterns from isel. Intrinsic handling is still creating these nodes with 32-bit elements as well. But at least this gets rid of 8 and 16. Ideally, someday we'll convert the intrinsics to generic vector shuffles and remove the intrinsics. llvm-svn: 312702
* AMDGPU: Don't legalize i16 extloads to i32 with legal i16Matt Arsenault2017-09-073-1/+8
| | | | | | | Keeping non-i16 extloads makes it easier to match some new gfx9 load instructions. llvm-svn: 312699
* ModuleSummaryAnalysis: Correctly handle all function operand references.Peter Collingbourne2017-09-071-7/+5
| | | | | | | | | | | | | | | | | | | The current code that handles personality functions when creating a module summary does not correctly handle the case where a function's personality function operand refers to the function indirectly (e.g. via a bitcast). This patch handles such cases by treating personality function references like any other reference, i.e. by adding them to the function's reference list. This has the minor side benefit of allowing personality functions to participate in early dead stripping. We do this by calling findRefEdges on the function itself. This way we also end up handling other function operands (specifically prefix data and prologue data) for free. Differential Revision: https://reviews.llvm.org/D37553 llvm-svn: 312698
* [X86] Remove patterns for selecting a v8f32 X86ISD::MOVSS or v4f64 ↵Craig Topper2017-09-072-48/+0
| | | | | | | | X86ISD::MOVSD. I don't think we ever generate these. If we did, I would expect we would also be able to generate v16f32 and v8f64, but we don't have those patterns. llvm-svn: 312694
* ARM: track globals promoted to coalesced const pool entriesSaleem Abdulrasool2017-09-073-13/+27
| | | | | | | | | | | | | Globals that are promoted to an ARM constant pool may alias with another existing constant pool entry. We need to keep a reference to all globals that were promoted to each constant pool value so that we can emit a distinct label for each promoted global. These labels are necessary so that debug info can refer to the promoted global without an undefined reference during linking. Patch by Stephen Crane! llvm-svn: 312692
* Object: Downgrade invalid weak externals from an assert fail to an ↵Peter Collingbourne2017-09-071-3/+6
| | | | | | | | llvm::Error when creating an irsymtab. This fixes bitcode emission for modules containing invalid weak externals. llvm-svn: 312686
* InstSimplify: canonicalize is idempotentMatt Arsenault2017-09-071-0/+1
| | | | llvm-svn: 312685
* LTO: Remove unnecessary Windows support code.Peter Collingbourne2017-09-071-15/+0
| | | | | | | I empirically verified that open files can in fact be renamed on Windows with sys::fs::rename, so remove the incorrect code and comment. llvm-svn: 312683
* [Pass] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-09-062-32/+43
| | | | | | other minor fixes (NFC). llvm-svn: 312679
* [AMDGPU] Use v_pk_max_f16 for fcanonicalizeStanislav Mekhanoshin2017-09-061-5/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D37325 llvm-svn: 312676
* [WebAssembly] Only treat imports/exports as symbols when reading relocatable ↵Sam Clegg2017-09-062-34/+68
| | | | | | | | | | | | | | | | | | | object files This change only treats imported and exports functions and globals as symbol table entries the object has a "linking" section (i.e. it is relocatable object file). In this case all globals must be of type I32 and initialized with i32.const. This was previously being assumed but not checked for and was causing a failure on big endian machines due to using the wrong value of then union. See: https://bugs.llvm.org/show_bug.cgi?id=34487 Differential Revision: https://reviews.llvm.org/D37497 llvm-svn: 312674
* Removes redundant `llvm::`, add comments and simplify a return type of a ↵Rui Ueyama2017-09-061-29/+32
| | | | | | | | function. No functional change intended. llvm-svn: 312673
* Insert IMPLICIT_DEFS for undef uses in tail mergingMatthias Braun2017-09-066-84/+138
| | | | | | | | | | | | | | | | | | | | | Tail merging can convert an undef use into a normal one when creating a common tail. Doing so can make the register live out from a block which previously contained the undef use. To keep the liveness up-to-date, insert IMPLICIT_DEFs in such blocks when necessary. To enable this patch the computeLiveIns() function which used to compute live-ins for a block and set them immediately is split into new functions: - computeLiveIns() just computes the live-ins in a LivePhysRegs set. - addLiveIns() applies the live-ins to a block live-in list. - computeAndAddLiveIns() is a convenience function combining the other two functions and behaving like computeLiveIns() before this patch. Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D37034 llvm-svn: 312668
* Disable jump threading into loop headersKrzysztof Parzyszek2017-09-061-4/+9
| | | | | | | | | | | | | | | | | | | | | | Consider this type of a loop: for (...) { ... if (...) continue; ... } Normally, the "continue" would branch to the loop control code that checks whether the loop should continue iterating and which contains the (often) unique loop latch branch. In certain cases jump threading can "thread" the inner branch directly to the loop header, creating a second loop latch. Loop canonicalization would then transform this loop into a loop nest. The problem with this is that in such a loop nest neither loop is countable even if the original loop was. This may inhibit subsequent loop optimizations and be detrimental to performance. Differential Revision: https://reviews.llvm.org/D36404 llvm-svn: 312664
* [X86] Move more isel patterns to X86InstrVecCompiler.td. NFCCraig Topper2017-09-063-437/+184
| | | | | | This moves more of our subvector insert/extract tricks to X86InstrVecCompiler.td and refactors them into multiclasses. llvm-svn: 312661
* [AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalizeStanislav Mekhanoshin2017-09-061-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D37522 llvm-svn: 312660
* [IfConversion] Remove kill flags from common instructions as wellKrzysztof Parzyszek2017-09-061-4/+6
| | | | | | | | | | | | | | | | | | | When if-converting a diamond, two separate blocks will be placed back to back to form a straight line code. To ensure correctness of the liveness information, any registers that are live in the second block should not be killed in the first block, even if they were in the original code. Additionally, when the two blocks share common instructions at the beginning, these instructions will not be duplicated, but only placed once, before both of the blocks. Since the function "isIdenticalTo" (as used here) ignores kill flags, the common initial code in one block may have a kill flag for a register that is live in the other block. Because the code that removes kill flags only runs for the non-common parts of the predicated blocks, a kill flag mismatch in the common code could still lead to a live register being killed prematurely. llvm-svn: 312654
* [X86] Actually add the new file that was supposed to go with r312649.Craig Topper2017-09-061-0/+179
| | | | llvm-svn: 312650
* [X86] Introduce a new td file to hold patterns some of the non instruction ↵Craig Topper2017-09-063-211/+1
| | | | | | | | | | | | patterns from SSE and AVX512 This patch moves some of similar non-instruction patterns from X86InstrSSE.td and X86InstrAVX512.td to a common file. This is intended as a starting point. There are many other optimization patterns that exist in both files that we could move here. Differential Revision: https://reviews.llvm.org/D37455 llvm-svn: 312649
* Fix PR33878: BasicAA incorrectly assumes different address spaces don't aliasNuno Lopes2017-09-061-5/+0
| | | | | | | | | Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0. This code was written before address spaces were introduced Differential Revision: https://reviews.llvm.org/D37518 llvm-svn: 312648
* Minor style fixes in lib/Support/**/Program.(inc|cpp).Alexander Kornienko2017-09-063-72/+70
| | | | | | No functional changes intended. llvm-svn: 312646
* [Hexagon] Add option to generate calls to "abort" for "unreachable"Krzysztof Parzyszek2017-09-061-0/+6
| | | | llvm-svn: 312644
* [TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parentWei Mi2017-09-061-0/+11
| | | | | | | | | | | | | | | | | function return the intrinsics's first argument. llvm.memcpy/memset/memmove return void but they will return the first argument after they are expanded as libcalls. Now if the parent function has any return value, llvm.memcpy cannot be turned into tail call after expansion. The patch is to handle that case in SelectionDAGBuilder so when caller function return the same value as the first argument of llvm.memcpy, tail call is allowed. Differential Revision: https://reviews.llvm.org/D37406 llvm-svn: 312641
* [AMDGPU] Fix shouldClusterMemOps to process flat loadsStanislav Mekhanoshin2017-09-061-0/+4
| | | | | | | | Flat loads do not have vdata operand but have vdst instead. Differential Revision: https://reviews.llvm.org/D37502 llvm-svn: 312640
* AMDGPU: Make worst-case assumption about the wait states in inline assemblyNicolai Haehnle2017-09-061-1/+2
| | | | | | | | | | | | | | | | Summary: Mesa still uses a hack where empty inline assembly is used as a kind of optimization barrier. This exposed a problem where not enough wait states were inserted, because the hazard recognizer implicitly assumed that each inline assembly "instruction" has at least one wait state. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37205 llvm-svn: 312635
* [X86][X87] Ensure x87 instructions are tagged as altering the FPSW regSimon Pilgrim2017-09-061-7/+8
| | | | | | | | | | As noted in PR34080, a lot of x87 instructions alter the FPSW status register (or leave it in an undefined state) but aren't tagged as such in the tablegen. This patch tags the control word, stack, wait and math instructions as altering FPSW, which matches what the AMD APMs suggests happens. Differential Revision: https://reviews.llvm.org/D36414 llvm-svn: 312629
* [RISCV][NFC] Fix sorting of includes in lib/Target/RISCVAlex Bradbury2017-09-062-6/+6
| | | | llvm-svn: 312624
* [DAGCombiner] When combining EXTRACT_SUBVECTOR of a BUILD_VECTOR, make sure ↵Craig Topper2017-09-061-2/+3
| | | | | | we don't create a BUILD_VECTOR with an illegal type after type legalization. llvm-svn: 312621
* [x86] Fix PR34377 by disabling cmov conversion when we relied on itChandler Carruth2017-09-061-0/+10
| | | | | | | | | | | performing a zext of a register. On the PR there is discussion of how to more effectively handle this, but this patch prevents us from miscompiling code. Differential Revision: https://reviews.llvm.org/D37504 llvm-svn: 312620
* [X86] Add more FMA3 patterns to cover a load in all 3 possible positions.Craig Topper2017-09-062-68/+137
| | | | | | This matches what we already do for AVX512. The peephole pass makes up for this in most if not all cases. But this makes isel behavior for these consistent with every other instruction. llvm-svn: 312613
OpenPOWER on IntegriCloud