summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Add DAG combine to form saturating VTRUNCUS/VTRUNCS from VTRUNCCraig Topper2019-10-041-0/+14
| | | | | | | | We already do this for ISD::TRUNCATE, but we can do the same for X86ISD::VTRUNC Differential Revision: https://reviews.llvm.org/D68432 llvm-svn: 373765
* [ModuloSchedule] Do not remap terminatorsJames Molloy2019-10-041-1/+1
| | | | | | | | | | This is a trivial point fix. Terminator instructions aren't scheduled, so we shouldn't expect to be able to remap them. This doesn't affect Hexagon and PPC because their terminators are always hardware loop backbranches that have no register operands. llvm-svn: 373762
* [AMDGPU][MC][GFX10][WS32] Corrected decoding of dst operand for v_cmp_*_sdwa ↵Dmitry Preobrazhensky2019-10-041-1/+2
| | | | | | | | | | | | opcodes See bug 43484: https://bugs.llvm.org/show_bug.cgi?id=43484 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68349 llvm-svn: 373745
* Fix MSVC "not all control paths return a value" warning. NFCI.Simon Pilgrim2019-10-041-0/+1
| | | | llvm-svn: 373741
* [AMDGPU][MC][GFX10] Enabled decoding of 'null' operandDmitry Preobrazhensky2019-10-041-0/+1
| | | | | | | | | | See bug 43485: https://bugs.llvm.org/show_bug.cgi?id=43485 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68348 llvm-svn: 373740
* ARM-Darwin: keep the frame register reserved even if not updated.Tim Northover2019-10-041-1/+1
| | | | | | | | Darwin platforms need the frame register to always point at a valid record even if it's not updated in a leaf function. Backtraces are more important than one extra GPR. llvm-svn: 373738
* [AMDGPU][MC][GFX10] Corrected definition of FLAT GLOBAL/SCRATCH instructionsDmitry Preobrazhensky2019-10-041-1/+1
| | | | | | | | | | See bug 43483: https://bugs.llvm.org/show_bug.cgi?id=43483 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68347 llvm-svn: 373736
* Fix MSVC "not all control paths return a value" warning. NFCI.Simon Pilgrim2019-10-041-0/+2
| | | | llvm-svn: 373730
* Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-10-041-1/+1
| | | | llvm-svn: 373729
* [DebugInfo] LiveDebugValues: move DBG_VALUE creation into VarLoc classJeremy Morse2019-10-041-107/+137
| | | | | | | | | | | | | | | | | | | | | | Rather than having a mixture of location-state shared between DBG_VALUEs and VarLoc objects in LiveDebugValues, this patch makes VarLoc the master record of variable locations. The refactoring means that the transfer of locations from one place to another is always a performed by an operation on an existing VarLoc, that produces another transferred VarLoc. DBG_VALUEs are only created at the end of LiveDebugValues, once all locations are known. As a plus, there is now only one method where DBG_VALUEs can be created. The test case added covers a circumstance that is now impossible to express in LiveDebugValues: if an already-indirect DBG_VALUE is spilt, previously it would have been restored-from-spill as a direct DBG_VALUE. We now don't lose this information along the way, as VarLocs always refer back to the "original" non-transfer DBG_VALUE, and we can always work out whether a location was "originally" indirect. Differential Revision: https://reviews.llvm.org/D67398 llvm-svn: 373727
* [DebugInfo] LiveDebugValues: defer DBG_VALUE creation during analysisJeremy Morse2019-10-041-8/+7
| | | | | | | | | | | | | | | | | | When transfering variable locations from one place to another, LiveDebugValues immediately creates a DBG_VALUE representing that transfer. This causes trouble if the variable location should subsequently be invalidated by a loop back-edge, such as in the added test case: the transfer DBG_VALUE from a now-invalid location is used as proof that the variable location is correct. This is effectively a self-fulfilling prophesy. To avoid this, defer the insertion of transfer DBG_VALUEs until after analysis has completed. Some of those transfers are still sketchy, but we don't propagate them into other blocks now. Differential Revision: https://reviews.llvm.org/D67393 llvm-svn: 373720
* AMDGPU/GlobalISel: Fix using wrong addrspace for apertureMatt Arsenault2019-10-041-1/+3
| | | | | | | This was always passing the destination flat address space, when it should be picking between the two valid source options. llvm-svn: 373716
* AMDGPU/GlobalISel: Select G_PTRTOINTMatt Arsenault2019-10-041-0/+1
| | | | llvm-svn: 373715
* AMDGPU/GlobalISel: Support wave32 waterfall loopsMatt Arsenault2019-10-041-22/+30
| | | | llvm-svn: 373714
* [X86] Enable inline memcmp() to use AVX512David Zarzycki2019-10-041-2/+1
| | | | llvm-svn: 373706
* Revert "[Symbolize] Use the local MSVC C++ demangler instead of relying on ↵Martin Storsjo2019-10-041-4/+37
| | | | | | | | | dbghelp. NFC." This reverts SVN r373698, as it broke sanitizer tests, e.g. in http://lab.llvm.org:8011/builders/sanitizer-windows/builds/52441. llvm-svn: 373701
* [AMDGPU][SILoadStoreOptimizer] NFC: Refactor codePiotr Sobczak2019-10-041-120/+80
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes a potential aliasing problem in InstClassEnum, where local values were mixed with machine opcodes. Introducing InstSubclass will keep them separate and help extending InstClassEnum with other instruction types (e.g. MIMG) in the future. This patch also makes getSubRegIdxs() more concise. Reviewers: nhaehnle, arsenm, tstellar Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68384 llvm-svn: 373699
* [Symbolize] Use the local MSVC C++ demangler instead of relying on dbghelp. NFC.Martin Storsjo2019-10-041-37/+4
| | | | | | | | | This allows making a couple llvm-symbolizer tests run in all environments. Differential Revision: https://reviews.llvm.org/D68133 llvm-svn: 373698
* [JITLink] Explicitly destroy bumpptr-allocated blocks to avoid a memory leak.Lang Hames2019-10-041-0/+6
| | | | llvm-svn: 373693
* [JITLink] Fix an unused variable warning.Lang Hames2019-10-041-3/+2
| | | | llvm-svn: 373692
* [JITLink] Switch from an atom-based model to a "blocks and symbols" model.Lang Hames2019-10-0415-1328/+1488
| | | | | | | | | | | | | | | | | | | | | | | | In the Atom model the symbols, content and relocations of a relocatable object file are represented as a graph of atoms, where each Atom represents a contiguous block of content with a single name (or no name at all if the content is anonymous), and where edges between Atoms represent relocations. If more than one symbol is associated with a contiguous block of content then the content is broken into multiple atoms and layout constraints (represented by edges) are introduced to ensure that the content remains effectively contiguous. These layout constraints must be kept in mind when examining the content associated with a symbol (it may be spread over multiple atoms) or when applying certain relocation types (e.g. MachO subtractors). This patch replaces the Atom model in JITLink with a blocks-and-symbols model. The blocks-and-symbols model represents relocatable object files as bipartite graphs, with one set of nodes representing contiguous content (Blocks) and another representing named or anonymous locations (Symbols) within a Block. Relocations are represented as edges from Blocks to Symbols. This scheme removes layout constraints (simplifying handling of MachO alt-entry symbols, and hopefully ELF sections at some point in the future) and simplifies some relocation logic. llvm-svn: 373689
* [RISCV] Split SP adjustment to reduce the offset of callee saved register ↵Shiva Chen2019-10-042-1/+90
| | | | | | | | | | | | | | | | | | | | spill and restore We would like to split the SP adjustment to reduce the instructions in prologue and epilogue as the following case. In this way, the offset of the callee saved register could fit in a single store. add sp,sp,-2032 sw ra,2028(sp) sw s0,2024(sp) sw s1,2020(sp) sw s3,2012(sp) sw s4,2008(sp) add sp,sp,-64 Differential Revision: https://reviews.llvm.org/D68011 llvm-svn: 373688
* LowerTypeTests: Rename local functions to avoid collisions with identically ↵Peter Collingbourne2019-10-031-0/+11
| | | | | | | | | | | named functions in ThinLTO modules. Without this we can encounter link errors or incorrect behaviour at runtime as a result of the wrong function being referenced. Differential Revision: https://reviews.llvm.org/D67945 llvm-svn: 373678
* [MemorySSA] Don't hoist stores if interfering uses (as calls) exist.Alina Sbirlea2019-10-031-1/+11
| | | | llvm-svn: 373674
* [DAGCombiner] add operation legality checks before creating shift ops (PR43542)Sanjay Patel2019-10-031-1/+6
| | | | | | | | | | | | | | As discussed on llvm-dev and: https://bugs.llvm.org/show_bug.cgi?id=43542 ...we have transforms that assume shift operations are legal and transforms to use them are profitable, but that may not hold for simple targets. In this case, the MSP430 target custom lowers shifts by repeating (many) simpler/fixed ops. That can be avoided by keeping this code as setcc/select. Differential Revision: https://reviews.llvm.org/D68397 llvm-svn: 373666
* Reland r349624: Let TableGen write output only if it changed, instead of ↵Nico Weber2019-10-031-8/+29
| | | | | | | | | | | doing so in cmake Move the write-if-changed logic behind a flag and don't pass it with the MSVC generator. msbuild doesn't have a restat optimization, so not doing write-if-change there doesn't have a cost, and it should fix whatever causes PR43385. llvm-svn: 373664
* DebugInfo: Generalize rnglist emission as a precursor to reusing it for ↵David Blaikie2019-10-031-15/+25
| | | | | | loclist emission llvm-svn: 373663
* [AArch64InstPrinter] prefer bfi to bfc for < armv8.2-aNick Desaulniers2019-10-031-1/+2
| | | | | | | | | | | | | | | | | | | Summary: Fixes pr/42576. Link: https://github.com/ClangBuiltLinux/linux/issues/697 Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D68356 llvm-svn: 373655
* [PowerPC] Adjust the naming and operand order of fnmsub patternsJinsong Ji2019-10-031-18/+18
| | | | | | | | | | | | | | | | | | | | | | Summary: This is follow up patch of https://reviews.llvm.org/D67595. Adjust naming and the Commutable operands for additional patterns to make it easier to read. The testcase update also show that we can save some unecessary fmr as well. Reviewers: #powerpc, steven.zhang, hfinkel, nemanjai Reviewed By: #powerpc, nemanjai Subscribers: wuzish, hiraditya, kbarton, MaskRay, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68112 llvm-svn: 373652
* [NFC] Fix unused variable in release buildsJordan Rupprecht2019-10-031-1/+2
| | | | llvm-svn: 373646
* [X86] Add v32i8 shuffle lowering strategy to recognize two v4i64 vectors ↵Craig Topper2019-10-031-0/+44
| | | | | | | | | | | | | truncated to v4i8 and concatenated into the lower 8 bytes with undef/zero upper bytes. This patch recognizes the shuffle pattern we get from a v8i64->v8i8 truncate when v8i64 isn't a legal type. With VLX we can use two VTRUNCs, unpckldq, and a insert_subvector. Diffrential Revision: https://reviews.llvm.org/D68374 llvm-svn: 373645
* [X86] matchShuffleWithSHUFPD - use Zeroable element mask directly. NFCI.Simon Pilgrim2019-10-031-7/+7
| | | | | | | | | | We can make use of the Zeroable mask to indicate which elements we can safely set to zero instead of creating a target shuffle mask on the fly. This only leaves one user of createTargetShuffleMask which we can hopefully get rid of in a similar manner. This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle masks isn't adjusted by its source inputs in setTargetShuffleZeroElements but instead we cache them in a parallel Zeroable mask. llvm-svn: 373641
* AMDGPU/GlobalISel: Handle RegBankSelect of G_INSERT_VECTOR_ELTMatt Arsenault2019-10-031-5/+77
| | | | llvm-svn: 373639
* AMDGPU/GlobalISel: Split 64-bit vector extracts during RegBankSelectMatt Arsenault2019-10-032-167/+271
| | | | | | | | Register indexing 64-bit elements is possible on the SALU, but not the VALU. Handle splitting this into two 32-bit indexes. Extend waterfall loop handling to allow moving a range of instructions. llvm-svn: 373638
* AMDGPU/GlobalISel: Allow VGPR to index SGPR registerMatt Arsenault2019-10-031-4/+6
| | | | | | | | We can still do a waterfall loop over the index if using a VGPR to index an SGPR. The result will still be a VGPR, but we can avoid the wide copy of the source register to a VGPR. llvm-svn: 373637
* AMDGPU/GlobalISel: Fix mutationIsSane assert v8s8 andMatt Arsenault2019-10-031-2/+3
| | | | | | This would try to do FewerElements to v9s8 llvm-svn: 373635
* AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructionsTom Stellard2019-10-031-82/+185
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a pre-pass to this optimization that scans through the basic block and generates lists of mergeable instructions with one list per unique address. In the optimization phase instead of scanning through the basic block for mergeable instructions, we now iterate over the lists generated by the pre-pass. The decision to re-optimize a block is now made per list, so if we fail to merge any instructions with the same address, then we do not attempt to optimize them in future passes over the block. This will help to reduce the time this pass spends re-optimizing instructions. In one pathological test case, this change reduces the time spent in the SILoadStoreOptimizer from 0.2s to 0.03s. This restructuring will also make it possible to implement further solutions in this pass, because we can now add less expensive checks to the pre-pass and filter instructions out early which will avoid the need to do the expensive scanning during the optimization pass. For example, checking for adjacent offsets is an inexpensive test we can move to the pre-pass. Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65961 llvm-svn: 373630
* [ModuloSchedule] removeBranch() *before* creating the trip count conditionJames Molloy2019-10-031-2/+1
| | | | | | | | | | | | | | The Hexagon code assumes there's no existing terminator when inserting its trip count condition check. This causes swp-stages5.ll to break. The generated code looks good to me, it is likely a permutation. I have disabled the new codegen path to keep everything green and will investigate along with the other 3-4 tests that have different codegen. Fixes expensive-checks build. llvm-svn: 373629
* [BPF] Handle offset reloc endpoint ending in the middle of chain properlyYonghong Song2019-10-031-118/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During studying support for bitfield, I found an issue for an example like the one in test offset-reloc-middle-chain.ll. struct t1 { int c; }; struct s1 { struct t1 b; }; struct r1 { struct s1 a; }; #define _(x) __builtin_preserve_access_index(x) void test1(void *p1, void *p2, void *p3); void test(struct r1 *arg) { struct s1 *ps = _(&arg->a); struct t1 *pt = _(&arg->a.b); int *pi = _(&arg->a.b.c); test1(ps, pt, pi); } The IR looks like: %0 = llvm.preserve.struct.access(base, ...) %1 = llvm.preserve.struct.access(%0, ...) %2 = llvm.preserve.struct.access(%1, ...) using %0, %1 and %2 In this case, we need to generate three relocatiions corresponding to chains: (%0), (%0, %1) and (%0, %1, %2). After collecting all the chains, the current implementation process each chain (in a map) with code generation sequentially. For example, after (%0) is processed, the code may look like: %0 = base + special_global_variable // llvm.preserve.struct.access(base, ...) is delisted // from the instruction stream. %1 = llvm.preserve.struct.access(%0, ...) %2 = llvm.preserve.struct.access(%1, ...) using %0, %1 and %2 When processing chain (%0, %1), the current implementation tries to visit intrinsic llvm.preserve.struct.access(base, ...) to get some of its properties and this caused segfault. This patch fixed the issue by remembering all necessary information (kind, metadata, access_index, base) during analysis phase, so in code generation phase there is no need to examine the intrinsic call instructions. This also simplifies the code. Differential Revision: https://reviews.llvm.org/D68389 llvm-svn: 373621
* Revert "[Alignment][NFC] Allow constexpr Align"Guillaume Chatelet2019-10-031-1/+1
| | | | | | This reverts commit b3af236fb5fc6e50fcc1b54d868f0bff557f3fb1. llvm-svn: 373619
* [RISCV] Add obsolete aliases of fscsr, frcsr (fssr, frsr)Edward Jones2019-10-031-0/+6
| | | | | | | | These old aliases were renamed, but are still used by some projects (eg newlib). Differential Revision: https://reviews.llvm.org/D68392 llvm-svn: 373618
* [yaml2obj] - Add a Size tag support for SHT_LLVM_ADDRSIG sections.George Rimar2019-10-032-6/+12
| | | | | | | | | It allows using "Size" with or without "Content" in YAML descriptions of SHT_LLVM_ADDRSIG sections. Differential revision: https://reviews.llvm.org/D68334 llvm-svn: 373610
* Recommit r373598 "[yaml2obj/obj2yaml] - Add support for SHT_LLVM_ADDRSIG ↵George Rimar2019-10-032-0/+67
| | | | | | | | | | | | | | | | | sections." Fix: call `consumeError()` for a case missed. Original commit message: SHT_LLVM_ADDRSIG is described here: https://llvm.org/docs/Extensions.html#sht-llvm-addrsig-section-address-significance-table This patch teaches tools to dump them and to parse the YAML declarations of such sections. Differential revision: https://reviews.llvm.org/D68333 llvm-svn: 373606
* [PGO] Refactor Value Profiling into a plugin based oracle and create a well ↵Bardia Mahjour2019-10-035-120/+280
| | | | | | | | | | | | | | | | | | | | | | defined API for the plugins. Summary: This PR creates a utility class called ValueProfileCollector that tells PGOInstrumentationGen and PGOInstrumentationUse what to value-profile and where to attach the profile metadata. It then refactors logic scattered in PGOInstrumentation.cpp into two plugins that plug into the ValueProfileCollector. Authored By: Wael Yehia <wyehia@ca.ibm.com> Reviewer: davidxl, tejohnson, xur Reviewed By: davidxl, tejohnson, xur Subscribers: llvm-commits Tag: #llvm Differential Revision: https://reviews.llvm.org/D67920 Patch By Wael Yehia <wyehia@ca.ibm.com> llvm-svn: 373601
* [AArch64][SVE] Adding patterns for floating point SVE add instructions.Ehsan Amiri2019-10-032-12/+14
| | | | llvm-svn: 373600
* Revert r373598 "[yaml2obj/obj2yaml] - Add support for SHT_LLVM_ADDRSIG ↵George Rimar2019-10-032-67/+0
| | | | | | | | | sections." It broke BB: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18655/steps/test/logs/stdio llvm-svn: 373599
* [yaml2obj/obj2yaml] - Add support for SHT_LLVM_ADDRSIG sections.George Rimar2019-10-032-0/+67
| | | | | | | | | | | SHT_LLVM_ADDRSIG is described here: https://llvm.org/docs/Extensions.html#sht-llvm-addrsig-section-address-significance-table This patch teaches tools to dump them and to parse the YAML declarations of such sections. Differential revision: https://reviews.llvm.org/D68333 llvm-svn: 373598
* [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)Guillaume Chatelet2019-10-0313-36/+31
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68268 llvm-svn: 373595
* [mips] Push `fixup_Mips_LO16` fixup for `jialc` and `jic` instructionsSimon Atanasyan2019-10-031-2/+5
| | | | llvm-svn: 373591
* [AArch64] Static (de)allocation of SVE stack objects.Sander de Smalen2019-10-036-13/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support to AArch64FrameLowering to allocate fixed-stack SVE objects. The focus of this patch is purely to allow the stack frame to allocate/deallocate space for scalable SVE objects. More dynamic allocation (at compile-time, i.e. determining placement of SVE objects on the stack), or resolving frame-index references that include scalable-sized offsets, are left for subsequent patches. SVE objects are allocated in the stack frame as a separate region below the callee-save area, and above the alignment gap. This is done so that the SVE objects can be accessed directly from the FP at (runtime) VL-based offsets to benefit from using the VL-scaled addressing modes. The layout looks as follows: +-------------+ | stack arg | +-------------+ | Callee Saves| | X29, X30 | (if available) |-------------| <- FP (if available) | : | | SVE area | | : | +-------------+ |/////////////| alignment gap. | : | | Stack objs | | : | +-------------+ <- SP after call and frame-setup SVE and non-SVE stack objects are distinguished using different StackIDs. The offsets for objects with TargetStackID::SVEVector should be interpreted as purely scalable offsets within their respective SVE region. Reviewers: thegameg, rovka, t.p.northover, efriedma, rengolin, greened Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61437 llvm-svn: 373585
OpenPOWER on IntegriCloud