summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* LowerTypeTests: Improve performance by optimising type metadata queries.Peter Collingbourne2016-12-061-88/+129
| | | | | | | | | | | | | | | | | | | | | | Requesting metadata for a global is a relatively expensive operation as it involves a map lookup, but it's one that we need to do relatively frequently in this pass to collect the list of type metadata nodes associated with a global. This change improves the performance of type metadata queries by prebuilding data structures that keep the global together with its list of type metadata, and changing the pass to use that data structure wherever we were previously passing global references around. This change also eliminates some O(N^2) behavior by collecting the list of globals associated with each type identifier during the first pass over the list of globals rather than visiting each global to compute that list every time we add a new type identifier. Reduces pass runtime on a module containing Chrome's vtables from over 60s to 0.9s. Differential Revision: https://reviews.llvm.org/D27484 llvm-svn: 288859
* [CodeGen] Fix result type for SMULO/UMULO legalizationEli Friedman2016-12-061-0/+9
| | | | | | | | | | | | | | | On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857
* AMDGPU: Fix operand name for v_interp_*Matt Arsenault2016-12-061-16/+16
| | | | | | Other VOP instructions call the output vdst llvm-svn: 288856
* [InstSimplify] fixed (?) to not mutate icmpsSanjay Patel2016-12-061-10/+4
| | | | | | | | | | | | | | As Eli noted in the post-commit thread for r288833, the use of swapOperands() may not be allowed in InstSimplify, so I'm removing those calls here pending further review. The swap mutates the icmp, and there doesn't appear to be precedent for instruction mutation in InstSimplify. I didn't actually have any tests for those cases, so I'm adding a few here. llvm-svn: 288855
* AMDGPU/SI: Set correct value for amd_kernel_code_t::kernarg_segment_alignmentTom Stellard2016-12-062-0/+9
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27416 llvm-svn: 288852
* [BDCE/DebugInfo] Preserve llvm.dbg.value's argument.Davide Italiano2016-12-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | BDCE has two phases: 1. It asks SimplifyDemandedBits if all the bits of an instruction are dead, and if so, replaces all its uses with the constant zero. 2. Then, it asks SimplifyDemandedBits again if the instruction is really dead (no side effects etc..) and if so, eliminates it. Now, in 1) if all the bits of an instruction are dead, we may end up replacing a dbg use: %call = tail call i32 (...) @g() #4, !dbg !15 tail call void @llvm.dbg.value(metadata i32 %call, i64 0, metadata !8, metadata !16), !dbg !17 -> %call = tail call i32 (...) @g() #4, !dbg !15 tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !8, metadata !16), !dbg !17 but not eliminating the call because it may have arbitrary side effects. In other words, we lose some debug informations. This patch fixes the problem making sure that BDCE does nothing with the instruction if it has side effects and no non-dbg uses. Differential Revision: https://reviews.llvm.org/D27471 llvm-svn: 288851
* AMDGPU/SI: Don't move copies of immediates to the VALUTom Stellard2016-12-061-1/+43
| | | | | | | | | | | | | | | Summary: If we write an immediate to a VGPR and then copy the VGPR to an SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than moving the copy to the SALU. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27272 llvm-svn: 288849
* GlobalISel: correctly handle small args via memory.Tim Northover2016-12-061-1/+1
| | | | | | | We were rounding size in bits down rather than up, leading to 0-sized slots for i1 (assert!) and bugs for other types not byte-aligned. llvm-svn: 288848
* [X86] Prefer reduced width multiplication over pmulld on SilvermontZvi Rackover2016-12-064-3/+22
| | | | | | | | | | | | | | | | | | | | Summary: Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld. On Silvermont [source: Optimization Reference Manual]: PMULLD has a throughput of 1/11 [instruction/cycles]. PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles]. Fixes pr31202. Analysis of this issue was done by Fahana Aleen. Reviewers: wmi, delena, mkuper Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D27203 llvm-svn: 288844
* [DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combineSimon Pilgrim2016-12-061-0/+9
| | | | | | | | Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842
* [InstSimplify] add folds for and-of-icmps with same operandsSanjay Patel2016-12-061-0/+33
| | | | | | | | | | | All of these (and a few more) are already handled by InstCombine, but we shouldn't have to wait until then to simplify these because they're cheap to deal with here in InstSimplify. This is the 'and' sibling of the earlier 'or' patch: https://reviews.llvm.org/rL288833 llvm-svn: 288841
* GlobalISel: fall back gracefully when we hit unhandled legalizer default.Tim Northover2016-12-061-1/+3
| | | | llvm-svn: 288840
* [SelectionDAG] We can ignore knownbits from an undef shuffle vector index if ↵Simon Pilgrim2016-12-061-3/+3
| | | | | | we don't actually demand that element llvm-svn: 288839
* GlobalISel: handle G_SEQUENCE fallbacks gracefully.Tim Northover2016-12-062-0/+7
| | | | | | | | | | There were two problems: + AArch64 was reusing random data from its binary op tables, which is complete nonsense for G_SEQUENCE. + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE, the generic code asserted. llvm-svn: 288836
* GlobalISel: allow G_SELECT instructions for pointers.Tim Northover2016-12-061-4/+5
| | | | llvm-svn: 288835
* GlobalISel: stop the legalizer from trying to handle oddly-sized types.Tim Northover2016-12-061-0/+5
| | | | | | | | It'll almost immediately fail because it always tries to half/double the size until it finds a legal one. Unfortunately, this triggers an assertion preventing the DAG fallback from being possible. llvm-svn: 288834
* [InstSimplify] add folds for or-of-icmps with same operandsSanjay Patel2016-12-061-2/+34
| | | | | | | | All of these (and a few more) are already handled by InstCombine, but we shouldn't have to wait until then to simplify these because they're cheap to deal with here in InstSimplify. llvm-svn: 288833
* Avoid repeated calls to Op.getOpcode(). NFCI.Simon Pilgrim2016-12-061-3/+4
| | | | llvm-svn: 288814
* [globalisel][aarch64] Fix unintended assumptions about PartialMappingIdx. NFC.Daniel Sanders2016-12-062-18/+21
| | | | | | | | | | | | | | | | | | | | | | Summary: This is NFC but prevents assertions when PartialMappingIdx is tablegen-erated. The assumptions were: 1) FirstGPR is 0 2) FirstGPR is the first of the First* enumerators. GPR32 is changed to 1 to demonstrate that assumption #1 is fixed. #2 will be covered by a subsequent patch that tablegen-erates information and swaps the order of GPR and FPR as a side effect. Depends on D27336 Reviewers: ab, t.p.northover, qcolombet Subscribers: aemerson, rengolin, vkalintiris, dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D27337 llvm-svn: 288812
* [globalisel][aarch64] Replace magic numbers with corresponding enumerators ↵Daniel Sanders2016-12-061-29/+49
| | | | | | | | | | | | in ValMappings. NFC Reviewers: ab, t.p.northover, qcolombet Subscribers: aemerson, rengolin, vkalintiris, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27336 llvm-svn: 288810
* [globalisel][aarch64] Correct argument names in comments.Daniel Sanders2016-12-061-3/+3
| | | | llvm-svn: 288809
* [ARM] Better error message for invalid flag-preserving Thumb1 instsOliver Stannard2016-12-061-1/+4
| | | | | | | | | | When we see a non flag-setting instruction for which only the flag-setting version is available in Thumb1, we should give a better error message than "invalid instruction". Differential Revision: https://reviews.llvm.org/D27414 llvm-svn: 288805
* [X86][AVX512] Detect repeated constant patterns in BUILD_VECTOR suitable for ↵Ayman Musa2016-12-061-3/+116
| | | | | | | | | | | | broadcasting. Check if a build_vector node includes a repeated constant pattern and replace it with a broadcast of that pattern. For example: "build_vector <0, 1, 2, 3, 0, 1, 2, 3>" would be replaced by "broadcast <0, 1, 2, 3>" Differential Revision: https://reviews.llvm.org/D26802 llvm-svn: 288804
* [PowerPC] Improvements for BUILD_VECTOR Vol. 4Nemanja Ivanovic2016-12-062-27/+126
| | | | | | | | | | | | This is the final patch in the series of patches that improves BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations to remove redundant instructions. It also adds a large test case which encompasses a large set of code patterns that build vectors - this test case was the motivator for this series of patches. Differential Revision: https://reviews.llvm.org/D26066 llvm-svn: 288800
* [globalisel][aarch64] Prefix PartialMappingIdx enumerators with 'PMI_' to ↵Daniel Sanders2016-12-062-66/+70
| | | | | | | | fit coding standards. This also stops things like 'None' polluting the llvm::AArch64 namespace. llvm-svn: 288799
* Fix MSVC bool to uint64_t promotion warningSimon Pilgrim2016-12-061-1/+1
| | | | llvm-svn: 288796
* [LCG] Add some much needed asserts and verify runs to uncoverChandler Carruth2016-12-061-4/+12
| | | | | | | | | | | | | | | | | a hilarious bug and fix it. We somehow were never verifying the RefSCCs newly formed when splitting an existing one apart, and when verifying them we weren't really checking the SCC indices mapping effectively. If we had been, it would have been blindingly obvious that right after putting something int `RC.SCCs` we should update `RC.SCCIndices` instead of `SCCIndices` which we were about to clear and rebuild anyways. =[ Anyways, this is thoroughly covered by existing tests now that we actually verify things properly. llvm-svn: 288795
* [framelowering] Improve tracking of first CS pop instruction.Florian Hahn2016-12-061-5/+8
| | | | | | | | | | | | Summary: This patch makes sure FirstCSPop and MBBI never point to DBG_VALUE instructions, which affected the code generated. Reviewers: mkuper, aprantl, MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27343 llvm-svn: 288794
* Add missing parens in assert.Sam McCall2016-12-061-1/+1
| | | | | | | | | | Summary: Add missing parens in assert, which warn in GCC. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27448 llvm-svn: 288792
* [PM] Basic cleanups to CGSCC update code, NFC.Chandler Carruth2016-12-061-41/+36
| | | | | | | Just using InstIterator, simpler loop structures, and making better use of the visit callback infrastructure. llvm-svn: 288790
* [X86] Remove another weird scalar sqrt/rcp/rsqrt pattern.Craig Topper2016-12-061-6/+0
| | | | | | | | This pattern turned a vector sqrt/rcp/rsqrt operation of sse_load_f32/f64 into the the scalar instruction for the operation and put undef into the upper bits. For correctness, the resulting code should still perform the sqrt/rcp/rsqrt on the upper bits after the load is extended since that's what the operation asked for. Particularly in the case where the upper bits are 0, in that case we need calculate the sqrt/rcp/rsqrt of the zeroes and keep the result in the upper-bits. This implies we should be using the packed instruction still. The only test case for this pattern is one I just added so there was no coverage of this. llvm-svn: 288784
* [X86] Remove bad pattern that caused 128-bit loads being used by scalar ↵Craig Topper2016-12-061-3/+0
| | | | | | | | sqrt/rcp/rsqrt intrinsics to select the memory form of the corresponding instruction and violate the semantics of the intrinsic. The intrinsics are supposed to pass the upper bits straight through to their output register. This means we need to make sure we still perform the 128-bit load to get those upper bits to pass to give to the instruction since the memory form of the instruction only reads 32 or 64 bits. llvm-svn: 288781
* [X86] Correct pattern for VSQRTSSr_Int, VSQRTSDr_Int, VRCPSSr_Int, and ↵Craig Topper2016-12-061-1/+1
| | | | | | | | | | VRSQRTSSr_Int to not have an IMPLICIT_DEF on the first input. The semantics of the intrinsic are clear and not undefined. The intrinsic takes one argument, the lower bits are affected by the operation and the upper bits should be passed through. The instruction itself takes two operands, the high bits of the first operand are passed through and the low bits of the second operand are modified by the operation. To match this to the intrinsic we should pass the single intrinsic input to both operands. I had to remove the stack folding test for these instructions since they depended on the incorrect behavior. The same register is now used for both inputs so the load can't be folded. llvm-svn: 288779
* [ObjectYAML] First bit of support for encoding DWARF in MachOChris Bieneman2016-12-061-1/+16
| | | | | | This patch adds the starting support for encoding data from the MachO __DWARF segment. The first section supported is the __debug_str section because it is the simplest. llvm-svn: 288774
* [X86] Remove scalar logical op alias instructions. Just use ↵Craig Topper2016-12-064-56/+83
| | | | | | | | | | | | | | | | | | | COPY_FROM/TO_REGCLASS and the normal packed instructions instead Summary: This patch removes the scalar logical operation alias instructions. We can just use reg class copies and use the normal packed instructions instead. This removes the need for putting these instructions in the execution domain fixing tables as was done recently. I removed the loadf64_128 and loadf32_128 patterns as DAG combine creates a narrower load for (extractelt (loadv4f32)) before we ever get to isel. I plan to add similar patterns for AVX512DQ in a future commit to allow use of the larger register class when available. Reviewers: spatel, delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27401 llvm-svn: 288771
* [CMake] Cleanup TableGen include flagsChris Bieneman2016-12-061-0/+2
| | | | | | | | | | It is kinda crazy to have llvm/include and llvm/lib/Target in the include path for every tablegen invocation for every tablegen-like tool. This patch removes those flags from the tablgen function that is called everywhere by instead creating a variable LLVM_TABLEGEN_FLAGS which is setup in the LLVM source directories. This removes TableGen.cmake's dependency on LLVM_MAIN_SRC_DIR, and LLVM_MAIN_INCLUDE_DIR. llvm-svn: 288770
* [LVI] Remove dead code in mergeInPhilip Reames2016-12-061-18/+0
| | | | | | Integers are expressed in the lattice via constant ranges. They can never be represented by constants or not-constants; those are reserved for non-integer types. This code has been dead for literaly years. llvm-svn: 288767
* [LVI] Extract a helper functionPhilip Reames2016-12-061-32/+22
| | | | | | Extracting a helper function out of solveBlockValue makes the contract around the cache much easier to understand. llvm-svn: 288766
* [LVI] Hide the last markX function on LVILatticeValPhilip Reames2016-12-061-10/+10
| | | | | | This completes a small series of patches to hide the stateful updates of LVILatticeVal from the consuming code. The only remaining stateful API is mergeIn. llvm-svn: 288765
* [LVI] Hide a confusing internal interfacePhilip Reames2016-12-061-16/+19
| | | | llvm-svn: 288764
* [llvm] Fix D26214: Move error handling out of MC and to the callers.Mandeep Singh Grang2016-12-061-14/+5
| | | | | | | | | | | | Summary: Related clang patch; https://reviews.llvm.org/D27360 Reviewers: t.p.northover, grosbach, compnerd, echristo Subscribers: compnerd, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27359 llvm-svn: 288763
* [LVI] Remove duplicate code using existing helper functionPhilip Reames2016-12-061-6/+2
| | | | llvm-svn: 288761
* Revert "[SCCP] Remove manual folding of terminator instructions."Davide Italiano2016-12-061-2/+27
| | | | | | This reverts commit r288725 as it broke a bot. llvm-svn: 288759
* AMDGPU: Don't required structured CFGMatt Arsenault2016-12-061-2/+3
| | | | | | | | | | | The structured CFG is just an aid to inserting exec mask modification instructions, once that is done we don't really need it anymore. We also do not analyze blocks with terminators that modify exec, so this should only be impacting true branches. llvm-svn: 288744
* Summary: Currently there is no way to disable deprecated warning from asm ↵Weiming Zhao2016-12-051-1/+2
| | | | | | | | | | | | | | | | | | | | like this clang -target arm deprecated-asm.s -c deprecated-asm.s:30:9: warning: use of SP or PC in the list is deprecated stmia r4!, {r12-r14} We have to have an option what can disable it. Patched by Yin Ma! Reviewers: joey, echristo, weimingz Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D27219 llvm-svn: 288734
* [libFuzzer] refactor the code to allow collecting features in different ↵Kostya Serebryany2016-12-053-37/+42
| | | | | | ways. Also initialize a couple of Fuzzer:: members that might have been used uninitialized :( llvm-svn: 288731
* GlobalISel: avoid looking too closely at PHIs when we bail.Tim Northover2016-12-051-9/+11
| | | | | | | | The function used to finish off PHIs by adding the relevant basic blocks can fail if we're aborting and still don't actually have the needed MachineBasicBlocks. So avoid trying in that case. llvm-svn: 288727
* [SCCP] Remove manual folding of terminator instructions.Davide Italiano2016-12-051-27/+2
| | | | | | | | | | | | | There are two cases handled here: 1) a branch on undef 2) a switch with an undef condition. Both cases are currently handled by ResolvedUndefsIn. If we have a branch on undef, we force its value to false (which is trivially foldable). If we have a switch on undef, we force to the first constant (which is also foldable). llvm-svn: 288725
* [TableGen] Centralize/Unify error handling.Davide Italiano2016-12-051-23/+21
| | | | llvm-svn: 288724
* [pdb] handle missing pdb streams more gracefullyBob Haarman2016-12-052-30/+75
| | | | | | | | | | | | Summary: The code we use to read PDBs assumed that streams we ask it to read exist, and would read memory outside a vector and crash if this wasn't the case. This would, for example, cause llvm-pdbdump to crash on PDBs generated by lld. This patch handles such cases more gracefully: the PDB reading code in LLVM now reports errors when asked to get a stream that is not present, and llvm-pdbdump will report missing streams and continue processing streams that are present. Reviewers: ruiu, zturner Subscribers: thakis, amccarth Differential Revision: https://reviews.llvm.org/D27325 llvm-svn: 288722
OpenPOWER on IntegriCloud