summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Simplify debug_loc.dwo handling slightly.David Blaikie2014-04-013-8/+3
| | | | llvm-svn: 205322
* DebugInfo: Avoid creating unnecessary/empty line tables and remove the ↵David Blaikie2014-04-011-11/+2
| | | | | | | | | | | | special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference This moves one case of raw text checking down into the MCStreamer interfaces in the form of a virtual function, even if we ultimately end up consolidating on the one-or-many line tables issue one day, this is nicer in the interim. This just generally streamlines a bunch of use cases into a common code path. llvm-svn: 205287
* LTO type uniquing: store the Decl field of a DIImportedEntity as a DIRef.Adrian Prantl2014-04-011-1/+1
| | | | | | | | | | No other functionality changes, DIBuilder testcase is included in a paired CFE commit. This relaxes the assertion in isScopeRef to also accept subclasses of DIScope. llvm-svn: 205279
* [Stackmaps] Update the stackmap format to use 64-bit relocations for the ↵Juergen Ributzka2014-03-311-20/+36
| | | | | | | | | | | | function address and properly align all entries. This commit updates the stackmap format to version 1 to indicate the reorganizaion of several fields. This was done in order to align stackmap entries to their natural alignment and to minimize padding. Fixes <rdar://problem/16005902> llvm-svn: 205254
* Change shouldSplitVectorElementType to better match the description.Matt Arsenault2014-03-311-1/+1
| | | | | | Pass the entire vector type, and not just the element. llvm-svn: 205247
* Add an optional ability to expand larger BUILD_VECTORs with shufflesHal Finkel2014-03-311-20/+117
| | | | | | | | | | | | | | | | | | | | | | | This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243
* Add a TLI hook to control when BUILD_VECTOR might be expanded using shufflesHal Finkel2014-03-311-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230
* Disable each MachineFunctionPass for 'optnone' functions, unless thatPaul Robinson2014-03-3114-0/+42
| | | | | | | pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228
* Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELTHal Finkel2014-03-311-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. llvm-svn: 205179
* Make use of previously generated stores in ↵Hal Finkel2014-03-301-4/+33
| | | | | | | | | | | | | | | | | SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153
* Avoid storing Twines.Benjamin Kramer2014-03-291-22/+19
| | | | | | While there nested ifs into a helper function. No functionality change. llvm-svn: 205108
* CodeGen: add sensible defaults for the ISD::FROUND operationTim Northover2014-03-291-0/+9
| | | | | | Some exotic types didn't know how to handle FROUND, which ARM64 uses. llvm-svn: 205088
* CodeGenPrep: wrangle IR to exploit AArch64 tbz/tbnz inst.Tim Northover2014-03-292-0/+81
| | | | | | | | | | | | | | | | Given IR like: %bit = and %val, #imm-with-1-bit-set %tst = icmp %bit, 0 br i1 %tst, label %true, label %false some targets can emit just a single instruction (tbz/tbnz in the AArch64 case). However, with ISel acting at the basic-block level, all three instructions need to be together for this to be possible. This adds another transformation to CodeGenPrep to expose these opportunities, if targets opt in via the hook. llvm-svn: 205086
* Provide a target override for the cost of using a callee-saved registerManman Ren2014-03-271-2/+6
| | | | | | | | | for the first time. Thanks Andy for the discussion. rdar://16162005 llvm-svn: 204979
* Canonicalise Windows target triple spellingsSaleem Abdulrasool2014-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Construct a uniform Windows target triple nomenclature which is congruent to the Linux counterpart. The old triples are normalised to the new canonical form. This cleans up the long-standing issue of odd naming for various Windows environments. There are four different environments on Windows: MSVC: The MS ABI, MSVCRT environment as defined by Microsoft GNU: The MinGW32/MinGW32-W64 environment which uses MSVCRT and auxiliary libraries Itanium: The MSVCRT environment + libc++ built with Itanium ABI Cygnus: The Cygwin environment which uses custom libraries for everything The following spellings are now written as: i686-pc-win32 => i686-pc-windows-msvc i686-pc-mingw32 => i686-pc-windows-gnu i686-pc-cygwin => i686-pc-windows-cygnus This should be sufficiently flexible to allow us to target other windows environments in the future as necessary. llvm-svn: 204977
* Register Allocator: refactoring and add comments.Manman Ren2014-03-271-35/+58
| | | | | | | | No functionality change. Thanks Andy for reviewing. rdar://16162005 llvm-svn: 204962
* DebugInfo: TargetOptions/MCAsmInfo support for compressed debug info sectionsDavid Blaikie2014-03-271-0/+3
| | | | llvm-svn: 204957
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934
* This is a fix for PR# 19051. I noticed code gen differences due to code ↵Ekaterina Romanova2014-03-261-1/+1
| | | | | | motion when running tests with and without the debug info at O2. The problem is in branch folding. A loop wanted to skip the debug info, but actually it didn't do so. llvm-svn: 204865
* Add comments. Addressing review comments from Evan on r204690.Manman Ren2014-03-261-0/+5
| | | | llvm-svn: 204864
* Fix for incorrect address sinking in the presence of potential overflows.Jim Grosbach2014-03-261-1/+8
| | | | | | | | | | | | | In some cases it is possible for CGP to attempt to reuse a base address from another basic block. In those cases we have to be sure that all the address math was either done at the same bit width, or that none of it overflowed before it was extended. Patch by Louis Gerbarg <lgg@apple.com> rdar://16307442 llvm-svn: 204833
* Add @llvm.clear_cache builtinRenato Golin2014-03-261-0/+2
| | | | | | | | | | | | | | | | | Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802
* Follow-up to r204790: don't try to emit line tables if there are no ↵Timur Iskhodzhanov2014-03-261-2/+9
| | | | | | functions with DI in the TU llvm-svn: 204795
* Fix PR19239 - Add support for generating debug info for functions without ↵Timur Iskhodzhanov2014-03-262-16/+12
| | | | | | lexical scopes and/or debug info at all llvm-svn: 204790
* Revert "Prevent alias from pointing to weak aliases."Rafael Espindola2014-03-261-1/+1
| | | | | | | | | This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781
* blockfreq: Implement Pass::releaseMemory()Duncan P. N. Exon Smith2014-03-251-10/+10
| | | | | | | | | | Implement Pass::releaseMemory() in BlockFrequencyInfo and MachineBlockFrequencyInfo. Just delete the private implementation when not in use. Switch to a std::unique_ptr to make the logic more clear. <rdar://problem/14292693> llvm-svn: 204741
* blockfreq: Use const in MachineBlockFrequencyInfoDuncan P. N. Exon Smith2014-03-252-10/+10
| | | | | | <rdar://problem/14292693> llvm-svn: 204740
* [DAG] Keep the opaque constant flag when performing unary constant folding ↵Juergen Ributzka2014-03-251-6/+12
| | | | | | | | | | | | operations. Usually opaque constants shouldn't be folded, unless they are simple unary operations that don't create new constants. Although this shouldn't drop the opaque constant flag. This commit fixes this. Related to <rdar://problem/14774662> llvm-svn: 204737
* Fix creating illegal setcc cond codes.Matt Arsenault2014-03-251-10/+18
| | | | | | | | | | | | | | If GT/UGT or LT/ULT were set to expand, a comparison with a constant would replace it with the illegal cond code. There are several more places later in this function that will have the same basic problem. Theoretically R600 should hit this problem for a test, but for some reason it doesn't. llvm-svn: 204727
* WinCOFF: Add support for -fdata-sectionsDavid Majnemer2014-03-251-3/+9
| | | | | | | | | | | | This is a pretty straight forward translation for COFF, we just need to stick the data in a COMDAT section marked as IMAGE_COMDAT_SELECT_NODUPLICATES. N.B. We must be careful to avoid sticking entities with private linkage in COMDAT groups. COFF is pretty hostile to the renaming of entities so we must be careful to disallow GlobalVariables with unstable names. llvm-svn: 204703
* DebugInfo: Add GNU_addr_base and GNU_ranges_base only when there are ↵David Blaikie2014-03-251-13/+11
| | | | | | | | addresses or ranges Based on code review feedback from Eric in r204672. llvm-svn: 204702
* DebugInfo: Support debug_loc under fissionDavid Blaikie2014-03-253-15/+34
| | | | | | | | | | | | | | | | | | | | | | Implement debug_loc.dwo, as well as llvm-dwarfdump support for dumping this section. Outlined in the DWARF5 spec and http://gcc.gnu.org/wiki/DebugFission the debug_loc.dwo section has more variation than the standard debug_loc, allowing 3 different forms of entry (plus the end of list entry). GCC seems to, and Clang certainly, only use one form, so I've just implemented dumping support for that for now. It wasn't immediately obvious that there was a good refactoring to share the implementation of dumping support between debug_loc and debug_loc.dwo, so they're separate for now - ideas welcome or I may come back to it at some point. As per a comment in the code, we could choose different forms that may reduce the number of debug_addr entries we emit, but that will require further study. llvm-svn: 204697
* DebugInfo: Remove unnecessary zero-size checkDavid Blaikie2014-03-251-3/+0
| | | | | | | | | This seems excessive - switching section isn't expensive (or if it is we're already being wasteful, since we emitted the debug_loc section symbol earlier anyway) and otherwise there's no work that happens in this function when the list is empty. llvm-svn: 204696
* Register Allocator: check other options before using a CSR for the first time.Manman Ren2014-03-251-6/+57
| | | | | | | | | | | | | | | | | | | When register allocator's stage is RS_Spill, we choose spill over using the CSR for the first time, if the spill cost is lower than CSRCost. When register allocator's stage is < RS_Split, we choose pre-splitting over using the CSR for the first time, if the cost of splitting is lower than CSRCost. CSRCost is set with command-line option "regalloc-csr-first-time-cost". The default value is 0 to generate the same codes as before this commit. With a value of 15 (1 << 14 is the entry frequency), I measured performance gain of 3% on 253.perlbmk and 1.7% on 197.parser, with instrumented PGO, on an arm device. rdar://16162005 llvm-svn: 204690
* Register Allocator: refactoring (no functionality change).Manman Ren2014-03-241-6/+30
| | | | | | | | | Factor out two functions calculateRegionSplitCost and doRegionSplit from tryRegionSplit. These two functions will be used in coming patches. rdar://16162005 llvm-svn: 204684
* DebugInfo: Simplify debug loc list handling by keeping separate listsDavid Blaikie2014-03-243-33/+17
| | | | | | | Rather than using a flat list with "empty" entries (ala the actual on-disk format), keep separate lists for each variable. llvm-svn: 204680
* DwarfDebug: Simplify debug_loc mergingDavid Blaikie2014-03-243-29/+13
| | | | | | | | | | No functional change intended. Merging up-front rather than delaying this task until later. This just seems simpler and more efficient (avoiding growing the debug loc list only to have to skip over those post-merged entries, etc). llvm-svn: 204679
* Get rid of an unnecessary use of the * and & operators.Adrian Prantl2014-03-241-1/+1
| | | | llvm-svn: 204673
* DebugInfo: Add DW_AT_GNU_ranges_base to skeleton CUsDavid Blaikie2014-03-241-4/+9
| | | | | | | | | This is used to avoid relocations in the dwo file by allowing DW_AT_ranges specified in debug_info.dwo to be relative to this base address. (r204667 implements the base-relative DW_AT_ranges side of this) llvm-svn: 204672
* DebugInfo: Implement relative addressing for DW_AT_ranges under fissionDavid Blaikie2014-03-241-2/+9
| | | | | | | | This removes the debug_ranges relocations from debug_info.dwo (but doesn't implement the DW_AT_GNU_ranges_base which is also necessary for correct functioning) llvm-svn: 204668
* DebugInfo: Don't emit relocations to abbreviations in debug_info.dwoDavid Blaikie2014-03-242-2/+6
| | | | llvm-svn: 204667
* DwarfDebug: Remove an unused parameterDavid Blaikie2014-03-242-9/+4
| | | | llvm-svn: 204665
* Remove unused parameterDavid Blaikie2014-03-243-10/+6
| | | | llvm-svn: 204663
* SelectionDAG: Allow promotion of SELECT nodes from float to int typesTom Stellard2014-03-241-1/+2
| | | | | | | | And vice-versa, as long as the types are the same width. There are a few R600 tests that will cover this. llvm-svn: 204616
* WinCOFF: Add support for -ffunction-sectionsDavid Majnemer2014-03-231-4/+9
| | | | | | | | This is a pretty straight forward translation for COFF, we just need to stick the function in a COMDAT section marked as IMAGE_COMDAT_SELECT_NODUPLICATES. llvm-svn: 204565
* remove a bunch of unused private methodsNuno Lopes2014-03-233-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h | 1 include/llvm/IR/DebugInfo.h | 3 lib/CodeGen/MachineSSAUpdater.cpp | 10 -- lib/CodeGen/PostRASchedulerList.cpp | 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 10 -- lib/IR/DebugInfo.cpp | 12 -- lib/MC/MCAsmStreamer.cpp | 2 lib/Support/YAMLParser.cpp | 39 --------- lib/TableGen/TGParser.cpp | 16 --- lib/TableGen/TGParser.h | 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp | 9 -- lib/Target/ARM/ARMCodeEmitter.cpp | 12 -- lib/Target/ARM/ARMFastISel.cpp | 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp | 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp | 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp | 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h | 2 lib/Target/PowerPC/PPCFastISel.cpp | 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp | 2 lib/Transforms/Instrumentation/BoundsChecking.cpp | 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp | 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp | 8 - lib/Transforms/Scalar/SCCP.cpp | 1 utils/TableGen/CodeEmitterGen.cpp | 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560
* [DAG] Fix an assertion failure caused by an invalid cast in method ↵Andrea Di Biagio2014-03-222-8/+7
| | | | | | | | | | | | 'BuildVectorSDNode::isConstantSplat' This patch renames method 'isConstantSplat' as 'getConstantSplatValue' (mainly for consistency reasons), and rewrites its logic to ensure that we always perform a legal 'cast<ConstantSDNode>'. Added test shift-combine-crash.ll to verify that DAGCombiner no longer crashes with an assertion failure in the attempt to simplify a vector shift by a vector of all undef counts. llvm-svn: 204536
* Delete stale comment. Thanks, Eric!Adrian Prantl2014-03-211-1/+0
| | | | llvm-svn: 204530
* Dwarf Debug: Remove some cargo-cult type uniquing. Scopes do not haveAdrian Prantl2014-03-211-1/+1
| | | | | | | an ID, so this is a noop. Thanks Manman for catching this! llvm-svn: 204528
OpenPOWER on IntegriCloud