summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* DebugInfo: Use a 64 bit type for the subrangeDavid Blaikie2014-04-031-4/+4
| | | | | | | | | | | While we were encoding 64 bit values (data8) in the subrange itself, using a 32 bit type for the subrange was still confusing the gdb. Oh, and make it unsigned too. As the comment points out, this could be pushed into the frontend so that it would be 32 or 64 bit as appropriate, etc. llvm-svn: 205512
* [CodeGen] Fix peephole optimizer bug introduced in r205481. Fixes PR19318.Lang Hames2014-04-031-9/+11
| | | | | | | | I should have read that comment a little more carefully. ;) Regression test in the works, committing in the mean time to un-break people. llvm-svn: 205511
* Account for scalarization costs in BasicTTI::getMemoryOpCost for extending ↵Hal Finkel2014-04-031-2/+24
| | | | | | | | | | | | | | | | | | | | vector loads When a vector type legalizes to a larger vector type, and the target does not support the associated extending load (or truncating store), then legalization will scalarize the load (or store) resulting in an associated scalarization cost. BasicTTI::getMemoryOpCost needs to account for this. Between this, and r205487, PowerPC on the P7 with VSX enabled shows: MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup (some of these are new; some of these, such as PAQ8p, just reverse regressions that VSX support would trigger) llvm-svn: 205495
* Fix multi-register costs in BasicTTI::getCastInstrCostHal Finkel2014-04-021-1/+2
| | | | | | | | | | | | | For an cast (extension, etc.), the currently logic predicts a low cost if the associated operation (keyed on the destination type) is legal (or promoted). This is not true when the number of values required to legalize the type is changing. For example, <8 x i16> being sign extended by <8 x i32> is not generically cheap on PPC with VSX, even though sign extension to v4i32 is legal, because two output v4i32 values are required compared to the single v8i16 input value, and without custom logic in the target, this conversion will scalarize. llvm-svn: 205487
* [CodeGen] Teach the peephole optimizer to remember (and exploit) all foldingLang Hames2014-04-021-35/+44
| | | | | | | | opportunities in the current basic block, rather than just the last one seen. <rdar://problem/16478629> llvm-svn: 205481
* Add comments and test case for [DAG] Keep the opaque constant flag when ↵Juergen Ributzka2014-04-021-1/+5
| | | | | | performing unary constant folding operations (r204737). llvm-svn: 205474
* Simplify resolveFrameIndex() signature.Jim Grosbach2014-04-021-1/+1
| | | | | | | | Just pass a MachineInstr reference rather than an MBB iterator. Creating a MachineInstr& is the first thing every implementation did anyway. llvm-svn: 205453
* ARM: Add support for segmented stacksOliver Stannard2014-04-021-0/+3
| | | | | | Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. llvm-svn: 205430
* clarify commentAdrian Prantl2014-04-021-1/+2
| | | | llvm-svn: 205429
* Adjust comments regarding non-relocated abbrev offset in debug_info.dwoDavid Blaikie2014-04-022-2/+4
| | | | | | | | I'm not sure the comment in the implementation really adds a lot of value (it's clear that we emit zero when no symbol is provided, but it doesn't explain why we would do that). Happy to iterate. llvm-svn: 205386
* Split debug_loc and debug_loc.dwo emission into two separate functionsDavid Blaikie2014-04-022-21/+32
| | | | | | Based on code review feedback from Eric Christopher on r204697 llvm-svn: 205385
* DebugInfo: Introduce DebugLocList to encapsulate a list of DebugLocEntries ↵David Blaikie2014-04-025-12/+39
| | | | | | | | | | | | and an MC Label to refer to them This removes the magic-number-esque code creating/retrieving the same label for a debug_loc entry from two places and removes the last small piece of reusable logic from emitDebugLoc so that there will be less duplication when refactoring it into two functions (one for debug_loc, the other for debug_loc.dwo). llvm-svn: 205382
* Add a doxygen comment to DebugLocEntry::Merge.Adrian Prantl2014-04-011-0/+3
| | | | llvm-svn: 205374
* DebugLocEntry: Actually merge the loc entry when returning true.David Blaikie2014-04-011-1/+5
| | | | | | | | | | Seems we didn't have any test coverage for merging... awesome. So I added some - but hit an llvm-objdump bug while I was there. I'm choosing not to shave that yak right now. Code review feedback/bug catch by Adrian Prantl in r205360. llvm-svn: 205373
* Fix accidental fallthrough in DebugLocEntry::hasSameValueOrLocationDavid Blaikie2014-04-011-5/+10
| | | | | | | | | | No test case (this would invoke UB by examining uninitialized members, etc, at best - and this code is apparently untested anyway - I'm about to fix that) Code review feedback from Adrian Prantl on r205360. llvm-svn: 205367
* Remove unused function DebugLocEntry::isEmptyDavid Blaikie2014-04-011-3/+0
| | | | llvm-svn: 205365
* Refactor out the comparison of the location/value in a DebugLocEntryDavid Blaikie2014-04-011-18/+19
| | | | llvm-svn: 205364
* DebugInfo: Split DebugLocEntry into its own file.David Blaikie2014-04-012-85/+113
| | | | | | | It seems big enough that it deserves its own file - but it is header only, so there's no need for another cpp file, etc. llvm-svn: 205360
* DwarfDebug: Prevent DebugLocEntry merging from coalescing two differentAdrian Prantl2014-04-011-2/+9
| | | | | | | | constants into only the first one. rdar://14874886. llvm-svn: 205357
* Make isSetCCEquivalent respect the TargetBooleanContentsMatt Arsenault2014-04-011-19/+22
| | | | llvm-svn: 205336
* Add helpers for checking if a value is a target boolean constant.Matt Arsenault2014-04-011-0/+48
| | | | llvm-svn: 205335
* DebugInfo: Factor out common functionality for rendering debug_loc and ↵David Blaikie2014-04-012-10/+17
| | | | | | | | | debug_loc.dwo location list entries In preparation for refactoring this function into two, one for debug_loc, one for debug_loc.dwo. llvm-svn: 205324
* Cleanup remaining use of removed variable to fix the buildDavid Blaikie2014-04-011-1/+1
| | | | llvm-svn: 205323
* Simplify debug_loc.dwo handling slightly.David Blaikie2014-04-013-8/+3
| | | | llvm-svn: 205322
* DebugInfo: Avoid creating unnecessary/empty line tables and remove the ↵David Blaikie2014-04-011-11/+2
| | | | | | | | | | | | special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference This moves one case of raw text checking down into the MCStreamer interfaces in the form of a virtual function, even if we ultimately end up consolidating on the one-or-many line tables issue one day, this is nicer in the interim. This just generally streamlines a bunch of use cases into a common code path. llvm-svn: 205287
* LTO type uniquing: store the Decl field of a DIImportedEntity as a DIRef.Adrian Prantl2014-04-011-1/+1
| | | | | | | | | | No other functionality changes, DIBuilder testcase is included in a paired CFE commit. This relaxes the assertion in isScopeRef to also accept subclasses of DIScope. llvm-svn: 205279
* [Stackmaps] Update the stackmap format to use 64-bit relocations for the ↵Juergen Ributzka2014-03-311-20/+36
| | | | | | | | | | | | function address and properly align all entries. This commit updates the stackmap format to version 1 to indicate the reorganizaion of several fields. This was done in order to align stackmap entries to their natural alignment and to minimize padding. Fixes <rdar://problem/16005902> llvm-svn: 205254
* Change shouldSplitVectorElementType to better match the description.Matt Arsenault2014-03-311-1/+1
| | | | | | Pass the entire vector type, and not just the element. llvm-svn: 205247
* Add an optional ability to expand larger BUILD_VECTORs with shufflesHal Finkel2014-03-311-20/+117
| | | | | | | | | | | | | | | | | | | | | | | This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243
* Add a TLI hook to control when BUILD_VECTOR might be expanded using shufflesHal Finkel2014-03-311-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two general methods for expanding a BUILD_VECTOR node: 1. Use SCALAR_TO_VECTOR on the defined scalar values and then shuffle them together. 2. Build the vector on the stack and then load it. Currently, we use a fixed heuristic: If there are only one or two unique defined values, then we attempt an expansion in terms of SCALAR_TO_VECTOR and vector shuffles (provided that the required shuffle mask is legal). Otherwise, always expand via the stack. Even when SCALAR_TO_VECTOR is not legal, this can still be a good idea depending on what tricks the target can play when lowering the resulting shuffle. If the target can't do anything special, however, and if SCALAR_TO_VECTOR is expanded via the stack, this heuristic leads to sub-optimal code (two stack loads instead of one). Because only the target knows whether the SCALAR_TO_VECTORs and shuffles for a build vector of a particular type are likely to be optimial, this adds a new TLI function: shouldExpandBuildVectorWithShuffles which takes the vector type and the count of unique defined values. If this function returns true, then method (1) will be used, subject to the constraint that all of the necessary shuffles are legal (as determined by isShuffleMaskLegal). If this function returns false, then method (2) is always used. This commit does not enhance the current code to support expanding a build_vector with more than two unique values using shuffles, but I'll commit an implementation of the more-general case shortly. llvm-svn: 205230
* Disable each MachineFunctionPass for 'optnone' functions, unless thatPaul Robinson2014-03-3114-0/+42
| | | | | | | pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228
* Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELTHal Finkel2014-03-311-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. llvm-svn: 205179
* Make use of previously generated stores in ↵Hal Finkel2014-03-301-4/+33
| | | | | | | | | | | | | | | | | SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153
* Avoid storing Twines.Benjamin Kramer2014-03-291-22/+19
| | | | | | While there nested ifs into a helper function. No functionality change. llvm-svn: 205108
* CodeGen: add sensible defaults for the ISD::FROUND operationTim Northover2014-03-291-0/+9
| | | | | | Some exotic types didn't know how to handle FROUND, which ARM64 uses. llvm-svn: 205088
* CodeGenPrep: wrangle IR to exploit AArch64 tbz/tbnz inst.Tim Northover2014-03-292-0/+81
| | | | | | | | | | | | | | | | Given IR like: %bit = and %val, #imm-with-1-bit-set %tst = icmp %bit, 0 br i1 %tst, label %true, label %false some targets can emit just a single instruction (tbz/tbnz in the AArch64 case). However, with ISel acting at the basic-block level, all three instructions need to be together for this to be possible. This adds another transformation to CodeGenPrep to expose these opportunities, if targets opt in via the hook. llvm-svn: 205086
* Provide a target override for the cost of using a callee-saved registerManman Ren2014-03-271-2/+6
| | | | | | | | | for the first time. Thanks Andy for the discussion. rdar://16162005 llvm-svn: 204979
* Canonicalise Windows target triple spellingsSaleem Abdulrasool2014-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Construct a uniform Windows target triple nomenclature which is congruent to the Linux counterpart. The old triples are normalised to the new canonical form. This cleans up the long-standing issue of odd naming for various Windows environments. There are four different environments on Windows: MSVC: The MS ABI, MSVCRT environment as defined by Microsoft GNU: The MinGW32/MinGW32-W64 environment which uses MSVCRT and auxiliary libraries Itanium: The MSVCRT environment + libc++ built with Itanium ABI Cygnus: The Cygwin environment which uses custom libraries for everything The following spellings are now written as: i686-pc-win32 => i686-pc-windows-msvc i686-pc-mingw32 => i686-pc-windows-gnu i686-pc-cygwin => i686-pc-windows-cygnus This should be sufficiently flexible to allow us to target other windows environments in the future as necessary. llvm-svn: 204977
* Register Allocator: refactoring and add comments.Manman Ren2014-03-271-35/+58
| | | | | | | | No functionality change. Thanks Andy for reviewing. rdar://16162005 llvm-svn: 204962
* DebugInfo: TargetOptions/MCAsmInfo support for compressed debug info sectionsDavid Blaikie2014-03-271-0/+3
| | | | llvm-svn: 204957
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934
* This is a fix for PR# 19051. I noticed code gen differences due to code ↵Ekaterina Romanova2014-03-261-1/+1
| | | | | | motion when running tests with and without the debug info at O2. The problem is in branch folding. A loop wanted to skip the debug info, but actually it didn't do so. llvm-svn: 204865
* Add comments. Addressing review comments from Evan on r204690.Manman Ren2014-03-261-0/+5
| | | | llvm-svn: 204864
* Fix for incorrect address sinking in the presence of potential overflows.Jim Grosbach2014-03-261-1/+8
| | | | | | | | | | | | | In some cases it is possible for CGP to attempt to reuse a base address from another basic block. In those cases we have to be sure that all the address math was either done at the same bit width, or that none of it overflowed before it was extended. Patch by Louis Gerbarg <lgg@apple.com> rdar://16307442 llvm-svn: 204833
* Add @llvm.clear_cache builtinRenato Golin2014-03-261-0/+2
| | | | | | | | | | | | | | | | | Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802
* Follow-up to r204790: don't try to emit line tables if there are no ↵Timur Iskhodzhanov2014-03-261-2/+9
| | | | | | functions with DI in the TU llvm-svn: 204795
* Fix PR19239 - Add support for generating debug info for functions without ↵Timur Iskhodzhanov2014-03-262-16/+12
| | | | | | lexical scopes and/or debug info at all llvm-svn: 204790
* Revert "Prevent alias from pointing to weak aliases."Rafael Espindola2014-03-261-1/+1
| | | | | | | | | This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781
* blockfreq: Implement Pass::releaseMemory()Duncan P. N. Exon Smith2014-03-251-10/+10
| | | | | | | | | | Implement Pass::releaseMemory() in BlockFrequencyInfo and MachineBlockFrequencyInfo. Just delete the private implementation when not in use. Switch to a std::unique_ptr to make the logic more clear. <rdar://problem/14292693> llvm-svn: 204741
OpenPOWER on IntegriCloud