summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* DwarfDebug: Prevent DebugLocEntry merging from coalescing two differentAdrian Prantl2014-04-012-2/+109
| | | | | | | | constants into only the first one. rdar://14874886. llvm-svn: 205357
* [PowerPC] Add some missing VSX bitcast patternsHal Finkel2014-04-012-0/+16
| | | | llvm-svn: 205352
* If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 andYaron Keren2014-04-012-3/+2
| | | | | | Environment == Triple::MSVC so it will never be MinGW or Cygwin. llvm-svn: 205349
* Implement X86TTI::getUnrollingPreferencesHal Finkel2014-04-014-10/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup llvm-svn: 205348
* Add some additional fields to TTI::UnrollingPreferencesHal Finkel2014-04-012-4/+25
| | | | | | | | | | | | | | | | | | | In preparation for an upcoming commit implementing unrolling preferences for x86, this adds additional fields to the UnrollingPreferences structure: - PartialThreshold and PartialOptSizeThreshold - Like Threshold and OptSizeThreshold, but used when not fully unrolling. These are necessary because we need different thresholds for full unrolling from those used when partially unrolling (the full unrolling thresholds are generally going to be larger). - MaxCount - A cap on the unrolling factor when partially unrolling. This can be used by a target to prevent the unrolled loop from exceeding some resource limit independent of the loop size (such as number of branches). There should be no functionality change for any in-tree targets. llvm-svn: 205347
* Use TopTTI->getGEPCost from within getUserCostHal Finkel2014-04-011-4/+4
| | | | | | | | | | | The implementation of getUserCost had duplicated (and hard-coded) the default logic in getGEPCost. Instead, it is better to use getGEPCost directly, which limits the default logic to the implementation of one function, and allows targets to override the behavior. No functionality change intended. llvm-svn: 205346
* [mips] Add Octeon cnMips instructions mtmX and mtpXKai Nacke2014-04-013-0/+31
| | | | | | | | | Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px". Includes tests. Reviews by: Daniel.Sanders@imgtec.com llvm-svn: 205343
* Support segmented stacks on Win64Reid Kleckner2014-04-012-5/+59
| | | | | | | Identical to Win32 method except the GS segment register is used for TLS instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14. llvm-svn: 205342
* Fix missing RUN line in testMatt Arsenault2014-04-011-0/+1
| | | | llvm-svn: 205341
* isTargetWindows() renamed to isTargetKnownWindowsMSVC()Yaron Keren2014-04-016-16/+16
| | | | | | | | to reflect its current functionality. Based on Takumi NAKAMURA suggestion. llvm-svn: 205338
* Make isSetCCEquivalent respect the TargetBooleanContentsMatt Arsenault2014-04-012-19/+51
| | | | llvm-svn: 205336
* Add helpers for checking if a value is a target boolean constant.Matt Arsenault2014-04-012-0/+56
| | | | llvm-svn: 205335
* DebugInfo: Factor out common functionality for rendering debug_loc and ↵David Blaikie2014-04-012-10/+17
| | | | | | | | | debug_loc.dwo location list entries In preparation for refactoring this function into two, one for debug_loc, one for debug_loc.dwo. llvm-svn: 205324
* Cleanup remaining use of removed variable to fix the buildDavid Blaikie2014-04-011-1/+1
| | | | llvm-svn: 205323
* Simplify debug_loc.dwo handling slightly.David Blaikie2014-04-013-8/+3
| | | | llvm-svn: 205322
* ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ↵Christian Pirker2014-04-0110-114/+114
| | | | | | ThumbLE/ThumbBE llvm-svn: 205317
* ARM: teach LLVM that Cortex-A7 is very similar to A8.Tim Northover2014-04-013-9/+11
| | | | llvm-svn: 205314
* Attempting to fix r205124, which had failed asserts when built with MSVC.Aaron Ballman2014-04-011-1/+1
| | | | | | Suggestion from Yaron Keren. llvm-svn: 205313
* ARM: add cyclone CPU with ZeroCycleZeroing feature.Tim Northover2014-04-016-6/+115
| | | | | | | | The Cyclone CPU is similar to swift for most LLVM purposes, but does have two preferred instructions for zeroing a VFP register. This teaches LLVM about them. llvm-svn: 205309
* [mips] Renamed ParseAnyRegisterWithoutDollar to MatchAnyRegisterWithoutDollarDaniel Sanders2014-04-011-8/+14
| | | | | | | | | This is for consistency with other functions. The Parse* functions consume tokens and the Match* functions don't. No functional change. llvm-svn: 205305
* Fixing an MSVC warning about widening the result of a 32-bit shift ↵Aaron Ballman2014-04-011-1/+1
| | | | | | implicitly. No functional change intended. llvm-svn: 205304
* ARM64: add intrinsic for pmull (p64 x p64 = p128) operations.Tim Northover2014-04-013-2/+31
| | | | llvm-svn: 205302
* Fixing warnings in the MSVC build. No functional changes intended.Aaron Ballman2014-04-015-42/+42
| | | | llvm-svn: 205301
* [mips] Extend ParseJumpTarget to support the full symbol expression syntax.Daniel Sanders2014-04-012-27/+20
| | | | | | | | | | | | | | | | Summary: This should fix the issues the D3222 caused in lld. Testcase is based on the one that failed in the buildbot. Depends on D3233 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3234 llvm-svn: 205298
* [mips] Use AsmLexer::peekTok() to resolve the conflict between $reg and $symDaniel Sanders2014-04-011-15/+10
| | | | | | | | | | | | | | | Summary: Parsing registers no longer consume the $ token before it's confirmed whether it really has a register or not, therefore it's no longer impossible to match symbols if registers were tried first. Depends on D3232 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3233 llvm-svn: 205297
* [mips] Hoist Parser.Lex() calls out of MatchAnyRegisterNameWithoutDollar()Daniel Sanders2014-04-011-9/+8
| | | | | | | | | | | | | | | Summary: No functional change Depends on D3222 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3232 llvm-svn: 205295
* ARM64: add patterns for more lane-wise ld1/st1 operations.Tim Northover2014-04-014-139/+268
| | | | llvm-svn: 205294
* ARM64: fix bug in ld3r (1d) SelectionDAG.Tim Northover2014-04-012-1/+32
| | | | llvm-svn: 205293
* [mips] Rewrite MipsAsmParser and MipsOperand.Daniel Sanders2014-04-0123-1069/+929
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Highlights: - Registers are resolved much later (by the render method). Prior to that point, GPR32's/GPR64's are GPR's regardless of register size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register size or FR mode. Numeric registers can be anything. - All registers are parsed the same way everywhere (even when handling symbol aliasing) - One consequence is that all registers can be specified numerically almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing but that can be easily resolved. - Removes the need for the hasConsumedDollar hack - Parenthesis and Bracket suffixes are handled generically - Micromips instructions are parsed directly instead of going through the standard encodings first. - rdhwr accepts all 32 registers, and the following instructions that previously xfailed now work: ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d, c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1 - Diagnostics involving registers point at the correct character (the $) - There's only one kind of immediate in MipsOperand. LSA immediates are handled by the predicate and renderer. Lowlights: - Hardcoded '$zero' in the div patterns is handled with a hack. MipsOperand::isReg() will return true for a k_RegisterIndex token with Index == 0 and getReg() will return ZERO for this case. Note that it doesn't return ZERO_64 on isGP64() targets. - I haven't cleaned up all of the now-unused functions. Some more of the generic parser could be removed too (integers and relocs for example). - insve.df needed a custom decoder to handle the implicit fourth operand that was needed to make it parse correctly. The difficulty was that the matcher expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this. Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3222 llvm-svn: 205292
* Recover TableGen/LangRef, make it officialRenato Golin2014-04-015-1307/+911
| | | | | | | | | | | | | Making the new TableGen documentation official and marking the old file as "Moved". Also, reverting the original LangRef as the normative formal description of the language, while keeping the "new" LangRef as LangIntro for the less inlcined to reading language grammars. We should remove TableGenFundamentals.rst one day, but for now, just a warning that it moved will have to do, while we make sure there are no more links to it from elsewhere. llvm-svn: 205289
* [x86] Do not convert to cmp32 for Atom arch by Sergey OkunevAlexey Volkov2014-04-012-4/+42
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D2824 llvm-svn: 205288
* DebugInfo: Avoid creating unnecessary/empty line tables and remove the ↵David Blaikie2014-04-0111-29/+32
| | | | | | | | | | | | special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference This moves one case of raw text checking down into the MCStreamer interfaces in the form of a virtual function, even if we ultimately end up consolidating on the one-or-many line tables issue one day, this is nicer in the interim. This just generally streamlines a bunch of use cases into a common code path. llvm-svn: 205287
* DebugInfo: Emit relocation to debug_line section when emitting asm for asmDavid Blaikie2014-04-016-36/+68
| | | | | | | | | | | | | | I don't think this is reachable by any frontend (why would you transform asm to asm+debug info?) but it helps tidy up some of this code, avoid the weird special case of "emit the first CU, store the label, then emit the rest" in MCDwarfLineTable::Emit by instead having the DWARF-for-assembly case use the same codepath as DwarfDebug.cpp, by registering the label of the debug_line section, thus causing it to be emitted. (with a special case in asm output to just emit the label since asm output uses the .loc directives, etc, rather than the debug_loc directly) llvm-svn: 205286
* Remove FIXMEs. The scope of a Variable is always a lexical scope; there isAdrian Prantl2014-04-011-2/+0
| | | | | | nothing to be gained from switching this over to a DIScopeRef. llvm-svn: 205281
* LTO type uniquing: store the Decl field of a DIImportedEntity as a DIRef.Adrian Prantl2014-04-015-7/+7
| | | | | | | | | | No other functionality changes, DIBuilder testcase is included in a paired CFE commit. This relaxes the assertion in isScopeRef to also accept subclasses of DIScope. llvm-svn: 205279
* Add a comment about type-uniquing ObjC types.Adrian Prantl2014-04-011-0/+2
| | | | llvm-svn: 205277
* Comment to describe the debug_loc.dwo constantsDavid Blaikie2014-03-311-0/+1
| | | | | | Code review feedback from Eric Christopher on r204697 llvm-svn: 205268
* MCJIT: ensure that cygwin is identified properlySaleem Abdulrasool2014-03-313-2/+12
| | | | | | | Cygwin is now a proper environment rather than an OS. This updates the MCJIT tests to avoid execution on Cygwin. This fixes native cygwin tests. llvm-svn: 205266
* Move partial/runtime unrolling late in the pipelineHal Finkel2014-03-314-2/+11
| | | | | | | | | | | | | | | | The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. llvm-svn: 205264
* lit: Set a base directory for compiler-rt testsDuncan P. N. Exon Smith2014-03-311-0/+5
| | | | | | | | | Setting this parameter enables llvm-lit to run on source directories for compiler-rt test suites that implement magic in their lit.cfg. <rdar://problem/16458307> llvm-svn: 205262
* Revert "SLPVectorizer: Ignore users that are insertelements we can ↵Arnold Schwaighofer2014-03-312-89/+30
| | | | | | | | | | | | | | reschedule them" This reverts commit r205018. Conflicts: lib/Transforms/Vectorize/SLPVectorizer.cpp test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll This is breaking libclc build. llvm-svn: 205260
* Shifting into the sign bit is UB as discussed on IRC. Explicitly use theJoerg Sonnenberger2014-03-311-7/+7
| | | | | | BitWord type for the constants to avoid this. llvm-svn: 205257
* [Stackmaps] Update the stackmap format to use 64-bit relocations for the ↵Juergen Ributzka2014-03-318-137/+247
| | | | | | | | | | | | function address and properly align all entries. This commit updates the stackmap format to version 1 to indicate the reorganizaion of several fields. This was done in order to align stackmap entries to their natural alignment and to minimize padding. Fixes <rdar://problem/16005902> llvm-svn: 205254
* [X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as wellAdam Nemet2014-03-312-0/+41
| | | | | | | | | Pretty obvious follow-on to r205159 to also handle conversion from double besides float. Fixes <rdar://problem/16373208> llvm-svn: 205253
* R600/SI: Remove leftover pattern splitting 64-bit ors.Matt Arsenault2014-03-311-8/+0
| | | | | | | It's now matched to the scalar 64-bit or and split later if necessary.' llvm-svn: 205252
* Register allocator: set CSRFirstUseCost to 5 for ARM64.Manman Ren2014-03-311-0/+7
| | | | | | | | | | | | A value of 5 means if we have a split or spill option that has a really low cost (1 << 14 is the entry frequency), we will choose to spill or split the really cold path before using a callee-saved register. This gives us the performance benefit on SPECInt2k and is also conservative. rdar://16162005 llvm-svn: 205248
* Change shouldSplitVectorElementType to better match the description.Matt Arsenault2014-03-316-8/+8
| | | | | | Pass the entire vector type, and not just the element. llvm-svn: 205247
* Fix MSVC warning.Rui Ueyama2014-03-311-1/+1
| | | | | | | | | This patch is to fix the following warning when compiled with MSVC 64 bit. warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) llvm-svn: 205245
* R600/SI: Implement shouldConvertConstantLoadToIntImmMatt Arsenault2014-03-314-18/+37
| | | | llvm-svn: 205244
* Add an optional ability to expand larger BUILD_VECTORs with shufflesHal Finkel2014-03-311-20/+117
| | | | | | | | | | | | | | | | | | | | | | | This adds the ability to expand large (meaning with more than two unique defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal) vector shuffles. There is now no limit of the size we are capable of expanding this way, although we don't currently do this for vectors with many unique values because of the default implementation of TLI's shouldExpandBuildVectorWithShuffles function. There is currently no functional change to any existing targets because the new capabilities are not used unless some target overrides the TLI shouldExpandBuildVectorWithShuffles function. As a result, I've not included a test case for the new functionality in this commit, but regression tests will (at least) be added soon when I commit support for the PPC QPX vector instruction set. The benefit of committing this now is that it makes the shouldExpandBuildVectorWithShuffles callback, which had to be added for other reasons regardless, fully functional. I suspect that other targets will also benefit from tuning the heuristic. llvm-svn: 205243
OpenPOWER on IntegriCloud