summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Store nodes only have 1 result.Matt Arsenault2014-07-251-1/+1
| | | | llvm-svn: 213928
* [SDAG] Start plumbing an assert into SDValues that we don't form oneChandler Carruth2014-07-251-1/+1
| | | | | | | | | | | | | with a result number outside the range of results for the node. I don't know how we managed to not really check this very basic invariant for so long, but the code is *very* broken at this point. I have over 270 test failures with the assert enabled. I'm committing it disabled so that others can join in the cleanup effort and reproduce the issues. I've also included one of the obvious fixes that I already found. More fixes to come. llvm-svn: 213926
* [ARM] In thumb mode, emit directive ".code 16" before file level inlineAkira Hatanaka2014-07-251-0/+3
| | | | | | | | | | | | assembly instructions. This is necessary to ensure ARM assembler switches to Thumb mode before it starts assembling the file level inline assembly instructions at the beginning of a .s file. <rdar://problem/17757232> llvm-svn: 213924
* Fix a warning in CoverageMappingReader.cppEhsan Akhgari2014-07-251-1/+1
| | | | llvm-svn: 213920
* [X86] Clarify some stackmap shadow optimization code as based on reviewLang Hames2014-07-252-7/+19
| | | | | | | | feedback from Eric Christopher. No functional change. llvm-svn: 213917
* [PATCH][PPC64LE] Correct little-endian usage of vmrgh* and vmrgl*.Bill Schmidt2014-07-253-48/+101
| | | | | | | | | | | | | | | | | | | | | | Because the PowerPC vmrgh* and vmrgl* instructions have a built-in big-endian bias, it is necessary to swap their inputs in little-endian mode when using them to implement a vector shuffle. This was previously missed in the vector LE implementation. There was already logic to distinguish between unary and "normal" vmrg* vector shuffles, so this patch extends that logic to use a third option: "swapped" vmrg* vector shuffles that are used for little endian in place of the "normal" ones. I've updated the vec-shuffle-le.ll test to check for the expected register ordering on the generated instructions. This bug was discovered when testing the LE and ELFv2 patches for safety if they were backported to 3.4. A different vectorization decision was made in 3.4 than on mainline trunk, and that exposed the problem. I've verified this fix takes care of that issue. llvm-svn: 213915
* Add code coverage mapping data, reader, and writer.Alex Lorenz2014-07-243-0/+828
| | | | | | | | | This patch implements the data structures, the reader and the writers for the new code coverage mapping system. The new code coverage mapping system uses the instrumentation based profiling to provide code coverage analysis. llvm-svn: 213910
* Add code coverage mapping data, reader, and writer.Alex Lorenz2014-07-242-1/+4
| | | | | | | | | This patch implements the data structures, the reader and the writers for the new code coverage mapping system. The new code coverage mapping system uses the instrumentation based profiling to provide code coverage analysis. llvm-svn: 213909
* After unrolling a loop with llvm.loop.unroll.count metadata (unroll factorMark Heffernan2014-07-241-1/+0
| | | | | | | | | | hint) the loop unroller replaces the llvm.loop.unroll.count metadata with llvm.loop.unroll.disable metadata to prevent any subsequent unrolling passes from unrolling more than the hint indicates. This patch fixes an issue where loop unrolling could be disabled for other loops as well which share the same llvm.loop metadata. llvm-svn: 213900
* Don't use 128bit functions on PPC32.Joerg Sonnenberger2014-07-241-0/+7
| | | | llvm-svn: 213899
* [SDAG] Introduce a combined set to the DAG combiner which tracks nodesChandler Carruth2014-07-241-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | which have successfully round-tripped through the combine phase, and use this to ensure all operands to DAG nodes are visited by the combiner, even if they are only added during the combine phase. This is critical to have the combiner reach nodes that are *introduced* during combining. Previously these would sometimes be visited and sometimes not be visited based on whether they happened to end up on the worklist or not. Now we always run them through the combiner. This fixes quite a few bad codegen test cases lurking in the suite while also being more principled. Among these, the TLS codegeneration is particularly exciting for programs that have this in the critical path like TSan-instrumented binaries (although I think they engineer to use a different TLS that is faster anyways). I've tried to check for compile-time regressions here by running llc over a merged (but not LTO-ed) clang bitcode file and observed at most a 3% slowdown in llc. Given that this is essentially a worst case (none of opt or clang are running at this phase) I think this is tolerable. The actual LTO case should be even less costly, and the cost in normal compilation should be negligible. With this combining logic, it is possible to re-legalize as we combine which is necessary to implement PSHUFB formation on x86 as a post-legalize DAG combine (my ultimate goal). Differential Revision: http://reviews.llvm.org/D4638 llvm-svn: 213898
* [x86] Make vector legalization of extloads work more like the "normal"Chandler Carruth2014-07-242-153/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector operation legalization with support for custom target lowering and fallback to expand when it fails, and use this to implement sext and anyext load lowering for x86 in a more principled way. Previously, the x86 backend relied on a target DAG combine to "combine away" sextload and extload nodes prior to legalization, or would expand them during legalization with terrible code. This is particularly problematic because the DAG combine relies on running over non-canonical DAG nodes at just the right time to match several common and important patterns. It used a combine rather than lowering because we didn't have good lowering support, and to expose some tricks being employed to more combine phases. With this change it becomes a proper lowering operation, the backend marks that it can lower these nodes, and I've added support for handling the canonical forms that don't have direct legal representations such as sextload of a v4i8 -> v4i64 on AVX1. With this change, our test cases for this behavior continue to pass even after the DAG combiner beigns running more systematically over every node. There is some noise caused by this in the test suite where we actually use vector extends instead of subregister extraction. This doesn't really seem like the right thing to do, but is unlikely to be a critical regression. We do regress in one case where by lowering to the target-specific patterns early we were able to combine away extraneous legal math nodes. However, this regression is completely addressed by switching to a widening based legalization which is what I'm working toward anyways, so I've just switched the test to that mode. Differential Revision: http://reviews.llvm.org/D4654 llvm-svn: 213897
* Target: invert condition for WindowsSaleem Abdulrasool2014-07-241-1/+1
| | | | | | | | | The Microsoft ABI and MSVCRT are considered the canonical C runtime and ABI. The long double routines are not part of this environment. However, cygwin and MinGW both provide supplementary implementations. Change the condition to reflect this reality. llvm-svn: 213896
* Feedback from Hans on r213815. No functionaility change.Manman Ren2014-07-241-10/+11
| | | | llvm-svn: 213895
* Windows: Don't wildcard expand /? or -?Hans Wennborg2014-07-241-0/+5
| | | | | | | Even if there's a file called c:\a, we want /? to be preserved as an option, not expanded to a filename. llvm-svn: 213894
* [X86] Optimize stackmap shadows on X86.Lang Hames2014-07-244-45/+145
| | | | | | | | | | | | | | | | | | | This patch minimizes the number of nops that must be emitted on X86 to satisfy stackmap shadow constraints. To minimize the number of nops inserted, the X86AsmPrinter now records the size of the most recent stackmap's shadow in the StackMapShadowTracker class, and tracks the number of instruction bytes emitted since the that stackmap instruction was encountered. Padding is emitted (if it is required at all) immediately before the next stackmap/patchpoint instruction, or at the end of the basic block. This optimization should reduce code-size and improve performance for people using the llvm stackmap intrinsic on X86. <rdar://problem/14959522> llvm-svn: 213892
* Replace an assertion with a fatal errorReid Kleckner2014-07-241-2/+6
| | | | | | | Frontends are responsible for putting inalloca on parameters that would be passed in memory and not registers. llvm-svn: 213891
* Use the same .eh_frame encoding for 32bit PPC as on i386.Joerg Sonnenberger2014-07-241-0/+1
| | | | llvm-svn: 213890
* X86: correct library call setup for Windows itaniumSaleem Abdulrasool2014-07-241-1/+1
| | | | | | | | This target is identical to the Windows MSVC (and follows Microsoft ABI for C). Correct the library call setup for this target. The same set of library calls are missing on this environment. llvm-svn: 213883
* R600: Add FMA instructions for EvergreenMatt Arsenault2014-07-242-0/+12
| | | | llvm-svn: 213882
* X86: silence sign comparison warningSaleem Abdulrasool2014-07-241-1/+3
| | | | | | | | GCC 4.8 detected a signed compare [-Wsign-compare]. Add a cast for the destination index. Add an assert to catch a potential overflow however unlikely it may be. llvm-svn: 213878
* R600: Add new functions for splitting vector loads and stores.Matt Arsenault2014-07-244-26/+145
| | | | | | These will be used in future patches and shouldn't change anything yet. llvm-svn: 213877
* Let the integrated assembler understand .exitm, PR20426.Nico Weber2014-07-241-8/+40
| | | | llvm-svn: 213876
* Remove unused field MacroInstantiation::TheMacro. No behavior change.Nico Weber2014-07-241-11/+6
| | | | llvm-svn: 213874
* Let the integrated assembler understand .warning, PR20428.Nico Weber2014-07-241-1/+33
| | | | llvm-svn: 213873
* Include relative path for header outside the current directory.Joerg Sonnenberger2014-07-241-1/+1
| | | | llvm-svn: 213872
* Remove dead code.Rafael Espindola2014-07-242-66/+0
| | | | | | Every user has been switched to using EngineBuilder. llvm-svn: 213871
* AArch64: refactor ReconstructShuffle functionTim Northover2014-07-241-109/+124
| | | | | | | | | | | Quite a bit of cruft had accumulated as we realised the various different cases it had to handle and squeezed them in where possible. This refactoring mostly flattens the logic and special-cases. The result is slightly longer, but I think clearer. Should be no functionality change. llvm-svn: 213867
* Add scoped-noalias metadataHal Finkel2014-07-2418-9/+508
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds scoped noalias metadata. The primary motivations for this feature are: 1. To preserve noalias function attribute information when inlining 2. To provide the ability to model block-scope C99 restrict pointers Neither of these two abilities are added here, only the necessary infrastructure. In fact, there should be no change to existing functionality, only the addition of new features. The logic that converts noalias function parameters into this metadata during inlining will come in a follow-up commit. What is added here is the ability to generally specify noalias memory-access sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA nodes: !scope0 = metadata !{ metadata !"scope of foo()" } !scope1 = metadata !{ metadata !"scope 1", metadata !scope0 } !scope2 = metadata !{ metadata !"scope 2", metadata !scope0 } !scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 } !scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 } Loads and stores can be tagged with an alias-analysis scope, and also, with a noalias tag for a specific scope: ... = load %ptr1, !alias.scope !{ !scope1 } ... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 } When evaluating an aliasing query, if one of the instructions is associated with an alias.scope id that is identical to the noalias scope associated with the other instruction, or is a descendant (in the scope hierarchy) of the noalias scope associated with the other instruction, then the two memory accesses are assumed not to alias. Note that is the first element of the scope metadata is a string, then it can be combined accross functions and translation units. The string can be replaced by a self-reference to create globally unqiue scope identifiers. [Note: This overview is slightly stylized, since the metadata nodes really need to just be numbers (!0 instead of !scope0), and the scope lists are also global unnamed metadata.] Existing noalias metadata in a callee is "cloned" for use by the inlined code. This is necessary because the aliasing scopes are unique to each call site (because of possible control dependencies on the aliasing properties). For example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } -- now just because we know that a1 does not alias with b1 at the first call site, and a2 does not alias with b2 at the second call site, we cannot let inlining these functons have the metadata imply that a1 does not alias with b2. llvm-svn: 213864
* Fixing an MSVC conversion warning about implicitly converting the shift ↵Aaron Ballman2014-07-241-1/+1
| | | | | | results to 64-bits. No functional change intended. llvm-svn: 213863
* AA metadata refactoring (introduce AAMDNodes)Hal Finkel2014-07-2436-328/+399
| | | | | | | | | | | | | | | | | | | | In order to enable the preservation of noalias function parameter information after inlining, and the representation of block-level __restrict__ pointer information (etc.), additional kinds of aliasing metadata will be introduced. This metadata needs to be carried around in AliasAnalysis::Location objects (and MMOs at the SDAG level), and so we need to generalize the current scheme (which is hard-coded to just one TBAA MDNode*). This commit introduces only the necessary refactoring to allow for the introduction of other aliasing metadata types, but does not actually introduce any (that will come in a follow-up commit). What it does introduce is a new AAMDNodes structure to hold all of the aliasing metadata nodes associated with a particular memory-accessing instruction, and uses that structure instead of the raw MDNode* in AliasAnalysis::Location, etc. No functionality change intended. llvm-svn: 213859
* Prune redundant libdeps.NAKAMURA Takumi2014-07-243-3/+3
| | | | llvm-svn: 213857
* Prune dependency to MC from each target disassembler.NAKAMURA Takumi2014-07-245-5/+5
| | | | llvm-svn: 213856
* [ARM] Make the assembler reject unpredictable pre/post-indexed ARM STRH ↵Tilmann Scheller2014-07-241-0/+2
| | | | | | | | instructions. The ARM ARM prohibits STRH instructions with writeback into the source register. With this commit this constraint is now enforced and we stop assembling STRH instructions with unpredictable behavior. llvm-svn: 213850
* [mips] Fix ll and sc instructionsDaniel Sanders2014-07-241-4/+4
| | | | | | | | | | Summary: The ll and sc instructions for r6 and non-r6 are misplaced. This patch fixes that. Patch by Jyun-Yan You Differential Revision: http://reviews.llvm.org/D4578 llvm-svn: 213847
* R600: Match rcp node on pre-SIMatt Arsenault2014-07-243-1/+9
| | | | llvm-svn: 213844
* R600: Fix LowerSDIV24Matt Arsenault2014-07-241-51/+50
| | | | | | | | | | Use ComputeNumSignBits instead of checking for i8 / i16 which only worked when AMDIL was lying about having legal i8 / i16. If an integer is known to fit in 24-bits, we can do division faster with float ops. llvm-svn: 213843
* Update library dependencies.NAKAMURA Takumi2014-07-249-9/+9
| | | | llvm-svn: 213832
* R600: Implement enableClusterLoads()Matt Arsenault2014-07-242-0/+7
| | | | llvm-svn: 213831
* [AArch64] Fix a bug generating incorrect instruction when building small vector.Kevin Qin2014-07-241-38/+63
| | | | | | | | | This bug is introduced by r211144. The element of operand may be smaller than the element of result, but previous commit can only handle the contrary condition. This commit is to handle this scenario and generate optimized codes like ZIP1. llvm-svn: 213830
* [AArch64] Disable some optimization cases for type conversion from sint to ↵Jiangning Liu2014-07-241-3/+4
| | | | | | fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file. llvm-svn: 213827
* Fixed PR20411 - bug in getINSERTPS()Filipe Cabecinhas2014-07-241-0/+14
| | | | | | | | | | When we had a vector_shuffle where we had an input from each vector, we could miscompile it because we were assuming the input from V2 wouldn't be moved from where it was on the vector. Added a test case. llvm-svn: 213826
* SimplifyCFG: fix a bug in switch to table conversionManman Ren2014-07-231-4/+13
| | | | | | | | | | | | | | | | | | | We use gep to access the global array "switch.table", and the table index should be treated as unsigned. When the highest bit is 1, this commit zero-extends the index to an integer type with larger size. For a switch on i2, we used to generate: %switch.tableidx = sub i2 %0, -2 getelementptr inbounds [4 x i64]* @switch.table, i32 0, i2 %switch.tableidx It is incorrect when %switch.tableidx is 2 or 3. The fix is to generate %switch.tableidx = sub i2 %0, -2 %switch.tableidx.zext = zext i2 %switch.tableidx to i3 getelementptr inbounds [4 x i64]* @switch.table, i32 0, i3 %switch.tableidx.zext rdar://17735071 llvm-svn: 213815
* Fix the build when building with only the ARM backend.Rafael Espindola2014-07-231-1/+1
| | | | llvm-svn: 213814
* Fix indenting.Eric Christopher2014-07-231-13/+14
| | | | llvm-svn: 213811
* Reorganize and simplify local variables.Eric Christopher2014-07-231-13/+11
| | | | llvm-svn: 213809
* Finish inverting the MC -> Object dependency.Rafael Espindola2014-07-238-9/+9
| | | | | | | There were still some disassembler bits in lib/MC, but their use of Object was only visible in the includes they used, not in the symbols. llvm-svn: 213808
* Remove the query for TargetMachine and TargetInstrInfo since we'reEric Christopher2014-07-231-3/+1
| | | | | | already inside TargetInstrInfo. llvm-svn: 213806
* ArgPromo+DebugInfo: Handle updating debug info over multiple applications of ↵David Blaikie2014-07-231-3/+7
| | | | | | | | | | | | | | | | | | | | | | argument promotion. While the subprogram map cache used by Dead Argument Elimination works there, I made a mistake when reusing it for Argument Promotion in r212128 because ArgPromo may transform functions more than once whereas DAE transforms each function only once, removing all the dead arguments in one go. To address this, ensure that the map is updated after each argument promotion. In retrospect it might be a little wasteful to create a map of all subprograms when only handling a single CGSCC, but the alternative is walking the debug info for each function in the CGSCC that gets updated. It's not clear to me what the right tradeoff is there, but since the current tradeoff seems to be working OK (and the code to keep things updated is very cheap), let's stick with that for now. llvm-svn: 213805
* [X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants.Jim Grosbach2014-07-232-6/+6
| | | | | | | | The transform to constant fold unary operations with an AND across a vector comparison applies when the constant is not a splat of a scalar as well. llvm-svn: 213800
OpenPOWER on IntegriCloud