summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* R600/SI: Add compute support for CI v2Tom Stellard2013-10-296-9/+24
| | | | | | | | v2: - Fix LDS size calculation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 193621
* R600: Expand vector FSQRT opsTom Stellard2013-10-291-0/+1
| | | | llvm-svn: 193620
* DWARF parser: propery handle DW_FORM_ref_sig8 and fix Windows build.Alexey Samsonov2013-10-291-43/+41
| | | | | | Based on D2050 by Timur Iskhodzhanov. llvm-svn: 193619
* The asm printer has a mangler. Use it.Rafael Espindola2013-10-293-7/+4
| | | | llvm-svn: 193618
* The AsmPrinter has a Mangler. Use it.Rafael Espindola2013-10-293-8/+6
| | | | llvm-svn: 193617
* The asm printer has a mangler. Don't keep a second pointer to it.Rafael Espindola2013-10-291-10/+11
| | | | llvm-svn: 193616
* Quick-fix DebugInfo build on WindowsTimur Iskhodzhanov2013-10-291-1/+3
| | | | | | | | | | | | | | | MSVC can't comprehend template<typename T, size_t N> ArrayRef<T> makeArrayRef(const T (&Arr)[N]) { return ArrayRef<T>(Arr); } if Arr is static const uint8_t sizes[]; declared in a templated and defined a few lines later. I'll send a proper fix (i.e. get rid of unnecessary templates) for review soon. llvm-svn: 193604
* ARM: Add subtarget feature for CRCBernard Ogden2013-10-295-6/+14
| | | | | | | | Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend. Differential Revision: http://llvm-reviews.chandlerc.com/D2036 llvm-svn: 193599
* Fix misapplied patch in r193597Anders Waldenborg2013-10-291-2/+2
| | | | | | Sorry Peter Zotov, entirely my fault. llvm-svn: 193598
* llvm-c: Make LLVM{Get,Set}Alignment work on {Load,Store}Inst tooAnders Waldenborg2013-10-291-4/+22
| | | | | | | | Patch by Peter Zotov Differential Revision: http://llvm-reviews.chandlerc.com/D1910 llvm-svn: 193597
* AArch64: add 'a' inline asm operand modifierTim Northover2013-10-291-0/+2
| | | | | | | This is used in the Linux kernel, and effectively just means "print an address". llvm-svn: 193593
* Debug Info: instead of calling addToContextOwner which constructs the contextManman Ren2013-10-291-7/+17
| | | | | | | | | | | | after the DIE creation, we construct the context first. This touches creation of namespaces and global variables. The purpose is to handle all DIE creations similarly: constructs the context first, then creates the DIE and immediately adds the DIE to its parent. We use createAndAddDIE to wrap around "new DIE(". llvm-svn: 193589
* Fix "existant" typosAlp Toker2013-10-291-1/+1
| | | | llvm-svn: 193579
* Clean up.Richard Smith2013-10-291-4/+4
| | | | llvm-svn: 193576
* DWARFFormValue.cpp: Appease gcc to give explicit constructors.NAKAMURA Takumi2013-10-291-4/+4
| | | | | | error: conversion from `const uint8_t*' to non-scalar type `llvm::ArrayRef<unsigned char>' requested llvm-svn: 193575
* ARM cost model: Unaligned vectorized double stores are expensiveArnold Schwaighofer2013-10-291-0/+15
| | | | | | | | | Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 llvm-svn: 193574
* ARM cost model: Account for zero cost scalar SROA instructionsArnold Schwaighofer2013-10-292-6/+33
| | | | | | | | | By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573
* SLPVectorizer: Use vector type for vectorized memory operationsArnold Schwaighofer2013-10-291-2/+2
| | | | | | | | | No test case, because with the current cost model we don't see a difference. An upcoming ARM memory cost model change will expose and test this bug. radar://15332579 llvm-svn: 193572
* Move the STT_FILE symbols out of the normal symbol table processing forJoerg Sonnenberger2013-10-293-22/+33
| | | | | | | ELF. They can overlap with the other symbols, e.g. if a source file "foo.c" contains a function "foo" with a static variable "c". llvm-svn: 193569
* Debug Info: use createAndAddDIE to wrap around "new DIE" in DwarfDebug.Manman Ren2013-10-291-6/+5
| | | | | | | | | This commit ensures DIEs are constructed within a compile unit and immediately added to their parents. Reviewed off-list by Eric. llvm-svn: 193568
* Debug Info: use createAndAddDIE for newly-created Subprogram DIEs.Manman Ren2013-10-291-9/+5
| | | | | | | | | | | More patches will be submitted to convert "new DIE(" to use createAddAndDIE in DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where we have to decide between ref4 and ref_addr, because DIEs that can be shared across CU will be added to a CU already. Reviewed off-list by Eric. llvm-svn: 193567
* Debug Info: add a helper function createAndAddDIE.Manman Ren2013-10-292-29/+28
| | | | | | | | | | | | | | It wraps around "new DIE(" and handles the bookkeeping part of the newly-created DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes sure that bookkeeping is done at the earliest time and we should not see parentless DIEs if all constructions of DIEs go through this helper function. Later on, we can use an allocator for DIE allocation, and will only need to change createAndAddDIE instead of modifying all the "new DIE(". Reviewed off-list by Eric. llvm-svn: 193566
* Merge DWARFDIE::extractFast and DWARFDIE::extract into one function.Alexey Samsonov2013-10-282-56/+10
| | | | | | | Complicated CU-DIE-specific logic in the latter was never used, and it makes sense to have safety checks for broken dwarf in the former. llvm-svn: 193563
* DWARF parser: Use ArrayRef to represent form sizes and simplify ↵Alexey Samsonov2013-10-284-18/+11
| | | | | | DWARFDIE::extractFast() interface. No functionality change. llvm-svn: 193560
* DWARF parser: since DWARF4, DW_AT_high_pc may be a constant representing ↵Alexey Samsonov2013-10-281-4/+10
| | | | | | function size llvm-svn: 193555
* DebugInfo: Introduce the notion of "form classes"Alexey Samsonov2013-10-285-67/+158
| | | | | | | | | | | | | | | | | | Summary: Use DWARF4 table of form classes to fetch attributes from DIE in a more consistent way. This shouldn't change the functionality and serves as a refactoring for upcoming change: DW_AT_high_pc has different semantics depending on its form class. Reviewers: dblaikie, echristo Reviewed By: echristo CC: echristo, llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1961 llvm-svn: 193553
* [mips] Simplify LowerFormalArguments using getRegClassFor.Akira Hatanaka2013-10-281-15/+2
| | | | | | No functionality change. llvm-svn: 193540
* Return early from getUnconditionalBranchTargetOpValue if the branch target isLang Hames2013-10-281-1/+1
| | | | | | | | | | | | | | | | | an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. llvm-svn: 193535
* [arm] Implement eabi_attribute, cpu, and fpu directives.Logan Chien2013-10-287-265/+514
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. llvm-svn: 193524
* simplify ConstantRange::getSetSize()Nuno Lopes2013-10-281-3/+0
| | | | llvm-svn: 193523
* [SystemZ] Set usaAA to trueRichard Sandiford2013-10-281-0/+3
| | | | | | | | | | | | | | | | useAA significantly improves the handling of vector code that has TBAA information attached. It also helps other cases, as shown by the testsuite changes here. The only real downside I've seen is that it interferes with MergeConsecutiveStores. The problem is that that optimization works top down, starting at the first store in the chain, and looks for cases where the chain result is only used by a single related store. These related stores don't alias, so useAA will have rewritten all the later stores to use a different chain input (typically the same one as the first store). I think the advantages outweigh the disadvantages though, so for now I've just disabled alias analysis for the unaligned-01.ll test. llvm-svn: 193521
* [DAGCombiner] Respect volatility when checking for aliasesRichard Sandiford2013-10-281-18/+25
| | | | | | | | Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518
* Keep TBAA info when rewriting SelectionDAG loads and storesRichard Sandiford2013-10-288-191/+181
| | | | | | | | | | | | | | | | | Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517
* SCEV: Make the final add of an inbounds GEP nuw if we know that the index is ↵Benjamin Kramer2013-10-281-4/+9
| | | | | | | | | | | | | | | | | | | | | | | positive. We can't do this for the general case as saying a GEP with a negative index doesn't have unsigned wrap isn't valid for negative indices. %gep = getelementptr inbounds i32* %p, i64 -1 But an inbounds GEP cannot run past the end of address space. So we check for the very common case of a positive index and make GEPs derived from that NUW. Together with Andy's recent non-unit stride work this lets us analyze loops like void foo3(int *a, int *b) { for (; a < b; a++) {} } PR12375, PR12376. Differential Revision: http://llvm-reviews.chandlerc.com/D2033 llvm-svn: 193514
* Prune utf8 chars in comments.NAKAMURA Takumi2013-10-282-5/+5
| | | | llvm-svn: 193512
* Prune trailing linefeeds.NAKAMURA Takumi2013-10-282-2/+0
| | | | llvm-svn: 193511
* Target/R600: Un-tab-ify.NAKAMURA Takumi2013-10-283-9/+9
| | | | llvm-svn: 193510
* Make first substantial checkin of my port of ARM constant islands code to Mips.Reed Kotler2013-10-278-12/+284
| | | | | | | | | | | | Before I just ported the shell of the pass. I've tried to keep everything nearly identical to the ARM version. I think it will be very easy to eventually merge these two and create a new more general pass that other targets can use. I have some improvements I would like to make to allow pools to be shared across functions and some other things. When I'm all done we can think about making a more general pass. More to be ported but the basic mechanism works now almost as good as gcc mips16. llvm-svn: 193509
* NVPTX: Remove unused globals.Benjamin Kramer2013-10-271-7/+3
| | | | llvm-svn: 193500
* Hexagon: Remove global state.Benjamin Kramer2013-10-271-10/+25
| | | | llvm-svn: 193499
* AVX-512: PMIN/PMAX intrinsics and patternsElena Demikhovsky2013-10-272-1/+45
| | | | | | Patch by Cameron McInally <cameron.mcinally@nyu.edu> llvm-svn: 193497
* Revert r193251 : Use address-taken to disambiguate global variable and ↵Shuxin Yang2013-10-279-27/+2
| | | | | | indirect memops. llvm-svn: 193489
* Quick look-up for block in loop.Wan Xiaofei2013-10-263-33/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460
* Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop.Andrew Trick2013-10-253-14/+48
| | | | | | | | | | | | Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438
* Fix LSR: don't normalize quadratic recurrences.Andrew Trick2013-10-251-5/+13
| | | | | | | | | | Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) ScalarEvolutionNormalization was attempting to normalize by adding and subtracting strides. Chained recurrences don't work that way. llvm-svn: 193437
* Handle calls and invokes in GlobalStatus.Rafael Espindola2013-10-251-0/+5
| | | | | | | | | | | This patch teaches GlobalStatus to analyze a call that uses the global value as a callee, not as an argument. With this change internalize call handle the common use of linkonce_odr functions. This reduces the number of linkonce_odr functions in a LTO build of clang (checked with the emit-llvm gold plugin option) from 1730 to 60. llvm-svn: 193436
* LoopVectorizer: Don't attempt to vectorize extractelement instructionsHal Finkel2013-10-251-2/+3
| | | | | | | | | | | | | | | The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434
* DIEHash: Summary hashing of member functionsDavid Blaikie2013-10-251-1/+1
| | | | llvm-svn: 193432
* Change MemoryBuffer::getFile to take a Twine.Rafael Espindola2013-10-251-8/+12
| | | | llvm-svn: 193429
* DIEHash: Summary hashing of nested typesDavid Blaikie2013-10-252-1/+26
| | | | llvm-svn: 193427
OpenPOWER on IntegriCloud