summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Make helper functions static.Benjamin Kramer2015-03-093-11/+11
| | | | | | Found by -Wmissing-prototypes. NFC. llvm-svn: 231664
* R600/SI: Fix DS definitions and add missing instructionsTom Stellard2015-03-092-44/+139
| | | | llvm-svn: 231663
* R600/SI: Fix opcode for ds_read2_b64 and ds_read2st64_b64Tom Stellard2015-03-091-2/+2
| | | | llvm-svn: 231662
* Move unreferenced passes into the cpp fileBenjamin Kramer2015-03-098-115/+88
| | | | | | NFC. llvm-svn: 231661
* SymbolRewriter: Hide implementation detailsBenjamin Kramer2015-03-091-6/+6
| | | | | | NFC. llvm-svn: 231660
* R600/SI: Limit SGPRs to 80 on Tonga and IcelandMarek Olsak2015-03-096-3/+45
| | | | | | This is a candidate for stable. llvm-svn: 231659
* R600/SI: Fix getNumSGPRsAllowed for VIMarek Olsak2015-03-092-12/+24
| | | | llvm-svn: 231658
* Revert r231630 - Run LICM pass after loop unrolling pass.Kevin Qin2015-03-092-10/+7
| | | | | | As it broke llvm bootstrap. llvm-svn: 231635
* Fix a bug in the LLParser where we failed to diagnose landingpads with ↵Owen Anderson2015-03-091-6/+7
| | | | | | | | | | non-constant clause operands. Fixing this also exposed a related issue where the landingpad under construction was not cleaned up when an error was raised, which would cause bad reference errors before the error could actually be printed. llvm-svn: 231634
* [AArch64] Enable partial & runtime unrolling on cortex-a57Kevin Qin2015-03-091-0/+10
| | | | | | | | For inner one of nested loops, it is more likely to be a hot loop, and the runtime check can be promoted out from patch 0001, so the overhead is less, we can try a doubled threshold to unroll more loops. llvm-svn: 231632
* Introduce runtime unrolling disable matadata and use it to mark the scalar ↵Kevin Qin2015-03-092-2/+60
| | | | | | | | | | | loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631
* Run LICM pass after loop unrolling pass.Kevin Qin2015-03-092-7/+10
| | | | | | | | | Runtime unrollng will introduce a runtime check in loop prologue. If the unrolled loop is a inner loop, then the proglogue will be inside the outer loop. LICM pass can help to promote the runtime check out if the checked value is loop invariant. llvm-svn: 231630
* InstCombine: fix fold "fcmp x, undef" to account for NaNMehdi Amini2015-03-092-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | Summary: See the two test cases. ; Can fold fcmp with undef on one side by choosing NaN for the undef ; Can fold fcmp with undef on both side ; fcmp u_pred undef, undef -> true ; fcmp o_pred undef, undef -> false ; because whatever you choose for the first undef ; you can choose NaN for the other undef Reviewers: hfinkel, chandlerc, majnemer Reviewed By: majnemer Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D7617 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231626
* DCE: isArrayMalloc() is not used neither in LLVM nor ClangMehdi Amini2015-03-091-17/+0
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231624
* Simplify expressions involving boolean constants with clang-tidyDavid Blaikie2015-03-0915-20/+19
| | | | | | | | Patch by Richard (legalize at xmission dot com). Differential Revision: http://reviews.llvm.org/D8154 llvm-svn: 231617
* Teach DataLayout to infer a plausible alignment for things even when nothing ↵Owen Anderson2015-03-081-3/+14
| | | | | | is specified by the user. llvm-svn: 231613
* [X86][AVX] Fix wrong lowering of VPERM2X128 nodesAndrea Di Biagio2015-03-081-1/+9
| | | | | | | | | | | | | | | | | | | | | | There were cases where the backend computed a wrong permute mask for a VPERM2X128 node. Example: \code define <8 x float> @foo(<8 x float> %a, <8 x float> %b) { %shuffle = shufflevector <8 x float> %a, <8 x float> %b, <8 x i32> <i32 undef, i32 undef, i32 6, i32 7, i32 undef, i32 undef, i32 6, i32 7> ret <8 x float> %shuffle } \code end Before this patch, llc (with -mattr=+avx) emitted the following vperm2f128: vperm2f128 $0, %ymm0, %ymm0, %ymm0 # ymm0 = ymm0[0,1,0,1] With this patch, llc emits a vperm2f128 with a correct permute mask: vperm2f128 $17, %ymm0, %ymm0, %ymm0 # ymm0 = ymm0[2,3,2,3] Differential Revision: http://reviews.llvm.org/D8119 llvm-svn: 231601
* Make static variables const if possible. Makes them go into a read-only section.Benjamin Kramer2015-03-086-49/+34
| | | | | | Or fold them into a initializer list which has the same effect. NFC. llvm-svn: 231598
* [DAGCombiner] Add a shuffle mask commutation helper function. NFCI.Simon Pilgrim2015-03-073-55/+6
| | | | | | | | | | We have an increasing number of cases where we are creating commuted shuffle masks - all implementing nearly the same code. This patch adds a static helper function - ShuffleVectorSDNode::commuteMask() and replaces a number of cases to use it. Differential Revision: http://reviews.llvm.org/D8139 llvm-svn: 231581
* Fix the autoconf buildDavid Majnemer2015-03-073-195/+160
| | | | | | | | lib/ExecutionEngine/Targets has no Makefile, causing the autoconf build to fail. Solve this by bringing the COFF implementation of RuntimeDyld in line like the Mach-O and ELF implementations. llvm-svn: 231579
* Make the assertion macros in Verifier and Linter truly variadic.Benjamin Kramer2015-03-072-1055/+1009
| | | | | | NFC. llvm-svn: 231577
* Fix unused variable/function warningsDavid Majnemer2015-03-073-10/+7
| | | | llvm-svn: 231576
* ExecutionEngine: Preliminary support for dynamically loadable coff objectsDavid Majnemer2015-03-079-14/+444
| | | | | | | | | | Provide basic support for dynamically loadable coff objects. Only handles a subset of x64 currently. Patch by Andy Ayers! Differential Revision: http://reviews.llvm.org/D7793 llvm-svn: 231574
* Make constant arrays that are passed to functions as const.Benjamin Kramer2015-03-078-40/+30
| | | | | | | | In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571
* Use SDValue bool check to tidyup some possible combines. NFC.Simon Pilgrim2015-03-071-6/+5
| | | | llvm-svn: 231569
* X86: Roll repetitive code into a loop. NFC.Benjamin Kramer2015-03-071-49/+16
| | | | llvm-svn: 231565
* [DAGCombiner] Fix wrong folding of AND dag nodes.Andrea Di Biagio2015-03-071-3/+7
| | | | | | | | | | | | | | | | | | | | | | | This patch fixes the logic in the DAGCombiner that folds an AND node according to rule: (and (X (load V)), C) -> (X (load V)) An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant. The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask. Unfortunately, the algorithm only worked under the assumption that the size of S is a multiple of the vector element type. With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type. Added X86 test and-load-fold.ll Differential Revision: http://reviews.llvm.org/D8085 llvm-svn: 231563
* [PM] Fixup for r231556 where I missed a dependency on intrinsicsChandler Carruth2015-03-071-0/+2
| | | | | | generation. llvm-svn: 231558
* [PM] Create a separate library for high-level pass management code.Chandler Carruth2015-03-078-2/+534
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will provide the analogous replacements for the PassManagerBuilder and other code long term. This code is extracted from the opt tool currently, and I plan to extend it as I build up support for using the new pass manager in Clang and other places. Mailing this out for review in part to let folks comment on the terrible names here. A brief word about why I chose the names I did. The library is called "Passes" to try and make it clear that it is a high-level utility and where *all* of the passes come together and are registered in a common library. I didn't want it to be *limited* to a registry though, the registry is just one component. The class is a "PassBuilder" but this name I'm less happy with. It doesn't build passes in any traditional sense and isn't a Builder-style API at all. The class is a PassRegisterer or PassAdder, but neither of those really make a lot of sense. This class is responsible for constructing passes for registry in an analysis manager or for population of a pass pipeline. If anyone has a better name, I would love to hear it. The other candidate I looked at was PassRegistrar, but that doesn't really fit either. There is no register of all the passes in use, and so I think continuing the "registry" analog outside of the registry of pass *names* and *types* is a mistake. The objects themselves are just objects with the new pass manager. Differential Revision: http://reviews.llvm.org/D8054 llvm-svn: 231556
* [DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLESimon Pilgrim2015-03-071-0/+30
| | | | | | | | | | | | This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554
* Typo.Eric Christopher2015-03-072-2/+2
| | | | llvm-svn: 231547
* Recommit r231324 with a fix to the ARM execution domain codeEric Christopher2015-03-072-16/+19
| | | | | | | | | | | | to disable lane switching if we don't actually have the instruction set we want to switch to. Models the earlier check above the conditional for the pass. The testcase is one that triggered with the assert that's added as part of the fix, use it to avoid adding a new testcase as it highlights the same problem. llvm-svn: 231539
* Do not restrict interleaved unrolling to small loops, depending on the target.Olivier Sallenave2015-03-064-0/+17
| | | | llvm-svn: 231528
* [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.Quentin Colombet2015-03-061-11/+116
| | | | | | | | | | | Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527
* DAGCombiner: Canonicalize select(and/or,x,y) depending on target.Matthias Braun2015-03-061-0/+63
| | | | | | | | | | | | | | | This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507
* DAGCombiner: Factor out some and/or combines.Matthias Braun2015-03-061-225/+252
| | | | | | | | | | | | | | | This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506
* LoopInterchange: Remove empty method.Benjamin Kramer2015-03-061-6/+1
| | | | llvm-svn: 231503
* LoopInterchange: Rephrase instruction moving using ilist's splice and factor ↵Benjamin Kramer2015-03-061-56/+19
| | | | | | | | it into a function + Random cleanups. No functional change. llvm-svn: 231501
* ExecutionDepsFix: Indizes -> Indices.Matthias Braun2015-03-061-10/+10
| | | | | | Translate german to english. llvm-svn: 231500
* Fix typo.Eric Christopher2015-03-061-1/+1
| | | | llvm-svn: 231495
* R600/SI: Remove unused register classTom Stellard2015-03-061-7/+0
| | | | llvm-svn: 231491
* Fold init() helpers into constructors. NFC.Benjamin Kramer2015-03-062-37/+19
| | | | llvm-svn: 231486
* Avoid calls to dumpPassInfo and RegionBase<Tr>::getNameStr() in RGPassManager ifChad Rosier2015-03-061-10/+14
| | | | | | | | | | | | | | | | | | -debug-pass is not specified, as the string is only used when dumping pass information. There is a big cost of determining the name in ReginBase<Tr>:getNameStr() if the region's entry or exit block doesn't have a name. This is the case for the Release build, as names are not preserved by the front-end. RegionPass is mainly used by Polly, resulting in long compile time for one file of a customer application with the Release build (1m24s) vs Release+Asserts build (10s) when Polly is used. With this change, the compile time with the Release build went down to 8s. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator: http://reviews.llvm.org/D8076 llvm-svn: 231485
* [ConstantRange] Teach multiply to be cleverer about signed ranges.James Molloy2015-03-061-1/+27
| | | | | | | | | | | | | Multiplication is not dependent on signedness, so just treating all input ranges as unsigned is not incorrect. However it will cause overly pessimistic ranges (such as full-set) when used with signed negative values. Teach multiply to try to interpret its inputs as both signed and unsigned, and then to take the most specific (smallest population) as its result. llvm-svn: 231483
* [AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-066-21/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475
* [AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-065-4/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474
* [mips] [IAS] Add missing constraints and improve testing for the .module ↵Toma Tabacu2015-03-063-7/+41
| | | | | | | | | | | | | | | | | | directive. Summary: None of the .set directives can be used before the .module directives. The .set mips0/pop/push were not triggering this constraint. Also added testing for all the other implemented directives which are supposed to trigger this constraint. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7140 llvm-svn: 231465
* Change the way in which error case is being handled.Daniel Jasper2015-03-061-2/+4
| | | | | | | | | Specifically this: * Prevents an "unused" warning in non-assert builds. * In that error case return with out removing a child loop instead of looping forever. llvm-svn: 231459
* Add a new pass "Loop Interchange"Karthik Bhat2015-03-064-0/+1204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass interchanges loops to provide a more cache-friendly memory access. For e.g. given a loop like - for(int i=0;i<N;i++) for(int j=0;j<N;j++) A[j][i] = A[j][i]+B[j][i]; is interchanged to - for(int j=0;j<N;j++) for(int i=0;i<N;i++) A[j][i] = A[j][i]+B[j][i]; This pass is currently disabled by default. To give a brief introduction it consists of 3 stages- LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix. LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time. LoopInterchangeTransform : Which does the actual transform. LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks. TODO: 1) Add support for reductions and lcssa phi. 2) Improve profitability model. 3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange. 4) Improve compile time regression found in llvm lnt due to this pass. 5) Fix issues in Dependency Analysis module. A special thanks to Hal for reviewing this code. Review: http://reviews.llvm.org/D7499 llvm-svn: 231458
* X86: Form IMGREL relocations for LLVM FunctionsDavid Majnemer2015-03-061-9/+7
| | | | | | | | We supported forming IMGREL relocations from ConstantExprs involving __ImageBase if the minuend was a GlobalVariable. Extend this functionality to all GlobalObjects. llvm-svn: 231456
OpenPOWER on IntegriCloud