summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombiner] Fix wrong folding of AND dag nodes.Andrea Di Biagio2015-03-072-3/+22
| | | | | | | | | | | | | | | | | | | | | | | This patch fixes the logic in the DAGCombiner that folds an AND node according to rule: (and (X (load V)), C) -> (X (load V)) An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant. The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask. Unfortunately, the algorithm only worked under the assumption that the size of S is a multiple of the vector element type. With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type. Added X86 test and-load-fold.ll Differential Revision: http://reviews.llvm.org/D8085 llvm-svn: 231563
* [Modules] Include the header needed for make_unique, otherwise we can'tChandler Carruth2015-03-071-0/+1
| | | | | | build this header in a module. llvm-svn: 231561
* Teach the LLVM CMake build how to explicitly use libc++abi when usingChandler Carruth2015-03-072-10/+17
| | | | | | | | | | | libc++. This lets me almost self-host on Linux with libc++ and libc++abi very simply. Currently, MCJIT and OrcJIT are failing due to uncaught exceptions, and the Go binding tests are failing to build due to not linking in the correct C++ standard library. llvm-svn: 231560
* [PM] Fixup for r231556 where I missed a dependency on intrinsicsChandler Carruth2015-03-071-0/+2
| | | | | | generation. llvm-svn: 231558
* [PM] Create a separate library for high-level pass management code.Chandler Carruth2015-03-0713-41/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will provide the analogous replacements for the PassManagerBuilder and other code long term. This code is extracted from the opt tool currently, and I plan to extend it as I build up support for using the new pass manager in Clang and other places. Mailing this out for review in part to let folks comment on the terrible names here. A brief word about why I chose the names I did. The library is called "Passes" to try and make it clear that it is a high-level utility and where *all* of the passes come together and are registered in a common library. I didn't want it to be *limited* to a registry though, the registry is just one component. The class is a "PassBuilder" but this name I'm less happy with. It doesn't build passes in any traditional sense and isn't a Builder-style API at all. The class is a PassRegisterer or PassAdder, but neither of those really make a lot of sense. This class is responsible for constructing passes for registry in an analysis manager or for population of a pass pipeline. If anyone has a better name, I would love to hear it. The other candidate I looked at was PassRegistrar, but that doesn't really fit either. There is no register of all the passes in use, and so I think continuing the "registry" analog outside of the registry of pass *names* and *types* is a mistake. The objects themselves are just objects with the new pass manager. Differential Revision: http://reviews.llvm.org/D8054 llvm-svn: 231556
* [DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLESimon Pilgrim2015-03-073-7/+34
| | | | | | | | | | | | This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554
* Typo.Eric Christopher2015-03-074-4/+4
| | | | llvm-svn: 231547
* Remove use of misched-bench from this test and replace it withEric Christopher2015-03-071-1/+1
| | | | | | | non-temporary enabling options. This is part of removing misched-bench as an option. llvm-svn: 231546
* [dsymutil] Apply relocations to DIE data before cloning.Frederic Riss2015-03-073-3/+86
| | | | | | | | | | | | | Doing this gets function's low_pc and global variable's locations right in the output debug info. It also could get right other attributes that need to be relocated (in linker terms), but I don't know of any other than the address attributes. This doesn't fixup low_pc attributes in compile_unit, lexical_block or inlined subroutine, nor does it get right high_pc attributes for function. This will come in a subsequent commit. llvm-svn: 231544
* Recommit r231324 with a fix to the ARM execution domain codeEric Christopher2015-03-073-16/+23
| | | | | | | | | | | | to disable lane switching if we don't actually have the instruction set we want to switch to. Models the earlier check above the conditional for the pass. The testcase is one that triggered with the assert that's added as part of the fix, use it to avoid adding a new testcase as it highlights the same problem. llvm-svn: 231539
* [modules] Mark Analysis/TargetLibraryInfo.def as a textual header.Richard Smith2015-03-061-0/+5
| | | | llvm-svn: 231532
* [dsymutil] Support cloning DIE reference attributes.Frederic Riss2015-03-063-10/+131
| | | | | | | | | | Reference attributes are mainly handled by just creating DIEEntry attributes for them. There is a special case for DW_FORM_ref_addr attributes though, because the DIEEntry code needs a DwarfDebug code to emit them (and we don't have one as we do no CodeGen). In that case, just use DIEInteger attributes with the right form. llvm-svn: 231531
* [dsymutil] Set linked unit start offset early. NFC.Frederic Riss2015-03-061-7/+8
| | | | | | | | | | The start offset of a linked unit is known before starting to clone its DIEs. Handling DW_FORM_ref_addr attributes requires that this offset is set while cloning the unit. Split CompileUnit::computeOffsets() into setStartOffset() and computeNextUnitOffset() and call them repsectively before cloning the DIEs and right after. llvm-svn: 231530
* Add DIEInteger::setValue() method.Frederic Riss2015-03-061-0/+1
| | | | | | | dsymutil needs to 'patch' attribute values after creating them. Just add this trivial capability. llvm-svn: 231529
* Do not restrict interleaved unrolling to small loops, depending on the target.Olivier Sallenave2015-03-067-0/+99
| | | | llvm-svn: 231528
* [AArch64][LoadStoreOptimizer] Generate LDP + SXTW instead of LD[U]R + LD[U]RSW.Quentin Colombet2015-03-062-11/+210
| | | | | | | | | | | Teach the load store optimizer how to sign extend a result of a load pair when it helps creating more pairs. The rational is that loads are more expensive than sign extensions, so if we gather some in one instruction this is better! <rdar://problem/20072968> llvm-svn: 231527
* fixed to test features, not CPUsSanjay Patel2015-03-061-3/+3
| | | | llvm-svn: 231524
* fixed to test features, not CPUsSanjay Patel2015-03-061-2/+2
| | | | llvm-svn: 231523
* loosen checking for buildbotsSanjay Patel2015-03-061-2/+1
| | | | llvm-svn: 231522
* fixed to test only the feature, not the feature and a CPUSanjay Patel2015-03-061-2/+1
| | | | llvm-svn: 231521
* fixed to test only the feature, not the feature and a CPUSanjay Patel2015-03-061-4/+4
| | | | llvm-svn: 231520
* fixed test to use FileCheckSanjay Patel2015-03-061-3/+8
| | | | llvm-svn: 231519
* fixed to use CHECK-LABELsSanjay Patel2015-03-061-9/+9
| | | | llvm-svn: 231517
* fixed to test only the feature, not the feature and a CPUSanjay Patel2015-03-061-1/+1
| | | | llvm-svn: 231516
* fixed to test only the feature, not the feature and a CPUSanjay Patel2015-03-061-2/+2
| | | | llvm-svn: 231515
* fixed to test feature, not CPUSanjay Patel2015-03-061-1/+1
| | | | llvm-svn: 231513
* fixed to test features, not CPUsSanjay Patel2015-03-061-3/+3
| | | | llvm-svn: 231512
* fixed test to use SSE2 attributeSanjay Patel2015-03-061-1/+1
| | | | llvm-svn: 231510
* fixed to test only the feature, not the feature and a CPUSanjay Patel2015-03-061-1/+1
| | | | llvm-svn: 231509
* DAGCombiner: Canonicalize select(and/or,x,y) depending on target.Matthias Braun2015-03-067-14/+197
| | | | | | | | | | | | | | | This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507
* DAGCombiner: Factor out some and/or combines.Matthias Braun2015-03-061-225/+252
| | | | | | | | | | | | | | | This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506
* [AsmPrinter][TLOF] Remove AArch64 test to appease buildbotsBruno Cardoso Lopes2015-03-061-93/+0
| | | | | | | Follow up from r231497. Using XFAIL would still trigger fail on some buildbots. Will re-introduce it as soon as I have a fix. llvm-svn: 231505
* LoopInterchange: Remove empty method.Benjamin Kramer2015-03-061-6/+1
| | | | llvm-svn: 231503
* LoopInterchange: Rephrase instruction moving using ilist's splice and factor ↵Benjamin Kramer2015-03-061-56/+19
| | | | | | | | it into a function + Random cleanups. No functional change. llvm-svn: 231501
* ExecutionDepsFix: Indizes -> Indices.Matthias Braun2015-03-061-10/+10
| | | | | | Translate german to english. llvm-svn: 231500
* [AsmPrinter][TLOF] XFAIL AArch64 test to appease buildbotsBruno Cardoso Lopes2015-03-061-0/+5
| | | | | | | | | The checking for extgotequiv and localgotequiv rely on the emission order, which is not guaranteed because we use DenseMap to hold the GOT equivalents. XFAIL this now until I get time to use MapVector and test out the solution. In the meantime, appease buildbots. llvm-svn: 231497
* Fix typo.Eric Christopher2015-03-061-1/+1
| | | | llvm-svn: 231495
* [dsymutil] Add debug_str construction support.Frederic Riss2015-03-063-8/+161
| | | | | | With this comes the ability to correctly clone string attributes in DIEs. llvm-svn: 231493
* R600/SI: Remove unused register classTom Stellard2015-03-061-7/+0
| | | | llvm-svn: 231491
* Fold init() helpers into constructors. NFC.Benjamin Kramer2015-03-063-40/+19
| | | | llvm-svn: 231486
* Avoid calls to dumpPassInfo and RegionBase<Tr>::getNameStr() in RGPassManager ifChad Rosier2015-03-061-10/+14
| | | | | | | | | | | | | | | | | | -debug-pass is not specified, as the string is only used when dumping pass information. There is a big cost of determining the name in ReginBase<Tr>:getNameStr() if the region's entry or exit block doesn't have a name. This is the case for the Release build, as names are not preserved by the front-end. RegionPass is mainly used by Polly, resulting in long compile time for one file of a customer application with the Release build (1m24s) vs Release+Asserts build (10s) when Polly is used. With this change, the compile time with the Release build went down to 8s. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator: http://reviews.llvm.org/D8076 llvm-svn: 231485
* [ConstantRange] Teach multiply to be cleverer about signed ranges.James Molloy2015-03-063-3/+36
| | | | | | | | | | | | | Multiplication is not dependent on signedness, so just treating all input ranges as unsigned is not incorrect. However it will cause overly pessimistic ranges (such as full-set) when used with signed negative values. Teach multiply to try to interpret its inputs as both signed and unsigned, and then to take the most specific (smallest population) as its result. llvm-svn: 231483
* [AsmPrinter][TLOF] Make AArch64 test a bit more flexibleBruno Cardoso Lopes2015-03-061-8/+8
| | | | llvm-svn: 231481
* [AsmPrinter][TLOF] Split tests and move to appropriate directoriesBruno Cardoso Lopes2015-03-064-38/+164
| | | | | | Follow up from r231474 and 231475 to appease buildbots llvm-svn: 231480
* [AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-069-22/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475
* [AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-067-28/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474
* CodingStyle: Allow delegating ctorsBenjamin Kramer2015-03-061-0/+2
| | | | | | Delegating constructors seem to work fine with all supported compilers. llvm-svn: 231473
* [mips] [IAS] Add missing constraints and improve testing for the .module ↵Toma Tabacu2015-03-065-13/+303
| | | | | | | | | | | | | | | | | | directive. Summary: None of the .set directives can be used before the .module directives. The .set mips0/pop/push were not triggering this constraint. Also added testing for all the other implemented directives which are supposed to trigger this constraint. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7140 llvm-svn: 231465
* Change the way in which error case is being handled.Daniel Jasper2015-03-061-2/+4
| | | | | | | | | Specifically this: * Prevents an "unused" warning in non-assert builds. * In that error case return with out removing a child loop instead of looping forever. llvm-svn: 231459
* Add a new pass "Loop Interchange"Karthik Bhat2015-03-0610-0/+2033
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass interchanges loops to provide a more cache-friendly memory access. For e.g. given a loop like - for(int i=0;i<N;i++) for(int j=0;j<N;j++) A[j][i] = A[j][i]+B[j][i]; is interchanged to - for(int j=0;j<N;j++) for(int i=0;i<N;i++) A[j][i] = A[j][i]+B[j][i]; This pass is currently disabled by default. To give a brief introduction it consists of 3 stages- LoopInterchangeLegality : Checks the legality of loop interchange based on Dependency matrix. LoopInterchangeProfitability: A very basic heuristic has been added to check for profitibility. This will evolve over time. LoopInterchangeTransform : Which does the actual transform. LNT Performance tests shows improvement in Polybench/linear-algebra/kernels/mvt and Polybench/linear-algebra/kernels/gemver becnmarks. TODO: 1) Add support for reductions and lcssa phi. 2) Improve profitability model. 3) Improve loop selection algorithm to select best loop for interchange. Currently the innermost loop is selected for interchange. 4) Improve compile time regression found in llvm lnt due to this pass. 5) Fix issues in Dependency Analysis module. A special thanks to Hal for reviewing this code. Review: http://reviews.llvm.org/D7499 llvm-svn: 231458
OpenPOWER on IntegriCloud