summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* [GlobalISel][AArch64] Fold G_GEP into LDR/STR ui addressing mode.Ahmed Bougacha2017-03-273-1/+487
| | | | | | | | | | We're not to the point of supporting the load/store patterns yet (because they extensively use PatFrags). But in the meantime, we can implement some of the simplest addressing modes. llvm-svn: 298863
* [GlobalISel][AArch64] Select store of zero to WZR/XZR.Ahmed Bougacha2017-03-272-0/+67
| | | | | | These occur very frequently, and are quite trivial to catch. llvm-svn: 298862
* [AMDGPU] SISched: Update colorEndsAccordingToDependenciesValery Pykhtin2017-03-271-0/+14
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30150 llvm-svn: 298861
* [APInt] Move the >64 bit case for flipAllBits out of line.Craig Topper2017-03-272-5/+12
| | | | | | This is more consistent with what we do for other operations. This shrinks the opt binary on my build by ~72k. llvm-svn: 298858
* [AMDGPU] Fix SI scheduler LiveOut Refcount issueValery Pykhtin2017-03-272-0/+26
| | | | | | | | Patch by Axel Davy (axel.davy@normalesup.org) Differential revision: https://reviews.llvm.org/D30145 llvm-svn: 298857
* [GlobalISel][AArch64] Select CBZ.Ahmed Bougacha2017-03-273-3/+161
| | | | | | | | | | CBZ/CBNZ represent a substantial portion of all conditional branches. Look through G_ICMP to select them. We can't use tablegen yet because the existing patterns match an AArch64ISD node. llvm-svn: 298856
* [GlobalISel] Add a 'getConstantVRegVal' helper.Ahmed Bougacha2017-03-272-12/+24
| | | | | | Use it to compare immediate operands. llvm-svn: 298855
* [GlobalISel][AArch64] Use proper constant types in test. NFC.Ahmed Bougacha2017-03-271-2/+2
| | | | llvm-svn: 298854
* [AMDGPU][MC] Fix for Bug 28207 + LIT testsDmitry Preobrazhensky2017-03-276-18/+226
| | | | | | | | | | Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31327 llvm-svn: 298852
* [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as not having side effects.Chad Rosier2017-03-273-2/+72
| | | | | | | | | Among other things, this allows Machine LICM to hoist a costly 'mrs' instruction from within a loop. Differential Revision: http://reviews.llvm.org/D31151 llvm-svn: 298851
* [AMDGPU] Get address space mapping by target triple environmentYaxun Liu2017-03-2739-290/+446
| | | | | | | | | | | | | | | | | | As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
* [InstCombine] Avoid incorrect folding of select into phi nodes when incoming ↵Anna Thomas2017-03-272-1/+43
| | | | | | | | | | | | | | | | | | | | | | element is a vector type Summary: We are incorrectly folding selects into phi nodes when the incoming value of a phi node is a constant vector. This optimization is done in `FoldOpIntoPhi` when the select condition is a phi node with constant incoming values. Without the fix, we are miscompiling (i.e. incorrectly folding the select into the phi node) when the vector contains non-zero elements. This patch fixes the miscompile and we will correctly fold based on the select vector operand (see added test cases). Reviewers: majnemer, sanjoy, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31189 llvm-svn: 298845
* Correct OptionCategoryCompare() in the command line library.Daniel Sanders2017-03-271-2/+3
| | | | | | | | | | | | | | | | | Summary: It should return <0, 0, or >0 for less-than, equal, and greater-than like strcmp() (according to the history, it used to be implemented with strcmp()) but it actually returned 0, or 1 for not-equal and equal. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D30996 llvm-svn: 298844
* [tablegen] Use categories on options that only matter to one emitter.Daniel Sanders2017-03-275-12/+25
| | | | | | | | | | | | | | | | Summary: The categories are emitted in a strange order in this patch due to a bug in the CommandLine library. Reviewers: ab Reviewed By: ab Subscribers: ab, llvm-commits Differential Revision: https://reviews.llvm.org/D30995 llvm-svn: 298843
* ADT: Add range helpers for pointer_ and pointee_iteratorJustin Bogner2017-03-272-0/+35
| | | | llvm-svn: 298841
* [X86][AVX2] bugzilla bug 21281 Performance regression in vector interleave ↵Gadi Haber2017-03-273-56/+67
| | | | | | | | | | | | | | | | | | | | | in AVX2 This is a patch for an on-going bugzilla bug 21281 on the generated X86 code for a matrix transpose8x8 subroutine which requires vector interleaving. The generated code in AVX2 is currently non-optimal and requires 60 instructions as opposed to only 40 instructions generated for AVX1. The patch includes a fix for the AVX2 case where vector unpack instructions use less operations than the vector blend operations available in AVX2. In this case using vector unpack instructions is more efficient. Reviewers: zvi delena igorb craig.topper guyblank eladcohen m_zuckerman aymanmus RKSimon llvm-svn: 298840
* [TableGen] Make CodeGenMapTable understand the namespace field of an instructionKarl-Johan Karlsson2017-03-271-8/+8
| | | | | | | | | | | | | | | | Do not force the backends to use target name as namespace. Original patch by Mattias Eriksson Reviewers: stoklund, craig.topper Reviewed By: stoklund Subscribers: materi, llvm-commits Differential Revision: https://reviews.llvm.org/D31322 llvm-svn: 298834
* [IR] Implement pairs of non-const and const methods using the const version ↵Craig Topper2017-03-2710-74/+100
| | | | | | | | instead of the non-const version. NFCI This removes a const_cast of the this pointer. llvm-svn: 298831
* [IR] Share implementation for pairs of const and non-const methods using ↵Craig Topper2017-03-274-12/+12
| | | | | | const_cast. NFCI llvm-svn: 298830
* [IR] Share implementation of pairs of const and non-const methods in ↵Craig Topper2017-03-272-63/+75
| | | | | | | | | | | | | | | | | | | BasicBlock using the const version instead of the non-const version Summary: During post-commit review of a previous change I made it was pointed out that const casting 'this' is technically a bad practice. This patch re-implements all of the methods in BasicBlock that do this to use the const BasicBlock version and const_cast the return value instead. I think there are still many other classes that do similar things. I may look at more in the future. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31377 llvm-svn: 298827
* [IR] Make Instruction::isAssociative method inline. Add LLVM_READONLY to the ↵Craig Topper2017-03-262-13/+5
| | | | | | static version. llvm-svn: 298826
* [Target] Remove some code probably copy/pasted from another backend.Davide Italiano2017-03-261-4/+0
| | | | llvm-svn: 298825
* [MachineScheduler] Reference the correct header.Davide Italiano2017-03-261-1/+1
| | | | llvm-svn: 298823
* Fix typo in comment; NFCSanjoy Das2017-03-261-1/+1
| | | | llvm-svn: 298819
* Fix signed/unsigned comparison warnings.Simon Pilgrim2017-03-261-1/+1
| | | | llvm-svn: 298813
* [llvm-readobj] Prefer ILT to IAT for reading COFF importsShoaib Meenai2017-03-261-6/+12
| | | | | | | | | | | | | | | | | | | | | | | We're seeing binutils ld produce binaries where the import address table's NameRVA entry is actually a VA instead (i.e. it's already base relocated), which llvm-readobj then chokes on. Both dumpbin and the Windows loader are able to handle these binaries correctly, however, and we can make llvm-readobj handle them correctly too by iterating the import lookup table (which doesn't have a relocated NameRVA) rather than the import address table. The import lookup table and the import address table are supposed to be identical on disk, and prior to r277298 the import lookup table would be used by `llvm-readobj -coff-imports` anyway, so this shouldn't have any functional change (except in the case of our malformed binaries). The import lookup table can apparently be missing when using old Borland linkers, so fall back to the import address table in that case. Resolves PR31766. Differential Revision: https://reviews.llvm.org/D31362 llvm-svn: 298812
* [LoopUnroll] Remap references in peeled iterationSerge Pavlov2017-03-262-4/+66
| | | | | | | | | References in cloned blocks must be remapped prior to dominator calculation. Differential Revision: https://reviews.llvm.org/D31281 llvm-svn: 298811
* [IR] Switch to more normal template parameter names ending in `T`Chandler Carruth2017-03-261-10/+10
| | | | | | | | | instead of `Ty`. The `Ty` suffix is much more commonly used for LLVM `Type` variable names, so this seemed like a particularly confusing collision. llvm-svn: 298808
* Fix signed/unsigned comparison warnings.Simon Pilgrim2017-03-261-2/+2
| | | | llvm-svn: 298807
* [X86][SSE] Add computeKnownBitsForTargetNode support for (V)PSLL/(V)PSRL ↵Simon Pilgrim2017-03-262-2/+26
| | | | | | instructions llvm-svn: 298806
* [X86][AVX512F] Fix reg class for VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrkSimon Pilgrim2017-03-262-17/+16
| | | | | | | | | | Fixed -verify-machineinstrs errors in fast-isel-select-sse.ll (one of many in PR27481) The VMOVSSZrr/VMOVSSZrrk and VMOVSDZrr/VMOVSDZrrk instructions were assuming both source registers were V128X when the second is actually supposed to be FR32X/FR64X Differential Revision: https://reviews.llvm.org/D31200 llvm-svn: 298805
* Fix MSVC signed/unsigned comparison warnings.Simon Pilgrim2017-03-261-6/+9
| | | | llvm-svn: 298804
* Regenerate testSimon Pilgrim2017-03-261-1/+1
| | | | llvm-svn: 298803
* Regenerate testSimon Pilgrim2017-03-261-7/+7
| | | | | | The CHECK-DAG aren't necessary and get in the way of automated checks llvm-svn: 298802
* Regenerate tests to remove duplicated checksSimon Pilgrim2017-03-261-241/+118
| | | | llvm-svn: 298801
* [GlobalISel][X86] support G_FRAME_INDEX instruction selection.Igor Breger2017-03-267-5/+110
| | | | | | | | | | | | | | | Summary: Support G_FRAME_INDEX instruction selection. Reviewers: zvi, rovka, ab, qcolombet Reviewed By: ab Subscribers: llvm-commits, dberris, kristof.beyls, eladcohen, guyblank Differential Revision: https://reviews.llvm.org/D30980 llvm-svn: 298800
* Split the SimplifyCFG pass into two variants.Joerg Sonnenberger2017-03-2617-39/+114
| | | | | | | | | | | | | | | | | | | | | | | The first variant contains all current transformations except transforming switches into lookup tables. The second variant contains all current transformations. The switch-to-lookup-table conversion results in code that is more difficult to analyze and optimize by other passes. Most importantly, it can inhibit Dead Code Elimination. As such it is often beneficial to only apply this transformation very late. A common example is inlining, which can often result in range restrictions for the switch expression. Changes in execution time according to LNT: SingleSource/Benchmarks/Misc/fp-convert +3.03% MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20% MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43% and a couple of smaller changes. For perimeter it also results 2.6% a smaller binary. Differential Revision: https://reviews.llvm.org/D30333 llvm-svn: 298799
* Add check for BSD when setting LIB_NAMES for GNU ldAndrew Wilkins2017-03-261-1/+1
| | | | | | | | Patch by Koop Mast and Alex Arslan! Differential Revision: https://reviews.llvm.org/D28215 llvm-svn: 298798
* [IR] Make SwitchInst::CaseIt almost a normal iterator.Chandler Carruth2017-03-266-50/+58
| | | | | | | | | | | | | | | | | | | | | | | | | This moves it to the iterator facade utilities giving it full random access semantics, etc. It can also now be used with standard algorithms like std::all_of and std::any_of and range adaptors like llvm::reverse. Also make the semantics of iterating match what every other iterator uses and forbid decrementing past the begin iterator. This was used as a hacky way to work around iterator invalidation. However, every instance trying to do this failed to actually avoid touching invalid iterators despite the clear documentation that the removed and all subsequent iterators become invalid including the end iterator. So I've added a return of the next iterator to removeCase and rewritten the loops that were doing this to correctly follow the iterator pattern of either incremneting or removing and assigning fresh values to the iterator and the end. In one case we were trying to go backwards to make this cleaner but it doesn't actually work. I've made that code match the code we use everywhere else to remove cases as we iterate. This changes the order of cases in one test output and I moved that test to CHECK-DAG so it wouldn't care -- the order isn't semantically meaningful anyways. llvm-svn: 298791
* [X86] Pull out repeated ScalarValueSizeInBits code. NFCI.Simon Pilgrim2017-03-251-6/+4
| | | | llvm-svn: 298783
* [X86][SSE] Combine (VSRLI (VSRAI X, Y), (NumSignBits-1)) -> (VSRLI X, ↵Simon Pilgrim2017-03-252-2/+9
| | | | | | | | | | (NumSignBits-1)) Part 3 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298782
* Change the default attributes for llvm.prefetch to inaccessiblemem_or_argmemonlyEric Christopher2017-03-258-25/+65
| | | | | | | | so that we can perform some optimizations across it. Fixes PR32365 llvm-svn: 298781
* [X86][SSE] Added ComputeNumSignBitsForTargetNode support for (V)PSRAISimon Pilgrim2017-03-252-2/+11
| | | | | | | | Part 2 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298780
* [X86][SSE] Generalised CMP+AND1 combine to ZERO/ALLBITS+MASKSimon Pilgrim2017-03-251-26/+22
| | | | | | | | | | | | Patch to generalize combinePCMPAnd1 (for handling SETCC + ZEXT cases) to work for any input that has zero/all bits set masked with an 'all low bits' mask. Replaced the implicit assumption of shift availability with a call to SupportedVectorShiftWithImm. Part 1 of 3. Differential Revision: https://reviews.llvm.org/D31347 llvm-svn: 298779
* [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equalitySanjay Patel2017-03-255-44/+81
| | | | | | | | | This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, we can replace memcmp calls with inline code that is both smaller and faster. Differential Revision: https://reviews.llvm.org/D31290 llvm-svn: 298775
* [X86][SSE] Add extra computeNumSignBits test case for D31311.Simon Pilgrim2017-03-251-0/+47
| | | | llvm-svn: 298774
* [InstCombine] Change the interface of SimplifyDemandedBits so that it takes ↵Craig Topper2017-03-253-46/+46
| | | | | | | | the instruction and operand instead of the Use. The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had. llvm-svn: 298772
* [AArch64] Refine Falkor Machine Model - Part1Balaram Makam2017-03-253-88/+422
| | | | llvm-svn: 298768
* [NewGVN] Adjust NDEBUG markers.Davide Italiano2017-03-251-2/+2
| | | | | | | This avoids 'used but not defined' warnings in Release builds with GCC. llvm-svn: 298760
* [AMDGPU] Switch data layout by triple environment amdgizYaxun Liu2017-03-253-1/+28
| | | | | | | | | | | | Switch data layout by target triple environment amdgiz and amdgizcl indicating using of an address space mapping in which generic address space is 0. amdgiz is for non-OpenCL environment where generic address space is 0. amdgizcl is for OpenCL environment where generic address space is 0. Differential Revision: https://reviews.llvm.org/D31211 llvm-svn: 298758
OpenPOWER on IntegriCloud