summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Create matchVectorShuffleWithUNPCK helper function.Simon Pilgrim2017-02-131-46/+42
| | | | | | Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943
* [X86] Improve readability of test/CodeGen/X86/lzcnt-zext-cmp.ll by adding a ↵Pierre Gousseau2017-02-131-144/+106
| | | | | | common check prefix ALL. NFC. llvm-svn: 294938
* [X86][AVX512] Fix operand classes for some AVX512 instructions to keep ↵Ayman Musa2017-02-131-17/+20
| | | | | | | | consistency between VEX/EVEX versions of the same instruction. Differential Revision: https://reviews.llvm.org/D29873 llvm-svn: 294937
* Compile time decreasing in the case we're dealing with Machine Combiner. Andrew V. Tischenko2017-02-131-15/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936
* [SLP] Fix for PR31690: Allow using of extra values in horizontalAlexey Bataev2017-02-132-322/+408
| | | | | | | | | | | | | | | | | | | | | | | reductions. Currently, LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also, it supports a vectorization of instructions with some extra arguments, like: ``` float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } ``` Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D29727 llvm-svn: 294934
* [DAGCombiner] Teach DAG combine that inserting an extract_subvector result ↵Craig Topper2017-02-133-24/+16
| | | | | | into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932
* [X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR ↵Craig Topper2017-02-136-33/+28
| | | | | | | | to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931
* [DAGCombiner] Remove the half vector width check for the combine of ↵Craig Topper2017-02-122-44/+43
| | | | | | | | EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930
* [X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR.Craig Topper2017-02-121-5/+7
| | | | | | This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929
* NewGVN: Update a number of xfailed tests to either be correct or noteDaniel Berlin2017-02-127-33/+39
| | | | | | why they fail. llvm-svn: 294928
* NewGVN: We really pass TBAA if we enable DCE and fix the test. Note that GVN ↵Daniel Berlin2017-02-121-3/+5
| | | | | | eliminates no-use readonly/readnone calls, even if they are not marked nounwind. NewGVN only eliminates them if they are marked nounwind, and thus, trivially dead. llvm-svn: 294927
* NewGVN: Reverse order of congruence class elimination to maximize trivial ↵Daniel Berlin2017-02-121-2/+2
| | | | | | deadness llvm-svn: 294926
* NewGVN: Use shouldSwapOperands in one more placeDaniel Berlin2017-02-121-1/+1
| | | | llvm-svn: 294925
* [TargetLowering] fix SETCC SETLT folding with FP typesSanjay Patel2017-02-122-9/+37
| | | | | | | | | | | | The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924
* Revert accidental commit titled "testing"Daniel Berlin2017-02-121-1/+1
| | | | | | This reverts commit r294919 llvm-svn: 294923
* NewGVN: Apply the fast math flags fix in r267113 to NewGVN as well.Daniel Berlin2017-02-122-24/+26
| | | | llvm-svn: 294922
* PredicateInfo: Handle critical edgesDaniel Berlin2017-02-126-107/+464
| | | | | | | | | | | | | | | | | Summary: This adds support for placing predicateinfo such that it affects critical edges. This fixes the issues mentioned by Nuno on the mailing list. Depends on D29519 Reviewers: davide, nlopes Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29606 llvm-svn: 294921
* NewGVN: Fix missed call that should be to shouldSwapOperandsDaniel Berlin2017-02-121-1/+0
| | | | llvm-svn: 294920
* testingDaniel Berlin2017-02-121-1/+2
| | | | llvm-svn: 294919
* [X86] Fix typo in function name. NFCI.Simon Pilgrim2017-02-121-2/+2
| | | | | | convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914
* llvm-readobj: process FreeBSD core notesSaleem Abdulrasool2017-02-122-0/+43
| | | | | | | core files on FreeBSD have additional notes to capture state. Process those notes when dumping the notes. llvm-svn: 294909
* [AVX-512] Add various EVEX move instructions to load folding tables using ↵Craig Topper2017-02-121-4/+10
| | | | | | the VEX equivalents as a guide. llvm-svn: 294908
* [AVX-512] Add VMOV64toSDZrm CodeGenOnly instruction based on the same ↵Craig Topper2017-02-121-0/+4
| | | | | | | | instruction from AVX/SSE. I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables. llvm-svn: 294907
* [X86] Fix a couple instruction names to use 'mr' instead of 'rm' to indicate ↵Craig Topper2017-02-121-2/+2
| | | | | | they are stores. AVX-512 version was already named with 'mr'. llvm-svn: 294906
* [AVX-512] Add VPEXTRD/Q to load folding tables.Craig Topper2017-02-122-0/+22
| | | | llvm-svn: 294905
* [X86][SSE] Update argument names to match function name. NFCI.Simon Pilgrim2017-02-121-12/+13
| | | | | | The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900
* [InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2)Sanjay Patel2017-02-122-21/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | I found one special case of this transform for 'slt 0', so I removed that and added the general transform. Alive code to check correctness: Name: slt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp slt %a, C1 => %b = icmp slt %x, C1 - C2 Name: sgt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp sgt %a, C1 => %b = icmp sgt %x, C1 - C2 http://rise4fun.com/Alive/MH Differential Revision: https://reviews.llvm.org/D29774 llvm-svn: 294898
* [ValueTracking] use nonnull argument attribute to eliminate null checksSanjay Patel2017-02-124-16/+73
| | | | | | | | | | | Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that. This is part of solving: https://llvm.org/bugs/show_bug.cgi?id=28430 Differential Revision: https://reviews.llvm.org/D28204 llvm-svn: 294897
* [X86][AVX2] Add support for combining target shuffles to VPMOVZXSimon Pilgrim2017-02-122-10/+13
| | | | | | Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896
* AMDGPU::expandMemIntrinsicUses(): Fix an uninitialized variable. This ↵NAKAMURA Takumi2017-02-121-1/+1
| | | | | | function returned true or undef. llvm-svn: 294895
* [LV/LoopAccess] Check statically if an unknown dependence distance can be Dorit Nuzman2017-02-125-14/+285
| | | | | | | | | | | | | | | | | | | | | | | proven larger than the loop-count This fixes PR31098: Try to resolve statically data-dependences whose compile-time-unknown distance can be proven larger than the loop-count, instead of resorting to runtime dependence checking (which are not always possible). For vectorization it is sufficient to prove that the dependence distance is >= VF; But in some cases we can prune unknown dependence distances early, and even before selecting the VF, and without a runtime test, by comparing the distance against the loop iteration count. Since the vectorized code will be executed only if LoopCount >= VF, proving distance >= LoopCount also guarantees that distance >= VF. This check is also equivalent to the Strong SIV Test. Reviewers: mkuper, anemet, sanjoy Differential Revision: https://reviews.llvm.org/D28044 llvm-svn: 294892
* AVX-512: Fixed DWARF register numbers for XMM16-31Elena Demikhovsky2017-02-121-16/+16
| | | | | | | The reference is here: https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf llvm-svn: 294890
* [LTO] Remove useless redirection from test. NFCI.Davide Italiano2017-02-121-1/+1
| | | | llvm-svn: 294889
* [PM] Add devirtualization-based iteration utility into the new PM'sChandler Carruth2017-02-122-3/+26
| | | | | | | | | | | default pipeline. A clang with this patch built with ASan and asserts can build all of the test-suite as well, so it seems to not uncover any latent problems. Differential Revision: https://reviews.llvm.org/D29853 llvm-svn: 294888
* [PM] Enable GlobalsAA in the new PM's pipeline by default.Chandler Carruth2017-02-122-15/+10
| | | | | | | | | | All the invalidation issues and bugs in this seem to be fixed, it has survived a full build of the test suite plus SPEC with asserts and ASan enabled on the Clang binary used. Differential Revision: https://reviews.llvm.org/D29815 llvm-svn: 294887
* [lib/LTO] Add support for hotness optremarks in the new API.Davide Italiano2017-02-122-0/+43
| | | | llvm-svn: 294885
* [LTO] Simplify this test quite a bit, @func2 is unused/unneeded.Davide Italiano2017-02-121-42/+0
| | | | llvm-svn: 294884
* [llvm-lto2] Fix typo in error message.Davide Italiano2017-02-121-1/+1
| | | | llvm-svn: 294883
* [lib/LTO] Initial support for optimization remarks in the new API.Davide Italiano2017-02-124-0/+52
| | | | llvm-svn: 294882
* Kaleidoscope-Ch7: Add TranformUtils for ↵NAKAMURA Takumi2017-02-121-0/+1
| | | | | | llvm::createPromoteMemoryToRegisterPass() added in r294870. llvm-svn: 294881
* [X86] Update test case I missed in r294876.Craig Topper2017-02-111-40/+39
| | | | llvm-svn: 294878
* [X86] Move code for using blendi for insert_subvector out to an isel ↵Craig Topper2017-02-114-69/+91
| | | | | | pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. llvm-svn: 294876
* [DAGCombiner] Make the combine of INSERT_SUBVECTOR into a CONCAT_VECTOR more ↵Craig Topper2017-02-111-16/+9
| | | | | | generic to support larger concats. llvm-svn: 294875
* [X86][SSE] Use VSEXT/VZEXT constant folding for ↵Simon Pilgrim2017-02-112-3/+7
| | | | | | | | SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 llvm-svn: 294874
* [X86][SSE] Improve VSEXT/VZEXT constant folding.Simon Pilgrim2017-02-118-48/+43
| | | | | | Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . llvm-svn: 294873
* Update Kaleidoscope tutorial and improve Windows supportMehdi Amini2017-02-1116-196/+327
| | | | | | | | | | | | | | | Many quoted code blocks were not in sync with the actual toy.cpp files. Improve tutorial text slightly in several places. Added some step descriptions crucial to avoid crashes (like InitializeNativeTarget* calls). Solve/workaround problems with Windows (JIT'ed method not found, using custom and standard library functions from host process). Patch by: Moritz Kroll <moritz.kroll@gmx.de> Differential Revision: https://reviews.llvm.org/D29864 llvm-svn: 294870
* Fix atomic-minmax-i6432.ll .Amaury Sechet2017-02-111-2/+0
| | | | llvm-svn: 294867
* Regen expected tests result. NFCAmaury Sechet2017-02-117-319/+722
| | | | llvm-svn: 294866
* Correcting several sphinx errors; should fix the LLVM documentation build.Aaron Ballman2017-02-111-6/+8
| | | | llvm-svn: 294865
* [X86][SSE] Add early-out when trying to match blend shuffle. NFCI.Simon Pilgrim2017-02-111-3/+4
| | | | llvm-svn: 294864
OpenPOWER on IntegriCloud