bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86][SSE] Create matchVectorShuffleWithUNPCK helper function.	Simon Pilgrim	2017-02-13	1	-46/+42
\| \| \| \| \| \|	Currently only used by target shuffle combining - will use it for lowering as well in a future patch. llvm-svn: 294943
*	[X86] Improve readability of test/CodeGen/X86/lzcnt-zext-cmp.ll by adding a ↵	Pierre Gousseau	2017-02-13	1	-144/+106
\| \| \| \| \| \|	common check prefix ALL. NFC. llvm-svn: 294938
*	[X86][AVX512] Fix operand classes for some AVX512 instructions to keep ↵	Ayman Musa	2017-02-13	1	-17/+20
\| \| \| \| \| \| \| \|	consistency between VEX/EVEX versions of the same instruction. Differential Revision: https://reviews.llvm.org/D29873 llvm-svn: 294937
*	Compile time decreasing in the case we're dealing with Machine Combiner.	Andrew V. Tischenko	2017-02-13	1	-15/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this patch compile time was about 21s (see below). After this patch we have less than 2s (see bellow). Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz DAGCombiner - trunk time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.685s DAGCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.655s MachineCombiner w/o Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m21.614s MachineCombiner + Speed patch time ./llc spill_fdiv.ll -o /dev/null -enable-unsafe-fp-math real 0m1.593s The test spill_fdiv.ll is attached to D29627 D29627 should be closed. llvm-svn: 294936
*	[SLP] Fix for PR31690: Allow using of extra values in horizontal	Alexey Bataev	2017-02-13	2	-322/+408
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reductions. Currently, LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also, it supports a vectorization of instructions with some extra arguments, like: ``` float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } ``` Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D29727 llvm-svn: 294934
*	[DAGCombiner] Teach DAG combine that inserting an extract_subvector result ↵	Craig Topper	2017-02-13	3	-24/+16
\| \| \| \| \| \|	into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932
*	[X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR ↵	Craig Topper	2017-02-13	6	-33/+28
\| \| \| \| \| \| \| \|	to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931
*	[DAGCombiner] Remove the half vector width check for the combine of ↵	Craig Topper	2017-02-12	2	-44/+43
\| \| \| \| \| \| \| \|	EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930
*	[X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR.	Craig Topper	2017-02-12	1	-5/+7
\| \| \| \| \| \|	This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. llvm-svn: 294929
*	NewGVN: Update a number of xfailed tests to either be correct or note	Daniel Berlin	2017-02-12	7	-33/+39
\| \| \| \| \| \|	why they fail. llvm-svn: 294928
*	NewGVN: We really pass TBAA if we enable DCE and fix the test. Note that GVN ↵	Daniel Berlin	2017-02-12	1	-3/+5
\| \| \| \| \| \|	eliminates no-use readonly/readnone calls, even if they are not marked nounwind. NewGVN only eliminates them if they are marked nounwind, and thus, trivially dead. llvm-svn: 294927
*	NewGVN: Reverse order of congruence class elimination to maximize trivial ↵	Daniel Berlin	2017-02-12	1	-2/+2
\| \| \| \| \| \|	deadness llvm-svn: 294926
*	NewGVN: Use shouldSwapOperands in one more place	Daniel Berlin	2017-02-12	1	-1/+1
\| \| \| \|	llvm-svn: 294925
*	[TargetLowering] fix SETCC SETLT folding with FP types	Sanjay Patel	2017-02-12	2	-9/+37
\| \| \| \| \| \| \| \| \| \| \| \|	The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924
*	Revert accidental commit titled "testing"	Daniel Berlin	2017-02-12	1	-1/+1
\| \| \| \| \| \|	This reverts commit r294919 llvm-svn: 294923
*	NewGVN: Apply the fast math flags fix in r267113 to NewGVN as well.	Daniel Berlin	2017-02-12	2	-24/+26
\| \| \| \|	llvm-svn: 294922
*	PredicateInfo: Handle critical edges	Daniel Berlin	2017-02-12	6	-107/+464
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds support for placing predicateinfo such that it affects critical edges. This fixes the issues mentioned by Nuno on the mailing list. Depends on D29519 Reviewers: davide, nlopes Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29606 llvm-svn: 294921
*	NewGVN: Fix missed call that should be to shouldSwapOperands	Daniel Berlin	2017-02-12	1	-1/+0
\| \| \| \|	llvm-svn: 294920
*	testing	Daniel Berlin	2017-02-12	1	-1/+2
\| \| \| \|	llvm-svn: 294919
*	[X86] Fix typo in function name. NFCI.	Simon Pilgrim	2017-02-12	1	-2/+2
\| \| \| \| \| \|	convertBitVectorToUnsiged - convertBitVectorToUnsigned llvm-svn: 294914
*	llvm-readobj: process FreeBSD core notes	Saleem Abdulrasool	2017-02-12	2	-0/+43
\| \| \| \| \| \| \|	core files on FreeBSD have additional notes to capture state. Process those notes when dumping the notes. llvm-svn: 294909
*	[AVX-512] Add various EVEX move instructions to load folding tables using ↵	Craig Topper	2017-02-12	1	-4/+10
\| \| \| \| \| \|	the VEX equivalents as a guide. llvm-svn: 294908
*	[AVX-512] Add VMOV64toSDZrm CodeGenOnly instruction based on the same ↵	Craig Topper	2017-02-12	1	-0/+4
\| \| \| \| \| \| \| \|	instruction from AVX/SSE. I can't prove that we can select this instruction or the AVX/SSE version, but I'm adding it for consistency for now so I can continue matching the load folding tables. llvm-svn: 294907
*	[X86] Fix a couple instruction names to use 'mr' instead of 'rm' to indicate ↵	Craig Topper	2017-02-12	1	-2/+2
\| \| \| \| \| \|	they are stores. AVX-512 version was already named with 'mr'. llvm-svn: 294906
*	[AVX-512] Add VPEXTRD/Q to load folding tables.	Craig Topper	2017-02-12	2	-0/+22
\| \| \| \|	llvm-svn: 294905
*	[X86][SSE] Update argument names to match function name. NFCI.	Simon Pilgrim	2017-02-12	1	-12/+13
\| \| \| \| \| \|	The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. llvm-svn: 294900
*	[InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2)	Sanjay Patel	2017-02-12	2	-21/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I found one special case of this transform for 'slt 0', so I removed that and added the general transform. Alive code to check correctness: Name: slt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp slt %a, C1 => %b = icmp slt %x, C1 - C2 Name: sgt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp sgt %a, C1 => %b = icmp sgt %x, C1 - C2 http://rise4fun.com/Alive/MH Differential Revision: https://reviews.llvm.org/D29774 llvm-svn: 294898
*	[ValueTracking] use nonnull argument attribute to eliminate null checks	Sanjay Patel	2017-02-12	4	-16/+73
\| \| \| \| \| \| \| \| \| \| \|	Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that. This is part of solving: https://llvm.org/bugs/show_bug.cgi?id=28430 Differential Revision: https://reviews.llvm.org/D28204 llvm-svn: 294897
*	[X86][AVX2] Add support for combining target shuffles to VPMOVZX	Simon Pilgrim	2017-02-12	2	-10/+13
\| \| \| \| \| \|	Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896
*	AMDGPU::expandMemIntrinsicUses(): Fix an uninitialized variable. This ↵	NAKAMURA Takumi	2017-02-12	1	-1/+1
\| \| \| \| \| \|	function returned true or undef. llvm-svn: 294895
*	[LV/LoopAccess] Check statically if an unknown dependence distance can be	Dorit Nuzman	2017-02-12	5	-14/+285
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	proven larger than the loop-count This fixes PR31098: Try to resolve statically data-dependences whose compile-time-unknown distance can be proven larger than the loop-count, instead of resorting to runtime dependence checking (which are not always possible). For vectorization it is sufficient to prove that the dependence distance is >= VF; But in some cases we can prune unknown dependence distances early, and even before selecting the VF, and without a runtime test, by comparing the distance against the loop iteration count. Since the vectorized code will be executed only if LoopCount >= VF, proving distance >= LoopCount also guarantees that distance >= VF. This check is also equivalent to the Strong SIV Test. Reviewers: mkuper, anemet, sanjoy Differential Revision: https://reviews.llvm.org/D28044 llvm-svn: 294892
*	AVX-512: Fixed DWARF register numbers for XMM16-31	Elena Demikhovsky	2017-02-12	1	-16/+16
\| \| \| \| \| \| \|	The reference is here: https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf llvm-svn: 294890
*	[LTO] Remove useless redirection from test. NFCI.	Davide Italiano	2017-02-12	1	-1/+1
\| \| \| \|	llvm-svn: 294889
*	[PM] Add devirtualization-based iteration utility into the new PM's	Chandler Carruth	2017-02-12	2	-3/+26
\| \| \| \| \| \| \| \| \| \| \|	default pipeline. A clang with this patch built with ASan and asserts can build all of the test-suite as well, so it seems to not uncover any latent problems. Differential Revision: https://reviews.llvm.org/D29853 llvm-svn: 294888
*	[PM] Enable GlobalsAA in the new PM's pipeline by default.	Chandler Carruth	2017-02-12	2	-15/+10
\| \| \| \| \| \| \| \| \| \|	All the invalidation issues and bugs in this seem to be fixed, it has survived a full build of the test suite plus SPEC with asserts and ASan enabled on the Clang binary used. Differential Revision: https://reviews.llvm.org/D29815 llvm-svn: 294887
*	[lib/LTO] Add support for hotness optremarks in the new API.	Davide Italiano	2017-02-12	2	-0/+43
\| \| \| \|	llvm-svn: 294885
*	[LTO] Simplify this test quite a bit, @func2 is unused/unneeded.	Davide Italiano	2017-02-12	1	-42/+0
\| \| \| \|	llvm-svn: 294884
*	[llvm-lto2] Fix typo in error message.	Davide Italiano	2017-02-12	1	-1/+1
\| \| \| \|	llvm-svn: 294883
*	[lib/LTO] Initial support for optimization remarks in the new API.	Davide Italiano	2017-02-12	4	-0/+52
\| \| \| \|	llvm-svn: 294882
*	Kaleidoscope-Ch7: Add TranformUtils for ↵	NAKAMURA Takumi	2017-02-12	1	-0/+1
\| \| \| \| \| \|	llvm::createPromoteMemoryToRegisterPass() added in r294870. llvm-svn: 294881
*	[X86] Update test case I missed in r294876.	Craig Topper	2017-02-11	1	-40/+39
\| \| \| \|	llvm-svn: 294878
*	[X86] Move code for using blendi for insert_subvector out to an isel ↵	Craig Topper	2017-02-11	4	-69/+91
\| \| \| \| \| \|	pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. llvm-svn: 294876
*	[DAGCombiner] Make the combine of INSERT_SUBVECTOR into a CONCAT_VECTOR more ↵	Craig Topper	2017-02-11	1	-16/+9
\| \| \| \| \| \|	generic to support larger concats. llvm-svn: 294875
*	[X86][SSE] Use VSEXT/VZEXT constant folding for ↵	Simon Pilgrim	2017-02-11	2	-3/+7
\| \| \| \| \| \| \| \|	SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 llvm-svn: 294874
*	[X86][SSE] Improve VSEXT/VZEXT constant folding.	Simon Pilgrim	2017-02-11	8	-48/+43
\| \| \| \| \| \|	Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . llvm-svn: 294873
*	Update Kaleidoscope tutorial and improve Windows support	Mehdi Amini	2017-02-11	16	-196/+327
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many quoted code blocks were not in sync with the actual toy.cpp files. Improve tutorial text slightly in several places. Added some step descriptions crucial to avoid crashes (like InitializeNativeTarget* calls). Solve/workaround problems with Windows (JIT'ed method not found, using custom and standard library functions from host process). Patch by: Moritz Kroll <moritz.kroll@gmx.de> Differential Revision: https://reviews.llvm.org/D29864 llvm-svn: 294870
*	Fix atomic-minmax-i6432.ll .	Amaury Sechet	2017-02-11	1	-2/+0
\| \| \| \|	llvm-svn: 294867
*	Regen expected tests result. NFC	Amaury Sechet	2017-02-11	7	-319/+722
\| \| \| \|	llvm-svn: 294866
*	Correcting several sphinx errors; should fix the LLVM documentation build.	Aaron Ballman	2017-02-11	1	-6/+8
\| \| \| \|	llvm-svn: 294865
*	[X86][SSE] Add early-out when trying to match blend shuffle. NFCI.	Simon Pilgrim	2017-02-11	1	-3/+4
\| \| \| \|	llvm-svn: 294864