bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86][CGP] Reduce memcmp() expansion to 2 load pairs (PR33914)	Simon Pilgrim	2017-07-25	1	-288/+122
\| \| \| \| \| \| \| \| \| \| \| \|	D35067/rL308322 attempted to support up to 4 load pairs for memcmp inlining which resulted in regressions for some optimized libc memcmp implementations (PR33914). Until we can match these more optimal cases, this patch reduces the memcmp expansion to a maximum of 2 load pairs (which matches what we do for -Os). This patch should be considered for the 5.0.0 release branch as well Differential Revision: https://reviews.llvm.org/D35830 llvm-svn: 308986
*	[X86] Regenerate test.	Simon Pilgrim	2017-07-25	1	-3/+5
\| \| \| \|	llvm-svn: 308981
*	[X86] Regenerate test with broadcast comments.	Simon Pilgrim	2017-07-25	1	-3/+3
\| \| \| \|	llvm-svn: 308980
*	[X86] Add 24-byte memcmp tests (PR33914)	Simon Pilgrim	2017-07-25	3	-17/+304
\| \| \| \|	llvm-svn: 308963
*	Adding base test for interleave store VF16 and expand the test for AVX512	Michael Zuckerman	2017-07-24	1	-81/+245
\| \| \| \| \| \|	This patch doesn't modifay any non test file. llvm-svn: 308909
*	[X86][AVX512] Add patterns for masked AVX512 floating point compare ↵	Ayman Musa	2017-07-24	1	-129/+4503
\| \| \| \| \| \| \| \| \| \| \|	instructions that were missing. patterns were missed by D33188. Adding for completion. +Updating test. Differential Revesion: https://reviews.llvm.org/D35179 llvm-svn: 308868
*	[CodeGen][X86] Fuchsia supports sincos* libcalls and sin+cos->sincos ↵	Petr Hosek	2017-07-23	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	optimization Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D35748 llvm-svn: 308854
*	[X86] Add patterns for memory forms of SARX/SHLX/SHRX with careful ↵	Craig Topper	2017-07-23	1	-9/+9
\| \| \| \| \| \| \| \|	complexity adjustment to keep shift by immediate using the legacy instructions. These patterns were only missing to favor using the legacy instructions when the shift was a constant. With careful adjustment of the pattern complexity we can make sure the immediate instructions still have priority over these patterns. llvm-svn: 308834
*	[X86][SSE] Add extra (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) test case	Simon Pilgrim	2017-07-21	1	-0/+30
\| \| \| \| \| \|	We should be able to handle the case where some c1+c2 elements exceed max shift and some don't by performing a clamp after the sum llvm-svn: 308724
*	[X86][SSE] Add pre-AVX2 support for (i32 bitcast(v32i1)) -> 2xMOVMSK	Simon Pilgrim	2017-07-21	4	-871/+88
\| \| \| \| \| \| \| \| \| \| \| \|	Currently we only support (i32 bitcast(v32i1)) using the AVX2 VPMOVMSKB ymm instruction. This patch adds support for splitting pre-AVX2 targets into 2 x (V)PMOVMSKB xmm instructions and merging the integer results. In future we could probably generalize this to handle more cases. Differential Revision: https://reviews.llvm.org/D35303 llvm-svn: 308723
*	[AVX-512] Fix a bug that prevented some non-temporal loads from using the ↵	Craig Topper	2017-07-21	1	-30/+9
\| \| \| \| \| \| \| \|	movntdqa instruction. The bitconverts here had an input type of 128-bits and an output type of 256 bits. The input type should also have been 256 bits. llvm-svn: 308702
*	Add an ID field to StackObjects	Matt Arsenault	2017-07-20	1	-44/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On AMDGPU SGPR spills are really spilled to another register. The spiller creates the spills to new frame index objects, which is used as a placeholder. This will eventually be replaced with a reference to a position in a VGPR to write to and the frame index deleted. It is most likely not a real stack location that can be shared with another stack object. This is a problem when StackSlotColoring decides it should combine a frame index used for a normal VGPR spill with a real stack location and a frame index used for an SGPR. Add an ID field so that StackSlotColoring has a way of knowing the different frame index types are incompatible. llvm-svn: 308673
*	[X86] Adding ISel tests for strided-shuffles with non-zero offset. NFC.	Zvi Rackover	2017-07-20	3	-0/+3641
\| \| \| \|	llvm-svn: 308672
*	[X86] Allow masks with more than 6 bits set on the x << (y & mask) ↵	Craig Topper	2017-07-20	1	-1/+0
\| \| \| \| \| \|	optimization for the 64-bit memory shifts. llvm-svn: 308657
*	[X86] Add test case to demonstrate that we don't allow masks wider than 6 ↵	Craig Topper	2017-07-20	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	bits in the (shift x, (and y, mask)) patterns for the 64-bit memory form. We allow wider than 5 bits in the 16 and 32 bit store forms. And we allow wider than 6 bits on the 64-bit regsiter form.:w I'm assuming this was a mistake made back in r148024. llvm-svn: 308656
*	[DAG] Handle missing transform in fold of value extension case.	Nirav Dave	2017-07-20	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When pushing an extension of a constant bitwise operator on a load into the load, change other uses of the load value if they exist to prevent the old load from persisting. Reviewers: spatel, RKSimon, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35030 llvm-svn: 308618
*	[DAG] Optimize away degenerate INSERT_VECTOR_ELT nodes.	Nirav Dave	2017-07-20	4	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add missing vector write of vector read reduction, i.e.: (insert_vector_elt x (extract_vector_elt x idx) idx) to x Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35563 llvm-svn: 308617
*	[X86][AVX512] Improve vector rotation constant folding tests	Simon Pilgrim	2017-07-20	1	-3/+32
\| \| \| \| \| \|	Test constant folding both on node creation (which already works) and once the input nodes have been folded themselves (not working yet). llvm-svn: 308611
*	[DAGCombiner] Match ISD::SRL non-uniform constant vectors patterns using ↵	Simon Pilgrim	2017-07-20	1	-42/+10
\| \| \| \| \| \| \| \|	predicates. Use predicate matchers introduced in D35492 to match more ISD::SRL constant folds llvm-svn: 308602
*	[DAGCombiner] Match ISD::SRA non-uniform constant vectors patterns using ↵	Simon Pilgrim	2017-07-20	1	-42/+10
\| \| \| \| \| \| \| \|	predicates. Use predicate matchers introduced in D35492 to match more ISD::SRA constant folds llvm-svn: 308600
*	[DAGCombiner] Match non-uniform constant vectors using predicates.	Simon Pilgrim	2017-07-20	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Most combines currently recognise scalar and splat-vector constants, but not non-uniform vector constants. This patch introduces a matching mechanism that uses predicates to check against BUILD_VECTOR of ConstantSDNode, as well as scalar ConstantSDNode cases. I've changed a couple of predicates to demonstrate - the combine-shl changes add currently unsupported cases, while the MatchRotate replaces an existing mechanism. Differential Revision: https://reviews.llvm.org/D35492 llvm-svn: 308598
*	[X86] Use SARX/SHLX/SHLX instructions for (shift x (and y, (BitWidth-1)))	Craig Topper	2017-07-20	2	-28/+13
\| \| \| \| \| \|	Fixes PR33841. llvm-svn: 308591
*	[X86] Add test cases for (shift x (and y, (BitWidth-1))) to the BMI2 shift test.	Craig Topper	2017-07-20	1	-0/+93
\| \| \| \| \| \|	We should use SHLX and similar instructions for these patterns, but we currently don't. llvm-svn: 308590
*	[X86] Regenerate shift-and.ll and shift-bmi2.ll using update_llc_test_checks.py.	Craig Topper	2017-07-20	2	-113/+184
\| \| \| \| \| \|	I've stripped the checks for 64-bit types in 32-bit mode to match the existing tests. llvm-svn: 308589
*	[X86] Remove outdated bug comment from a test.	Craig Topper	2017-07-20	1	-1/+0
\| \| \| \| \| \|	The test issue was fixed and the test was updated in r244577, but the comment wasn't removed. llvm-svn: 308588
*	[PEI] Add basic opt-remarks support	Francis Visoiu Mistrih	2017-07-19	2	-0/+60
\| \| \| \| \| \| \| \| \| \|	Add optimization remarks support to the PrologueEpilogueInserter. For now, emit the stack size as an analysis remark, but more additions wrt shrink-wrapping may be added. https://reviews.llvm.org/D35645 llvm-svn: 308556
*	Forgot to add triple to test in r308513.	Wolfgang Pieb	2017-07-19	1	-1/+1
\| \| \| \|	llvm-svn: 308527
*	Fixing an issue with the initialization of LexicalScopes objects when mixing ↵	Wolfgang Pieb	2017-07-19	1	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \|	debug and non-debug units. Patch by Andrea DiBiagio. Differential Revision: https://reviews.llvm.org/D35637 llvm-svn: 308513
*	[X86] Don't try to scale down if that exceeds the bitwidth.	Davide Italiano	2017-07-19	1	-0/+38
\| \| \| \| \| \|	Fixes the crash reported in PR33844. llvm-svn: 308503
*	[X86][XOP] Use default AVX2 lowering for v4i64 ashr by splat constants	Simon Pilgrim	2017-07-19	1	-3/+2
\| \| \| \| \| \|	XOP shifts only support 128-bit vectors, so we were ending up with less optimal codegen requiring constants llvm-svn: 308430
*	AMD znver1 Initial Scheduler model	Craig Topper	2017-07-19	16	-349/+2344
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds the following 1. Adds a skeleton scheduler model for AMD Znver1. 2. Introduces the znver1 execution units and pipes. 3. Caters the instructions based on the generic scheduler classes. 4. Further additions to the scheduler model with instruction itineraries will be carried out incrementally based on a. Instructions types b. Registers used 5. Since itineraries are not added based on instructions, throughput information are bound to change when incremental changes are added. 6. Scheduler testcases are modified accordingly to suit the new model. Patch by Ganesh Gopalasubramanian. With minor formatting tweaks from me. Reviewers: craig.topper, RKSimon Subscribers: javed.absar, shivaram, ddibyend, vprasad Differential Revision: https://reviews.llvm.org/D35293 llvm-svn: 308411
*	[DAG] Improve Aliasing of operations to static alloca	Nirav Dave	2017-07-18	13	-78/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-recommiting after landing DAG extension-crash fix. Recommiting after adding check to avoid miscomputing alias information on addresses of the same base but different subindices. Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 308350
*	[DAG] Avoid deleting nodes before combining them.	Nirav Dave	2017-07-18	2	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When replacing a node and it's operand, replacing the operand node may cause the deletion of the original node leading to an assertion failure. Case around these replacements to avoid this without relying on inspecting the DELETED_NODE opcode in various extend dagcombiner cases. Fixes PR32515. Reviewers: dbabokin, RKSimon, davide, chandlerc Subscribers: chandlerc, llvm-commits Differential Revision: https://reviews.llvm.org/D34095 llvm-svn: 308330
*	[X86][AVX] Regenerate shift test to show constant broadcast comment	Simon Pilgrim	2017-07-18	1	-5/+5
\| \| \| \|	llvm-svn: 308323
*	[x86, CGP] increase memcmp() expansion up to 4 load pairs	Simon Pilgrim	2017-07-18	2	-330/+784
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It should be a win to avoid going out to the system lib for all small memcmp() calls using scalar ops. For x86 32-bit, this means most everything up to 16 bytes. For 64-bit, that doubles because we can do 8-byte loads. Notes: Reduced from 4 to 2 loads for -Os behavior, which might not be optimal in all cases. It's effectively a question of how much do we trust the system implementation. Linux and macOS (and Windows I assume, but did not test) have optimized memcmp() code for x86, so it's probably not bad either way? PPC is using 8/4 for defaults on these. We do not expand at all for -Oz. There are still potential improvements to make for the CGP expansion IR and/or lowering such as avoiding select-of-constants (D34904) and not doing zexts to the max load type before doing a compare. We have special-case SSE/AVX codegen for (memcmp(x, y, 16/32) == 0) that will no longer be produced after this patch. I've shown the experimental justification for that change in PR33329: https://bugs.llvm.org/show_bug.cgi?id=33329#c12 TLDR: While the vector code is a likely winner, we can't guarantee that it's a winner in all cases on all CPUs, so I'm willing to sacrifice it for the greater good of expanding all small memcmp(). If we want to resurrect that codegen, it can be done by adjusting the CGP params or poking a hole to let those fall-through the CGP expansion. Committed on behalf of Sanjay Patel Differential Revision: https://reviews.llvm.org/D35067 llvm-svn: 308322
*	[X86] Add optsize and minsize memcmp tests (D35067)	Simon Pilgrim	2017-07-18	2	-0/+1419
\| \| \| \|	llvm-svn: 308311
*	[X86] Added cmov target to memcmp test	Simon Pilgrim	2017-07-18	1	-76/+32
\| \| \| \| \| \| \| \|	As discussed by @spatel on D35067: "I added the cmov attribute to the 32-bit codegen test because it removes some noise for that file. I think the intent for the SSE vs no-SSE runs is to show the potential difference for the 16 and 32 byte cases rather than the lack of cmov (which has been available for all CPUs since SSE1, so that's why it shows up automatically with -mattr=sse2)." llvm-svn: 308309
*	[DAGCombine] Fix issue with out of bound constant rotation (PR33828)	Simon Pilgrim	2017-07-18	1	-0/+48
\| \| \| \| \| \|	Take the modulo of rotations by a constant greater than or equal to the bit-width llvm-svn: 308302
*	[X86][AVX512] Add ISD::ROTL/ISD::ROTR constant folding tests	Simon Pilgrim	2017-07-18	1	-0/+20
\| \| \| \|	llvm-svn: 308295
*	[X86] Add test case for PR32282	Simon Pilgrim	2017-07-18	1	-0/+104
\| \| \| \|	llvm-svn: 308286
*	[x86] Add a missing triple, without which the CPU won't parse.	Chandler Carruth	2017-07-18	1	-0/+2
\| \| \| \| \| \| \|	Notably, this is failing on our PPC build bots: http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/8338/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Apr33772.ll llvm-svn: 308272
*	Revert r308025 due to uncovering a crash in SelectionDAG. This is filed	Chandler Carruth	2017-07-18	13	-99/+78
\| \| \| \| \| \| \| \| \|	with a minimal test case in http://llvm.org/PR33833. Original commit message: Improve Aliasing of operations to static alloca llvm-svn: 308271
*	[X86] Prevent an assertion failure if a gather intrinsic is passed a ↵	Craig Topper	2017-07-18	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	non-constant scale value. This isn't legal code, but we shouldn't crash on it. Now we just don't convert the gather intrinsic if the scale isn't constant and let it go through to isel where we'll report an isel failure. Fixes PR33772. llvm-svn: 308267
*	[AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well	Martin Storsjo	2017-07-17	9	-26/+26
\| \| \| \| \| \| \| \| \| \| \| \|	Rename the enum value from X86_64_Win64 to plain Win64. The symbol exposed in the textual IR is changed from 'x86_64_win64cc' to 'win64cc', but the numeric value is kept, keeping support for old bitcode. Differential Revision: https://reviews.llvm.org/D34474 llvm-svn: 308208
*	[llvm] Remove redundant check-prefix=CHECK from tests. NFC.	Mandeep Singh Grang	2017-07-17	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: t.p.northover, oren_ben_simhon, niravd, mcrosier Reviewed By: oren_ben_simhon, mcrosier Subscribers: nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D35466 llvm-svn: 308193
*	[X86] Add LEA scheduling tests	Simon Pilgrim	2017-07-17	2	-0/+1060
\| \| \| \|	llvm-svn: 308180
*	[X86][AVX512] Add lowering of vXi32/vXi64 ISD::ROTL/ISD::ROTR	Simon Pilgrim	2017-07-17	5	-225/+264
\| \| \| \| \| \| \| \|	Add support for lowering to ISD::ROTL/ISD::ROTR, including rotate by immediate Differential Revision: https://reviews.llvm.org/D35463 llvm-svn: 308177
*	Fixed line endings. NFCI.	Simon Pilgrim	2017-07-17	1	-321/+321
\| \| \| \|	llvm-svn: 308175
*	[X86][AVX] Fix typo in vector rotate tests	Simon Pilgrim	2017-07-17	2	-17/+11
\| \| \| \| \| \|	Was preventing rotate matching llvm-svn: 308171
*	[X86][AVX512] Add constant splat vector rotate tests for D35463	Simon Pilgrim	2017-07-17	1	-0/+252
\| \| \| \|	llvm-svn: 308169