bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] Extend single-source shuffle cost test to test more arches. NFC.	Michael Kuperstein	2017-02-01	1	-22/+129
\| \| \| \|	llvm-svn: 293793
*	[SCEV] Simplify/generalize howFarToZero solving.	Eli Friedman	2017-01-31	2	-4/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make SolveLinEquationWithOverflow take the start as a SCEV, so we can solve more cases. With that implemented, get rid of the special case for powers of two. The additional functionality probably isn't particularly useful, but it might help a little for certain cases involving pointer arithmetic. Differential Revision: https://reviews.llvm.org/D28884 llvm-svn: 293576
*	AMDGPU: Fix atomic_inc/atomic_dec + ds_swizzle not being divergent	Matt Arsenault	2017-01-30	2	-0/+43
\| \| \| \|	llvm-svn: 293504
*	Fix BasicAA incorrect assumption on GEP	Mehdi Amini	2017-01-27	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is fixing pr31761: BasicAA is deducing NoAlias on the result of the GEP if the base pointer is itself NoAlias. This is possible only if the NoAlias on the base pointer is deduced with a non-sized query: this should guarantee that the pointers are belonging to different memory allocation and that the GEP can't legally jump from one to another. Differential Revision: https://reviews.llvm.org/D29216 llvm-svn: 293293
*	[SCEV] Introduce add operation inlining limit	Daniil Fukalov	2017-01-26	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Inlining in getAddExpr() can cause abnormal computational time in some cases. New parameter -scev-addops-inline-threshold is intruduced with default value 500. Reviewers: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D28812 llvm-svn: 293176
*	Revert "[PPC] Give unaligned memory access lower cost on processor that ↵	Daniel Jasper	2017-01-25	2	-27/+1
\| \| \| \| \| \| \| \| \| \|	supports it" This reverts commit r292680. It is causing significantly worse performance and test timeouts in our internal builds. I have already routed reproduction instructions your way. llvm-svn: 293092
*	This test apparently requires an x86 target and is failing on numerous	Chandler Carruth	2017-01-23	1	-48/+0
\| \| \| \| \| \| \| \| \| \| \|	bots ever since d0k fixed the CHECK lines so that it did something at all. It isn't actually testing SCEV directly but LSR, so move it into LSR and the x86-specific tree of tests that already exists there. Target dependence is common and unavoidable with the current design of LSR. llvm-svn: 292774
*	[PM] Teach LVI to correctly invalidate itself when its dependencies	Chandler Carruth	2017-01-23	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \|	become unavailable. The AssumptionCache is now immutable but it still needs to respond to DomTree invalidation if it ended up caching one. This lets us remove one of the explicit invalidates of LVI but the other one continues to avoid hitting a latent bug. llvm-svn: 292769
*	Attempt to fix test in release builds.	Benjamin Kramer	2017-01-22	1	-5/+5
\| \| \| \|	llvm-svn: 292762
*	Fix some broken CHECK lines.	Benjamin Kramer	2017-01-22	2	-7/+7
\| \| \| \| \| \|	The colon is important. llvm-svn: 292761
*	[PPC] Give unaligned memory access lower cost on processor that supports it	Guozhi Wei	2017-01-20	2	-1/+27
\| \| \| \| \| \| \| \| \| \|	Newer ppc supports unaligned memory access, it reduces the cost of unaligned memory access significantly. This patch handles this case in PPCTTIImpl::getMemoryOpCost. This patch fixes pr31492. Differential Revision: https://reviews.llvm.org/D28630 llvm-svn: 292680
*	[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.	Eli Friedman	2017-01-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid regressions, make ScalarEvolution::createSCEV a bit more clever. Also get rid of some useless code in ScalarEvolution::howFarToZero which was hiding this bug. No new testcase because it's impossible to actually expose this bug: we don't have any in-tree users of getUDivExactExpr besides the two functions I just mentioned, and they both dodged the problem. I'll try to add some interesting users in a followup. Differential Revision: https://reviews.llvm.org/D28587 llvm-svn: 292449
*	[PM] Teach the LoopPassManager to automatically canonicalize loops by	Chandler Carruth	2017-01-17	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	runnig LCSSA over them prior to running the loop pipeline. This also teaches the loop PM to verify that LCSSA form is preserved throughout the pipeline's run across the loop nest. Most of the test updates just leverage this new functionality. One has to be relaxed with the new PM as IVUsers is less powerful when it sees LCSSA input. Differential Revision: https://reviews.llvm.org/D28743 llvm-svn: 292241
*	[CostModel][X86] Fix AVX512BW vector shift costs for vXi16 types	Simon Pilgrim	2017-01-15	3	-10/+20
\| \| \| \| \| \|	We already have patterns in place to support 128/256-bit shifts without AVX512VL llvm-svn: 292077
*	[CostModel][X86] Drop separate AVX512VL checks - they match existing AVX512 ↵	Simon Pilgrim	2017-01-15	3	-51/+9
\| \| \| \| \| \| \| \|	costs Keep the tests though. llvm-svn: 292076
*	[CostModel][X86] Update vector shift tests to correctly check by ↵	Simon Pilgrim	2017-01-15	3	-199/+232
\| \| \| \| \| \| \| \|	non-constant uniform values. Use shuffle( scslar_to_vector, zeroinitializer) pattern instead of shuffle( vec, zeroinitializer) llvm-svn: 292075
*	[PM] Clean up the testing for IVUsers, especially with the new PM.	Chandler Carruth	2017-01-15	4	-0/+64
\| \| \| \| \| \| \| \| \| \|	First, I've moved a test of IVUsers from the LSR tree to a dedicated IVUsers test directory. I've also simplified its RUN line now that the new pass manager's loop PM is providing analyses on their own. No functionality changed, but it makes subsequent changes cleaner. llvm-svn: 292060
*	[PM] The assumption cache is fundamentally designed to be self-updating,	Chandler Carruth	2017-01-15	2	-27/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	mark it as never invalidated in the new PM. The old PM already required this to work, and after a discussion with Hal this seems to really be the only sensible answer. The cache gracefully degrades as the IR is mutated, and most things which do this should already be incrementally updating the cache. This gets rid of a bunch of logic preserving and testing the invalidation of this analysis. llvm-svn: 292039
*	[CostModel][X86] Updated vXi64 ASHR costs on AVX512 targets now that D28604 ↵	Simon Pilgrim	2017-01-14	1	-8/+8
\| \| \| \| \| \|	has landed llvm-svn: 292023
*	[SCEV] Make howFarToZero max backedge-taken count check for precondition.	Eli Friedman	2017-01-11	1	-4/+2
\| \| \| \| \| \| \| \| \|	Refines max backedge-taken count if a loop like "for (int i = 0; i != n; ++i) { /* body */ }" is rotated. Differential Revision: https://reviews.llvm.org/D28536 llvm-svn: 291704
*	[SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.	Eli Friedman	2017-01-11	1	-0/+83
\| \| \| \| \| \| \| \| \|	This is both easier to understand, and produces a tighter bound in certain cases. Differential Revision: https://reviews.llvm.org/D28393 llvm-svn: 291701
*	[X86][AVX512BW] Vectorize v64i8 vector shifts	Simon Pilgrim	2017-01-11	3	-18/+18
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28447 llvm-svn: 291665
*	Fix line endings	Simon Pilgrim	2017-01-11	4	-421/+421
\| \| \| \|	llvm-svn: 291663
*	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	Mohammed Agabaria	2017-01-11	1	-0/+317
\| \| \| \| \| \| \| \| \| \| \| \|	updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
*	[AArch64] Consider all vector types for FeatureSlowMisaligned128Store	Evandro Menezes	2017-01-10	1	-8/+50
\| \| \| \| \| \| \| \| \| \| \| \|	The original code considered only v2i64 as slow for this feature. This patch consider all 128-bit long vector types as slow candidates. In internal tests, extending this feature to all 128-bit vector types resulted in an overall improvement of 1% on Exynos M1. Differential revision: https://reviews.llvm.org/D27998 llvm-svn: 291616
*	[CostModel][X86] Add AVX512VL vector shift cost tests.	Simon Pilgrim	2017-01-10	3	-0/+57
\| \| \| \|	llvm-svn: 291585
*	[ValueTracking] regenerate checks; NFC	Sanjay Patel	2017-01-09	1	-7/+15
\| \| \| \|	llvm-svn: 291468
*	[PM] Teach SCEV to invalidate itself when its dependencies become	Chandler Carruth	2017-01-09	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \|	invalid. This fixes use-after-free bugs that will arise with any interesting use of SCEV. I've added a dedicated test that works diligently to trigger these kinds of bugs in the new pass manager and also checks for them explicitly as well as triggering ASan failures when things go squirly. llvm-svn: 291426
*	[CostModel][X86] Fixed vXi8 uniform shift costs.	Simon Pilgrim	2017-01-08	5	-39/+45
\| \| \| \| \| \| \| \| \| \|	The 'fast' costs should only work for shifts by uniform constants (uniform non-constant are lowered using the slow default implementation). Logical shifts were not taking into account that we must mask the psrlw result, so the costs needed to be doubled. Added missing AVX2/AVX512BW costs as well. llvm-svn: 291391
*	[CostModel][X86] Moved legal uniform shift costs earlier.	Simon Pilgrim	2017-01-08	2	-8/+5
\| \| \| \| \| \|	XOP was prematurely matching, doubling the cost of ashr/lshr uniform shifts. llvm-svn: 291390
*	[CostModel][X86] Update SSE41/AVX1 vXi32 SHL costs	Simon Pilgrim	2017-01-07	1	-12/+12
\| \| \| \| \| \|	SSE41 provides pmulld which allows the simpler pslld/paddd/cvttps2dq/pmulld pattern than SSE2's use of pmuludq. llvm-svn: 291372
*	[CostModel][X86] Fix AVX2 v16i16 shift 'splat' costs.	Simon Pilgrim	2017-01-07	2	-12/+16
\| \| \| \|	llvm-svn: 291366
*	[CostModel][X86] Match 256-bit vector shift 'splat' costs for AVX2 and above	Simon Pilgrim	2017-01-07	3	-20/+20
\| \| \| \| \| \|	We were matching against general vector shift costs before the uniform splat costs llvm-svn: 291365
*	[CostModel][AVX512BW] Add v32i16 vector shift costs for avx512bw targets.	Simon Pilgrim	2017-01-07	3	-12/+12
\| \| \| \|	llvm-svn: 291354
*	[X86][AVX512] Use lowerShuffleAsRepeatedMaskAndLanePermute for non-VBMI ↵	Simon Pilgrim	2017-01-07	1	-1/+1
\| \| \| \| \| \|	v64i8 shuffles (PR31470) llvm-svn: 291347
*	[CostModel][X86] Add AVX512 and 512-bit vector shift cost tests.	Simon Pilgrim	2017-01-06	3	-18/+758
\| \| \| \|	llvm-svn: 291269
*	[AArch64] Reduce vector insert/extract cost for Falkor.	Chad Rosier	2017-01-06	1	-0/+26
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28403 llvm-svn: 291254
*	[CostModel][X86] Fix 512-bit SDIV/UDIV 'big' costs.	Simon Pilgrim	2017-01-06	1	-4/+2
\| \| \| \| \| \|	Set the costs on the lowest target that supports the type. llvm-svn: 291229
*	[CostModel][X86] Add SDIV/UDIV cost tests for a wider range of targets	Simon Pilgrim	2017-01-06	1	-18/+50
\| \| \| \| \| \|	Added a test demonstrating bug in AVX512 division costs llvm-svn: 291228
*	[CostModel][X86] Include the cost of 256-bit upper subvector ↵	Simon Pilgrim	2017-01-05	1	-2/+2
\| \| \| \| \| \| \| \|	extract/insertion in AVX1 v4i64 MUL Matches other MUL/ADD/SUB 256-bit case on AVX1 llvm-svn: 291149
*	[AArch64][CostModel] Add coverage for bswap intrinsics.	Chad Rosier	2017-01-05	1	-0/+70
\| \| \| \|	llvm-svn: 291140
*	[CostModel][X86] Add support for broadcast shuffle costs	Simon Pilgrim	2017-01-05	1	-2/+138
\| \| \| \| \| \| \| \|	Currently only for broadcasts with input and output of the same width. Differential Revision: https://reviews.llvm.org/D27811 llvm-svn: 291122
*	[AArch64] Remove mcpu option as this test is not target specific. NFC.	Chad Rosier	2017-01-05	1	-1/+1
\| \| \| \|	llvm-svn: 291117
*	[AArch64] Remove unused arguments from tests. NFC.	Chad Rosier	2017-01-05	1	-32/+32
\| \| \| \|	llvm-svn: 291112
*	Add missing CHECK: line to test case added in 29097	Tobias Grosser	2017-01-04	1	-0/+1
\| \| \| \| \| \| \|	Without this CHECK line, we may not detect incorrectly detected additional regions at the end of the region tree. llvm-svn: 290994
*	RegionInfo: add new test case	Tobias Grosser	2017-01-04	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \|	This test case has been reduced from test/Analysis/RegionInfo/mix_1.ll and provides us with a minimal example of a test case which caused problems while working on an improved version of the RegionInfo analysis. We upstream this test case, as it certainly can be helpful in future debugging and optimization tests. Test case reduced by Pratik Bhatu <cs12b1010@iith.ac.in> llvm-svn: 290974
*	[CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costs	Simon Pilgrim	2017-01-04	1	-22/+22
\| \| \| \| \| \|	Actual codegen is much better than the extract+insert patterns that was assumed. llvm-svn: 290962
*	Fixed shuffle-reverse cost on AVX-512.	Elena Demikhovsky	2017-01-02	1	-1/+2
\| \| \| \| \| \|	(This changed was approved in https://reviews.llvm.org/D28118, but Simon asked to submit it separately). llvm-svn: 290812
*	AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.	Elena Demikhovsky	2017-01-02	9	-9/+762
\| \| \| \| \| \| \| \| \| \| \| \|	X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost. In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426). * Shiffle-broadcast cost will be changed in Simon's upcoming patch. Differential Revision: https://reviews.llvm.org/D28118 llvm-svn: 290810
*	[ValueTracking] add tests for known-nonnull-at; NFC	Sanjay Patel	2016-12-31	1	-0/+57
\| \| \| \|	llvm-svn: 290790