bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Be wary of abnormal exits from loop when exploiting UB	Sanjoy Das	2016-06-09	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can safely rely on a NoWrap add recurrence causing UB down the road only if we know the loop does not have a exit expressed in a way that is opaque to ScalarEvolution (e.g. by a function call that conditionally calls exit(0)). I believe with this change PR28012 is fixed. Note: I had to change some llvm-lit tests in LoopReroll, since it looks like they were depending on this incorrect behavior. llvm-svn: 272237
*	[LoopSimplify] Preserve LCSSA when merging exit blocks.	Michael Zolotukhin	2016-06-08	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes PR26682. Also add LCSSA as a preserved pass to LoopSimplify, that looks correct to me and allows to write a test for the issue. Reviewers: chandlerc, bogner, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21112 llvm-svn: 272224
*	[SLPVectorizer] Handle GEP with differing constant index types	Michael Zolotukhin	2016-06-08	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes PR27617. Bug description: The SLPVectorizer asserts on encountering GEPs with different index types, such as i8 and i64. The patch includes a simple relaxation of the assert to allow constants being of different types, along with a regression test that will provoke the unrelaxed assert. Reviewers: nadav, mzolotukhin Subscribers: JesperAntonsson, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20685 Patch by Jesper Antonsson! llvm-svn: 272206
*	The patch set unroll disable pragma when unroll	Evgeny Stupachenko	2016-06-08	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	with user specified count has been applied. Summary: Previously SetLoopAlreadyUnrolled() set the disable pragma only if there was some loop metadata. Now it set the pragma in all cases. This helps to prevent multiple unroll when -unroll-count=N is given. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D20765 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 272195
*	[MemCpyOpt] Do not exchange llvm.lifetime.start and llvm.memcpy	Tim Shen	2016-06-08	1	-0/+25
\| \| \| \| \| \| \| \| \| \|	Reviewers: iteratee Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21087 llvm-svn: 272192
*	Use FileCheck instead of grepping for patterns. NFC.	Easwaran Raman	2016-06-07	3	-10/+11
\| \| \| \|	llvm-svn: 272065
*	Quick fix for the test from rL272014 "[LAA] Improve non-wrapping pointer	Andrey Turetskiy	2016-06-07	1	-1/+1
\| \| \| \| \| \| \| \|	detection by handling loop-invariant case" (s couple of buildbots failed). Patch by Roman Shirokiy. llvm-svn: 272019
*	[LAA] Improve non-wrapping pointer detection by handling loop-invariant case.	Andrey Turetskiy	2016-06-07	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes PR26314. This patch adds new helper “isNoWrap” with detection of loop-invariant pointer case. Patch by Roman Shirokiy. Ref: https://llvm.org/bugs/show_bug.cgi?id=26314 Differential Revision: http://reviews.llvm.org/D17268 llvm-svn: 272014
*	[InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to ↵	Simon Pilgrim	2016-06-07	1	-58/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	native shifts Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit). If the shift amount is constant we can sometimes convert these instructions to native shifts: 1 - if all shift amounts are in range then the conversion is trivial. 2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion. 3 - logical shifts just return zero if all elements have out of range shift amounts. In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts. Differential Revision: http://reviews.llvm.org/D19675 llvm-svn: 271996
*	[InstCombine][SSE] Add MOVMSK constant folding (PR27982)	Simon Pilgrim	2016-06-07	1	-0/+184
\| \| \| \| \| \| \| \| \| \|	This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions. The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs. Differential Revision: http://reviews.llvm.org/D20998 llvm-svn: 271990
*	[InstCombine] scalarizePHI should not assume the code it sees has been CSE'd	Michael Kuperstein	2016-06-06	2	-6/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961
*	[LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.	Michael Zolotukhin	2016-06-06	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \|	In some cases, when simplifying with SCEV, we might consider pointer values as just usual integer values. Thus, we might get a different type from what we had originally in the map of simplified values, and hence we need to check types before operating on the values. This fixes PR28015. llvm-svn: 271931
*	Reapply [LSR] Create fewer redundant instructions.	Geoff Berry	2016-06-06	2	-0/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Originally reviewed in http://reviews.llvm.org/D18001 Reviewers: atrick Subscribers: llvm-commits, mzolotukhin, mcrosier Differential Revision: http://reviews.llvm.org/D18480 llvm-svn: 271929
*	[InstCombine] limit icmp transform to ConstantInt (PR28011)	Sanjay Patel	2016-06-06	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check above this to work for any Constant rather than ConstantInt. AFAICT, that part makes sense if we can determine that the shrunken/extended constant remained equal. But it doesn't make sense for this later transform where we assume that the constant DID change. This could assert for a ConstantExpr: https://llvm.org/bugs/show_bug.cgi?id=28011 And it could be wrong for a vector as shown in the added regression test. llvm-svn: 271908
*	regenerate checks	Sanjay Patel	2016-06-06	1	-110/+159
\| \| \| \|	llvm-svn: 271904
*	regenerate checks	Sanjay Patel	2016-06-06	1	-57/+86
\| \| \| \|	llvm-svn: 271903
*	LICM: Don't sink stores out of loops that may throw.	Eli Friedman	2016-06-05	1	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This hasn't been caught before because it requires noalias or similarly strong alias analysis to actually reproduce. Fixes http://llvm.org/PR27952 . Reviewers: hfinkel, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20944 llvm-svn: 271858
*	Add safety check to InstCombiner::commonIRemTransforms	Sanjoy Das	2016-06-05	1	-0/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since FoldOpIntoPhi speculates the binary operation to potentially each of the predecessors of the PHI node (pulling it out of arbitrary control dependence in the process), we can FoldOpIntoPhi only if we know the operation doesn't have UB. This also brings up an interesting profitability question -- the way it is written today, commonIRemTransforms will hoist out work from dynamically dead code into code that will execute at runtime. Perhaps that isn't the best canonicalization? Fixes PR27968. llvm-svn: 271857
*	Add test case for InstCombiner::commonIRemTransforms; NFC	Sanjoy Das	2016-06-05	1	-0/+23
\| \| \| \| \| \| \|	The PHI case in commonIRemTransforms was untested; add a trivial test case. llvm-svn: 271856
*	[Internalize] Test that __stack_chk_{guard, fail} are not internalized.	Davide Italiano	2016-06-05	1	-0/+9
\| \| \| \| \| \| \|	r154645 introduced this feature without test. This should have better coverage now. llvm-svn: 271853
*	[PM] Port IndVarSimplify to the new pass manager	Sanjoy Das	2016-06-05	3	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are some rough corners, since the new pass manager doesn't have (as far as I can tell) LoopSimplify and LCSSA, so I've updated the tests to run them separately in the old pass manager in the lit tests. We also don't have an equivalent for AU.setPreservesCFG() in the new pass manager, so I've left a FIXME. Reviewers: bogner, chandlerc, davide Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20783 llvm-svn: 271846
*	[IndVars] Remove -liv-reduce	Sanjoy Das	2016-06-05	1	-56/+0
\| \| \| \| \| \| \| \| \| \|	It is an off-by-default option that no one seems to use[0], and given that SCEV directly understands the overflow instrinsics there is no real need for it anymore. [0]: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098181.html llvm-svn: 271845
*	fix checks	Sanjay Patel	2016-06-05	1	-3/+3
\| \| \| \| \| \|	update_test_checks.py got confused matching the variable names. llvm-svn: 271844
*	[InstCombine] allow vector icmp bool transforms	Sanjay Patel	2016-06-05	2	-5/+5
\| \| \| \|	llvm-svn: 271843
*	add tests to show missing vector transforms	Sanjay Patel	2016-06-05	1	-0/+20
\| \| \| \|	llvm-svn: 271842
*	regenerate checks	Sanjay Patel	2016-06-05	1	-57/+78
\| \| \| \|	llvm-svn: 271841
*	update test to use FileCheck	Sanjay Patel	2016-06-05	1	-93/+175
\| \| \| \|	llvm-svn: 271840
*	update test to use FileCheck	Sanjay Patel	2016-06-05	1	-11/+16
\| \| \| \|	llvm-svn: 271838
*	update test to FileCheck	Sanjay Patel	2016-06-05	1	-8/+12
\| \| \| \|	llvm-svn: 271837
*	[PM] Port GCOVProfiler pass to the new pass manager	Xinliang David Li	2016-06-05	7	-0/+33
\| \| \| \|	llvm-svn: 271823
*	[SimplifyCFG] Don't kill empty cleanuppads with multiple uses	David Majnemer	2016-06-04	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A basic block could contain: %cp = cleanuppad [] cleanupret from %cp unwind to caller This basic block is empty and is thus a candidate for removal. However, there can be other uses of %cp outside of this basic block. This is only possible in unreachable blocks. Make our transform more correct by checking that the pad has a single user before removing the BB. This fixes PR28005. llvm-svn: 271816
*	[InstCombine] allow vector constants for cast+icmp fold	Sanjay Patel	2016-06-04	2	-8/+12
\| \| \| \| \| \| \|	This is step 1 of unknown towards fixing PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001 llvm-svn: 271810
*	[InstCombine] add test for missing vector optimization	Sanjay Patel	2016-06-04	1	-2/+15
\| \| \| \|	llvm-svn: 271808
*	[InstCombine] add test for missing vector optimization	Sanjay Patel	2016-06-04	1	-2/+16
\| \| \| \|	llvm-svn: 271806
*	[InstCombine] minimize test case and use FileCheck	Sanjay Patel	2016-06-04	1	-27/+13
\| \| \| \|	llvm-svn: 271805
*	[Analysis] Enabled BITREVERSE as a vectorizable intrinsic	Simon Pilgrim	2016-06-04	1	-288/+633
\| \| \| \| \| \|	Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve. llvm-svn: 271803
*	[InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX	Simon Pilgrim	2016-06-04	1	-0/+20
\| \| \| \| \| \| \| \|	Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614 Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type. llvm-svn: 271789
*	[InstCombine] look through bitcasts to find selects	Sanjay Patel	2016-06-03	1	-29/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was concern that creating bitcasts for the simpler potential select pattern: define <2 x i64> @vecBitcastOp1(<4 x i1> %cmp, <2 x i64> %a) { %a2 = add <2 x i64> %a, %a %sext = sext <4 x i1> %cmp to <4 x i32> %bc = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %a2, %bc ret <2 x i64> %and } might lead to worse code for some targets, so this patch is matching the larger patterns seen in the test cases. The motivating example for this patch is this IR produced via SSE intrinsics in C: define <2 x i64> @gibson(<2 x i64> %a, <2 x i64> %b) { %t0 = bitcast <2 x i64> %a to <4 x i32> %t1 = bitcast <2 x i64> %b to <4 x i32> %cmp = icmp sgt <4 x i32> %t0, %t1 %sext = sext <4 x i1> %cmp to <4 x i32> %t2 = bitcast <4 x i32> %sext to <2 x i64> %and = and <2 x i64> %t2, %a %neg = xor <4 x i32> %sext, <i32 -1, i32 -1, i32 -1, i32 -1> %neg2 = bitcast <4 x i32> %neg to <2 x i64> %and2 = and <2 x i64> %neg2, %b %or = or <2 x i64> %and, %and2 ret <2 x i64> %or } For an AVX target, this is currently: vpcmpgtd %xmm1, %xmm0, %xmm2 vpand %xmm0, %xmm2, %xmm0 vpandn %xmm1, %xmm2, %xmm1 vpor %xmm1, %xmm0, %xmm0 retq With this patch, it becomes: vpmaxsd %xmm1, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D20774 llvm-svn: 271676
*	[InstCombine] change tests to show a more obvious transform possibility	Sanjay Patel	2016-06-02	1	-63/+62
\| \| \| \| \| \| \| \| \| \| \| \|	The original tests were intended to show a missing transform that would be solved by D20774: http://reviews.llvm.org/D20774 But it's not clear that the transform for the simpler tests is a win for all targets. Make the tests show a larger pattern that should be a win regardless of the cost of bitcast instructions. llvm-svn: 271603
*	transform obscured FP sign bit ops into a fabs/fneg using TLI hook	Sanjay Patel	2016-06-02	2	-67/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is effectively a revert of: http://reviews.llvm.org/rL249702 - [InstCombine] transform masking off of an FP sign bit into a fabs() intrinsic call (PR24886) and: http://reviews.llvm.org/rL249701 - [ValueTracking] teach computeKnownBits that a fabs() clears sign bits and a reimplementation as a DAG combine for targets that have IEEE754-compliant fabs/fneg instructions. This is intended to resolve the objections raised on the dev list: http://lists.llvm.org/pipermail/llvm-dev/2016-April/098154.html and: https://llvm.org/bugs/show_bug.cgi?id=24886#c4 In the interest of patch minimalism, I've only partly enabled AArch64. PowerPC, MIPS, x86 and others can enable later. Differential Revision: http://reviews.llvm.org/D19391 llvm-svn: 271573
*	[profile] value profiling bug fix -- missing icall targets in profile-use	Xinliang David Li	2016-06-02	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inline virtual functions has linkeonceodr linkage (emitted in comdat on supporting targets). If the vtable for the class is not emitted in the defining module, function won't be address taken thus its address is not recorded. At the mercy of the linker, if the per-func prf_data from this module (in comdat) is picked at link time, we will lose mapping from function address to its hash val. This leads to missing icall promotion. The second test case (currently disabled) in compiler_rt (r271528): instrprof-icall-prom.test demostrates the bug. The first profile-use subtest is fine due to linker order difference. With this change, no missing icall targets is found in instrumented clang's raw profile. llvm-svn: 271532
*	make icall pass name consistent /NFC	Xinliang David Li	2016-06-02	2	-4/+4
\| \| \| \|	llvm-svn: 271467
*	[MemorySSA] Port to new pass manager	Geoff Berry	2016-06-01	16	-16/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for the new pass manager to MemorySSA pass. Change MemorySSA to be computed eagerly upon construction. Change MemorySSAWalker to be owned by the MemorySSA object that creates it. Reviewers: dberlin, george.burgess.iv Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19664 llvm-svn: 271432
*	Revert "Claim NoAlias if two GEPs index different fields of the same struct"	Daniel Berlin	2016-06-01	1	-77/+81
\| \| \| \| \| \|	This reverts commit 2d5d6493f43eb68493a3852b8c226ac9fafdc7eb. llvm-svn: 271422
*	Claim NoAlias if two GEPs index different fields of the same struct	Daniel Berlin	2016-06-01	1	-81/+77
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by Taewook Oh Summary: Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure. Reviewers: hfinkel, dberlin Subscribers: dberlin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20665 llvm-svn: 271415
*	[LV] For some IVs, use vector phis instead of widening in the loop body	Michael Kuperstein	2016-06-01	8	-20/+85
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, whenever we needed a vector IV, we would create it on the fly, by splatting the scalar IV and adding a step vector. Instead, we can create a real vector IV. This tends to save a couple of instructions per iteration. This only changes the behavior for the most basic case - integer primary IVs with a constant step. Differential Revision: http://reviews.llvm.org/D20315 llvm-svn: 271410
*	[SLP] Pass in correct alignment when query memory access cost	Guozhi Wei	2016-05-31	2	-0/+31
\| \| \| \| \| \| \| \| \| \|	This patch fixes bug https://llvm.org/bugs/show_bug.cgi?id=27897. When query memory access cost, current SLP always passes in alignment value of 1 (unaligned), so it gets a very high cost of scalar memory access, and wrongly vectorize memory loads in the test case. It can be fixed by simply giving correct alignment. llvm-svn: 271333
*	Fix a crash in MergeFunctions related to ordering of weak/strong functions	Erik Eckstein	2016-05-31	1	-0/+47
\| \| \| \| \| \| \| \| \| \| \|	The assumption, made in insert() that weak functions are always inserted after strong functions, is only true in the first round of adding functions. In subsequent rounds this is no longer guaranteed , because we might remove a strong function from the tree (because it's modified) and add it later, where an equivalent weak function already exists in the tree. This change removes the assert in insert() and explicitly enforces a weak->strong order. This also removes the need of two separate loops in runOnModule(). llvm-svn: 271299
*	[IndVars] Eliminate op.with.overflow when possible (re-apply)	Sanjoy Das	2016-05-29	1	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. This was first checked in at r265913 but reverted in r265950 because it exposed some issues around how SCEV handled post-inc add recurrences. Those issues have now been fixed. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 271153
*	[SCEV] Don't always add no-wrap flags to post-inc add recs	Sanjoy Das	2016-05-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes PR27315. The post-inc version of an add recurrence needs to "follow the same rules" as a normal add or subtract expression. Otherwise we miscompile programs like ``` int main() { int a = 0; unsigned a_u = 0; volatile long last_value; do { a_u += 3; last_value = (long) ((int) a_u); if (will_add_overflow(a, 3)) { // Leave, and don't actually do the increment, so no UB. printf("last_value = %ld\n", last_value); exit(0); } a += 3; } while (a != 46); return 0; } ``` This patch changes SCEV to put no-wrap flags on post-inc add recurrences only when the poison from a potential overflow will go ahead to cause undefined behavior. To avoid regressing performance too much, I've assumed infinite loops without side effects is undefined behavior to prove poison<->UB equivalence in more cases. This isn't ideal, but is not new to LLVM as a whole, and far better than the situation I'm trying to fix. llvm-svn: 271151