bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Reformat test case to be easier to migrate to typeless pointers.	David Blaikie	2015-02-15	1	-1/+4
\| \| \| \|	llvm-svn: 229275
*	InstCombine: propagate deref via new addDereferenceableAttr	Ramkumar Ramachandra	2015-02-14	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "dereferenceable" attribute cannot be added via .addAttribute(), since it also expects a size in bytes. AttrBuilder#addAttribute or AttributeSet#addAttribute is wrapped by classes Function, InvokeInst, and CallInst. Add corresponding wrappers to AttrBuilder#addDereferenceableAttr. Having done this, propagate the dereferenceable attribute via gc.relocate, adding a test to exercise it. Note that -datalayout is required during execution over and above -instcombine, because InstCombine only optionally requires DataLayoutPass. Differential Revision: http://reviews.llvm.org/D7510 llvm-svn: 229265
*	[InstCombine] When canonicalizing gep indices, prefer zext when possible	Philip Reames	2015-02-14	1	-0/+61
\| \| \| \| \| \| \| \| \| \|	If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead. This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?). We already apply a similar transform in DAGCombine, this just extends that to the IR level. This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. Differential Revision: http://reviews.llvm.org/D7255 llvm-svn: 229189
*	[InstCombine] Fix regression introduced at r227197.	Andrea Di Biagio	2015-02-13	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a problem I accidentally introduced in an instruction combine on select instructions added at r227197. That revision taught the instruction combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared. However, the new rule added at r227197 would have produced wrong results in the case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a zero-extend or truncate. In that case, the folded instruction would have been inserted in a wrong location thus leaving the CFG in an inconsistent state. This patch fixes the problem and add two reproducible test cases to existing test 'InstCombine/select-cmp-cttz-ctlz.ll'. llvm-svn: 229124
*	[CodeGenPrepare] Removed duplicate logic. SimplifyCFG already knows how to ↵	Andrea Di Biagio	2015-02-13	4	-0/+298
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	speculate calls to cttz/ctlz. SimplifyCFG now knows how to speculate calls to intrinsic cttz/ctlz that are 'cheap' for the target. Therefore, some of the logic in CodeGenPrepare that was originally added at revision 224899 can now be removed. This patch is basically a no functional change. It removes the duplicated logic in CodeGenPrepare and converts all the existing target specific tests for cttz/ctlz into SimplifyCFG tests. Differential Revision: http://reviews.llvm.org/D7608 llvm-svn: 229105
*	[SimplifyCFG] Add test for r229099	James Molloy	2015-02-13	1	-0/+22
\| \| \| \| \| \|	Add extra test that was accidentally not staged. llvm-svn: 229101
*	[unroll] Concede defeat and disable the unroll analyzer for now.	Chandler Carruth	2015-02-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an email to the original commit thread with details and a proposal for how to redesign things. I'm disabling this for now so that we don't spend time debugging issues with it in its current state. llvm-svn: 229064
*	[InstCombine] Fix a bug when combining `icmp` from `ptrtoint`	Michael Liao	2015-02-13	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \|	- First, there's a crash when we try to combine that pointers into `icmp` directly by creating a `bitcast`, which is invalid if that two pointers are from different address spaces. - It's not always appropriate to cast one pointer to another if they are from different address spaces as that is not no-op cast. Instead, we only combine `icmp` from `ptrtoint` if that two pointers are of the same address space. llvm-svn: 229063
*	[IC] Fix a bug with the instcombine canonicalizing of loads and	Chandler Carruth	2015-02-13	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	propagating of metadata. We were propagating !nonnull metadata even when the newly formed load is no longer of a pointer type. This is clearly broken and results in LLVM failing the verifier and aborting. This patch just restricts the propagation of !nonnull metadata to when we actually have a pointer type. This bug report and the initial version of this patch was provided by Charles Davis! Many thanks for finding this! We still need to add logic to round-trip the metadata correctly if we combine from pointer types to integer types and then back by using range metadata for the integer type loads. But this is the minimal and safe version of the patch, which is important so we can backport it into 3.6. llvm-svn: 229029
*	Check interleaving without relying on debug output.	Olivier Sallenave	2015-02-13	1	-3/+14
\| \| \| \|	llvm-svn: 229027
*	Testcase for r228988.	Michael Zolotukhin	2015-02-13	1	-0/+3
\| \| \| \|	llvm-svn: 228995
*	llvm/test/Transforms/LoopVectorize/PowerPC/small-loop-rdx.ll REQUIRES ↵	NAKAMURA Takumi	2015-02-13	1	-0/+1
\| \| \| \| \| \|	+Asserts due to -debug. llvm-svn: 228989
*	Change max interleave factor to 12 for POWER7 and POWER8.	Olivier Sallenave	2015-02-12	1	-0/+35
\| \| \| \|	llvm-svn: 228973
*	Fix a crash in the assumption cache when inlining indirect function calls	Bjorn Steinbrink	2015-02-12	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instances of the AssumptionCache are per function, so we can't re-use the same AssumptionCache instance when recursing in the CallAnalyzer to analyze a different function. Instead we have to pass the AssumptionCacheTracker to the CallAnalyzer so it can get the right AssumptionCache on demand. Reviewers: hfinkel Subscribers: llvm-commits, hans Differential Revision: http://reviews.llvm.org/D7533 llvm-svn: 228957
*	Update test case.	Benjamin Kramer	2015-02-12	1	-2/+2
\| \| \| \|	llvm-svn: 228956
*	InstCombine: Allow folding of xor into icmp by changing the predicate for ↵	Benjamin Kramer	2015-02-12	1	-0/+6
\| \| \| \| \| \| \| \|	vectors The loop vectorizer can create this pattern. llvm-svn: 228954
*	Add a testcase for r228432.	Michael Zolotukhin	2015-02-12	1	-0/+34
\| \| \| \|	llvm-svn: 228951
*	[LoopRerolling] Be more forgiving with instruction order.	James Molloy	2015-02-12	1	-0/+57
\| \| \| \| \| \| \| \| \|	We can't solve the full subgraph isomorphism problem. But we can allow obvious cases, where for example two instructions of different types are out of order. Due to them having different types/opcodes, there is no ambiguity. llvm-svn: 228931
*	[TTI] Teach the cost heuristic how to query TLI to check if a zext/trunc is ↵	Andrea Di Biagio	2015-02-12	1	-0/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'free' for the target. Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl how to query TLI in order to get a more accurate cost for truncates and zero-extends. Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase would have conservatively returned a 'default' TCC_Basic for all zero-extends, and TCC_Free for truncates on native types. This patch improves the heuristic so that we query TLI (if available) to get more accurate answers. If TLI is available, then methods 'isZExtFree' and 'isTruncateFree' can be used to check if a zext/trunc is free for the target. Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll. With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz immediately followed by a free zext/trunc. Differential Revision: http://reviews.llvm.org/D7585 llvm-svn: 228923
*	[slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.	Chandler Carruth	2015-02-12	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Apparently some code finally started to tickle this after my canonicalization changes to instcombine. The bug stems from trying to form a vector type out of scalars that aren't compatible at all. In this example, from x86_mmx values. The code in the vectorizer that checks for reasonable types whas checking for aggregates or vectors, but there are lots of other types that should just never reach the vectorizer. Debugging this was made more confusing by the lie in an assert in VectorType::get() -- it isn't that the types are primitive. The types must be integer, pointer, or floating point types. No other types are allowed. I've improved the assert and added a helper to the vectorizer to handle the element type validity checks. It now re-uses the VectorType static function and then further excludes weird target-specific types that we probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of these are really reachable anyways (neither 80-bit nor 128-bit things will get vectorized) but it seems better to just eagerly exclude such nonesense. I've added a test case, but while it definitely covers two of the paths through this code there may be more paths that would benefit from test coverage. I'm not familiar enough with the SLP vectorizer to synthesize test cases for all of these, but was able to update the code itself by inspection. llvm-svn: 228899
*	DeadArgElim: aggregate Return assessment properly.	Tim Northover	2015-02-11	1	-0/+30
\| \| \| \| \| \| \| \| \|	I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It actually depends on the index too, which means we need to be careful about how the results are combined before return. In particular if a single Use returns Live, that counts for the entire object, at the granularity we're considering. llvm-svn: 228885
*	Reassociate: cannot negate a INT_MIN value	Mehdi Amini	2015-02-11	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When trying to canonicalize negative constants out of multiplication expressions, we need to check that the constant is not INT_MIN which cannot be negated. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7286 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 228872
*	[TTI] Improved cost heuristic for cttz/ctlz calls.	Andrea Di Biagio	2015-02-11	2	-16/+141
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is a follow-up of r228826 (see code-review: D7506). Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we have to fix the cost heuristic for intrinsic calls to cttz/ctlz. This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl queries TLI to check if a call to cttz/ctlz is cheap for the target. Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86, SimplifyCFG only speculates a call to cttz/ctlz if it is cheap. Differential Revision: http://reviews.llvm.org/D7554 llvm-svn: 228829
*	[SimplifyCFG] Swap to using TargetTransformInfo for cost	James Molloy	2015-02-11	3	-32/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. llvm-svn: 228826
*	[LoopReroll] Introduce the concept of DAGRootSets.	James Molloy	2015-02-11	1	-0/+167
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A DAGRootSet models an induction variable being used in a rerollable loop. For example: x[i3+0] = y1 x[i3+1] = y2 x[i3+2] = y3 Base instruction -> i3 +---+----+ / \| \ ST[y1] +1 +2 <-- Roots \| \| ST[y2] ST[y3] There may be multiple DAGRootSets, for example: x[i2+0] = ... (1) x[i2+1] = ... (1) x[i2+4] = ... (2) x[i2+5] = ... (2) x[(i+1234)2+5678] = ... (3) x[(i+1234)2+5679] = ... (3) This concept is similar to the "Scale" member used previously, but allows multiple independent sets of roots based off the same induction variable. llvm-svn: 228821
*	Fix invalid LLVM IR in PruneEH tests	Reid Kleckner	2015-02-11	2	-0/+9
\| \| \| \|	llvm-svn: 228786
*	Don't promote asynch EH invokes of nounwind functions to calls	Reid Kleckner	2015-02-11	4	-15/+66
\| \| \| \| \| \| \| \| \| \| \|	If the landingpad of the invoke is using a personality function that catches asynch exceptions, then it can catch a trap. Also add some landingpads to invalid LLVM IR test cases that lack them. Over-the-shoulder reviewed by David Majnemer. llvm-svn: 228782
*	EarlyCSE: Add check lines for test added in r228760	David Majnemer	2015-02-10	1	-0/+3
\| \| \| \|	llvm-svn: 228761
*	EarlyCSE: It isn't safe to CSE across synchronization boundaries	David Majnemer	2015-02-10	1	-1/+7
\| \| \| \| \| \|	This fixes PR22514. llvm-svn: 228760
*	DeadArgElim: arguments affect all returned sub-values by default.	Tim Northover	2015-02-10	1	-0/+17
\| \| \| \| \| \| \| \| \| \|	Unless we meet an insertvalue on a path from some value to a return, that value will be live if any of the return's components are live, so all of those components must be added to the MaybeLiveUses. Previously we were deleting arguments if sub-value 0 turned out to be dead. llvm-svn: 228731
*	Add a test case for new unrolling heuristics.	Michael Zolotukhin	2015-02-10	1	-0/+59
\| \| \| \| \| \|	THe heuristics were added in r228265 and r228434. llvm-svn: 228713
*	Revert r228556: InstCombine: propagate nonNull through assume	Chandler Carruth	2015-02-10	1	-37/+0
\| \| \| \| \| \| \| \| \|	This commit isn't using the correct context, and is transfoming calls that are operands to loads rather than calls that are operands to an icmp feeding into an assume. I've replied on the original review thread with a very reduced test case and some thoughts on how to rework this. llvm-svn: 228677
*	PlaceSafepoints: modernize gc.result.* -> gc.result	Ramkumar Ramachandra	2015-02-09	1	-1/+16
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D7516 llvm-svn: 228625
*	Introduce more tests for PlaceSafepoints	Philip Reames	2015-02-09	3	-0/+157
\| \| \| \| \| \|	These tests the two optimizations for backedge insertion currently implemented and the split backedge flag which is currently off by default. llvm-svn: 228617
*	Minor test cleanup	Philip Reames	2015-02-09	1	-5/+4
\| \| \| \| \| \| \|	a) add gc attribute b) remove unused param llvm-svn: 228612
*	Add basic tests for PlaceSafepoints	Philip Reames	2015-02-09	1	-0/+72
\| \| \| \| \| \|	This is just adding really simple tests which should have been part of the original submission. When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission. I fixed two such bugs in this submission. llvm-svn: 228610
*	DeadArgElim: fix mismatch in accounting of array return types.	Tim Northover	2015-02-09	1	-1/+45
\| \| \| \| \| \| \| \| \| \| \| \|	Some parts of DeadArgElim were only considering the individual fields of StructTypes separately, but others (where insertvalue & extractvalue instructions occur) also looked into ArrayTypes. This one is an actual bug; the mismatch can lead to an argument being considered used by a return sub-value that isn't being tracked (and hence is dead by default). It then gets incorrectly eliminated. llvm-svn: 228559
*	DeadArgElim: assess uses of entire return value aggregate.	Tim Northover	2015-02-09	1	-0/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, a non-extractvalue use of an aggregate return value meant the entire return was considered live (the algorithm gave up entirely). This was correct, but conservative. It's better to actually look at that Use, making the analysis results apply to all sub-values under consideration. E.g. %val = call { i32, i32 } @whatever() [...] ret { i32, i32 } %val The return is using the entire aggregate (sub-values 0 and 1). We can still simplify @whatever if we can prove that this return is itself unused. Also unifies the logic slightly between aggregate and non-aggregate cases.. llvm-svn: 228558
*	InstCombine: propagate nonNull through assume	Ramkumar Ramachandra	2015-02-09	1	-0/+37
\| \| \| \| \| \| \| \| \|	Make assume (load (call\|invoke) != null) set nonNull return attribute for the call and invoke. Also include tests. Differential Revision: http://reviews.llvm.org/D7107 llvm-svn: 228556
*	Correctly combine alias.scope metadata by a union instead of intersecting	Bjorn Steinbrink	2015-02-08	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The alias.scope metadata represents sets of things an instruction might alias with. When generically combining the metadata from two instructions the result must be the union of the original sets, because the new instruction might alias with anything any of the original instructions aliased with. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7490 llvm-svn: 228525
*	ValueTracking: Make isBytewiseValue simpler and more powerful at the same time.	Benjamin Kramer	2015-02-07	1	-0/+15
\| \| \| \| \| \| \|	Turns out there is a simpler way of checking that all bytes in a word are equal than binary decomposition. llvm-svn: 228503
*	Properly update AA metadata when performing call slot optimization	Bjorn Steinbrink	2015-02-07	1	-0/+22
\| \| \| \| \| \| \| \|	Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7482 llvm-svn: 228500
*	InstCombine: Combine select sequences into a single select	Matthias Braun	2015-02-06	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normalize select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b) select(C0, a, select(C1, a, b)) -> select((C0 \| C1), a, b) This normal form may enable further combines on the And/Or and shortens paths for the values. Many targets prefer the other but can go back easily in CodeGen. Differential Revision: http://reviews.llvm.org/D7399 llvm-svn: 228409
*	Teach isDereferenceablePointer() to look through bitcast constant expressions.	Michael Kuperstein	2015-02-05	1	-0/+46
\| \| \| \| \| \| \| \|	This fixes a LICM regression due to the new load+store pair canonicalization. Differential Revision: http://reviews.llvm.org/D7411 llvm-svn: 228284
*	Value soft float calls as more expensive in the inliner.	Cameron Esfahani	2015-02-05	1	-0/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 llvm-svn: 228263
*	StructurizeCFG: Use a reverse post-order traversal	Tom Stellard	2015-02-04	3	-9/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were previously doing a post-order traversal and operating on the list in reverse, however this would occasionaly cause backedges for loops to be visited before some of the other blocks in the loop. We know use a reverse post-order traversal, which avoids this issue. The reverse post-order traversal is not completely ideal, so we need to manually fixup the list to ensure that inner loop backedges are visited before outer loop backedges. llvm-svn: 228186
*	Reverting VLD1/VST1 base-updating/post-incrementing combining	Renato Golin	2015-02-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. llvm-svn: 228129
*	Allow PRE to insert no-cost phi nodes	Daniel Berlin	2015-02-03	1	-0/+31
\| \| \| \|	llvm-svn: 228024
*	Add straight-line strength reduction to LLVM	Jingyue Wu	2015-02-03	1	-0/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Straight-line strength reduction (SLSR) is implemented in GCC but not yet in LLVM. It has proven to effectively simplify statements derived from an unrolled loop, and can potentially benefit many other cases too. For example, LLVM unrolls #pragma unroll foo (int i = 0; i < 3; ++i) { sum += foo((b + i) * s); } into sum += foo(b * s); sum += foo((b + 1) * s); sum += foo((b + 2) * s); However, no optimizations yet reduce the internal redundancy of the three expressions: b * s (b + 1) * s (b + 2) * s With SLSR, LLVM can optimize these three expressions into: t1 = b * s t2 = t1 + s t3 = t2 + s This commit is only an initial step towards implementing a series of such optimizations. I will implement more (see TODO in the file commentary) in the near future. This optimization is enabled for the NVPTX backend for now. However, I am more than happy to push it to the standard optimization pipeline after more thorough performance tests. Test Plan: test/StraightLineStrengthReduce/slsr.ll Reviewers: eliben, HaoLiu, meheff, hfinkel, jholewinski, atrick Reviewed By: jholewinski, atrick Subscribers: karthikthecool, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7310 llvm-svn: 228016
*	Fix: SLPVectorizer crashes with assertion when vectorizing a cmp instruction.	Erik Eckstein	2015-02-02	1	-0/+56
\| \| \| \| \| \| \| \| \|	The commit r225977 uncovered this bug. The problem was that the vectorizer tried to read the second operand of an already deleted instruction. The bug didn't show up before r225977 because the freed memory still contained a non-null pointer. With r225977 deletion of instructions is delayed and the read operand pointer is always null. llvm-svn: 227800