bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SLPVectorizer] Try different vectorization factors for store chains	Sanjay Patel	2015-07-08	1	-7/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	...and set max vector register size based on target This patch is based on discussion on the llvmdev mailing list: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html and also solves: https://llvm.org/bugs/show_bug.cgi?id=17170 Several FIXME/TODO items are noted in comments as potential improvements. Differential Revision: http://reviews.llvm.org/D10950 llvm-svn: 241760
*	[LoopVectorizer] Rename BypassBlock to VectorPH, and CheckBlock to ↵	Michael Zolotukhin	2015-07-08	1	-46/+46
\| \| \| \| \| \|	NewVectorPH. NFCI. llvm-svn: 241742
*	[LoopVectorizer] Restructurize code for emitting RT checks. NFCI.	Michael Zolotukhin	2015-07-08	1	-18/+22
\| \| \| \| \| \| \| \| \| \|	Place all code corresponding to a run-time check in one place. Previously we generated some code, then proceeded to a next check, then finished the code for the first check (like splitting blocks and generating branches). Now the code for generating a check is self-contained. llvm-svn: 241741
*	[LoopVectorizer] Remove redundant variables PastOverflowCheck and ↵	Michael Zolotukhin	2015-07-08	1	-11/+2
\| \| \| \| \| \|	OverflowCheckAnchor. NFCI. llvm-svn: 241740
*	[LoopVectorizer] Move some code around to ease further refactoring. NFCI.	Michael Zolotukhin	2015-07-08	1	-16/+13
\| \| \| \|	llvm-svn: 241739
*	[LoopVectorizer] Remove redundant variable LastBypassBlock. NFC.	Michael Zolotukhin	2015-07-08	1	-14/+12
\| \| \| \|	llvm-svn: 241738
*	remove unnecessary temp variable; NFCI	Sanjay Patel	2015-07-05	1	-5/+4
\| \| \| \|	llvm-svn: 241415
*	use range-based for loops; NFCI	Sanjay Patel	2015-07-05	1	-8/+7
\| \| \| \|	llvm-svn: 241412
*	use range-based for loops; NFCI	Sanjay Patel	2015-07-04	1	-20/+20
\| \| \| \|	llvm-svn: 241395
*	[LoopVectorize] Use ReplaceInstWithInst() helper where appropriate.	Alexey Samsonov	2015-07-01	1	-22/+15
\| \| \| \| \| \| \| \| \| \|	This is mostly an NFC, which increases code readability (instead of saving old terminator, generating new one in front of old, and deleting old, we just call a function). However, it would additionaly copy the debug location from old instruction to replacement, which would help PR23837. llvm-svn: 241197
*	[LoopVectorize] Pointer indicies may be wider than the pointer	David Majnemer	2015-06-27	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \|	If we are dealing with a pointer induction variable, isInductionPHI gives back a step value of Stride / size of pointer. However, we might be indexing with a legal type wider than the pointer width. Handle this by inserting casts where appropriate instead of crashing. This fixes PR23954. llvm-svn: 240877
*	Move VectorUtils from Transforms to Analysis to correct layering violation	David Blaikie	2015-06-26	2	-2/+2
\| \| \| \|	llvm-svn: 240804
*	[LoopVectorizer] Fix bailing-out condition for OptForSize case.	Michael Zolotukhin	2015-06-24	1	-4/+3
\| \| \| \| \| \| \| \| \| \|	With option OptForSize enabled, the Loop Vectorizer is not supposed to create tail loop. The condition checking that was invalid and was not matching to the comment above. Patch by Marianne Mailhot-Sarrasin. llvm-svn: 240556
*	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)	Alexander Kornienko	2015-06-23	2	-2/+2
\| \| \| \| \| \|	Apparently, the style needs to be agreed upon first. llvm-svn: 240390
*	[SLP] Vectorize for all-constant entries.	Michael Zolotukhin	2015-06-19	1	-2/+4
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D10531 llvm-svn: 240144
*	Fixed/added namespace ending comments using clang-tidy. NFC	Alexander Kornienko	2015-06-19	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-,llvm-namespace-comment -header-filter='llvm/.\|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
*	[PM/AA] Remove the Location typedef from the AliasAnalysis class now	Chandler Carruth	2015-06-17	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	that it is its own entity in the form of MemoryLocation, and update all the callers. This is an entirely mechanical change. References to "Location" within AA subclases become "MemoryLocation", and elsewhere "AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps out-of-tree folks update. llvm-svn: 239885
*	Refactor RecurrenceInstDesc	Tyler Nowicki	2015-06-16	1	-1/+1
\| \| \| \| \| \|	Moved RecurrenceInstDesc into RecurrenceDescriptor to simplify the namespaces. llvm-svn: 239862
*	Rename Reduction variables/structures to Recurrence.	Tyler Nowicki	2015-06-16	1	-16/+16
\| \| \| \| \| \| \| \|	A reduction is a special kind of recurrence. In the loop vectorizer we currently identify basic reductions. Future patches will extend this to identifying basic recurrences. llvm-svn: 239835
*	[LoopVectorize] Revert the enabling of interleaved memory access in Loop ↵	Hao Liu	2015-06-11	1	-1/+1
\| \| \| \| \| \|	Vectorizor, which was wrongly committed in r239514. llvm-svn: 239515
*	[AArch64] Match interleaved memory accesses into ldN/stN instructions.	Hao Liu	2015-06-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true" E.g. Transform an interleaved load (Factor = 2): %wide.vec = load <8 x i32>, <8 x i32>* %ptr %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6> ; Extract even elements %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7> ; Extract odd elements Into: %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr) %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0 %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1 E.g. Transform an interleaved store (Factor = 2): %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7> ; Interleaved vec store <8 x i32> %i.vec, <8 x i32>* %ptr Into: %v0 = shuffle %i.vec, undef, <0, 1, 2, 3> %v1 = shuffle %i.vec, undef, <4, 5, 6, 7> call void aarch64.neon.st2(%v0, %v1, %ptr) llvm-svn: 239514
*	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.	Hao Liu	2015-06-08	1	-2/+699
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291
*	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and	Chandler Carruth	2015-06-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	port it to the new pass manager. All this does is extract the inner "location" class used by AA into its own full fledged type. This seems much cleaner as MemoryDependence and soon MemorySSA also use this heavily, and it doesn't make much sense being inside the AA infrastructure. This will also make it much easier to break apart the AA infrastructure into something that stands on its own rather than using the analysis group design. There are a few places where this makes APIs not make sense -- they were taking an AliasAnalysis pointer just to build locations. I'll try to clean those up in follow-up commits. Differential Revision: http://reviews.llvm.org/D10228 llvm-svn: 239003
*	Replace push_back(Constructor(foo)) with emplace_back(foo) for non-trivial types	Benjamin Kramer	2015-05-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the type isn't trivially moveable emplace can skip a potentially expensive move. It also saves a couple of characters. Call sites were found with the ASTMatcher + some semi-automated cleanup. memberCallExpr( argumentCountIs(1), callee(methodDecl(hasName("push_back"))), on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))), hasArgument(0, bindTemporaryExpr( hasType(recordDecl(hasNonTrivialDestructor())), has(constructExpr()))), unless(isInTemplateInstantiation())) No functional change intended. llvm-svn: 238602
*	Change Function::getIntrinsicID() to return an Intrinsic::ID. NFC.	Pete Cooper	2015-05-20	2	-4/+4
\| \| \| \| \| \| \| \|	Now that Intrinsic::ID is a typed enum, we can forward declare it and so return it from this method. This updates all users which were either using an unsigned to store it, or had a now unnecessary cast. llvm-svn: 237810
*	[X86] Disable loop unrolling in loop vectorization pass when VF is 1.	Wei Mi	2015-05-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613
*	Fix a couple of typos in comments.	Michael Zolotukhin	2015-04-24	1	-3/+3
\| \| \| \|	llvm-svn: 235674
*	Recommit r235458: [opaque pointer type] Avoid using ↵	David Blaikie	2015-04-23	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PointerType::getElementType for a few cases of CallInst (reverted in r235533) Original commit message: "Calls to llvm::Value::mutateType are becoming extra-sensitive now that instructions have extra type information that will not be derived from operands or result type (alloca, gep, load, call/invoke, etc... ). The special-handling for mutateType will get more complicated as this work continues - it might be worth making mutateType virtual & pushing the complexity down into the classes that need special handling. But with only two significant uses of mutateType (vectorization and linking) this seems OK for now. Totally open to ideas/suggestions/improvements, of course. With this, and a bunch of exceptions, we can roundtrip an indirect call site through bitcode and IR. (a direct call site is actually trickier... I haven't figured out how to deal with the IR deserializer's lazy construction of Function/GlobalVariable decl's based on the type of the entity which means looking through the "pointer to T" type referring to the global)" The remapping done in ValueMapper for LTO was insufficient as the types weren't correctly mapped (though I was using the post-mapped operands, some of those operands might not have been mapped yet so the type wouldn't be post-mapped yet). Instead use the pre-mapped type and explicitly map all the types. llvm-svn: 235651
*	Move common loop utility function isInductionPHI into LoopUtils.cpp	Karthik Bhat	2015-04-23	1	-43/+0
\| \| \| \| \| \| \|	This patch refactors the definition of common utility function "isInductionPHI" to LoopUtils.cpp. This fixes compilation error when configured with -DBUILD_SHARED_LIBS=ON llvm-svn: 235577
*	Revert "[opaque pointer type] Avoid using PointerType::getElementType for a ↵	David Blaikie	2015-04-22	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \|	few cases of CallInst" This reverts commit r235458. It looks like this might be breaking something LTO-ish. Looking into it & will recommit with a fix/test case/etc once I've got more to go on. llvm-svn: 235533
*	[opaque pointer type] Avoid using PointerType::getElementType for a few ↵	David Blaikie	2015-04-21	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cases of CallInst Calls to llvm::Value::mutateType are becoming extra-sensitive now that instructions have extra type information that will not be derived from operands or result type (alloca, gep, load, call/invoke, etc... ). The special-handling for mutateType will get more complicated as this work continues - it might be worth making mutateType virtual & pushing the complexity down into the classes that need special handling. But with only two significant uses of mutateType (vectorization and linking) this seems OK for now. Totally open to ideas/suggestions/improvements, of course. With this, and a bunch of exceptions, we can roundtrip an indirect call site through bitcode and IR. (a direct call site is actually trickier... I haven't figured out how to deal with the IR deserializer's lazy construction of Function/GlobalVariable decl's based on the type of the entity which means looking through the "pointer to T" type referring to the global) llvm-svn: 235458
*	[NFC] Refactor identification of reductions as common utility function.	Karthik Bhat	2015-04-20	1	-519/+30
\| \| \| \| \| \| \| \| \|	This patch refactors reduction identification code out of LoopVectorizer and exposes them as common utilities. No functional change. Review: http://reviews.llvm.org/D9046 llvm-svn: 235284
*	Add range iterators for post order and inverse post order. Use them	Daniel Berlin	2015-04-15	1	-3/+1
\| \| \| \|	llvm-svn: 235026
*	Reduce dyn_cast<> to isa<> or cast<> where possible.	Benjamin Kramer	2015-04-10	1	-2/+2
\| \| \| \| \| \|	No functional change intended. llvm-svn: 234586
*	[LoopAccesses] Allow analysis to complete in the presence of uniform stores	Adam Nemet	2015-04-08	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Re-apply r234361 with a fix and a testcase for PR23157) Both run-time pointer checking and the dependence analysis are capable of dealing with uniform addresses. I.e. it's really just an orthogonal property of the loop that the analysis computes. Run-time pointer checking will only try to reason about SCEVAddRec pointers or else gives up. If the uniform pointer turns out the be a SCEVAddRec in an outer loop, the run-time checks generated will be correct (start and end bounds would be equal). In case of the dependence analysis, we work again with SCEVs. When compared against a loop-dependent address of the same underlying object, the difference of the two SCEVs won't be constant. This will result in returning an Unknown dependence for the pair. When compared against another uniform access, the difference would be constant and we should return the right type of dependence (forward/backward/etc). The changes also adds support to query this property of the loop and modify the vectorizer to use this. Patch by Ashutosh Nema! llvm-svn: 234424
*	Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform ↵	Adam Nemet	2015-04-08	1	-8/+0
\| \| \| \| \| \| \| \| \| \|	stores" This reverts commit r234361. It caused PR23157. llvm-svn: 234387
*	[LoopAccesses] Allow analysis to complete in the presence of uniform stores	Adam Nemet	2015-04-07	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both run-time pointer checking and the dependence analysis are capable of dealing with uniform addresses. I.e. it's really just an orthogonal property of the loop that the analysis computes. Run-time pointer checking will only try to reason about SCEVAddRec pointers or else gives up. If the uniform pointer turns out the be a SCEVAddRec in an outer loop, the run-time checks generated will be correct (start and end bounds would be equal). In case of the dependence analysis, we work again with SCEVs. When compared against a loop-dependent address of the same underlying object, the difference of the two SCEVs won't be constant. This will result in returning an Unknown dependence for the pair. When compared against another uniform access, the difference would be constant and we should return the right type of dependence (forward/backward/etc). The changes also adds support to query this property of the loop and modify the vectorizer to use this. Patch by Ashutosh Nema! llvm-svn: 234361
*	[opaque pointer type] More GEP API migrations in IRBuilder uses	David Blaikie	2015-04-03	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	The plan here is to push the API changes out from the common components (like Constant::getGetElementPtr and IRBuilder::CreateGEP related functions) and just update callers to either pass the type if it's obvious, or pass null. Do this with LoadInst as well and anything else that comes up, then to start porting specific uses to not pass null anymore - this may require some refactoring in each case. llvm-svn: 234042
*	Transforms: Use the new DebugLoc API, NFC	Duncan P. N. Exon Smith	2015-03-30	1	-2/+1
\| \| \| \| \| \|	Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API. llvm-svn: 233587
*	Refactor Code inside LoopVectorizer's function isInductionVariable.	Karthik Bhat	2015-03-27	1	-9/+23
\| \| \| \| \| \| \| \|	This patch exposes LoopVectorizer's isInductionVariable function as common a functionality. http://reviews.llvm.org/D8608 llvm-svn: 233352
*	Opaque Pointer Types: GEP API migrations to specify the gep type explicitly	David Blaikie	2015-03-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The changes to InstCombine do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) llvm-svn: 233126
*	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.	Benjamin Kramer	2015-03-23	1	-1/+1
\| \| \| \|	llvm-svn: 232998
*	Try to fix a test broken by one of my previous commits.	Michael Zolotukhin	2015-03-17	1	-0/+3
\| \| \| \|	llvm-svn: 232536
*	LoopVectorize: teach loop vectorizer to vectorize calls.	Michael Zolotukhin	2015-03-17	1	-35/+157
\| \| \| \| \| \| \|	The tests would be committed in a commit for http://reviews.llvm.org/D8131 Review: http://reviews.llvm.org/D8095 llvm-svn: 232530
*	LoopVectorizer: Add TargetTransformInfo.	Michael Zolotukhin	2015-03-17	1	-9/+12
\| \| \| \| \|	Review: http://reviews.llvm.org/D8092 llvm-svn: 232522
*	[LAA-memchecks 2/3] Move number of memcheck threshold checking to LV	Adam Nemet	2015-03-10	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now the analysis won't "fail" if the memchecks exceed the threshold. It is the transform pass' responsibility to perform the check. This allows the transform pass to further analyze/eliminate the memchecks. E.g. in Loop distribution we only need to check pointers that end up in different partitions. Note that there is a slight change of functionality here. The logic in analyzeLoop is that if dependence checking fails due to non-constant distance between the pointers, another attempt is made to prove safety of the dependences purely using run-time checks. Before this patch we could fail the loop due to exceeding the memcheck threshold after the first step, now we only check the threshold in the client after the full analysis. There is no measurable compile-time effect but I wanted to record this here. llvm-svn: 231817
*	DataLayout is mandatory, update the API to reflect it with references.	Mehdi Amini	2015-03-10	3	-126/+108
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
*	Remove the remaining uses of abs64 and nuke it.	Benjamin Kramer	2015-03-09	1	-4/+4
\| \| \| \| \| \|	std::abs works just fine and we're already using it in many places. NFC intended. llvm-svn: 231696
*	Make helper functions static.	Benjamin Kramer	2015-03-09	1	-4/+3
\| \| \| \| \| \|	Found by -Wmissing-prototypes. NFC. llvm-svn: 231664
*	Introduce runtime unrolling disable matadata and use it to mark the scalar ↵	Kevin Qin	2015-03-09	1	-2/+52
\| \| \| \| \| \| \| \| \| \| \|	loop from vectorization. Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases. llvm-svn: 231631