bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Revert "[MemorySSA] Revert r293361 and r293363, as the tests fail under asan."	Daniel Berlin	2017-01-30	2	-15/+34
\| \| \| \| \| \| \|	This reverts commit r293471, reapplying r293361 and r293363 with a fix for an out-of-bounds read. llvm-svn: 293474
*	[MemorySSA] Revert r293361 and r293363, as the tests fail under asan.	Sam McCall	2017-01-30	2	-27/+12
\| \| \| \|	llvm-svn: 293471
*	[LoopVectorize] Improve getVectorCallCost() getScalarizationOverhead() call.	Jonas Paulsson	2017-01-30	1	-19/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	By calling getScalarizationOverhead with the CallInst instead of the types of its arguments, we make sure that only unique call arguments are added to the scalarization cost. getScalarizationOverhead() is extended to handle calls by only passing on the actual call arguments (which is not all the operands). This also eliminates a wrapper function with the same name. review: Hal Finkel llvm-svn: 293459
*	[MemorySSA] Correct an assertion surrounding with parentheses.	Davide Italiano	2017-01-30	1	-3/+2
\| \| \| \|	llvm-svn: 293453
*	[InstCombine] enable (X >>?,exact C1) << C2 --> X << (C2 - C1) for vectors ↵	Sanjay Patel	2017-01-29	1	-17/+17
\| \| \| \| \| \|	with splats llvm-svn: 293435
*	NewGVN: Fix where newline is printed in debug printing of memory equivalence	Daniel Berlin	2017-01-29	1	-1/+1
\| \| \| \|	llvm-svn: 293428
*	[ArgPromote] Move static helpers to modern LLVM naming conventions while	Chandler Carruth	2017-01-29	1	-15/+15
\| \| \| \| \| \| \| \| \| \|	here. NFC. Simple refactoring while prepping a port to the new PM. Differential Revision: https://reviews.llvm.org/D29249 llvm-svn: 293426
*	[ArgPromote] Run clang-format to normalize remarkably idiosyncratic	Chandler Carruth	2017-01-29	1	-112/+121
\| \| \| \| \| \| \| \| \| \|	formatting that has evolved here over the past years prior to making somewhat invasive changes to thread new PM support through the business logic. Differential Revision: https://reviews.llvm.org/D29248 llvm-svn: 293425
*	[ArgPromote] Re-arrange the code in a more typical, logical way.	Chandler Carruth	2017-01-29	1	-561/+547
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This arranges the static helpers in an order where they are defined prior to their use to avoid the need of forward declarations, and collect the core pass components at the bottom below their helpers. This also folds one trivial function into the pass itself. Factoring this 'runImpl' was an attempt to help porting to the new pass manager, however in my attempt to begin this port in earnest it turned out to not be a substantial help. I think it will be easier to factor things without it. This is an NFC change and does a minimal amount of edits over all. Subsequent NFC cleanups will normalize the formatting with clang-format and improve the basic doxygen commenting. Differential Revision: https://reviews.llvm.org/D29247 llvm-svn: 293424
*	Remove inclusion of SSAUpdater from several passes.	Davide Italiano	2017-01-29	3	-3/+1
\| \| \| \| \| \| \| \|	It is, in fact, unused. Found while reviewing Danny's new SSAUpdater and porting passes to it to see how the new API looked like. llvm-svn: 293407
*	[PM] MLSM has been enabled for a way. Reclaim a cl::opt.	Davide Italiano	2017-01-28	1	-8/+2
\| \| \| \|	llvm-svn: 293401
*	[SLP] Vectorize loads of consecutive memory accesses, accessed in ↵	Mohammad Shahid	2017-01-28	1	-57/+120
\| \| \| \| \| \| \| \| \| \| \| \| \|	non-consecutive (jumbled) way. The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask. Reviewers: hfinkel, mssimpso, mkuper Differential Revision: https://reviews.llvm.org/D26905 Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad llvm-svn: 293386
*	[InstCombine] Merge DebugLoc when speculatively hoisting store instruction	Taewook Oh	2017-01-28	1	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Along with https://reviews.llvm.org/D27804, debug locations need to be merged when hoisting store instructions as well. Not sure if just dropping debug locations would make more sense for this case, but as the branch instruction will have at least different discriminator with the hoisted store instruction, I think there will be no difference in practice. Reviewers: aprantl, andreadb, danielcdh Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29062 llvm-svn: 293372
*	Use print() instead of dump() in code	Matthias Braun	2017-01-28	1	-5/+2
\| \| \| \|	llvm-svn: 293371
*	MemorySSA: Allow movement to arbitrary places	Daniel Berlin	2017-01-28	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Extend the MemorySSAUpdater API to allow movement to arbitrary places Reviewers: davide, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29239 llvm-svn: 293363
*	MemorySSA: Fix block numbering invalidation and replacement bugs discovered ↵	Daniel Berlin	2017-01-28	2	-11/+20
\| \| \| \| \| \|	by updater llvm-svn: 293361
*	Cleanup dump() functions.	Matthias Braun	2017-01-28	5	-20/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359
*	MemorySSA: Move updater to its own file	Daniel Berlin	2017-01-28	3	-339/+373
\| \| \| \|	llvm-svn: 293357
*	Introduce a basic MemorySSA updater, that supports insertDef,	Daniel Berlin	2017-01-28	1	-28/+368
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	insertUse, moveBefore and moveAfter operations. Summary: This creates a basic MemorySSA updater that handles arbitrary insertion of uses and defs into MemorySSA, as well as arbitrary movement around the CFG. It replaces the current splice API. It can be made to handle arbitrary control flow changes. Currently, it uses the same updater algorithm from D28934. The main difference is because MemorySSA is single variable, we have the complete def and use list, and don't need anyone to give it to us as part of the API. We also have to rename stores below us in some cases. If we go that direction in that patch, i will merge all the updater implementations (using an updater_traits or something to provide the get* functions we use, called read/write in that patch). Sadly, the current SSAUpdater algorithm is way too slow to use for what we are doing here. I have updated the tests we have to basically build memoryssa incrementally using the updater api, and make sure it still comes out the same. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29047 llvm-svn: 293356
*	[RegisterCoalescing] Recommit the patch "Remove partial redundent copy".	Quentin Colombet	2017-01-28	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r292621, the recommit fixes a bug related with live interval update after the partial redundent copy is moved. This recommit solves an additional bug related to the lack of update of subranges. The original patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 Re-apply r292621 Revert "Revert rL292621. Caused some internal build bot failures in apple." This reverts commit r292984. Original patch: Wei Mi <wmi@google.com> Subrange fix: Mostly Matthias Braun <matze@braunis.de> llvm-svn: 293353
*	[InstCombine] move icmp transforms that might be recognized as min/max and ↵	Sanjay Patel	2017-01-27	1	-10/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	inf-loop (PR31751) This is a minimal patch to avoid the infinite loop in: https://llvm.org/bugs/show_bug.cgi?id=31751 But the general problem is bigger: we're not canonicalizing all of the min/max forms reported by value tracking's matchSelectPattern(), and we don't define min/max consistently. Some code uses matchSelectPattern(), other code uses matchers like m_Umax, and others have their own inline definitions which may be subtly different from any of the above. The reason that the test cases in this patch need a cast op to trigger is because we don't (yet) canonicalize all min/max forms based on matchSelectPattern() in canonicalizeMinMaxWithConstant(), but we do make min/max+cast transforms based on matchSelectPattern() in visitSelectInst(). The location of the icmp transforms that trigger the inf-loop seems arbitrary at best, so I'm moving those behind the min/max fence in visitICmpInst() as the quick fix. llvm-svn: 293345
*	Global DCE performance improvement	Mehdi Amini	2017-01-27	1	-60/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the original algorithm so that it scales better when meeting very large bitcode where every instruction does not implies a global. The target query is "how to you get all the globals referenced by another global"? Before this patch, it was doing this by walking the body (or the initializer) and collecting the references. What this patch is doing, it precomputing the answer to this query for the whole module by walking the use-list of every global instead. Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D28549 llvm-svn: 293328
*	[PGO] add debug option to view raw count after prof use annotation	Xinliang David Li	2017-01-27	1	-1/+59
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D29045 llvm-svn: 293325
*	NFC: Add debug tracing for more cases where loop unrolling fails.	Anna Thomas	2017-01-27	1	-2/+8
\| \| \| \|	llvm-svn: 293313
*	[SLP] Refactoring of horizontal reduction analysis, NFC.	Alexey Bataev	2017-01-27	1	-24/+25
\| \| \| \| \| \| \| \| \| \| \|	Some checks in SLP horizontal reduction analysis function are performed several times, though it is enough to perform these checks only once during an initial attempt at adding candidate for the reduction instruction/reduced value. Differential Revision: https://reviews.llvm.org/D29175 llvm-svn: 293274
*	[LICM] When we are recomputing the alias sets for a subloop, we cannot	Chandler Carruth	2017-01-27	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	skip sub-subloops. The logic to skip subloops dated from when this code was shared with the cached case. Once it was factored out to only run in the case of recomputed subloops it became a dangerous bug. If a subsubloop contained an interfering instruction it would be silently skipped from the alias sets for LICM. With the old pass manager this was extremely hard to trigger as it would require failing to visit these subloops with the LICM pass but then visiting the outer loop somehow. I've not yet contrived any test case that actually manages to trigger this. But with the new pass manager we don't do the cross-loop caching hack that the old PM does and so we recompute alias set information from first principles. While this seems much cleaner and simpler it exposed this bug and would subtly miscompile code due to failing to correctly model the aliasing constraints of deeply nested loops. llvm-svn: 293273
*	Fix unused variable warning.	Richard Trieu	2017-01-27	1	-0/+1
\| \| \| \|	llvm-svn: 293260
*	NewGVN: Add basic dead and redundant store elimination	Daniel Berlin	2017-01-27	1	-29/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds basic dead and redundant store elimination to NewGVN. Unlike our current DSE, it will happily do cross-block DSE if it meets our requirements. We get a bunch of DSE's simple.ll cases, and some stuff it doesn't. Unlike DSE, however, we only try to eliminate stores of the same value to the same memory location, not just general stores to the same memory location. Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29149 llvm-svn: 293258
*	[NVPTX] [InstCombine] Add llvm_unreachable to appease MSVC.	Justin Lebar	2017-01-27	1	-0/+1
\| \| \| \|	llvm-svn: 293253
*	[NVPTX] Fix use-after-stack-free bug in InstCombineCalls.	Justin Lebar	2017-01-27	1	-1/+1
\| \| \| \| \| \|	Introduced in r293244. llvm-svn: 293251
*	Constant fold switch inst when looking for trivial conditions to unswitch on.	Xin Tong	2017-01-27	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Constant fold switch inst when looking for trivial conditions to unswitch on. Reviewers: sanjoy, chenli, hfinkel, efriedma Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D29037 llvm-svn: 293250
*	[PM] Port LoopLoadElimination to the new pass manager and wire it into	Chandler Carruth	2017-01-27	1	-27/+60
\| \| \| \| \| \| \| \| \| \| \|	the main pipeline. This is a very straight forward port. Nothing weird or surprising. This brings the number of missing passes from the new PM's pipeline down to three. llvm-svn: 293249
*	[NVPTX] Upgrade NVVM intrinsics in InstCombineCalls.	Justin Lebar	2017-01-27	1	-0/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There are many NVVM intrinsics that we can't entirely get rid of, but that nonetheless often correspond to target-generic LLVM intrinsics. For example, if flush denormals to zero (ftz) is enabled, we can convert @llvm.nvvm.ceil.ftz.f to @llvm.ceil.f32. On the other hand, if ftz is disabled, we can't do this, because @llvm.ceil.f32 will be lowered to a non-ftz PTX instruction. In this case, we can, however, simplify the non-ftz nvvm ceil intrinsic, @llvm.nvvm.ceil.f, to @llvm.ceil.f32. These transformations are particularly useful because they let us constant fold instructions that appear in libdevice, the bitcode library that ships with CUDA and essentially functions as its libm. Reviewers: tra Subscribers: hfinkel, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D28794 llvm-svn: 293244
*	[LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.	Justin Lebar	2017-01-27	1	-5/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute sqrt(x), check if x is negative, and select NaN if it is: %cmp = fcmp olt double %a, -0.000000e+00 %sqrt = call double @llvm.sqrt.f64(double %a) %ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt This is technically UB as the LangRef is written today if %a is ever less than -0. But emitting code that's compliant with the current definition of sqrt would require a branch, which would then prevent us from matching this idiom in SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative inputs), because SelectionDAG looks at one BB at a time. Nothing in LLVM takes advantage of this undefined behavior, as far as we can tell, and the fact that llvm.sqrt has UB dates from its initial addition to the LangRef. Reviewers: arsenm, mehdi_amini, hfinkel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D28797 llvm-svn: 293242
*	Revert a couple of InstCombine/Guard checkins	Sanjoy Das	2017-01-26	1	-29/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change reverts: r293061: "[InstCombine] Canonicalize guards for NOT OR condition" r293058: "[InstCombine] Canonicalize guards for AND condition" They miscompile cases like: ``` declare void @llvm.experimental.guard(i1, ...) define void @test_guard_not_or(i1 %A, i1 %B) { %C = or i1 %A, %B %D = xor i1 %C, true call void(i1, ...) @llvm.experimental.guard(i1 %D, i32 20, i32 30)[ "deopt"() ] ret void } ``` because they do transfer the `i32 20, i32 30` parameters to newly created guard instructions. llvm-svn: 293227
*	NewGVN: Fix bug exposed by PR31761	Daniel Berlin	2017-01-26	1	-43/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This does not actually fix the testcase in PR31761 (discussion is ongoing on the testcase), but does fix a bug it exposes, where stores were not properly clobbering loads. We accomplish this by unifying the memory equivalence infratructure back into the normal congruence infrastructure, and then properly destroying congruence classes when memory state leaders disappear. Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29195 llvm-svn: 293216
*	[InstCombine] fold (X >>u C) << C --> X & (-1 << C)	Sanjay Patel	2017-01-26	1	-18/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have this fold when the lshr has one use, but it doesn't need that restriction. We may be able to remove some code from foldShiftedShift(). Also, move the similar: (X << C) >>u C --> X & (-1 >>u C) ...directly into visitLShr to help clean up foldShiftByConstOfShiftByConst(). That whole function seems questionable since it is called by commonShiftTransforms(), but there's really not much in common if we're checking the shift opcodes for every fold. llvm-svn: 293215
*	NewGVN: Add algorithm overview	Daniel Berlin	2017-01-26	1	-0/+21
\| \| \| \|	llvm-svn: 293212
*	[InstCombine] use m_APInt to allow (X << C) >>u C --> X & (-1 >>u C) with ↵	Sanjay Patel	2017-01-26	1	-16/+24
\| \| \| \| \| \|	splat vectors llvm-svn: 293208
*	NewGVN: Make unreachable blocks be marked with unreachable	Daniel Berlin	2017-01-26	1	-18/+13
\| \| \| \|	llvm-svn: 293196
*	[LV] Fix an issue where forming LCSSA in the place that we did would	Chandler Carruth	2017-01-26	1	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	change the set of uniform instructions in the loop causing an assert failure. The problem is that the legalization checking also builds data structures mapping various facts about the loop body. The immediate cause was the set of uniform instructions. If these then change when LCSSA is formed, the data structures would already have been built and become stale. The included test case triggered an assert in loop vectorize that was reduced out of the new PM's pipeline. The solution is to form LCSSA early enough that no information is cached across the changes made. The only really obvious position is outside of the main logic to vectorize the loop. This also has the advantage of removing one case where forming LCSSA could mutate the loop but we wouldn't track that as a "Changed" state. If it is significantly advantageous to do some legalization checking prior to this, we can do a more careful positioning but it seemed best to just back off to a safe position first. llvm-svn: 293168
*	[TargetTransformInfo] Refactor and improve getScalarizationOverhead()	Jonas Paulsson	2017-01-26	1	-33/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactoring to remove duplications of this method. New method getOperandsScalarizationOverhead() that looks at the present unique operands and add extract costs for them. Old behaviour was to just add extract costs for one operand of the type always, which still happens in getArithmeticInstrCost() if no operands are provided by the caller. This is a good start of improving on this, but there are more places that can be improved by using getOperandsScalarizationOverhead(). Review: Hal Finkel https://reviews.llvm.org/D29017 llvm-svn: 293155
*	[X86] Add demanded elts support for the inputs to pclmul intrinsic	Craig Topper	2017-01-26	1	-0/+38
\| \| \| \| \| \| \| \|	This intrinsic uses bit 0 and bit 4 of an immediate argument to determine which bits of its inputs to read. This patch uses this information to simplify the demanded elements of the input vectors. Differential Revision: https://reviews.llvm.org/D28979 llvm-svn: 293151
*	Revert test commit	Taewook Oh	2017-01-26	1	-1/+0
\| \| \| \|	llvm-svn: 293150
*	test commit	Taewook Oh	2017-01-26	1	-3/+4
\| \| \| \|	llvm-svn: 293148
*	[PM] Simplify the new PM interface to the loop unroller and expose two	Chandler Carruth	2017-01-26	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	factory functions for the two modes the loop unroller is actually used in in-tree: simplified full-unrolling and the entire thing including partial unrolling. I've also wired these up to nice names so you can express both of these being in a pipeline easily. This is a precursor to actually enabling these parts of the O2 pipeline. Differential Revision: https://reviews.llvm.org/D28897 llvm-svn: 293136
*	[LoopUnroll] Properly update loopinfo for runtime unrolling by 2	Michael Kuperstein	2017-01-26	3	-10/+19
\| \| \| \| \| \| \| \| \| \| \|	Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is complicated by the fact the remainder may itself be either inserted into an outer loop, or at the top level. In the latter case, we may need to create new top-level loops. Differential Revision: https://reviews.llvm.org/D29156 llvm-svn: 293124
*	[NewGVN] Skip uses in unreachable blocks.	Davide Italiano	2017-01-26	1	-0/+6
\| \| \| \| \| \| \| \|	Otherwise we ask for a domtree node that's not there, and we crash. Differential Revision: https://reviews.llvm.org/D29145 llvm-svn: 293122
*	LowerTypeTests: Ignore external globals with type metadata.	Peter Collingbourne	2017-01-26	1	-3/+6
\| \| \| \| \| \|	Thanks to Davide Italiano for finding the problem and providing a test case. llvm-svn: 293119
*	[NewGVN] Simplify folding a lambda used only once. NFCI.	Davide Italiano	2017-01-25	1	-5/+3
\| \| \| \|	llvm-svn: 293112