bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Simplify test for sret attribute in instcombine	Reid Kleckner	2017-04-19	2	-15/+29
\| \| \| \| \| \| \| \| \|	This change is correct because the verifier requires that at most one argument be marked 'sret'. NFC, removes a use of AttributeList slot APIs. llvm-svn: 300784
*	[InstCombine] Add frem constant folding test (PR3316)	Simon Pilgrim	2017-04-19	1	-0/+9
\| \| \| \|	llvm-svn: 300757
*	[InstCombine] Add frem constant folding test (PR32177)	Simon Pilgrim	2017-04-19	1	-0/+9
\| \| \| \|	llvm-svn: 300750
*	StructurizeCFG: Directly invert cmp instructions	Matt Arsenault	2017-04-19	3	-6/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true which the DAG will have to fold later. This produces nicer to read structurizer output. This produces some random changes in codegen due to the DAG swapping branch conditions itself, and then does a poor job of dealing with those inverts. llvm-svn: 300732
*	[GVN] Don't coerce non-integral pointers to integers or vice versa	Sanjoy Das	2017-04-19	2	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type The NewGVN test does not fail without these changes (perhaps it does try to coerce pointers <-> integers to begin with?), but I added the test case anyway. Reviewers: dberlin Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D32208 llvm-svn: 300730
*	[InstSimplify] fold identity shuffles (recursing if needed)	Sanjay Patel	2017-04-19	1	-26/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch simplifies the examples from D31509 and D31927 (PR30630) and catches the basic identity shuffle tests that Zvi recently added. I'm not sure if we have something like this in DAGCombiner, but we should? It's worth noting that "MaxRecurse / RecursionLimit" is only 3 on entry at the moment. We might want to bump that up if there are longer shuffle chains like this in the wild. For now, we're ignoring shuffles that have undef mask elements because it's not clear how those should be handled. Differential Revision: https://reviews.llvm.org/D31960 llvm-svn: 300714
*	[InstSimplify] Deduce correct type for vector GEP.	Davide Italiano	2017-04-19	1	-1/+25
\| \| \| \| \| \| \| \| \| \|	InstSimplify returned the wrong type when simplifying a vector GEP and we ended up crashing when trying to replace all uses with the new value. Fixes PR32697. Differential Revision: https://reviews.llvm.org/D32180 llvm-svn: 300693
*	Regenerate test. NFCI.	Simon Pilgrim	2017-04-19	1	-8/+9
\| \| \| \|	llvm-svn: 300683
*	Revert r300657 due to crashes in stage2 of bootstraps:	Chandler Carruth	2017-04-19	1	-89/+0
\| \| \| \| \| \| \| \| \| \| \| \|	http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/2476/steps/build-stage2-LLVMgold.so/logs/stdio http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/15036/steps/build_llvmclang/logs/stdio I've updated the commit thread, reverting to get the bots back to green. Original commit summary: [JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. llvm-svn: 300662
*	[JumpThread] We want to fold (not thread) when all predecessor go to single ↵	Xin Tong	2017-04-19	1	-0/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	BB's successor. . Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread). Reviewers: efriedma, sanjoy Reviewed By: sanjoy Subscribers: dberlin, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D30869 llvm-svn: 300657
*	[SLP vectorizer] Allow phi node reordering in tryToVectorizeList.	Easwaran Raman	2017-04-18	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In tryToVectorizeList, under a very limited circumstance (when entered from tryToVectorizePair), the values may be reordered (swapped) and the SLP tree is built with the new order. This extends that to the case when starting from phis in vectorizeChainsInBlock when there are exactly two phis. The textual order of phi nodes shouldn't really matter. Without this change, the loop body in the accompnaying test case is fully vectorized when we swap the orde of the phis but not with this order. While this doesn't solve the phi-ordering problem in a general way (for more than 2 phis), this is simple fix that piggybacks on an existing mechanism and is useful in cases like multiplying two complex numbers. Differential revision: https://reviews.llvm.org/D32065 llvm-svn: 300574
*	Make globalaa-retained.ll test catching more cases.	Nikolai Bozhenov	2017-04-18	1	-3/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: * Add checks for store. That is needed because GlobalsAA is called twice in the current pipeline with different sets of Function passes following it. However, the loads are eliminated using instcombine which happens everywhere. On the other hand, DeadStoreElimination is performed only once so by checking for store we'll be able to catch more cases when GlobalsAA is invalidated unintentionally. * Add empty function above/below the test so that we don't depend on the relative order of instcombine/dead-store-elimination and the pass that invalidates the analysis (inside the same FunctionPassManager). Reviewers: kristof.beyls Reviewed By: kristof.beyls Subscribers: llvm-commits, n.bozhenov Differential Revision: https://reviews.llvm.org/D32015 Patch by Andrei Elovikov <andrei.elovikov@intel.com> llvm-svn: 300553
*	PR32382: Fix emitting complex DWARF expressions.	Adrian Prantl	2017-04-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a register - consist of only a DW_OP_reg 2. Memory location descriptions - describe the address of a variable 3. Implicit location descriptions - describe the value of a variable - end with DW_OP_stack_value & friends The existing DwarfExpression code is pretty much ignorant of these restrictions. This used to not matter because we only emitted very short expressions that we happened to get right by accident. This patch makes DwarfExpression aware of the rules defined by the DWARF standard and now chooses the right kind of location description for each expression being emitted. This would have been an NFC commit (for the existing testsuite) if not for the way that clang describes captured block variables. Based on how the previous code in LLVM emitted locations, DW_OP_deref operations that should have come at the end of the expression are put at its beginning. Fixing this means changing the semantics of DIExpression, so this patch bumps the version number of DIExpression and implements a bitcode upgrade. There are two major changes in this patch: I had to fix the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the address of a variable as the first argument, even if the argument is not an alloca. When lowering a DBG_VALUE, the decision of whether to emit a register location description or a memory location description depends on the MachineLocation — register machine locations may get promoted to memory locations based on their DIExpression. (Future) optimization passes that want to salvage implicit debug location for variables may do so by appending a DW_OP_stack_value. For example: DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8 DBG_VALUE, RAX --> DW_OP_reg0 +0 DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0 All testcases that were modified were regenerated from clang. I also added source-based testcases for each of these to the debuginfo-tests repository over the last week to make sure that no synchronized bugs slip in. The debuginfo-tests compile from source and run the debugger. https://bugs.llvm.org/show_bug.cgi?id=32382 <rdar://problem/31205000> Differential Revision: https://reviews.llvm.org/D31439 llvm-svn: 300522
*	Build SymbolMap in SampleProfileLoader to help matchin function names with ↵	Dehao Chen	2017-04-17	2	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	suffix. Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline. Reviewers: davidxl, dnovillo, tejohnson Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31952 llvm-svn: 300507
*	[InstCombine] Matchers work with both ConstExpr and Instructions.	Davide Italiano	2017-04-17	1	-0/+22
\| \| \| \| \| \| \| \| \| \|	So, `cast<Instruction>` is not guaranteed to succeed. Change the code so that we create a new constant and use it in the newly created instruction, as it's done in other places in InstCombine. OK'ed by Sanjay/Craig. Fixes PR32686. llvm-svn: 300495
*	[InstSimplify] add/move tests for (icmp X, C1 & icmp X, C2); NFC	Sanjay Patel	2017-04-17	2	-20/+2912
\| \| \| \| \| \|	We simplify based on range intersection, but we're missing folds. llvm-svn: 300493
*	[CodeGenPrepare] Fix crash due to an invalid CFG	Brendon Cahoon	2017-04-17	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The splitIndirectCriticalEdges function generates and invalid CFG when the 'Target' basic block is a loop to itself. When this occurs, the code that updates the predecessor terminator needs to update the terminator in the split basic block. This occurs when there is an edge from block D back to D. Since D is split in to D0 and D1, the code needs to update the terminator in D1. But D1 is not in the OtherPreds vector, so it was not getting updated. Differential Revision: https://reviews.llvm.org/D32126 llvm-svn: 300480
*	AMDGPU: SimplifyDemandedElts for image intrinsics	Matt Arsenault	2017-04-17	1	-6/+1190
\| \| \| \| \| \| \| \|	Causes some VGPR usage improvements in shaderdb, but introduces some SGPR spilling regressions due to random scheduling changes later. llvm-svn: 300453
*	[LoopPeeling] Get rid of Phis that become invariant after N steps	Max Kazantsev	2017-04-17	1	-3/+146
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is a generalization of the improvement introduced in rL296898. Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after N iterations, we can peel N times and turn it into invariant. In order to do this, we for every Phi in loop's header we define the Invariant Depth value which is calculated as follows: Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge]. If %y is a loop invariant, then Depth(%x) = 1. If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1. Otherwise, Depth(%x) is infinite. Notice that if we peel a loop, all Phis with Depth = 1 become invariants, and all other Phis with finite depth decrease the depth by 1. Thus, peeling N first iterations allows us to turn all Phis with Depth <= N into invariants. Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31613 llvm-svn: 300446
*	[LoopPeeling] Fix condition for phi-eliminating peeling	Max Kazantsev	2017-04-17	2	-1/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When peeling loops basing on phis becoming invariants, we make a wrong loop size check. UP.Threshold should be compared against the total numbers of instructions after the transformation, which is equal to 2 * LoopSize in case of peeling one iteration. We should also check that the maximum allowed number of peeled iterations is not zero. Reviewers: sanjoy, anna, reames, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31753 llvm-svn: 300441
*	[InstCombine] Simplify 1/X for vectors.	Craig Topper	2017-04-17	1	-2/+5
\| \| \| \|	llvm-svn: 300439
*	[InstCombine] Add test cases for missing support for simplifying 1/X for ↵	Craig Topper	2017-04-17	1	-0/+18
\| \| \| \| \| \|	vectors. NFC llvm-svn: 300438
*	[InstCombine] Add support for vector srem->urem.	Craig Topper	2017-04-17	1	-1/+1
\| \| \| \|	llvm-svn: 300437
*	[InstCombine] Add missing testcases for srem->urem conversion. The vector ↵	Craig Topper	2017-04-17	1	-0/+22
\| \| \| \| \| \|	version isn't currently supported. NFC llvm-svn: 300436
*	[InstCombine] Add support for turning vector sdiv into udiv.	Craig Topper	2017-04-17	2	-8/+5
\| \| \| \|	llvm-svn: 300435
*	[InstCombine] Add test cases for missing support for turning vector sdiv ↵	Craig Topper	2017-04-17	2	-0/+26
\| \| \| \| \| \|	into udiv. NFC llvm-svn: 300434
*	[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization	Michael Zuckerman	2017-04-16	1	-0/+181
\| \| \| \| \| \| \| \| \|	This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b)) to instCombineCall pass and was written specific for X86 CMP intrinsics. Differential Revision: https://reviews.llvm.org/D31398 llvm-svn: 300422
*	[InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat ↵	Sanjay Patel	2017-04-15	1	-9/+7
\| \| \| \| \| \|	vector constants llvm-svn: 300402
*	[InstCombine] add tests to show missing transforms for vectors; NFC	Sanjay Patel	2017-04-15	1	-0/+26
\| \| \| \|	llvm-svn: 300401
*	[InstCombine] (X != C1 && X != C2) --> (X \| (C1 ^ C2)) != C2	Sanjay Patel	2017-04-14	1	-10/+8
\| \| \| \| \| \| \| \| \| \|	...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364
*	[InstCombine] Support folding a subtract with a constant LHS into a phi node	Craig Topper	2017-04-14	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \|	We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363
*	[InstCombine] Regenerate test checks using script. NFC	Craig Topper	2017-04-14	1	-3/+4
\| \| \| \|	llvm-svn: 300360
*	[InstCombine] add/move tests for and/or-of-icmps equality folds; NFC	Sanjay Patel	2017-04-14	4	-111/+139
\| \| \| \|	llvm-svn: 300357
*	Update tests for the patch.	Alexey Bataev	2017-04-14	4	-559/+576
\| \| \| \|	llvm-svn: 300351
*	NewGVN: Don't propagate over phi backedges where undef causes us to	Daniel Berlin	2017-04-14	1	-0/+33
\| \| \| \| \| \| \| \|	have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
*	Revert accidentally-committed files in r300252.	Richard Smith	2017-04-13	1	-297/+0
\| \| \| \|	llvm-svn: 300253
*	Remove all allocation and divisions from GreatestCommonDivisor	Richard Smith	2017-04-13	1	-0/+297
\| \| \| \| \| \| \| \| \| \| \|	Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
*	[InstCombine] Fix !prof metadata preservation for invokes	Reid Kleckner	2017-04-13	1	-12/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
*	SamplePGO: convert callsite samples map key from callsite_location to ↵	Dehao Chen	2017-04-13	2	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
*	[LV] Fix the vector code generation for first order recurrence	Anna Thomas	2017-04-13	2	-12/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238
*	[InstCombine] fold X == 0 \|\| X == -1 to one compare (PR32524)	Sanjay Patel	2017-04-13	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236
*	[ArgPromotion] Don't drop !prof metadata on promoted calls	Reid Kleckner	2017-04-13	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229
*	[Analysis] Support bitreverse in -demanded-bits pass	Brian Gesiak	2017-04-13	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: * Add a bitreverse case in the demanded bits analysis pass. * Add tests for the bitreverse (and bswap) intrinsic in the demanded bits pass. * Add a test case to the BDCE tests: that manipulations to high-order bits are eliminated once the bits are reversed and then right-shifted. Reviewers: mkuper, jmolloy, hfinkel, trentxintong Reviewed By: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31857 llvm-svn: 300215
*	[InstCombine] add/move tests for or-of-icmps; NFC	Sanjay Patel	2017-04-13	2	-26/+61
\| \| \| \| \| \| \|	If we had these tests, the bug caused by https://reviews.llvm.org/rL299851 would have been caught sooner. There's also an assert in the code that should have caught that bug, but the assert line itself has a bug. llvm-svn: 300201
*	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."	Geoff Berry	2017-04-13	1	-0/+38
\| \| \| \| \| \|	This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200
*	[InstCombine] Add vector version of a test to show missing optimization.	Craig Topper	2017-04-13	1	-0/+12
\| \| \| \|	llvm-svn: 300161
*	[InstCombine] fix wrong undef handling when converting select to shuffle	Sanjay Patel	2017-04-12	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092
*	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an ↵	Craig Topper	2017-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084
*	[SystemZ] Fix more target specific tests	Renato Golin	2017-04-12	1	-0/+2
\| \| \| \|	llvm-svn: 300081
*	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit ↵	Craig Topper	2017-04-12	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075