bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[LoopPeeling] Fix condition for phi-eliminating peeling	Max Kazantsev	2017-04-17	2	-1/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When peeling loops basing on phis becoming invariants, we make a wrong loop size check. UP.Threshold should be compared against the total numbers of instructions after the transformation, which is equal to 2 * LoopSize in case of peeling one iteration. We should also check that the maximum allowed number of peeled iterations is not zero. Reviewers: sanjoy, anna, reames, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31753 llvm-svn: 300441
*	[InstCombine] Simplify 1/X for vectors.	Craig Topper	2017-04-17	1	-2/+5
\| \| \| \|	llvm-svn: 300439
*	[InstCombine] Add test cases for missing support for simplifying 1/X for ↵	Craig Topper	2017-04-17	1	-0/+18
\| \| \| \| \| \|	vectors. NFC llvm-svn: 300438
*	[InstCombine] Add support for vector srem->urem.	Craig Topper	2017-04-17	1	-1/+1
\| \| \| \|	llvm-svn: 300437
*	[InstCombine] Add missing testcases for srem->urem conversion. The vector ↵	Craig Topper	2017-04-17	1	-0/+22
\| \| \| \| \| \|	version isn't currently supported. NFC llvm-svn: 300436
*	[InstCombine] Add support for turning vector sdiv into udiv.	Craig Topper	2017-04-17	2	-8/+5
\| \| \| \|	llvm-svn: 300435
*	[InstCombine] Add test cases for missing support for turning vector sdiv ↵	Craig Topper	2017-04-17	2	-0/+26
\| \| \| \| \| \|	into udiv. NFC llvm-svn: 300434
*	[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization	Michael Zuckerman	2017-04-16	1	-0/+181
\| \| \| \| \| \| \| \| \|	This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b)) to instCombineCall pass and was written specific for X86 CMP intrinsics. Differential Revision: https://reviews.llvm.org/D31398 llvm-svn: 300422
*	[InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat ↵	Sanjay Patel	2017-04-15	1	-9/+7
\| \| \| \| \| \|	vector constants llvm-svn: 300402
*	[InstCombine] add tests to show missing transforms for vectors; NFC	Sanjay Patel	2017-04-15	1	-0/+26
\| \| \| \|	llvm-svn: 300401
*	[InstCombine] (X != C1 && X != C2) --> (X \| (C1 ^ C2)) != C2	Sanjay Patel	2017-04-14	1	-10/+8
\| \| \| \| \| \| \| \| \| \|	...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364
*	[InstCombine] Support folding a subtract with a constant LHS into a phi node	Craig Topper	2017-04-14	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \|	We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363
*	[InstCombine] Regenerate test checks using script. NFC	Craig Topper	2017-04-14	1	-3/+4
\| \| \| \|	llvm-svn: 300360
*	[InstCombine] add/move tests for and/or-of-icmps equality folds; NFC	Sanjay Patel	2017-04-14	4	-111/+139
\| \| \| \|	llvm-svn: 300357
*	Update tests for the patch.	Alexey Bataev	2017-04-14	4	-559/+576
\| \| \| \|	llvm-svn: 300351
*	NewGVN: Don't propagate over phi backedges where undef causes us to	Daniel Berlin	2017-04-14	1	-0/+33
\| \| \| \| \| \| \| \|	have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
*	Revert accidentally-committed files in r300252.	Richard Smith	2017-04-13	1	-297/+0
\| \| \| \|	llvm-svn: 300253
*	Remove all allocation and divisions from GreatestCommonDivisor	Richard Smith	2017-04-13	1	-0/+297
\| \| \| \| \| \| \| \| \| \| \|	Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
*	[InstCombine] Fix !prof metadata preservation for invokes	Reid Kleckner	2017-04-13	1	-12/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
*	SamplePGO: convert callsite samples map key from callsite_location to ↵	Dehao Chen	2017-04-13	2	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
*	[LV] Fix the vector code generation for first order recurrence	Anna Thomas	2017-04-13	2	-12/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238
*	[InstCombine] fold X == 0 \|\| X == -1 to one compare (PR32524)	Sanjay Patel	2017-04-13	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236
*	[ArgPromotion] Don't drop !prof metadata on promoted calls	Reid Kleckner	2017-04-13	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229
*	[Analysis] Support bitreverse in -demanded-bits pass	Brian Gesiak	2017-04-13	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: * Add a bitreverse case in the demanded bits analysis pass. * Add tests for the bitreverse (and bswap) intrinsic in the demanded bits pass. * Add a test case to the BDCE tests: that manipulations to high-order bits are eliminated once the bits are reversed and then right-shifted. Reviewers: mkuper, jmolloy, hfinkel, trentxintong Reviewed By: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31857 llvm-svn: 300215
*	[InstCombine] add/move tests for or-of-icmps; NFC	Sanjay Patel	2017-04-13	2	-26/+61
\| \| \| \| \| \| \|	If we had these tests, the bug caused by https://reviews.llvm.org/rL299851 would have been caught sooner. There's also an assert in the code that should have caught that bug, but the assert line itself has a bug. llvm-svn: 300201
*	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."	Geoff Berry	2017-04-13	1	-0/+38
\| \| \| \| \| \|	This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200
*	[InstCombine] Add vector version of a test to show missing optimization.	Craig Topper	2017-04-13	1	-0/+12
\| \| \| \|	llvm-svn: 300161
*	[InstCombine] fix wrong undef handling when converting select to shuffle	Sanjay Patel	2017-04-12	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092
*	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an ↵	Craig Topper	2017-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084
*	[SystemZ] Fix more target specific tests	Renato Golin	2017-04-12	1	-0/+2
\| \| \| \|	llvm-svn: 300081
*	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit ↵	Craig Topper	2017-04-12	1	-10/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075
*	[InstCombine] morph an existing instruction instead of creating a new one	Sanjay Patel	2017-04-12	3	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 llvm-svn: 300067
*	Fix a RUN line in new test.	Jonas Paulsson	2017-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Use '2>&1 \|' and not '\|&' to pipe debug output to FileCheck Hopefully handles a "shell parser error" on llvm-clang-x86_64-expensive-checks-win test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll llvm-svn: 300064
*	[SLPVectorizer] Pass the right type argument to getCmpSelInstrCost()	Jonas Paulsson	2017-04-12	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \|	In getEntryCost(), make the scalar type for a compare instruction that of the operands, not i1. This is needed in order to call getCmpSelInstrCost() for a compare in a sensible way, the same way as the LoopVectorizer does. New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll Review: Matthew Simpson https://reviews.llvm.org/D31601 llvm-svn: 300061
*	[LoopVectorizer] Improve handling of branches during cost estimation.	Jonas Paulsson	2017-04-12	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cost for a branch after vectorization is very different depending on if the vectorizer will if-convert the block (branch is eliminated), or if scalarized and predicated blocks will be produced (branch duplicated before each block). There is also the case of remaining scalar branches, such as the back-edge branch. This patch handles these cases differently with TTI based cost estimates. Review: Matthew Simpson https://reviews.llvm.org/D31175 llvm-svn: 300058
*	[LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore()	Jonas Paulsson	2017-04-12	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056
*	[SystemZ] TargetTransformInfo cost functions implemented.	Jonas Paulsson	2017-04-12	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
*	[LoadCombine] Avoid analysing dead basic blocks	Bjorn Pettersson	2017-04-12	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Dead basic blocks may be forming a loop, for which SSA form is fulfilled, but with a circular def-use chain. LoadCombine could enter an infinite loop when analysing such dead code. This patch solves the problem by simply avoiding to analyse all basic blocks that aren't forward reachable, from function entry, in LoadCombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=27065 Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide Reviewed By: davide Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31032 llvm-svn: 300034
*	ThinLTOBitcodeWriter: keep comdats together, rename if leader is renamed	Bob Haarman	2017-04-12	1	-0/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: COFF requires that every comdat contain a symbol with the same name as the comdat. ThinLTOBitcodeWriter renames symbols, which may cause this requirement to be violated. This change avoids such violations by renaming comdats if their leaders are renamed. It also keeps comdats together when splitting modules. Reviewers: pcc, mehdi_amini, tejohnson Reviewed By: pcc Subscribers: rnk, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31963 llvm-svn: 300019
*	InstSimplify: A shuffle of a splat is always the splat itself	Zvi Rackover	2017-04-11	1	-2/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fold: shuffle (splat-shuffle), undef, M --> splat-shuffle Reviewers: spatel, RKSimon, craig.topper Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31527 llvm-svn: 299990
*	[LV] Avoid vectorizing first order recurrence when phi uses are outside loop	Anna Thomas	2017-04-11	2	-3/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the vectorization of first order recurrence, we vectorize such that the last element in the vector will be the one extracted to pass into the scalar remainder loop. However, this is not true when there is a phi (other than the primary induction variable) is used outside the loop. In such a case, we need the value from the second last iteration (i.e. the phi value), not the last iteration (which would be the phi update). I've added a test case for this. Also see PR32396. A follow up patch would generate the correct code gen for such cases, and turn this vectorization on. Differential Revision: https://reviews.llvm.org/D31910 Reviewers: mssimpso llvm-svn: 299985
*	[InstSimplify] add tests for chains of shuffles; NFC	Sanjay Patel	2017-04-11	1	-0/+45
\| \| \| \|	llvm-svn: 299984
*	MemorySSA: Move to Analysis, from Transforms/Utils. It's used as	Daniel Berlin	2017-04-11	22	-1510/+0
\| \| \| \| \| \| \| \|	Analysis, it has Analysis passes, and once NewGVN is made an Analysis, this removes the cross dependency from Analysis to Transform/Utils. NFC. llvm-svn: 299980
*	[AddDiscriminators] Assign discriminators to MemIntrinsic calls.	Andrea Di Biagio	2017-04-11	1	-0/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this patch, pass AddDiscriminators always avoided to assign discriminators to intrinsic calls. This was done mainly for two reasons: 1) We wanted to minimize the number of based discriminators used. 2) We wanted to avoid non-deterministic discriminator assignment for different debug levels. Unfortunately, that approach was problematic for MemIntrinsic calls. MemIntrinsic calls can be split by SROA into loads and stores, and each new load/store instruction would obtain the debug location from the original intrinsic call. If we don't assign a discriminator to MemIntrinsic calls, then we cannot correctly set the discriminator for the newly created loads and stores. This may have a negative impact on the basic block weight computation performed by the SampleLoader. This patch fixes the issue by letting MemIntrinsic calls have a discriminator. Differential Revision: https://reviews.llvm.org/D31900 llvm-svn: 299972
*	[InstCombine] Add testcases for (B&A)^A -> ~B & A and (B\|A)^A -> B & ~A	Craig Topper	2017-04-11	1	-0/+88
\| \| \| \|	llvm-svn: 299971
*	[LV] Move first order recurrence test to common folder. NFC	Anna Thomas	2017-04-11	1	-0/+0
\| \| \| \|	llvm-svn: 299969
*	revert r299851 - [InstCombine] fix matching of or-of-icmps constants (PR32524)	Sanjay Patel	2017-04-11	1	-3/+4
\| \| \| \| \| \|	This is a candidate culprit for multiple bot fails, so reverting pending investigation. llvm-svn: 299955
*	[StripDeadDebug/DIFinder] Track inlined SPs	Keno Fischer	2017-04-11	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not referenced from the current module. However, in doing so I neglected to realize that some SPs could be referenced entirely from inlined functions. It appears I was not the only one to make this mistake, because DebugInfoFinder, doesn't find those SPs either. Fix this in DebugInfoFinder and then use that to make sure not to drop those CUs in strip-dead-debug-info. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31904 llvm-svn: 299936
*	[InstCombine] Support weird size element types in dyn_castNegVal.	Craig Topper	2017-04-11	2	-4/+9
\| \| \| \|	llvm-svn: 299915
*	[LoopUnswitch] Fix a test case	Sanjoy Das	2017-04-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(h/t to Chandler for pointing this out) The test in question was not at all testing what it was supposed to test. We do not //care// about placing `!make.implicit` in inner constant branch (since it will be folded away anyway). We care about placing `!make.implicit` in the outer branch that switches between either version of the loop. Having said that, it is _correct_ to leave behind the `!make.implicit` in the inner branch, but there is no need to do so. llvm-svn: 299912