bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat ↵	Sanjay Patel	2017-04-15	1	-19/+19
\| \| \| \| \| \|	vector constants llvm-svn: 300402
*	[ProfileData] Unify getInstrProf*SectionName helpers	Vedant Kumar	2017-04-15	2	-31/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a version of D32090 that unifies all of the `getInstrProf*SectionName` helper functions. (Note: the build failures which D32090 would have addressed were fixed with r300352.) We should unify these helper functions because they are hard to use in their current form. E.g we recently introduced more helpers to fix section naming for COFF files. This scheme doesn't totally succeed at hiding low-level details about section naming, so we should switch to an API that is easier to maintain. This is not an NFC commit because it fixes llvm-cov's testing support for COFF files (this falls out of the API change naturally). This is an area where we lack tests -- I will see about adding one as a follow up. Testing: check-clang, check-profile, check-llvm. Differential Revision: https://reviews.llvm.org/D32097 llvm-svn: 300381
*	[InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computations	Craig Topper	2017-04-14	1	-36/+46
\| \| \| \| \| \| \| \| \| \| \| \|	When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code. This patch moves them to named temporaries so we can reuse them. Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch. Differential Revision: https://reviews.llvm.org/D32094 llvm-svn: 300376
*	[IR] Make paramHasAttr to use arg indices instead of attr indices	Reid Kleckner	2017-04-14	4	-6/+6
\| \| \| \| \| \| \| \| \|	This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
*	[InstCombine] (X != C1 && X != C2) --> (X \| (C1 ^ C2)) != C2	Sanjay Patel	2017-04-14	1	-36/+65
\| \| \| \| \| \| \| \| \| \|	...when C1 differs from C2 by one bit and C1 <u C2: http://rise4fun.com/Alive/Vuo And move related folds to a helper function. This reduces code duplication and will make it easier to remove the scalar-only restriction as a follow-up step. llvm-svn: 300364
*	[InstCombine] Support folding a subtract with a constant LHS into a phi node	Craig Topper	2017-04-14	8	-28/+44
\| \| \| \| \| \| \| \| \| \| \| \|	We currently only support folding a subtract into a select but not a PHI. This fixes that. I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects. Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards. Differential Revision: https://reviews.llvm.org/D31686 llvm-svn: 300363
*	[InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip ↵	Craig Topper	2017-04-14	1	-30/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	code when LHS/RHS aren't BinaryOperators Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out. We also rely on these null checks to check the result of getIdentityValue and early out for it. This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator. Differential Revision: https://reviews.llvm.org/D31913 llvm-svn: 300353
*	[FunctionImport] assert(false) -> llvm_unreachable(). NFCI.	Davide Italiano	2017-04-14	1	-1/+1
\| \| \| \|	llvm-svn: 300344
*	Tighten the API for ScalarEvolutionNormalization	Sanjoy Das	2017-04-14	1	-4/+3
\| \| \| \|	llvm-svn: 300331
*	Remove NormalizeAutodetect; NFC	Sanjoy Das	2017-04-14	1	-9/+3
\| \| \| \| \| \| \| \| \|	It is cleaner to have a callback based system where the logic of whether an add recurrence is normalized or not lives on IVUsers. This is one step in a multi-step cleanup. llvm-svn: 300330
*	[LV] Remove implicit single basic block assumption	Gil Rapaport	2017-04-14	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is part of D28975's breakdown - no change in output intended. LV's code currently assumes the vectorized loop is a single basic block up until predicateInstructions() is called. This patch removes two manifestations of this assumption (loop phi incoming values, dominator tree update) by replacing the use of vectorLoopBody with the vectorized loop's latch/header. Differential Revision: https://reviews.llvm.org/D32040 llvm-svn: 300310
*	[InstCombine] Use APInt::setSignBit and APInt::isNegative(). NFC	Craig Topper	2017-04-14	1	-3/+3
\| \| \| \|	llvm-svn: 300305
*	Fix test failure on windows: pass module to getInstrProfXXName calls	Xinliang David Li	2017-04-14	1	-4/+4
\| \| \| \|	llvm-svn: 300302
*	NewGVN: Don't propagate over phi backedges where undef causes us to	Daniel Berlin	2017-04-14	1	-8/+149
\| \| \| \| \| \| \| \|	have >1 value, unless we can prove the phi node is cycle free. Fixes PR 32607. llvm-svn: 300299
*	[Profile] PE binary coverage bug fix	Xinliang David Li	2017-04-13	2	-12/+11
\| \| \| \| \| \| \| \|	PR/32584 Differential Revision: https://reviews.llvm.org/D32023 llvm-svn: 300277
*	[IR] Make getParamAttributes take argument numbers, not ArgNo+1	Reid Kleckner	2017-04-13	5	-35/+34
\| \| \| \| \| \| \| \| \| \| \| \|	Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
*	[InstCombine] Use APInt::getBitsSetFrom instead of inverting the result of ↵	Craig Topper	2017-04-13	1	-4/+2
\| \| \| \| \| \|	getLowBitsSet. NFC llvm-svn: 300265
*	[LCSSA] Efficiently compute blocks dominating at least one exit.	Davide Italiano	2017-04-13	1	-19/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For LCSSA purposes, loop BBs not dominating any of the exits aren't interesting, as none of the values defined in these blocks can be used outside the loop. The way the code computed this information was by comparing each BB of the loop with each of the exit blocks and ask the dominator tree about their dominance relation. This is slow. A more efficient way, implemented here, is that of starting from the exit blocks and walking the dom upwards until we hit an header. By transitivity, all the blocks we encounter in our path dominate an exit. For the testcase provided in PR31851, this reduces compile time on `opt -O2` by ~25%, going from 1m47s to 1m22s. Thanks to Dan/MichaelZ for discussions/suggesting the approach/review. Differential Revision: https://reviews.llvm.org/D31843 llvm-svn: 300255
*	Revert accidentally-committed files in r300252.	Richard Smith	2017-04-13	1	-403/+0
\| \| \| \|	llvm-svn: 300253
*	Remove all allocation and divisions from GreatestCommonDivisor	Richard Smith	2017-04-13	1	-0/+403
\| \| \| \| \| \| \| \| \| \| \|	Switch from Euclid's algorithm to Stein's algorithm for computing GCD. This avoids the (expensive) APInt division operation in favour of bit operations. Remove all memory allocation from within the GCD loop by tweaking our `lshr` implementation so it can operate in-place. Differential Revision: https://reviews.llvm.org/D31968 llvm-svn: 300252
*	[InstCombine] Fix !prof metadata preservation for invokes	Reid Kleckner	2017-04-13	1	-18/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Bug noticed by inspection. Extend the test to handle invokes as well as calls, and rewrite it to not depend on the inliner and other passes. Also simplify the call site replacement code with CallSite, similar to what I did to dead arg elimination and arg promotion (rL300235 and rL300229). Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32041 llvm-svn: 300251
*	[LCSSA] Assert that we always have a valid loop.	Davide Italiano	2017-04-13	1	-0/+1
\| \| \| \| \| \| \|	We could otherwise add BBs not belonging to a loop in `formLCSSA` and later crash when trying to iterate the loop blocks. llvm-svn: 300244
*	[LCSSA] Remove spurious whitespaces. NFCI.	Davide Italiano	2017-04-13	1	-1/+1
\| \| \| \|	llvm-svn: 300243
*	[LCSSA] Use `auto` when the type is obvious. NFCI.	Davide Italiano	2017-04-13	1	-3/+3
\| \| \| \|	llvm-svn: 300242
*	SamplePGO: convert callsite samples map key from callsite_location to ↵	Dehao Chen	2017-04-13	1	-39/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	callsite_location+callee_name Summary: For iterative SamplePGO, an indirect call can be speculatively promoted to multiple direct calls and get inlined. All these promoted direct calls will share the same callsite location (offset+discriminator). With the current implementation, we cannot distinguish between different promotion candidates and its inlined instance. This patch adds callee_name to the key of the callsite sample map. And added helper functions to get all inlined callee samples for a given callsite location. This helps the profile annotator promote correct targets and inline it before annotation, and ensures all indirect call targets to be annotated correctly. Reviewers: davidxl, dnovillo Reviewed By: davidxl Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D31950 llvm-svn: 300240
*	[LV] Fix the vector code generation for first order recurrence	Anna Thomas	2017-04-13	2	-24/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 llvm-svn: 300238
*	[InstCombine] fold X == 0 \|\| X == -1 to one compare (PR32524)	Sanjay Patel	2017-04-13	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is effectively a retry of: https://reviews.llvm.org/rL299851 but now we have tests and an assert to make sure the bug that was exposed with that attempt will not happen again. I'll fix the code duplication and missing sibling fold next, but I want to make this change as small as possible to reduce risk since I messed it up last time. This should fix: https://bugs.llvm.org/show_bug.cgi?id=32524 llvm-svn: 300236
*	[DAE] Simplify call site replacement code with CallSite NFC	Reid Kleckner	2017-04-13	1	-27/+24
\| \| \| \|	llvm-svn: 300235
*	[InstCombine] Simplify attribute code with new AttributeList::get NFC	Reid Kleckner	2017-04-13	1	-31/+20
\| \| \| \|	llvm-svn: 300230
*	[ArgPromotion] Don't drop !prof metadata on promoted calls	Reid Kleckner	2017-04-13	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	Noticed by inspection while doing attribute work. DAE, InstCombineCalls, and ArgPromotion have a fair amount of duplicated code for hacking on call sites, and you can find bugs by comparing them. Add a test case for this. llvm-svn: 300229
*	[InstCombine] use similar ops for related folds; NFCI	Sanjay Patel	2017-04-13	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's less efficient to produce 'ule' than 'ult' since we know we're going to canonicalize to 'ult', but we shouldn't have duplicated code for these folds. As a trade-off, this was a pretty terrible way to make a '2'. :) if (LHSC == SubOne(RHSC)) AddC = ConstantExpr::getSub(AddOne(RHSC), LHSC); The next steps are to share the code to fix PR32524 and add the missing 'and' fold that was left out when PR14708 was fixed: https://bugs.llvm.org/show_bug.cgi?id=14708 llvm-svn: 300222
*	[InstCombine] fix assert to not always be true	Sanjay Patel	2017-04-13	1	-1/+1
\| \| \| \|	llvm-svn: 300202
*	Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."	Geoff Berry	2017-04-13	1	-2/+2
\| \| \| \| \| \|	This reverts commit r296872 now that PR32153 has been fixed. llvm-svn: 300200
*	[LV] Refactor ILV to provide vectorizeInstruction(); NFC	Ayal Zaks	2017-04-13	1	-310/+302
\| \| \| \| \| \| \| \| \| \| \|	Refactoring InnerLoopVectorizer's vectorizeBlockInLoop() to provide vectorizeInstruction(). Aligning DeadInstructions with its only user. Facilitates driving the transformation by VPlan - follows https://reviews.llvm.org/D28975 and its tentative breakdown. Differential Revision: https://reviews.llvm.org/D31997 llvm-svn: 300183
*	[IR] Take func, ret, and arg attrs separately in AttributeList::get	Reid Kleckner	2017-04-13	4	-82/+55
\| \| \| \| \| \| \| \| \| \| \| \| \|	This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153
*	[InstCombine] Teach SimplifyMultipleUseDemandedBits to handle And/Or/Xor ↵	Craig Topper	2017-04-12	1	-11/+46
\| \| \| \| \| \| \| \|	known bits using the LHS/RHS known bits it already acquired without recursing back into computeKnownBits. This replicates the known bits and constant creation code from the single use case for these instructions and adds it here. The computeKnownBits and constant creation code for other instructions is now in the default case of the opcode switch. llvm-svn: 300094
*	[InstCombine] Remove unreachable code for turning an And where all demanded ↵	Craig Topper	2017-04-12	1	-4/+0
\| \| \| \| \| \| \| \|	bits on both sides are known to be zero into a constant 0. We already handled a superset check that included the known ones too and folded to a constant that may include ones. But it can also handle the case of no ones. llvm-svn: 300093
*	[InstCombine] fix wrong undef handling when converting select to shuffle	Sanjay Patel	2017-04-12	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32486 ...the canonicalization of vector select to shufflevector does not hold up when undef elements are present in the condition vector. Try to make the undef handling clear in the code and the LangRef. Differential Revision: https://reviews.llvm.org/D31980 llvm-svn: 300092
*	[InstCombine] In SimplifyMultipleUseDemandedBits, use a switch instead of ↵	Craig Topper	2017-04-12	1	-3/+11
\| \| \| \| \| \|	cascaded ifs on opcode. NFC llvm-svn: 300085
*	[InstCombine] Teach SimplifyDemandedInstructionBits that even if we reach an ↵	Craig Topper	2017-04-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction that has multiple uses, if we know all the bits for the demanded bits for this context we can go ahead and create a constant. Currently if we reach an instruction with multiples uses we know we can't do any optimizations to that instruction itself since we only have the demanded bits for one of the users. But if we know all of the bits are zero/one for that one user we can still go ahead and create a constant to give to that user. This might then reduce the instruction to having a single use and allow additional optimizations on the other path. This picks up an additional case that r300075 didn't catch. Differential Revision: https://reviews.llvm.org/D31552 llvm-svn: 300084
*	[InstCombine] Move portion of SimplifyDemandedUseBits that deals with ↵	Craig Topper	2017-04-12	2	-76/+103
\| \| \| \| \| \|	instructions with multiple uses out to a separate method. NFCI llvm-svn: 300082
*	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit ↵	Craig Topper	2017-04-12	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 llvm-svn: 300075
*	[InstCombine] morph an existing instruction instead of creating a new one	Sanjay Patel	2017-04-12	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \|	One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 llvm-svn: 300067
*	[SLPVectorizer] Pass the right type argument to getCmpSelInstrCost()	Jonas Paulsson	2017-04-12	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	In getEntryCost(), make the scalar type for a compare instruction that of the operands, not i1. This is needed in order to call getCmpSelInstrCost() for a compare in a sensible way, the same way as the LoopVectorizer does. New test: test/Transforms/SLPVectorizer/SystemZ/SLP-cmp-cost-query.ll Review: Matthew Simpson https://reviews.llvm.org/D31601 llvm-svn: 300061
*	[LoopVectorizer] Improve handling of branches during cost estimation.	Jonas Paulsson	2017-04-12	1	-1/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cost for a branch after vectorization is very different depending on if the vectorizer will if-convert the block (branch is eliminated), or if scalarized and predicated blocks will be produced (branch duplicated before each block). There is also the case of remaining scalar branches, such as the back-edge branch. This patch handles these cases differently with TTI based cost estimates. Review: Matthew Simpson https://reviews.llvm.org/D31175 llvm-svn: 300058
*	[LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore()	Jonas Paulsson	2017-04-12	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056
*	[SystemZ] TargetTransformInfo cost functions implemented.	Jonas Paulsson	2017-04-12	4	-24/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
*	[LoadCombine] Avoid analysing dead basic blocks	Bjorn Pettersson	2017-04-12	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Dead basic blocks may be forming a loop, for which SSA form is fulfilled, but with a circular def-use chain. LoadCombine could enter an infinite loop when analysing such dead code. This patch solves the problem by simply avoiding to analyse all basic blocks that aren't forward reachable, from function entry, in LoadCombine. Fixes https://bugs.llvm.org/show_bug.cgi?id=27065 Reviewers: mehdi_amini, chandlerc, grosser, Bigcheese, davide Reviewed By: davide Subscribers: dberlin, zzheng, bjope, grandinj, Ka-Ka, materi, jholewinski, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31032 llvm-svn: 300034
*	[IR] Redesign the case iterator in SwitchInst to actually be an iterator	Chandler Carruth	2017-04-12	15	-93/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 llvm-svn: 300032
*	[InstCombine][IR] Add a commutable BinOp matcher. Use it to reduce some ↵	Craig Topper	2017-04-12	1	-2/+1
\| \| \| \| \| \|	code. NFC llvm-svn: 300030