bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstSimplify] fold sdiv/srem based on compare of dividend and divisor	Sanjay Patel	2017-09-14	1	-4/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should bring signed div/rem analysis up to the same level as unsigned. We use icmp simplification to determine when the divisor is known greater than the dividend. Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits. There are extra tests for the signed-min-value special cases. Alive proofs: http://rise4fun.com/Alive/WI5 Differential Revision: https://reviews.llvm.org/D37713 llvm-svn: 313264
*	[InstSimplify] clean up div/rem handling; NFCI	Sanjay Patel	2017-09-14	1	-54/+44
\| \| \| \| \| \| \| \| \| \| \|	The idea to make an 'isDivZero' helper was suggested for the signed case in D37713: https://reviews.llvm.org/D37713 This clean-up makes it clear that D37713 is just filling the gap for signed div/rem, removes unnecessary code, and allows us to remove a bit of duplicated code from the planned improvement in D37713. llvm-svn: 313261
*	[PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle	Chandler Carruth	2017-09-14	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	invalidated SCCs even when we do not have an updated SCC to redirect towards. This comes up in a fairly subtle and surprising circumstance: we need to have a connected but internal node in the call graph which later becomes a disconnected island, and then gets deleted. All of this needs to happen mid-CGSCC walk. Because it is disconnected, we have no way of computing a new "current" SCC when it gets deleted. Instead, we need to explicitly check for a deleted "current" SCC and bail out of the current CGSCC step. This will bubble all the way up to the post-order walk and then resume correctly. I've included minimal tests for this bug. The specific behavior matches something we've seen in the wild with the new PM combined with ThinLTO and sample PGO, but I've not yet confirmed whether this is the only issue there. llvm-svn: 313242
*	[LV] Fix maximum legal VF calculation	Alon Kom	2017-09-14	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes pr34283, which exposed that the computation of maximum legal width for vectorization was wrong, because it relied on MaxInterleaveFactor to obtain the maximum stride used in the loop, however not all strided accesses in the loop have an interleave-group associated with them. Instead of recording the maximum stride in the loop, which can be over conservative (e.g. if the access with the maximum stride is not involved in the dependence limitation), this patch tracks the actual maximum legal width imposed by accesses that are involved in dependencies. Differential Revision: https://reviews.llvm.org/D37507 llvm-svn: 313237
*	[Inliner] Add another way to compute full inline cost.	Easwaran Raman	2017-09-13	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Full inline cost is computed when -inline-cost-full is true or ORE is non-null. This patch adds another way to compute full inline cost by adding a field to InlineParams. This will be used by SampleProfileLoader to check legality of inlining a callee that it wants to inline. Reviewers: danielcdh, haicheng Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37819 llvm-svn: 313185
*	Add options to dump PGO counts in text.	Hiroshi Yamauchi	2017-09-13	1	-14/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added text options to -pgo-view-counts and -pgo-view-raw-counts that dump block frequency and branch probability info in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37776 llvm-svn: 313159
*	[ThinLTO] AliasSummary should not have any references	Teresa Johnson	2017-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: References should only be on the aliasee. Reviewers: pcc Subscribers: llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D37814 llvm-svn: 313158
*	[LAA] Allow more run-time alias checks by coercing pointer expressions to ↵	Silviu Baranga	2017-09-12	1	-27/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AddRecExprs Summary: LAA can only emit run-time alias checks for pointers with affine AddRec SCEV expressions. However, non-AddRecExprs can be now be converted to affine AddRecExprs using SCEV predicates. This change tries to add the minimal set of SCEV predicates in order to enable run-time alias checking. Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel Reviewed By: hfinkel Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D17080 llvm-svn: 313012
*	[ScalarEvolution] Refactor forgetLoop() to improve performance	Marcello Maggioni	2017-09-11	1	-40/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	forgetLoop() has pretty bad performance because it goes over the same instructions over and over again in particular when nested loop are involved. The refactoring changes the function to a not-recursive function and reusing the allocation for data-structures and the Visited set. NFCI Differential Revision: https://reviews.llvm.org/D37659 llvm-svn: 312920
*	[InstSimplify] reorder methods; NFC	Sanjay Patel	2017-09-11	1	-230/+229
\| \| \| \| \| \| \| \| \| \| \|	I'm trying to refactor some shared code for integer div/rem, but I keep having to scroll through fdiv. The FP ops have nothing in common with the integer ops, so I'm moving FP below everything else. While here, improve a couple of comments and fix some formatting. llvm-svn: 312913
*	[InstSimplify] refactor udiv/urem code and add tests; NFCI	Sanjay Patel	2017-09-10	1	-18/+31
\| \| \| \| \| \| \| \| \|	This removes some duplicated code and makes it easier to support signed div/rem in a similar way if we want to do that. Note that the existing comments were not accurate - we don't need a constant divisor to simplify; icmp simplification does more than that. But as the added tests show, it could go even further. llvm-svn: 312885
*	Merge isKnownNonNull into isKnownNonZero	Nuno Lopes	2017-09-09	4	-110/+102
\| \| \| \| \| \| \| \| \|	It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 llvm-svn: 312869
*	[DivRempairs] add a pass to optimize div/rem pairs (PR31028)	Sanjay Patel	2017-09-09	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 llvm-svn: 312862
*	[TargetTransformInfo] Add a new public interface getInstructionCost	Guozhi Wei	2017-09-08	2	-562/+571
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize ///< Instruction code size. }; int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const; All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model. This patch also provides a simple default implementation of getInstructionLatency. The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways: Add more detail into this function. Add getXXXLatency function and call it from here. Implement target specific getInstructionLatency function. Differential Revision: https://reviews.llvm.org/D37170 llvm-svn: 312832
*	[SLP] Support for horizontal min/max reduction.	Alexey Bataev	2017-09-08	2	-49/+115
\| \| \| \| \| \| \| \| \| \| \| \| \|	SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 llvm-svn: 312791
*	ModuleSummaryAnalysis: Correctly handle all function operand references.	Peter Collingbourne	2017-09-07	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code that handles personality functions when creating a module summary does not correctly handle the case where a function's personality function operand refers to the function indirectly (e.g. via a bitcast). This patch handles such cases by treating personality function references like any other reference, i.e. by adding them to the function's reference list. This has the minor side benefit of allowing personality functions to participate in early dead stripping. We do this by calling findRefEdges on the function itself. This way we also end up handling other function operands (specifically prefix data and prologue data) for free. Differential Revision: https://reviews.llvm.org/D37553 llvm-svn: 312698
*	InstSimplify: canonicalize is idempotent	Matt Arsenault	2017-09-07	1	-0/+1
\| \| \| \|	llvm-svn: 312685
*	Fix PR33878: BasicAA incorrectly assumes different address spaces don't alias	Nuno Lopes	2017-09-06	1	-5/+0
\| \| \| \| \| \| \| \| \|	Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0. This code was written before address spaces were introduced Differential Revision: https://reviews.llvm.org/D37518 llvm-svn: 312648
*	[ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to ↵	Sanjay Patel	2017-09-05	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 llvm-svn: 312591
*	[SCEV] Ensure ScalarEvolution::createAddRecFromPHIWithCastsImpl properly ↵	Daniel Neilson	2017-09-05	1	-15/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	handles out of range truncations of the start and accum values Summary: When constructing the predicate P1 in ScalarEvolution::createAddRecFromPHIWithCastsImpl() it is possible for the PHISCEV from which the predicate is constructed to be a SCEVConstant instead of a SCEVAddRec. If this happens, then the cast<SCEVAddRec>(PHISCEV) in the code will assert. Such a PHISCEV is possible if either the start value or the accumulator value is a constant value that not equal to its truncated value, and if the truncated value is zero. This patch adds tests that demonstrate the cast<> assertion, and fixes this problem by checking whether the PHISCEV is a constant before constructing the P1 predicate; if it is, then P1 is equivalent to one of P2 or P3. Additionally, if we know that the start value or accumulator value are constants then we check whether the P2 and/or P3 predicates are known false at compile time; if either is, then we bail out of constructing the AddRec. Reviewers: sanjoy, mkazantsev, silviu.baranga Reviewed By: mkazantsev Subscribers: mkazantsev, llvm-commits Differential Revision: https://reviews.llvm.org/D37265 llvm-svn: 312568
*	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You ↵	Eugene Zelenko	2017-09-01	2	-26/+69
\| \| \| \| \| \|	Use warnings; other minor fixes (NFC). llvm-svn: 312383
*	[InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through ↵	Craig Topper	2017-09-01	2	-16/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	truncate instructions This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask. This allows some code to be removed from InstSimplify that was doing this functionality. This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out. There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary. Differential Revision: https://reviews.llvm.org/D37158 llvm-svn: 312382
*	ModuleSummaryAnalysis: Correctly handle refs from function inline asm to ↵	Peter Collingbourne	2017-09-01	1	-54/+56
\| \| \| \| \| \| \| \| \| \| \| \| \|	module inline asm. If a function contains inline asm and the module-level inline asm contains the definition of a local symbol, prevent the function from being imported in case the function-level inline asm refers to a symbol in the module-level inline asm. Differential Revision: https://reviews.llvm.org/D37370 llvm-svn: 312332
*	[SCEV] Add URem support to SCEV	Alexandre Isoard	2017-09-01	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In LLVM IR the following code: %r = urem <ty> %t, %b is equivalent to %q = udiv <ty> %t, %b %s = mul <ty> nuw %q, %b %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented with minimal effort using that relation: %r --> (-%b * (%t /u %b)) + %t We implement two special cases: - if %b is 1, the result is always 0 - if %b is a power-of-two, we produce a zext/trunc based expression instead That is, the following code: %r = urem i32 %t, 65536 Produces: %r --> (zext i16 (trunc i32 %a to i16) to i32) Note that while this helps get a tighter bound on the range analysis and the known-bits analysis, this exposes some normalization shortcoming of SCEVs: %div = udim i32 %a, 65536 %mul = mul i32 %div, 65536 %rem = urem i32 %a, 65536 %add = add i32 %mul, %rem Will usually not be reduced. llvm-svn: 312329
*	[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use ↵	Eugene Zelenko	2017-08-31	2	-35/+55
\| \| \| \| \| \|	warnings; other minor fixes. Also affected in files (NFC). llvm-svn: 312289
*	Remove an unnecessary const_cast.	Adam Nemet	2017-08-28	1	-1/+2
\| \| \| \| \| \|	I think that this is dating back to when emit used to take a const reference. llvm-svn: 311948
*	[Dominators] Remove redundant explicit template instantiation.	Don Hinton	2017-08-26	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Remove redundant explicit template instantiation. This was reported by Andrew Kelley building release_50 with gcc7.2.0 on MacOS: duplicate symbol llvm::DominatorTreeBase. Reviewers: kuhar, andrewrk, davide, hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37185 llvm-svn: 311835
*	Add options to dump block frequency/branch probability info in text.	Hiroshi Yamauchi	2017-08-26	2	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add options -print-bfi/-print-bpi that dump block frequency and branch probability info like -view-block-freq-propagation-dags and -view-machine-block-freq-propagation-dags do but in text. This is useful when the graph is very large and complex (the dot command crashes, lines/edges too close to tell apart, hard to navigate without textual search) or simply when text is preferred. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37165 llvm-svn: 311822
*	[InlineCost] Small changes to early exit condition. NFC.	Haicheng Wu	2017-08-25	1	-3/+3
\| \| \| \| \| \| \| \| \|	Change the early exit condition from Cost > Threshold to Cost >= Threshold because the inline condition is Cost < Threshold. Differential Revision: https://reviews.llvm.org/D37087 llvm-svn: 311791
*	Normlize to LF line endings.	Michael Kruse	2017-08-25	1	-1/+1
\| \| \| \| \| \| \|	Commit r297442 introduced mixed CRLF/LF line endings to two files. Normalize to to LF-only line endings. llvm-svn: 311774
*	Move accurate-sample-profile into the function attribute.	Dehao Chen	2017-08-24	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO. Reviewers: davidxl, rsmith Reviewed By: davidxl Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D37113 llvm-svn: 311706
*	Model cache size and associativity in TargetTransformInfo	Tobias Grosser	2017-08-24	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We add the precise cache sizes and associativity for the following Intel architectures: - Penry - Nehalem - Westmere - Sandy Bridge - Ivy Bridge - Haswell - Broadwell - Skylake - Kabylake Polly uses since several months a performance model for BLAS computations that derives optimal cache and register tile sizes from cache and latency information (based on ideas from "Analytical Modeling Is Enough for High-Performance BLIS", by Tze Meng Low published at TOMS 2016). While bootstrapping this model, these target values have been kept in Polly. However, as our implementation is now rather mature, it seems time to teach LLVM itself about cache sizes. Interestingly, L1 and L2 cache sizes are pretty constant across micro-architectures, hence a set of architecture specific default values seems like a good start. They can be expanded to more target specific values, in case certain newer architectures require different values. For now a set of Intel architectures are provided. Just as a little teaser, for a simple gemm kernel this model allows us to improve performance from 1.2s to 0.27s. For gemm kernels with less optimal memory layouts even larger speedups can be reported. Reviewers: Meinersbur, bollu, singam-sanjay, hfinkel, gareevroman, fhahn, sebpop, efriedma, asb Reviewed By: fhahn, asb Subscribers: lsaba, asb, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D37051 llvm-svn: 311647
*	[PGO] Set edge weights for indirectbr instruction with profile counts	Rong Xu	2017-08-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Current PGO only annotates the edge weight for branch and switch instructions with profile counts. We should also annotate the indirectbr instruction as all the information is there. This patch enables the annotating for indirectbr instructions. Also uses this annotation in branch probability analysis. Differential Revision: https://reviews.llvm.org/D37074 llvm-svn: 311604
*	[lib/Analysis] - Mark personality functions as live.	George Rimar	2017-08-22	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is PR33245. Case I am fixing is next: Imagine we have 2 BC files, one defines and uses personality routine, second has only declaration and also uses it. Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did not know about personality routines and leaved them dead even if function that has routine was live. As a result thinLTOInternalizeAndPromoteGUID() method changed binding for such symbol to local. Later when LLD tried to link these objects it failed because one object had undefined global symbol for routine and second object contained local definition instead of global. Patch set the live root flag on the corresponding FunctionSummary for personality routines when we build the per-module summaries during the compile step. Differential revision: https://reviews.llvm.org/D36834 llvm-svn: 311432
*	[ValueTracking] Add assertions that the starting Depth in ↵	Craig Topper	2017-08-21	1	-0/+3
\| \| \| \| \| \| \| \|	isKnownToBeAPowerOfTwo and ComputeNumSignBitsImpl is not above MaxDepth The function does an equality check later to terminate the recursion, but that won't work if its starts out too high. Similar assert already exists in computeKnownBits. llvm-svn: 311400
*	[InlineCost] Add cl::opt to allow full inline cost to be computed for ↵	Haicheng Wu	2017-08-21	1	-14/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	debugging purposes. Currently, the inline cost model will bail once the inline cost exceeds the inline threshold in order to avoid unnecessary compile-time. However, when debugging it is useful to compute the full cost, so this command line option is added to override the default behavior. I took over this work from Chad Rosier (mcrosier@codeaurora.org). Differential Revision: https://reviews.llvm.org/D35850 llvm-svn: 311371
*	[InlineCost] Add more debug during inline cost computation.	Chad Rosier	2017-08-21	1	-1/+1
\| \| \| \|	llvm-svn: 311370
*	[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; ↵	Eugene Zelenko	2017-08-18	5	-105/+155
\| \| \| \| \| \|	other minor fixes (NFC). llvm-svn: 311212
*	[InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply ↵	Amjad Aboud	2017-08-18	1	-0/+11
\| \| \| \| \| \| \| \|	instruction. Differential Revision: https://reviews.llvm.org/D36679 llvm-svn: 311206
*	[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; ↵	Eugene Zelenko	2017-08-16	5	-81/+166
\| \| \| \| \| \|	other minor fixes (NFC). llvm-svn: 311048
*	[DemandedBits] simplify call; NFC	Sanjay Patel	2017-08-16	1	-1/+1
\| \| \| \|	llvm-svn: 311009
*	[InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares	Craig Topper	2017-08-14	1	-0/+28
\| \| \| \| \| \| \| \|	This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred. Differential Revision: https://reviews.llvm.org/D36646 llvm-svn: 310893
*	Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of ↵	Craig Topper	2017-08-14	3	-31/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	decomposeBitTestICmp and use it in the InstSimplify" This recommits r310869, with the moved files and no extra changes. Original commit message: This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310889
*	[InlineCost] Refactor the checks for different analyses to be a bit more	Chandler Carruth	2017-08-14	1	-62/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	localized to the code that uses those analyses. Technically, this can change behavior as we no longer require the existence of the ProfileSummaryInfo analysis to use local profile information via BFI. We didn't actually require the PSI to have an interesting profile though, so this only really impacts the behavior in non-default pass pipelines. IMO, this makes it substantially less surprising how everything works -- before an analysis that wasn't actually used had to exist to trigger any profile aware inlining. I think the new organization makes it more obvious where various checks for profile signals happen. Differential Revision: https://reviews.llvm.org/D36710 llvm-svn: 310888
*	Add strictfp attribute to prevent unwanted optimizations of libm calls	Andrew Kaylor	2017-08-14	1	-3/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D34163 llvm-svn: 310885
*	Revert r310869 "[InstSimplify][InstCombine] Modify the interface of ↵	Craig Topper	2017-08-14	2	-22/+31
\| \| \| \| \| \| \| \|	decomposeBitTestICmp and use it in the InstSimplify" Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything. llvm-svn: 310873
*	Revert r310870 "[InstCombine][InstSimplify] 'git add' two files that moved ↵	Craig Topper	2017-08-14	1	-137/+0
\| \| \| \| \| \| \| \|	in r310869." An extra change crept in here. llvm-svn: 310872
*	[InstCombine][InstSimplify] 'git add' two files that moved in r310869.	Craig Topper	2017-08-14	1	-0/+137
\| \| \| \|	llvm-svn: 310870
*	[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and ↵	Craig Topper	2017-08-14	2	-31/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	use it in the InstSimplify This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too. I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself. I also had to make decomposeBitTest support vectors since InstSimplify needs that. As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library. Differential Revision: https://reviews.llvm.org/D36593 llvm-svn: 310869
*	[ValueTracking] Don't delete assumes of side-effectful instructions	Hal Finkel	2017-08-14	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ValueTracking has to strike a balance when attempting to propagate information backwards from assumes, because if the information is trivially propagated backwards, it can appear to LLVM that the assumption is known to be true, and therefore can be removed. This is sound (because an assumption has no semantic effect except for causing UB), but prevents the assume from allowing further optimizations. The isEphemeralValueOf check exists to try and prevent this issue by not removing the source of an assumption. This tries to make it a little bit more general to handle the case of side-effectful instructions, such as in %0 = call i1 @get_val() %1 = xor i1 %0, true call void @llvm.assume(i1 %1) Patch by Ariel Ben-Yehuda, thanks! Differential Revision: https://reviews.llvm.org/D36590 llvm-svn: 310859