bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[LegacyPM] Make the 'addLoop' method accept a loop to add rather than	Chandler Carruth	2017-05-25	1	-15/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	having it internally allocate the loop. This is a much more flexible API and necessary in the new loop unswitch to reasonably support both new and old PMs in common code. It also just seems like a cleaner separation of concerns. NFC, this should just be a pure refactoring. Differential Revision: https://reviews.llvm.org/D33528 llvm-svn: 303834
*	[InstSimplify] Simplify uadd/sadd/umul/smul with overflow intrinsics when ↵	Craig Topper	2017-05-24	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the Zero or Undef is on the LHS. Summary: This code was migrated from InstCombine a few years ago. InstCombine had nearby code that would move Constants to the RHS for these, but InstSimplify doesn't have such code on this path. Reviewers: spatel, majnemer, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33473 llvm-svn: 303774
*	[ValueTracking] Convert most of the calls to computeKnownBits to use the ↵	Craig Topper	2017-05-24	4	-34/+13
\| \| \| \| \| \| \| \| \| \|	version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773
*	[ValueTracking] Add OptimizationRemarkEmitter to the other signature for ↵	Craig Topper	2017-05-24	1	-2/+4
\| \| \| \| \| \| \| \|	commuteKnownBits. This is needed for an upcoming patch. llvm-svn: 303772
*	Revert "[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start"	Diana Picus	2017-05-24	1	-59/+2
\| \| \| \| \| \|	This reverts commit r303730 because it broke all the buildbots. llvm-svn: 303747
*	[LoopVectorizer] Let target prefer scalar addressing computations.	Jonas Paulsson	2017-05-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744
*	[SCEV] Do not fold dominated SCEVUnknown into AddRecExpr start	Max Kazantsev	2017-05-24	1	-2/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When folding arguments of AddExpr or MulExpr with recurrences, we rely on the fact that the loop of our base recurrency is the bottom-lost in terms of domination. This assumption may be broken by an expression which is treated as invariant, and which depends on a complex Phi for which SCEVUnknown was created. If such Phi is a loop Phi, and this loop is lower than the chosen AddRecExpr's loop, it is invalid to fold our expression with the recurrence. Another reason why it might be invalid to fold SCEVUnknown into Phi start value is that unlike other SCEVs, SCEVUnknown are sometimes position-bound. For example, here: for (...) { // loop phi = {A,+,B} } X = load ... Folding phi + X into {A+X,+,B}<loop> actually makes no sense, because X does not exist and cannot exist while we are iterating in loop (this memory can be even not allocated and not filled by this moment). It is only valid to make such folding if X is defined before the loop. In this case the recurrence {A+X,+,B}<loop> may be existant. This patch prohibits folding of SCEVUnknown (and those who use them) into the start value of an AddRecExpr, if this instruction is dominated by the loop. Merging the dominating unknown values is still valid. Some tests that relied on the fact that some SCEVUnknown should be folded into AddRec's are changed so that they no longer expect such behavior. llvm-svn: 303730
*	InstructionSimplify: don't speculate about Constants changing.	Tim Northover	2017-05-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	When presented with an icmp/select pair, we can end up asking what would happen if we replaced one constant with another in an instruction. This is a mistake, while non-constant Values could become a constant, constants cannot change and trying to do so can lead to completely invalid IR (a GEP referencing a non-existant field in the original case). llvm-svn: 303580
*	[SCEV] Clarify behavior around max backedge taken count	Sanjoy Das	2017-05-22	1	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a re-application of a r303497 that was reverted in r303498. I thought it had broken a bot when it had not (the breakage did not go away with the revert). This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303531
*	Revert "[SCEV] Clarify behavior around max backedge taken count"	Sanjoy Das	2017-05-21	1	-37/+10
\| \| \| \| \| \| \|	This reverts commit r303497 since it breaks the msan bootstrap bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1379/ llvm-svn: 303498
*	[SCEV] Clarify behavior around max backedge taken count	Sanjoy Das	2017-05-21	1	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change makes the split between the "exact" backedge taken count and the "maximum" backedge taken count a bit more obvious. Both of these are upper bounds on the number of times the loop header executes (since SCEV does not account for most kinds of abnormal control flow), but the latter is guaranteed to be a constant. There were a few places where the max backedge taken count was a non-constant; I've changed those to compute constants instead. At this point, I'm not sure if the constant max backedge count can be computed by calling `getUnsignedRange(Exact).getUnsignedMax()` without losing precision. If it can, we can simplify even further by making `getMaxBackedgeTakenCount` a thin wrapper around `getBackedgeTakenCount` and `getUnsignedRange`. llvm-svn: 303497
*	Revert "Add pthread_self function prototype and make it speculatable."	Xin Tong	2017-05-21	1	-9/+0
\| \| \| \| \| \| \| \|	This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21. Build breaking llvm-svn: 303496
*	Add pthread_self function prototype and make it speculatable.	Xin Tong	2017-05-20	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495
*	Fix breakage after r303461	Matthias Braun	2017-05-20	1	-1/+5
\| \| \| \| \| \| \|	- Improve wchar_t size predicitions based on target triple. - Be less strict in wchar_t size verifier. llvm-svn: 303477
*	SimplifyLibCalls: Optimize wcslen	Matthias Braun	2017-05-19	2	-34/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461
*	BasicAA: Uninserted instructions have no parent, and notDifferentParent ↵	Daniel Berlin	2017-05-19	1	-1/+4
\| \| \| \| \| \|	explicitly allows for this case, but getParent crashes when handed one. llvm-svn: 303442
*	[InstSimplify] Fix 80 column violation. NFC	Craig Topper	2017-05-19	1	-3/+4
\| \| \| \|	llvm-svn: 303433
*	[IR] De-virtualize ~Value to save a vptr	Reid Kleckner	2017-05-18	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362
*	[SCEV][NFC] Remove duplication of isLoopInvariant code	Max Kazantsev	2017-05-18	1	-2/+2
\| \| \| \| \| \| \| \| \|	Replace two places that duplicate the code of isLoopInvariant method with the invocation of this method. Differential Revision: https://reviews.llvm.org/D33313 llvm-svn: 303336
*	[BPI] Reduce the probability of unreachable edge to minimal value greater than 0	Serguei Katkov	2017-05-18	1	-40/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The probability of edge coming to unreachable block should be as low as possible. The change reduces the probability to minimal value greater than zero. The bug https://bugs.llvm.org/show_bug.cgi?id=32214 show the example when the probability of edge coming to unreachable block is greater than for edge coming to out of the loop and it causes incorrect loop rotation. Please note that with this change the behavior of unreachable heuristic is a bit different than others. Specifically, before this change the sum of probabilities coming to unreachable blocks have the same weight for all branches (it was just split over all edges of this block coming to unreachable blocks). With this change it might be slightly different but not to much due to probability of taken branch to unreachable block is really small. Reviewers: chandlerc, sanjoy, vsk, congh, junbuml, davidxl, dexonsmith Reviewed By: chandlerc, dexonsmith Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D30633 llvm-svn: 303327
*	[Statistics] Add a method to atomically update a statistic that contains a ↵	Craig Topper	2017-05-18	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	maximum Summary: There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways: MaxNumFoo = std::max(MaxNumFoo, NumFoo); or MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo; The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes. This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again. This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ Reviewers: dberlin, chandlerc, hfinkel, dblaikie Reviewed By: chandlerc Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D33301 llvm-svn: 303318
*	[InstSimplify] handle all icmp i1 X, C in one place; NFCI	Sanjay Patel	2017-05-17	1	-28/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already handled all of the new tests identically, but several of those went through a lot of unnecessary processing before getting folded. Another motivation for grouping these cases together is that InstCombine needs a similar fold. Currently, it handles the 'not' cases inefficiently which can lead to bugs as described in the post-commit comments of: https://reviews.llvm.org/D32143 llvm-svn: 303295
*	[SCEV] Always sort AddRecExprs from different loops by dominance	Max Kazantsev	2017-05-17	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sorting of AddRecExprs by loop nesting does not make sense since we only invoke the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This guarantees that there is always a dominance relationship between them. This patch removes the sorting by nesting which is a dead code in current usage of this function. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33228 llvm-svn: 303235
*	[SCEV][NFC] Replace redundant dyn_cast with cast in getAddExpr	Max Kazantsev	2017-05-17	1	-14/+15
\| \| \| \| \| \| \| \|	Replace dyn_cast which is ensured by isa just one line above with cast. Differential Revision: https://reviews.llvm.org/D33231 llvm-svn: 303234
*	BitVector: add iterators for set bits	Francis Visoiu Mistrih	2017-05-17	1	-17/+16
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D32060 llvm-svn: 303227
*	[InstSimplify] add folds for constant mask of value shifted by constant	Sanjay Patel	2017-05-16	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We would eventually catch these via demanded bits and computing known bits in InstCombine, but I think it's better to handle the simple cases as soon as possible as a matter of efficiency. This fold allows further simplifications based on distributed ops transforms. eg: %a = lshr i8 %x, 7 %b = or i8 %a, 2 %c = and i8 %b, 1 InstSimplify can directly fold this now: %a = lshr i8 %x, 7 Differential Revision: https://reviews.llvm.org/D33221 llvm-svn: 303213
*	[Inliner] Do not mix callsite and callee hotness based updates.	Easwaran Raman	2017-05-16	1	-15/+27
\| \| \| \| \| \| \| \| \| \|	Update threshold based on callee's hotness only when BFI is not available. Otherwise use only callsite's hotness. This makes it easier to reason about hotness related threshold updates. Differential revision: https://reviews.llvm.org/D33157 llvm-svn: 303210
*	Add hasProfileSummary and has{Sample\|Instrumentation}Profile methods	Easwaran Raman	2017-05-16	1	-1/+1
\| \| \| \| \| \| \| \|	ProfileSummaryInfo already checks whether the module has sample profile in determining profile counts. This will also be useful in inliner to clean up threshold updates. llvm-svn: 303204
*	[SCEV] Fix sorting order for AddRecExprs	Max Kazantsev	2017-05-16	1	-16/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs by loop depth, but does not pay attention to dominance of loops. This can lead us to the following buggy situation: for (...) { // loop1 op1 = {A,+,B} } for (...) { // loop2 op2 = {A,+,B} S = add op1, op2 } In this case there is no guarantee that in operand list of S the op2 comes before op1 (loop depth is the same, so they will be sorted just lexicographically), so we can incorrectly treat S as a recurrence of loop1, which is wrong. This patch changes the sorting logic so that it places the dominated recs before the dominating recs. This ensures that when we pick the first recurrency in the operands order, it will be the bottom-most in terms of domination tree. The attached test set includes some tests that produce incorrect SCEV estimations and crashes with oldlogic. Reviewers: sanjoy, reames, apilipenko, anna Reviewed By: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33121 llvm-svn: 303148
*	IR: Give function GlobalValue::getRealLinkageName() a less misleading name: ↵	Peter Collingbourne	2017-05-16	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134
*	[SLP] Enable 64-bit wide vectorization on AArch64	Adam Nemet	2017-05-15	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ARM Neon has native support for half-sized vector registers (64 bits). This is beneficial for example for 2D and 3D graphics. This patch adds the option to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer. * Performance Analysis This change was motivated by some internal benchmarks but it is also beneficial on SPEC and the LLVM testsuite. The results are with -O3 and PGO. A negative percentage is an improvement. The testsuite was run with a sample size of 4. SPEC * CFP2006/482.sphinx3 -3.34% A pretty hot loop is SLP vectorized resulting in nice instruction reduction. This used to be a +22% regression before rL299482. * CFP2000/177.mesa -3.34% * CINT2000/256.bzip2 +6.97% My current plan is to extend the fix in rL299482 to i16 which brings the regression down to +2.5%. There are also other problems with the codegen in this loop so there is further room for improvement. ** LLVM testsuite * SingleSource/Benchmarks/Misc/ReedSolomon -10.75% There are multiple small SLP vectorizations outside the hot code. It's a bit surprising that it adds up to 10%. Some of this may be code-layout noise. * MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40% The opt-viewer screenshot can be seen at F3218284. We start at a colder store but the tree leads us into the hottest loop. * MultiSource/Applications/lambda-0.1.3/lambda -2.68% * MultiSource/Benchmarks/Bullet/bullet -2.18% This is using 3D vectors. * SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67% Noise, binary is unchanged. * MultiSource/Benchmarks/Ptrdist/anagram/anagram +4.90% There is an additional SLP in the cold code. The test runs for ~1sec and prints out over 2000 lines. This is most likely noise. * MultiSource/Applications/aha/aha +1.63% * MultiSource/Applications/JM/lencod/lencod +1.41% * SingleSource/Benchmarks/Misc/richards_benchmark +1.15% Differential Revision: https://reviews.llvm.org/D31965 llvm-svn: 303116
*	[InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949)	Sanjay Patel	2017-05-15	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving: https://bugs.llvm.org/show_bug.cgi?id=9343 As shown here: http://rise4fun.com/Alive/C8 ...however, the sdiv exact case needs a stronger predicate. I opted for duplicated code instead of adding another fallthrough because I think that's easier to read (and edit in case we need/want to restrict/loosen the predicates any more). This should fix: https://bugs.llvm.org/show_bug.cgi?id=32949 https://bugs.llvm.org/show_bug.cgi?id=32948 Differential Revision: https://reviews.llvm.org/D32954 llvm-svn: 303104
*	[SCEV] Use copy initialization of APInts instead of direct initialization.	Craig Topper	2017-05-15	1	-6/+6
\| \| \| \| \| \|	This is based on post commit feed back from r302769. llvm-svn: 303092
*	[ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits.	Craig Topper	2017-05-15	4	-71/+54
\| \| \| \| \| \| \| \|	This patch finishes off the conversion of ComputeSignBit to computeKnownBits. Differential Revision: https://reviews.llvm.org/D33166 llvm-svn: 303035
*	Move some code into ScalarEvolution.cpp; NFC	Sanjoy Das	2017-05-15	1	-0/+24
\| \| \| \| \| \| \|	I need to add some asserts to these constructors that are easier to add once they're in the .cpp file. llvm-svn: 303032
*	[InstCombine] Merge duplicate functionality between InstCombine and ↵	Craig Topper	2017-05-15	1	-5/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ValueTracking Summary: Merge overflow computation for signed add, appearing both in InstCombine and ValueTracking. As part of the merge, cleanup the interface for overflow checks in InstCombine. Patch by Yoav Ben-Shalom. Reviewers: craig.topper, majnemer Reviewed By: craig.topper Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32946 llvm-svn: 303029
*	[InstSimplify] Add patterns for folding (A & B) \| (~A ^ B) -> (~A ^ B) and ↵	Craig Topper	2017-05-14	1	-0/+18
\| \| \| \| \| \| \| \|	its commuted variants. We already had (A & ~B) \| (A ^ B), but we missed the cases where the not was part of the xor. llvm-svn: 303004
*	[BasicAA] Alphabetize includes. NFC	Craig Topper	2017-05-14	1	-1/+1
\| \| \| \|	llvm-svn: 303002
*	[ValueTracking] Remove const_casts on several calls to computeKnownBits and ↵	Craig Topper	2017-05-13	2	-6/+3
\| \| \| \| \| \|	ComputeSignBit. NFC llvm-svn: 302991
*	[TLI] Add mapping for various '__<func>_finite' forms of the math routines ↵	Andrew Kaylor	2017-05-12	1	-0/+24
\| \| \| \| \| \| \| \| \| \|	to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 llvm-svn: 302957
*	[ConstantFolding] Add folding for various math '__<func>_finite' routines ↵	Andrew Kaylor	2017-05-12	1	-11/+69
\| \| \| \| \| \| \| \| \| \|	generated from -ffast-math Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31788 llvm-svn: 302956
*	[TLI] Add declarations for various math header file routines from ↵	Andrew Kaylor	2017-05-12	1	-0/+86
\| \| \| \| \| \| \| \| \| \|	math-finite.h that create '__<func>_finite as functions Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31787 llvm-svn: 302955
*	[KnownBits] Add bit counting methods to KnownBits struct and use them where ↵	Craig Topper	2017-05-12	4	-43/+35
\| \| \| \| \| \| \| \| \| \| \| \|	possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
*	[BPI] Ignore remainder while distributing the remaining probability from ↵	Serguei Katkov	2017-05-12	1	-8/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unreachanble This is a follow up patch for https://reviews.llvm.org/rL300440 to address a comment. To make implementation to be consistent with other cases we just ignore the remainder after distribution of remaining probability between reachable edges. If we reduced the probability of some edges coming to unreachable blocks we should distribute the remaining part across other edges coming to reachable blocks to satisfy the condition that sum of all probabilities should be equal to one. If this remaining part is not divided by number of "reachable" edges then we get this remainder. This remainder probability should be pretty small. Other cases just ignore if the sum of probabilities is not equal to one so we do the same. Reviewers: chandlerc, sanjoy, vsk, junbuml, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D32124 llvm-svn: 302883
*	CallGraph: Remove almost-unused field 'Root'.	Peter Collingbourne	2017-05-11	1	-29/+5
\| \| \| \|	llvm-svn: 302852
*	Restrict call metadata based hotness detection to Sample PGO mode	Teresa Johnson	2017-05-11	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Don't use the metadata on call instructions for determining hotness unless we are in sample PGO mode, where it is needed because profile counts are not accurate. In instrumentation mode this is not necessary and does more harm than good when calls have VP metadata that hasn't been properly scaled after transformations or dropped after constant prop based devirtualization (both should be fixed, but we don't need to do this in the first place for instrumentation PGO). This required adjusting a number of tests to distinguish between sample and instrumentation PGO handling, and to add in profile summary metadata so that getProfileCount can get the summary. Reviewers: davidxl, danielcdh Subscribers: aemerson, rengolin, mehdi_amini, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32877 llvm-svn: 302844
*	Decrease inlinecold-threshold to 45	Easwaran Raman	2017-05-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I ran the test-suite (including SPEC 2006) in PGO mode comparing cold thresholds of 225 and 45. Here are some stats on the text size: Out of 904 tests that ran, 197 see a change in text size. The average text size reduction (of all the 904 binaries) is 1.07%. Of the 197 binaries, 19 see a text size increase, as high as 18%, but most of them are small single source benchmarks. There are 3 multisource benchmarks with a >0.5% size increase (0.7, 1.3 and 2.1 are their % increases). On the other side of the spectrum, 31 benchmarks see >10% size reduction and 6 of them are MultiSource. I haven't run the test-suite with other values of inlinecold-threshold. Since we have a cold callsite threshold of 45, I picked this value. Differential revision: https://reviews.llvm.org/D33106 llvm-svn: 302829
*	[SCEV] Reduce possible APInt allocations a bit.	Craig Topper	2017-05-11	1	-7/+11
\| \| \| \|	llvm-svn: 302769
*	[SCEV] Remove unneeded 'using namespace APIntOps'.	Craig Topper	2017-05-11	1	-37/+34
\| \| \| \|	llvm-svn: 302768
*	Ensure non-null ProfileSummaryInfo passed to ModuleSummaryIndex builder	Teresa Johnson	2017-05-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This fixes a ubsan bot failure after r302597, which made getProfileCount non-static, but ended up invoking it on a null ProfileSummaryInfo object in some cases from buildModuleSummaryIndex. Most testing passed because the non-static getProfileCount currently doesn't access any member variables, but I found this when testing a follow on patch (D32877) that adds a member variable access. llvm-svn: 302705