bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[IRCE] Refactor out a parseRangeCheckFromCond; NFC	Sanjoy Das	2016-05-26	1	-50/+39
\| \| \| \| \| \| \|	This will later hold more general logic to parse conjunctions of range checks. llvm-svn: 270802
*	[PM] Port PartiallyInlineLibCalls to the new pass manager.	Davide Italiano	2016-05-25	2	-39/+55
\| \| \| \|	llvm-svn: 270798
*	[PM] CorrelatedValuePropagation: pass state to function. NFCI.	Davide Italiano	2016-05-25	1	-29/+16
\| \| \| \| \| \| \|	While here, convert the logic of the pass to use static function(s). This is in preparation for porting this pass to the new PM. llvm-svn: 270734
*	[X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a ↵	Craig Topper	2016-05-25	1	-2/+0
\| \| \| \| \| \|	long time. llvm-svn: 270677
*	[PM] Port BDCE to the new pass manager.	Davide Italiano	2016-05-25	2	-34/+40
\| \| \| \|	llvm-svn: 270647
*	Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one ↵	Michael Zolotukhin	2016-05-24	1	-3/+3
\| \| \| \| \| \| \| \|	more time. This reverts commit r270577. llvm-svn: 270630
*	[GuardWidening] Tighten the interface of the RangeCheck struct; NFC	Sanjoy Das	2016-05-24	1	-32/+37
\| \| \| \| \| \|	Make `GuardWideningImpl::RangeCheck` into a class and add accessors. llvm-svn: 270611
*	[IRCE] Return a Value, not SCEV from parseRangeCheck; NFC	Sanjoy Das	2016-05-24	1	-26/+10
\| \| \| \| \| \| \|	This is better layering, since the caller needs to check if the index was an add-rec anyway. llvm-svn: 270582
*	Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling ↵	Hans Wennborg	2016-05-24	1	-3/+3
\| \| \| \| \| \| \| \|	analysis by default. Chromium builds are still hitting the assert in PR27874. llvm-svn: 270577
*	Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by ↵	Michael Zolotukhin	2016-05-24	1	-3/+3
\| \| \| \| \| \| \| \| \|	default."" This reverts commit r270512 and reapplies r270478. Originally it caused PR27847, but it was fixed in r270517. llvm-svn: 270518
*	Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."	Hans Wennborg	2016-05-23	1	-3/+3
\| \| \| \| \| \|	This caused PR27847. llvm-svn: 270512
*	[IRCE] Optimize "uses" not branches; NFCI	Sanjoy Das	2016-05-23	1	-9/+10
\| \| \| \| \| \| \| \| \| \|	This changes IRCE to optimize uses, and not branches. This change is NFCI since the uses we do inspect are in practice only ever going to be the condition use in conditional branches; but this flexibility will later allow us to analyze more complex expressions than just a direct branch on a range check. llvm-svn: 270500
*	[LoopUnroll] Enable advanced unrolling analysis by default.	Michael Zolotukhin	2016-05-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch turns on LoopUnrollAnalyzer by default. To mitigate compile time regressions, I chose very conservative thresholds for now. Later we can make them more aggressive, but it might require being smarter in which loops we're optimizing. E.g. currently the biggest issue is that with more agressive thresholds we unroll many cold loops, which increases compile time for no performance benefit (performance of those loops is improved, but it doesn't matter since they are cold). Test results for compile time(using 4 samples to reduce noise): ``` MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19% SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect 4.19% MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow 3.39% MultiSource/Applications/JM/lencod/lencod 1.47% MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06% ``` I didn't see any performance changes in the testsuite, but it improves some internal tests. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D20482 llvm-svn: 270478
*	[IRCE] Don't use an allocator for range checks; NFC	Sanjoy Das	2016-05-21	1	-37/+28
\| \| \| \| \| \| \| \|	The InductiveRangeCheck struct is only five words long; so passing these around value is fine. The allocator makes the code look more complex than it is. llvm-svn: 270309
*	[IRCE] Don't pass IRBuilder<> where unnecessary; NFC	Sanjoy Das	2016-05-21	1	-8/+6
\| \| \| \|	llvm-svn: 270308
*	[GuardWidening] Fix incorrect use of remove_if	Sanjoy Das	2016-05-21	1	-25/+29
\| \| \| \| \| \| \| \| \| \| \|	I had used `std::remove_if` under the assumption that it moves the predicate matching elements to the end, but actaully the elements remaining towards the end (after the iterator returned by `std::remove_if`) are indeterminate. Fix the bug (and make the code more straightforward) by using a temporary SmallVector, and add a test case demonstrating the issue. llvm-svn: 270306
*	[PM/PartiallyInlineLibCalls] Fix pass dependencies.	Davide Italiano	2016-05-20	1	-10/+13
\| \| \| \| \| \|	Inline getAnalysisUsage() while I'm here. llvm-svn: 270231
*	[PartiallyInlineLibCalls] Remove dead includes. NFC.	Davide Italiano	2016-05-20	1	-2/+0
\| \| \| \|	llvm-svn: 270228
*	[PM/PartiallyInlineLibCalls] Convert to static function in preparation for ↵	Davide Italiano	2016-05-20	1	-55/+49
\| \| \| \| \| \|	porting this pass to the new PM. llvm-svn: 270225
*	Add const qualifiers to appease bots; NFC	Sanjoy Das	2016-05-19	1	-3/+3
\| \| \| \|	llvm-svn: 270155
*	[GuardWidening] Introduce range check merging	Sanjoy Das	2016-05-19	1	-0/+244
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sequences of range checks expressed using guards, like guard((I - 2) u< L) guard((I - 1) u< L) guard((I + 0) u< L) guard((I + 1) u< L) guard((I + 2) u< L) can sometimes be combined into a smaller sequence: guard((I - 2) u< L AND (I + 2) u< L) if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks expressed in the previous sequence. This change teaches GuardWidening to do this kind of merging when feasible. llvm-svn: 270151
*	[SCCP] Prefer class to struct.	Davide Italiano	2016-05-19	1	-2/+4
\| \| \| \|	llvm-svn: 270074
*	[GuardWidening] Use getEquivalentICmp to fold constant compares	Sanjoy Das	2016-05-19	1	-13/+23
\| \| \| \| \| \| \|	`ConstantRange::getEquivalentICmp` is more general, and better factored. llvm-svn: 270019
*	[LowerGuards] Rename variable; NFC	Sanjoy Das	2016-05-18	1	-3/+3
\| \| \| \| \| \| \|	PredicatePassProbability is a better name for what LikelyBranchWeight was trying to express. llvm-svn: 269999
*	New pass: guard widening	Sanjoy Das	2016-05-18	3	-0/+430
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Implement guard widening in LLVM. Description from GuardWidening.cpp: The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM transform it so that it fails more often that it did before the transform. This optimization is called "widening" and can be used hoist and common runtime checks in situations like these: ``` %cmp0 = 7 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() %cmp1 = 9 u< Length call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ] ... ``` to ``` %cmp0 = 9 u< Length call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ] call @unknown_side_effects() ... ``` If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back to a generic implementation of the same function, which will have the correct semantics from that point onward. It is always _legal_ to deoptimize (so replacing `%cmp0` with false is "correct"), though it may not always be profitable to do so. NB! This pass is a work in progress. It hasn't been tuned to be "production ready" yet. It is known to have quadriatic running time and will not scale to large numbers of guards Reviewers: reames, atrick, bogner, apilipenko, nlewycky Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20143 llvm-svn: 269997
*	[LoopUnrollAnalyzer] Take into account cost of instructions controlling ↵	Michael Zolotukhin	2016-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \|	branches, along with their operands. Previously, we didn't add their and their operands cost, which could've resulted in unrolling loops for no actual benefit. llvm-svn: 269985
*	[PM] Port per-function SCCP to the new pass manager.	Davide Italiano	2016-05-18	2	-45/+53
\| \| \| \|	llvm-svn: 269937
*	[PM] Port DSE to the new pass manager	Justin Bogner	2016-05-17	2	-322/+340
\| \| \| \| \| \|	Patch by JakeVanAdrighem. Thanks! llvm-svn: 269847
*	[Guards] Add branch metadata when lowering	Sanjoy Das	2016-05-17	1	-0/+10
\| \| \| \| \| \| \|	Guards are expected to basically never fail. Reflect this in the branch probabilities in their lowered form. llvm-svn: 269791
*	[RewriteStatepointsForGC] Remove obsolete assertion	Igor Laevsky	2016-05-17	1	-8/+0
\| \| \| \| \| \| \| \| \| \|	This is assertion is no longer necessary since we never record constants in the live set anyway. (They are never recorded in the initial live set, and constant bases are removed near line 2119) Differential Revision: http://reviews.llvm.org/D20293 llvm-svn: 269764
*	[PM] RewriterStatepointForGC: add missing dependency.	Davide Italiano	2016-05-16	1	-0/+1
\| \| \| \|	llvm-svn: 269624
*	Move helper classes into anonymous namespaces. NFC.	Benjamin Kramer	2016-05-15	1	-1/+1
\| \| \| \|	llvm-svn: 269591
*	[PM/SCCP] Fix pass dependencies.	Davide Italiano	2016-05-15	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	TargetLibraryInfoWrapperPass is a dependency of SCCP but it's not listed as such. Chandler pointed out this is an easy mistake to make which only surfaces in weird crashes with some flag combinations. This code will go away anyway at some point in the future, but as long as it's (still) exercised, try to make it correct. llvm-svn: 269589
*	[SCCP] Use range-based for loops. NFC.	Davide Italiano	2016-05-14	1	-10/+10
\| \| \| \|	llvm-svn: 269578
*	[PM] Port LowerAtomic to the new pass manager.	Davide Italiano	2016-05-13	2	-42/+56
\| \| \| \|	llvm-svn: 269511
*	Revert "Revert "[Unroll] Implement a conservative and monotonically ↵	Michael Zolotukhin	2016-05-13	1	-13/+178
\| \| \| \| \| \| \| \| \| \|	increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."" This reverts commit r269395. Try to reapply with a fix from chapuni. llvm-svn: 269486
*	Rename getLargestLegalIntTypeSize to getLargestLegalIntTypeSizeInBits(). NFC.	Jun Bum Lim	2016-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Rename DataLayout::getLargestLegalIntTypeSize to DataLayout::getLargestLegalIntTypeSizeInBits() to prevent similar mistakes fixed in r269433. Reviewers: joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20248 llvm-svn: 269456
*	[EarlyCSE] Change key type of AvailableCalls to Instruction*. NFCI.	Geoff Berry	2016-05-13	1	-3/+4
\| \| \| \|	llvm-svn: 269445
*	[MemCpyOpt] Use MaxIntSize in byte instead of bit	Jun Bum Lim	2016-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change fix the bug in isProfitableToUseMemset() where MaxIntSize shoule be in byte, not bit. Reviewers: arsenm, joker.eph, mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20176 llvm-svn: 269433
*	Revert "[Unroll] Implement a conservative and monotonically increasing cost ↵	Michael Zolotukhin	2016-05-13	1	-178/+13
\| \| \| \| \| \| \| \| \| \| \|	tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..." This reverts commit r269388. It caused some bots to fail, I'm reverting it until I investigate the issue. llvm-svn: 269395
*	[LoopDist] Only run LAA for loops with the pragma	Adam Nemet	2016-05-13	1	-17/+17
\| \| \| \| \| \| \|	This should fix some compile-time regressions after r267672. Thanks to Chris Matthews for bisecting it. llvm-svn: 269392
*	[Unroll] Implement a conservative and monotonically increasing cost tracking ↵	Michael Zolotukhin	2016-05-13	1	-13/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... Summary: ...loop after the last iteration. This is really hard to do correctly. The core problem is that we need to model liveness through the induction PHIs from iteration to iteration in order to get the correct results, and we need to correctly de-duplicate the common subgraphs of instructions feeding some subset of the induction PHIs. All of this can be driven either from a side effect at some iteration or from the loop values used after the loop finishes. This patch implements this by storing the forward-propagating analysis of each instruction in a cache to recall whether it was free and whether it has become live and thus counted toward the total unroll cost. Then, at each sink for a value in the loop, we recursively walk back through every value that feeds the sink, including looping back through the iterations as needed, until we have marked the entire input graph as live. Because we cache this, we never visit instructions more than twice -- once when we analyze them and put them into the cache, and once when we count their cost towards the unrolled loop. Also, because the cache is only two bits and because we are dealing with relatively small iteration counts, we can store all of this very densely in memory to avoid this from becoming an excessively slow analysis. The code here is still pretty gross. I would appreciate suggestions about better ways to factor or split this up, I've stared too long at the algorithmic side to really have a good sense of what the design should probably look at. Also, it might seem like we should do all of this bottom-up, but I think that is a red herring. Specifically, the simplification power is much greater working top-down. We can forward propagate very effectively, even across strange and interesting recurrances around the backedge. Because we use data to propagate, this doesn't cause a state space explosion. Doing this level of constant folding, etc, would be very expensive to do bottom-up because it wouldn't be until the last moment that you could collapse everything. The current solution is essentially a top-down simplification with a bottom-up cost accounting which seems to get the best of both worlds. It makes the simplification incremental and powerful while leaving everything dead until we know it is needed. Finally, a core property of this approach is its monotonicity. At all times, the current UnrolledCost is a conservatively low estimate. This ensures that we will never early-exit from the analysis due to exceeding a threshold when if we had continued, the cost would have gone back below the threshold. These kinds of bugs can cause incredibly hard to track down random changes to behavior. We could use a techinque similar (but much simpler) within the inliner as well to avoid considering speculated code in the inline cost. Reviewers: chandlerc Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D11758 llvm-svn: 269388
*	[PM] Port of the DepndenceAnalysis to the new PM.	Chandler Carruth	2016-05-12	2	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	Ported DA to the new PM by splitting the former DependenceAnalysis Pass into a DependenceInfo result type and DependenceAnalysisWrapperPass type and adding a new PM-style DependenceAnalysis analysis pass returning the DependenceInfo. Patch by Philip Pfaffe, most of the review by Justin. Differential Revision: http://reviews.llvm.org/D18834 llvm-svn: 269370
*	[PM] Make LowerAtomic a FunctionPass.	Davide Italiano	2016-05-12	1	-5/+16
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D20025 llvm-svn: 269322
*	[SCCP] Resolve shifts beyond the bitwidth to undef	David Majnemer	2016-05-12	1	-0/+16
\| \| \| \| \| \| \| \| \|	Shifts beyond the bitwidth are undef but SCCP resolved them to zero. Instead, DTRT and resolve them to undef. This reimplements the transform which caused PR27712. llvm-svn: 269269
*	Revert "[SCCP] Partially propagate informations when the input is not fully ↵	Davide Italiano	2016-05-11	1	-3/+0
\| \| \| \| \| \| \| \|	defined." This reverts commit r269105 as it caused PR27712. llvm-svn: 269252
*	Revert "MemCpyOpt: combine local load/store sequences into memcpy."	Tim Northover	2016-05-10	1	-222/+48
\| \| \| \| \| \| \|	This reverts commit r269125. It was in my tree when I ran "git svn dcommit". It's really still under review. llvm-svn: 269127
*	MemCpyOpt: combine local load/store sequences into memcpy.	Tim Northover	2016-05-10	1	-48/+222
\| \| \| \| \| \| \| \|	Sort of the BB-local equivalent to idiom-recognizer: if we have a basic-block that really implements a memcpy operation, backends can benefit from seeing this. llvm-svn: 269125
*	Loop unroller: set thresholds for optsize and minsize functions to zero	Hans Wennborg	2016-05-10	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before r268509, Clang would disable the loop unroll pass when optimizing for size. That commit enabled it to be able to support unroll pragmas in -Os builds. However, this regressed binary size in one of Chromium's DLLs with ~100 KB. This restores the original behaviour of no unrolling at -Os, but doing it in LLVM instead of Clang makes more sense, and also allows the pragmas to keep working. Differential revision: http://reviews.llvm.org/D20115 llvm-svn: 269124
*	Enable loopreroll for sext of loop control only IV	Lawrence Hu	2016-05-10	1	-12/+33
\| \| \| \| \| \| \| \| \|	This patch extend loopreroll to allow the instruction chain of loop control only IV has sext. Differential Revision: http://reviews.llvm.org/D19820 llvm-svn: 269121