bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make helper functions static. NFC.	Benjamin Kramer	2017-05-26	2	-3/+5
\| \| \| \|	llvm-svn: 304029
*	PMB: Run the whole-program-devirt pass during LTO at --lto-O0.	Peter Collingbourne	2017-05-26	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	The whole-program-devirt pass needs to run at -O0 because only it knows about the llvm.type.checked.load intrinsic: it needs to both lower the intrinsic itself and handle it in the summary. Differential Revision: https://reviews.llvm.org/D33571 llvm-svn: 304019
*	[InstCombine] Pass the DominatorTree, AssumptionCache, and context ↵	Craig Topper	2017-05-26	3	-4/+7
\| \| \| \| \| \| \| \| \| \|	instruction to a few calls to isKnownPositive, isKnownNegative, and isKnownNonZero Every other place in InstCombine that uses these methods in ValueTracking already pass this information. This makes the remaining sites consistent. Differential Revision: https://reviews.llvm.org/D33567 llvm-svn: 304018
*	Revert rL303923 since it broke the sanitizer bootstrap build bot.	Wei Mi	2017-05-26	1	-136/+21
\| \| \| \|	llvm-svn: 303969
*	[InstCombine] Add an InstCombine specific wrapper around ↵	Craig Topper	2017-05-25	4	-14/+14
\| \| \| \| \| \| \| \|	isKnownToBeAPowerOfTwo to shorten code. NFC We have wrappers for several other ValueTracking methods that take care of passing all of the analysis and assumption cache parameters. This extends it to isKnownToBeAPowerOfTwo. llvm-svn: 303924
*	[GVN] Add phi-translate support in scalarpre.	Wei Mi	2017-05-25	1	-21/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now scalarpre doesn't have phi-translate support, so it will miss some simple pre opportunities. Like the following testcase, current scalarpre cannot recognize the last "a * b" is fully redundent because a and b used by the last "a * b" expr are both defined by phis. long a[100], b[100], g1, g2, g3; __attribute__((pure)) long goo(); void foo(long a, long b, long c, long d) { g1 = a * b; if (__builtin_expect(g2 > 3, 0)) { a = c; b = d; g2 = a * b; } g3 = a * b; // fully redundant. } The patch adds phi-translate support in scalarpre. This is only a temporary solution before the newpre based on newgvn is available. Differential Revision: https://reviews.llvm.org/D32252 llvm-svn: 303923
*	NewGVN: Fix PR 33119, PR 33129, due to regressed undef handling	Daniel Berlin	2017-05-25	1	-22/+42
\| \| \| \| \| \|	Fix PR33120 and others by eliminating self-cycles a different way. llvm-svn: 303875
*	[InstCombine] Teach isAllocSiteRemovable to look through addrspacecasts	Artur Pilipenko	2017-05-25	1	-1/+3
\| \| \| \| \| \| \| \|	Reviewed By: reames Differential Revision: https://reviews.llvm.org/D28565 llvm-svn: 303870
*	[InstCombine] make icmp-mul fold more efficient	Sanjay Patel	2017-05-25	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \|	There's probably a lot more like this (see also comments in D33338 about responsibility), but I suspect we don't usually get a visible manifestation. Given the recent interest in improving InstCombine efficiency, another potential micro-opt that could be repeated several times in this function: morph the existing icmp pred/operands instead of creating a new instruction. llvm-svn: 303860
*	[GVNSink] Pacify MSVC	James Molloy	2017-05-25	1	-1/+1
\| \| \| \| \| \|	Don't convert an unsigned to a pointer for a sentinel, use a size_t instead. llvm-svn: 303855
*	[GVNSink] Don't define operator<< in NDEBUG	James Molloy	2017-05-25	1	-0/+2
\| \| \| \| \| \| \|	Without debug macros enabled, the raw_ostream operator<< overload is unused. llvm-svn: 303852
*	[GVNSink] GVNSink pass	James Molloy	2017-05-25	6	-47/+926
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts. This pass attempts to sink instructions into successors, reducing static instruction count and enabling if-conversion. We use a variant of global value numbering to decide what can be sunk. Consider: [ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ] [ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ] \ / [ %e = phi i32 %a2, %c2 ] [ add i32 %e, 4 ] GVN would number %a1 and %c1 differently because they compute different results - the VN of an instruction is a function of its opcode and the transitive closure of its operands. This is the key property for hoisting and CSE. What we want when sinking however is for a numbering that is a function of the uses of an instruction, which allows us to answer the question "if I replace %a1 with %c1, will it contribute in an equivalent way to all successive instructions?". The (new) PostValueTable class in GVN provides this mapping. This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG. Differential revision: https://reviews.llvm.org/D24805 llvm-svn: 303850
*	[PM/Unswitch] Fix a bug in the domtree update logic for the new unswitch	Chandler Carruth	2017-05-25	1	-14/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pass. The original logic only considered direct successors of the hoisted domtree nodes, but that isn't really enough. If there are other basic blocks that are completely within the subtree, their successors could just as easily be impacted by the hoisting. The more I think about it, the more I think the correct update here is to hoist every block on the dominance frontier which has an idom in the chain we hoist across. However, this is subtle enough that I'd definitely appreciate some more eyes on it. Sadly, if this is the correct algorithm, it requires computing a (highly localized) dominance frontier. I've done this in the simplest (IE, least code) way I could come up with, but that may be too naive. Suggestions welcome here, dominance update algorithms are not an area I've studied much, so I don't have strong opinions. In good news, with this patch, turning on simple unswitch passes the LLVM test suite for me with asserts enabled. Differential Revision: https://reviews.llvm.org/D32740 llvm-svn: 303843
*	[LegacyPM] Make the 'addLoop' method accept a loop to add rather than	Chandler Carruth	2017-05-25	2	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	having it internally allocate the loop. This is a much more flexible API and necessary in the new loop unswitch to reasonably support both new and old PMs in common code. It also just seems like a cleaner separation of concerns. NFC, this should just be a pure refactoring. Differential Revision: https://reviews.llvm.org/D33528 llvm-svn: 303834
*	Fix coverage check for full post-dominator basic blocks.	George Karpenkov	2017-05-25	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	Coverage instrumentation which does not instrument full post-dominators and full-dominators may skip valid paths, as the reasoning for skipping blocks may become circular. This patch fixes that, by only skipping full post-dominators with multiple predecessors, as such predecessors by definition can not be full-dominators. llvm-svn: 303827
*	[coroutines] CoroFrame.cpp conform to coding convention (s/repeat/Repeat) (NFC)	Gor Nishanov	2017-05-25	1	-3/+2
\| \| \| \|	llvm-svn: 303826
*	[coroutines] Relocate instructions that maybe spilled after coro.begin	Gor Nishanov	2017-05-25	1	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Frontend generates store instructions after allocas, for example: ``` define i8* @f(i64 %this) "coroutine.presplit"="1" personality i32 0 { entry: %this.addr = alloca i64 store i64 %this, i64* %this.addr .. %hdl = call i8* @llvm.coro.begin(token %id, i8* %alloc) ``` Such instructions may require spilling into coro.frame, but, coro-frame address is only available after coro.begin and thus needs to be moved after coro.begin. The only instructions that should not be moved are the arguments of coro.begin and all of their operands. Reviewers: GorNishanov, majnemer Reviewed By: GorNishanov Subscribers: llvm-commits, EricWF Differential Revision: https://reviews.llvm.org/D33527 llvm-svn: 303825
*	[coroutines] Allow rematerialization upto 4 times. Remove incorrect assert	Gor Nishanov	2017-05-24	1	-15/+19
\| \| \| \| \| \| \| \| \| \|	Reviewers: majnemer Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33524 llvm-svn: 303819
*	[InstCombine] use m_APInt to allow icmp-mul-mul vector fold	Sanjay Patel	2017-05-24	1	-11/+12
\| \| \| \| \| \| \| \| \|	The swapped operands in the first test is a manifestation of an inefficiency for vectors that doesn't exist for scalars because the IRBuilder checks for an all-ones mask for scalars, but not vectors. llvm-svn: 303818
*	[InstCombine] Merge together the SimplifyDemandedUseBits implementations for ↵	Craig Topper	2017-05-24	1	-21/+10
\| \| \| \| \| \| \| \| \| \|	ZExt and Trunc. NFC While there avoid resizing the DemandedMask twice. Make a copy into a separate variable instead. This potentially removes an allocation on large bit widths. With the use of the zextOrTrunc methods on APInt and KnownBits these can be made almost source identical. The only difference is the zero of the upper bits for ZExt. This is similar to how its done in computeKnownBits in ValueTracking. llvm-svn: 303791
*	Fix a couple of typos in memory intrinsic optimization output (NFC)	Teresa Johnson	2017-05-24	1	-1/+1
\| \| \| \| \| \|	s/instrinsic/intrinsic llvm-svn: 303782
*	[InstCombine] Use less bitwise operations to handle Instruction::SExt in ↵	Craig Topper	2017-05-24	1	-19/+14
\| \| \| \| \| \| \| \| \| \| \| \|	SimplifyDemandedUseBits. Other improvements. The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits. We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better. Differential Revision: https://reviews.llvm.org/D32098 llvm-svn: 303779
*	[ValueTracking] Convert most of the calls to computeKnownBits to use the ↵	Craig Topper	2017-05-24	9	-48/+21
\| \| \| \| \| \| \| \| \| \|	version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773
*	[LV] Update type in cost model for scalarization	Matthew Simpson	2017-05-24	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For non-uniform instructions marked for scalarization, we should update `VectorTy` when computing instruction costs to reflect the scalar type. In addition to determining instruction costs, this type is also used to signal that all instructions in the loop will be scalarized. This currently affects memory instructions and non-pointer induction variables and their updates. (We also mark GEPs scalar after vectorization, but their cost is computed together with memory instructions.) For scalarized induction updates, this patch also scales the scalar cost by the vectorization factor, corresponding to each induction step. llvm-svn: 303763
*	[LoopVectorizer] Let target prefer scalar addressing computations.	Jonas Paulsson	2017-05-24	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744
*	[NewGVN] Update additionalUsers when we simplify to a value.	Davide Italiano	2017-05-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Otherwise we don't revisit an instruction that could be simplified, and when we verify, we discover there's something that changed, i.e. what we had wasn't a maximal fixpoint. Fixes PR32836. llvm-svn: 303715
*	Revert "Disable coverage opt-out for strong postdominator blocks."	George Karpenkov	2017-05-24	1	-2/+22
\| \| \| \| \| \| \|	This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef. Buildbots do not like this on Linux. llvm-svn: 303710
*	[SCCP] Use the `hasAddressTaken()` version defined in `Function`.	Davide Italiano	2017-05-23	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Instead of using the SCCP homegrown one. We should eventually make the private SCCP version disappear, but that wont' be today. PR33143 tracks this issue. Add braces for consistency while here. No functional change intended. llvm-svn: 303706
*	[LIR] Use the newly `getRecurrenceVar()` helper. NFCI.	Davide Italiano	2017-05-23	1	-4/+4
\| \| \| \|	llvm-svn: 303704
*	[LIR] Strengthen the check for recurrence variable in popcnt/CTLZ.	Davide Italiano	2017-05-23	1	-9/+16
\| \| \| \| \| \| \|	Fixes PR33114. Differential Revision: https://reviews.llvm.org/D33420 llvm-svn: 303700
*	Disable coverage opt-out for strong postdominator blocks.	George Karpenkov	2017-05-23	1	-22/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverage instrumentation has an optimization not to instrument extra blocks, if the pass is already "accounted for" by a successor/predecessor basic block. However (https://github.com/google/sanitizers/issues/783) this reasoning may become circular, which stops valid paths from having coverage. In the worst case this can cause fuzzing to stop working entirely. This change simplifies logic to something which trivially can not have such circular reasoning, as losing valid paths does not seem like a good trade-off for a ~15% decrease in the # of instrumented basic blocks. llvm-svn: 303698
*	[InstCombine] allow icmp-xor folds for vectors (PR33138)	Sanjay Patel	2017-05-23	1	-5/+9
\| \| \| \| \| \| \| \| \|	This fixes the first part of: https://bugs.llvm.org/show_bug.cgi?id=33138 More work is needed for the bitcasted variant. llvm-svn: 303660
*	[IR] Switch AttributeList to use an array for O(1) access	Reid Kleckner	2017-05-23	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have attributes. However, it requires doing a search over the pairs to test an argument or function attribute. Profiling shows that this loop was 0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly tests values for nullability. This was worth about 2.5% of mid-level optimization cycles on the sqlite3 amalgamation. Here are the full perf results: https://reviews.llvm.org/P7995 Here are just the before and after cycle counts: ``` $ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null 13,274,181,184 cycles # 3.047 GHz ( +- 0.28% ) $ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null 12,906,927,263 cycles # 3.043 GHz ( +- 0.51% ) ``` This patch does not change the indices used to query attributes, as requested by reviewers. Tracking whether an index is usable for array indexing is a huge pain that affects many of the internal APIs, so it would be good to come back later and do a cleanup to remove this internal adjustment. Reviewers: pete, chandlerc Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D32819 llvm-svn: 303654
*	[JumpThreading] Safely replace uses of condition	Anna Thomas	2017-05-23	2	-2/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch builds over https://reviews.llvm.org/rL303349 and replaces the use of the condition only if it is safe to do so. We should not blindly RAUW the condition if experimental.guard or assume is a use of that condition. This is because LVI may have used the guard/assume to identify the value of the condition, and RUAWing will fold the guard/assume and uses before the guards/assumes. Reviewers: sanjoy, reames, trentxintong, mkazantsev Reviewed by: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33257 llvm-svn: 303633
*	[KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or ↵	Craig Topper	2017-05-23	1	-16/+16
\| \| \| \| \| \|	similar. NFC llvm-svn: 303614
*	[LV] Report multiple reasons for not vectorizing under allowExtraAnalysis	Ayal Zaks	2017-05-23	1	-20/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The default behavior of -Rpass-analysis=loop-vectorizer is to report only the first reason encountered for not vectorizing, if one is found, at which time the vectorizer aborts its handling of the loop. This patch allows multiple reasons for not vectorizing to be identified and reported, at the potential expense of additional compile-time, under allowExtraAnalysis which can currently be turned on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed. Removed from LoopVectorizationLegality::canVectorize() the redundant checking and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also does that. This redundancy is caught by a lit test once multiple reasons are reported. Patch initially developed by Dror Barak. Differential Revision: https://reviews.llvm.org/D33396 llvm-svn: 303613
*	Fix update VP metadata after inlining for instrumentation PGO	Teresa Johnson	2017-05-22	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: With instrumentation profiling, when updating the VP metadata after an inline, VP metadata on the inlined copy was inadvertantly having all counts zeroed out. This was causing indirect calls from code inlined during the call step to be marked as cold in the ThinLTO summaries and not imported. The CallerBFI needs to be passed down so that the CallSiteCount can be computed from the profile summary info. With Sample PGO this was working since the count is extracted from the branch weight metadata on the call being inlined (even before we stopped looking at metadata for non-sample PGO in r302844 this largely wasn't working for instrumentation PGO since only promoted indirect calls would be getting inlined and have the metadata). Added an instrumentation PGO test and renamed the sample PGO test. Reviewers: danielcdh, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D33389 llvm-svn: 303574
*	[PartialInlining] Add internal options to enable partial inlining in pass ↵	Xinliang David Li	2017-05-22	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \|	pipeline (off by default) 1. Legacy: -mllvm -enable-partial-inlining 2. New: -mllvm -enable-npm-partial-inlining -fexperimental-new-pass-manager Differential Revision: http://reviews.llvm.org/D33382 llvm-svn: 303567
*	[LoopPredication] NFC. Add extra debug output in case we fail to parse the ↵	Artur Pilipenko	2017-05-22	1	-1/+3
\| \| \| \| \| \|	range check llvm-svn: 303544
*	[LoopPredication] NFC. Move a nested struct declaration before the fields, ↵	Artur Pilipenko	2017-05-22	1	-7/+9
\| \| \| \| \| \| \| \|	clang-format a bit This will simplify the diff for an upcoming review. llvm-svn: 303543
*	[InstCombine] Cleanup the interface for overflow checks	Craig Topper	2017-05-22	4	-39/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix naming conventions and const correctness. This completes the changes made in rL303029. Patch by Yoav Ben-Shalom. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33377 llvm-svn: 303529
*	[SimplifyCFG] Prevent a few APInt copies on method calls that return const ↵	Craig Topper	2017-05-22	1	-2/+2
\| \| \| \| \| \|	reference. NFCI llvm-svn: 303523
*	[KnownBits] Use isNegative/isNonNegative to shorten some code. NFC	Craig Topper	2017-05-22	1	-2/+2
\| \| \| \|	llvm-svn: 303522
*	NewGVN: Fix PR 33116, the memoryphi version of bug 32838.	Daniel Berlin	2017-05-21	1	-6/+5
\| \| \| \|	llvm-svn: 303521
*	NewGVN: Cleanup some repeated code using some templated helpers	Daniel Berlin	2017-05-21	1	-40/+40
\| \| \| \|	llvm-svn: 303520
*	NewGVN: Fix printing of simplified expression	Daniel Berlin	2017-05-21	1	-1/+1
\| \| \| \|	llvm-svn: 303519
*	[InstCombine] Take in account the size in sext->lshr->trunc patterns.	Davide Italiano	2017-05-21	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise we end up miscompiling, transforming: define i8 @tinky() { %sext = sext i1 1 to i16 %hibit = lshr i16 %sext, 15 %tr = trunc i16 %hibit to i8 ret i8 %tr } into: %sext = sext i1 1 to i8 ret i8 %sext and the first get folded to ret i8 1, while the second gets folded to ret i8 -1. Eventually we should get rid of this transform entirely, but for now, this at least fixes a know correctness bug. Differential Revision: https://reviews.llvm.org/D33338 llvm-svn: 303513
*	Revert "Add pthread_self function prototype and make it speculatable."	Xin Tong	2017-05-21	1	-12/+0
\| \| \| \| \| \| \| \|	This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21. Build breaking llvm-svn: 303496
*	Add pthread_self function prototype and make it speculatable.	Xin Tong	2017-05-20	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495
*	[NewGVN] Create a StoreExpression instead of a VariableExpression.	Davide Italiano	2017-05-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	In the case where we have an operand defined by a lod of the same memory location. Historically this was a VariableExpression because we wanted to make sure they ended up in the same class, but if we create the right expression, they end up in the same class anyway. Fixes PR32897. Thanks to Dan for the detailed discussion and the fix suggestion. llvm-svn: 303475