bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert r193251 : Use address-taken to disambiguate global variable and ↵	Shuxin Yang	2013-10-27	1	-1/+0
\| \| \| \| \| \|	indirect memops. llvm-svn: 193489
*	Quick look-up for block in loop.	Wan Xiaofei	2013-10-26	2	-18/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements quick look-up for block in loop by maintaining a hash set for blocks. It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng). Below are the compilation time for our benchmark in llc before & after the patch. Benchmark llc - trunk llc - patched 401.bzip2 0.339081 100.00% 0.329657 102.86% 403.gcc 19.853966 100.00% 19.605466 101.27% 429.mcf 0.049823 100.00% 0.048451 102.83% 433.milc 0.514898 100.00% 0.510217 100.92% 444.namd 1.109328 100.00% 1.103481 100.53% 445.gobmk 4.988028 100.00% 4.929114 101.20% 456.hmmer 0.843871 100.00% 0.825865 102.18% 458.sjeng 0.754238 100.00% 0.714095 105.62% 464.h264ref 2.9668 100.00% 2.90612 102.09% 471.omnetpp 4.556533 100.00% 4.511886 100.99% bitmnp01 0.038168 100.00% 0.0357 106.91% idctrn01 0.037745 100.00% 0.037332 101.11% libquake2 3.78689 100.00% 3.76209 100.66% libquake_ 2.251525 100.00% 2.234104 100.78% linpack 0.033159 100.00% 0.032788 101.13% matrix01 0.045319 100.00% 0.043497 104.19% nbench 0.333161 100.00% 0.329799 101.02% tblook01 0.017863 100.00% 0.017666 101.12% ttsprk01 0.054337 100.00% 0.053057 102.41% Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov> Approver : Andrew Trick <atrick@apple.com> Test : Pass make check-all & llvm test-suite llvm-svn: 193460
*	Fix SCEVExpander: don't try to expand quadratic recurrences outside a loop.	Andrew Trick	2013-10-25	2	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \|	Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu (affecting trunk and 3.3) When SCEV expands a recurrence outside of a loop it attempts to scale by the stride of the recurrence. Chained recurrences don't work that way. We could compute binomial coefficients, but would hve to guarantee that the chained AddRec's are in a perfectly reduced form. llvm-svn: 193438
*	Handle calls and invokes in GlobalStatus.	Rafael Espindola	2013-10-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	This patch teaches GlobalStatus to analyze a call that uses the global value as a callee, not as an argument. With this change internalize call handle the common use of linkonce_odr functions. This reduces the number of linkonce_odr functions in a LTO build of clang (checked with the emit-llvm gold plugin option) from 1730 to 60. llvm-svn: 193436
*	LoopVectorizer: Don't attempt to vectorize extractelement instructions	Hal Finkel	2013-10-25	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. llvm-svn: 193434
*	Inliner: Handle readonly attribute per argument when adding memcpy	Tom Stellard	2013-10-24	1	-10/+13
\| \| \| \| \| \|	Patch by: Vincent Lejeune llvm-svn: 193356
*	Mark vector loops as already vectorized	Renato Golin	2013-10-24	1	-0/+4
\| \| \| \| \| \| \| \|	Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. llvm-svn: 193349
*	fix PR17635: false positive with packed structures	Nuno Lopes	2013-10-24	1	-1/+2
\| \| \| \| \| \|	LLVM optimizers may widen accesses to packed structures that overflow the structure itself, but should be in bounds up to the alignment of the object llvm-svn: 193317
*	Fix a bug in LinearFunctionTestReplace that created invalid loop exit checks.	Juergen Ributzka	2013-10-24	1	-1/+7
\| \| \| \| \| \|	Reviewed by Andy llvm-svn: 193303
*	Clarify comments in genLoopLimit.	Andrew Trick	2013-10-24	1	-3/+4
\| \| \| \|	llvm-svn: 193292
*	Fixed comment typo in GCOVProfiling.cpp	Yuchen Wu	2013-10-23	1	-1/+1
\| \| \| \|	llvm-svn: 193268
*	Use address-taken to disambiguate global variable and indirect memops.	Shuxin Yang	2013-10-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Major steps include: 1). introduces a not-addr-taken bit-field in GlobalVariable 2). GlobalOpt pass sets "not-address-taken" if it proves a global varirable dosen't have its address taken. 3). AA use this info for disambiguation. llvm-svn: 193251
*	Fix spelling, grammar, and match naming convention for test files.	Eric Christopher	2013-10-21	1	-3/+3
\| \| \| \|	llvm-svn: 193130
*	SimplifyCFG: Don't duplicate calls to functions marked noduplicate v2	Tom Stellard	2013-10-21	1	-0/+15
\| \| \| \| \| \| \|	v2: - Use CI->cannotDuplicate() llvm-svn: 193115
*	Use more type helper functions	Matt Arsenault	2013-10-21	4	-23/+23
\| \| \| \|	llvm-svn: 193109
*	Teach SimplifyCFG about address spaces	Matt Arsenault	2013-10-21	1	-5/+9
\| \| \| \|	llvm-svn: 193104
*	Optimize more linkonce_odr values during LTO.	Rafael Espindola	2013-10-21	4	-210/+200
\| \| \| \| \| \| \| \| \| \| \|	When a linkonce_odr value that is on the dso list is not unnamed_addr we can still look to see if anything is actually using its address. If not, it is safe to hide it. This patch implements that by moving GlobalStatus to Transforms/Utils and using it in Internalize. llvm-svn: 193090
*	Fix the predecessor removal logic in r193045.	Michael Gottesman	2013-10-21	1	-11/+9
\| \| \| \| \| \|	Additionally some small comment/stylistic fixes are included as well. llvm-svn: 193068
*	Don't eliminate a partially redundant load if it's in a landing pad.	Bill Wendling	2013-10-21	2	-15/+7
\| \| \| \| \| \| \| \| \| \| \| \|	A landing pad can be jumped to only by the unwind edge of an invoke instruction. If we eliminate a partially redundant load in a landing pad, it will create a basic block that violates this constraint. It then leads to other problems down the line if it tries to merge that basic block with the landing pad. Avoid this by not eliminating the load in a landing pad. PR17621 llvm-svn: 193064
*	Teach simplify-cfg how to correctly create covered lookup tables for ↵	Michael Gottesman	2013-10-20	1	-6/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	switches on iN with N >= 3. One optimization simplify-cfg performs is the converting of switches to lookup tables if the switch has > 4 cases. This is done by: 1. Finding the max/min case value and calculating the switch case range. 2. Create a lookup table basic block. 3. Perform a check in the switch's BB to see if the input value is in the switch's case range. If the input value satisfies said predicate branch to the lookup table BB, otherwise branch to the switch's default destination BB using the default value as the result. The conditional check consists of subtracting the min case value of the table from any input iN value and then ensuring that said value is unsigned less than the size of the lookup table represented as an iN value. If the lookup table is a covered lookup table, the size of the table will be N which is 0 as an iN value. Thus the comparison will be an `icmp ult` of an iN value against 0 which is always false yielding the incorrect result. This patch fixes this problem by recognizing if we have a covered lookup table and if we do, unconditionally jumps to the lookup table BB since the covering property of the lookup table implies no input values could not be handled by said BB. rdar://15268442 llvm-svn: 193045
*	Perform an intelligent splice of the predecessor with the single successor.	Bill Wendling	2013-10-19	1	-1/+14
\| \| \| \| \| \| \| \|	If the predecessor's being spliced into a landing pad, then we need the PHIs to come first and the rest of the predecessor's code to come after the landing pad instruction. llvm-svn: 193035
*	Mark some command line flags as hidden	Nadav Rotem	2013-10-18	1	-3/+3
\| \| \| \|	llvm-svn: 193013
*	Rename fields of GlobalStatus to match the coding style.	Rafael Espindola	2013-10-17	1	-43/+41
\| \| \| \|	llvm-svn: 192910
*	rename SafeToDestroyConstant to isSafeToDestroyConstant and clang-format.	Rafael Espindola	2013-10-17	1	-10/+12
\| \| \| \|	llvm-svn: 192907
*	Simplify the interface of AnalyzeGlobal a bit and rename to analyzeGlobal.	Rafael Espindola	2013-10-17	1	-14/+22
\| \| \| \| \| \|	No functionality change. llvm-svn: 192906
*	[msan] Use zero-extension in shadow cast by default.	Evgeniy Stepanov	2013-10-17	1	-7/+8
\| \| \| \| \| \|	Switch to sign-extension in r192575 caused 7% perf loss on 482.sphinx3. llvm-svn: 192882
*	tsan: implement no_sanitize_thread attribute	Dmitry Vyukov	2013-10-17	1	-1/+1
\| \| \| \| \| \| \|	If a function has no_sanitize_thread attribute, do not instrument memory accesses in it. llvm-svn: 192871
*	SLPVectorizer: Don't vectorize volatile memory operations	Arnold Schwaighofer	2013-10-16	1	-3/+8
\| \| \| \| \| \| \| \| \| \|	radar://15231682 Reapply r192799, http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226 showed that the bot is still broken even with this out. llvm-svn: 192820
*	Revert "SLPVectorizer: Don't vectorize volatile memory operations"	Arnold Schwaighofer	2013-10-16	1	-8/+3
\| \| \| \| \| \|	This speculatively reverts commit 192799. It might have broken a linux buildbot. llvm-svn: 192816
*	SLPVectorizer: Don't vectorize volatile memory operations	Arnold Schwaighofer	2013-10-16	1	-3/+8
\| \| \| \| \| \|	radar://15231682 llvm-svn: 192799
*	[asan] Optimize accesses to global arrays with constant index	Kostya Serebryany	2013-10-16	1	-6/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Given a global array G[N], which is declared in this CU and has static initializer avoid instrumenting accesses like G[i], where 'i' is a constant and 0<=i<N. Also add a bit of stats. This eliminates ~1% of instrumentations on SPEC2006 and also partially helps when asan is being run together with coverage. Reviewers: samsonov Reviewed By: samsonov CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1947 llvm-svn: 192794
*	LoopVectorize: Properly reflect PODness in comments.	Benjamin Kramer	2013-10-15	1	-4/+4
\| \| \| \|	llvm-svn: 192717
*	Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from ↵	Craig Topper	2013-10-15	1	-1/+0
\| \| \| \| \| \|	x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672
*	Remove lib/Transforms/Instrumentation/ProfilingUtils.*	Rafael Espindola	2013-10-14	4	-207/+0
\| \| \| \| \| \| \| \|	They were leftover from the old profiling support. Patch by Alastair Murray. llvm-svn: 192605
*	Basic blocks typically have few predecessors. Use a SmallDenseMap to	Chris Lattner	2013-10-14	1	-3/+3
\| \| \| \| \| \|	avoid a heap allocation when this is the case. llvm-svn: 192602
*	[msan] Instrument x86._cvt intrinsics.	Evgeniy Stepanov	2013-10-14	1	-28/+149
\| \| \| \| \| \| \| \|	Currently MSan checks that arguments of cvt intrinsics are fully initialized. That's too much to ask: some of them only operate on lower half, or even quarter, of the input register. llvm-svn: 192599
*	[msan] Fix handling of scalar select of vectors.	Evgeniy Stepanov	2013-10-14	1	-4/+4
\| \| \| \|	llvm-svn: 192575
*	SLPVectorizer: Sort PHINodes based on their opcode	Arnold Schwaighofer	2013-10-12	1	-23/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this patch we relied on the order of phi nodes when we looked for phi nodes of the same type. This could prevent vectorization of cases where there was a phi node of a second type in between phi nodes of some type. This is important for vectorization of an internal graphics kernel. On the test suite + external on x86_64 (and on a run on armv7s) it showed no impact on either performance or compile time. radar://15024459 llvm-svn: 192537
*	LoopVectorize: Add missing INITIALIZE_PASS_DEPENDENCY macros	Tobias Grosser	2013-10-12	1	-0/+3
\| \| \| \| \|	Contributed-by: Peter Zotov <whitequark@whitequark.org> llvm-svn: 192536
*	Better info when debugging vectorizer	Renato Golin	2013-10-11	1	-6/+5
\| \| \| \|	llvm-svn: 192460
*	Fix a bug in Dead Argument Elimination.	Shuxin Yang	2013-10-09	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a function seen at compile time is not necessarily the one linked to the binary being built, it is illegal to change the actual arguments passing to it. e.g. -------------------------- void foo(int lol) { // foo() has linkage satisifying isWeakForLinker() // "lol" is not used at all. } void bar(int lo2) { // xform to foo(undef) is illegal, as compiler dose not know which // instance of foo() will be linked to the the binary being built. foo(lol2); } ----------------------------- Such functions can be captured by isWeakForLinker(). NOTE that mayBeOverridden() is insufficient for this purpose as it dosen't include linkage types like AvailableExternallyLinkage and LinkOnceODRLinkage. Take link_odr* as an example, it indicates a set of EQUIVALENT globals that can be merged at link-time. However, the semantic of EQUIVALENT-functions includes parameters. Changing parameters breaks the assumption. Thank John McCall for help, especially for the explanation of subtle difference between linkage types. rdar://11546243 llvm-svn: 192302
*	LoopVectorize: External uses must use the last value in a reduction cycle	Arnold Schwaighofer	2013-10-07	1	-0/+6
\| \| \| \| \| \| \| \| \|	Otherwise, we don't perform operations that would have been performed on the scalar version. Fixes PR17498. llvm-svn: 192133
*	Revert r191834 until we measure the effect of this benchmarks and maybe find ↵	Alexey Samsonov	2013-10-07	1	-3/+56
\| \| \| \| \| \|	a better way to fix it llvm-svn: 192121
*	UpdatePHINodes in BasicBlockUtils should not crash on duplicate predecessors	Hal Finkel	2013-10-04	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \|	UpdatePHINodes has an optimization to reuse an existing PHI node, where it first deletes all of its entries and then replaces them. Unfortunately, in the case where we had duplicate predecessors (which are allowed so long as the associated PHI entries have the same value), the loop removing the existing PHI entries from the to-be-reused PHI would assert (if that PHI was not the one which had the duplicates). llvm-svn: 192001
*	SLPVectorizer: Sort inputs to commutative binary operations	Arnold Schwaighofer	2013-10-04	1	-4/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sort the operands of the other entries in the current vectorization root according to the first entry's operands opcodes. %conv0 = uitofp ... %load0 = load float ... = fmul %conv0, %load0 = fmul %load0, %conv1 = fmul %load0, %conv2 Make sure that we recursively vectorize <%conv0, %conv1, %conv2> and <%load0, %load0, %load0>. This makes it more likely to obtain vectorizable trees. We have to be careful when we sort that we don't destroy 'good' existing ordering implied by source order. radar://15080067 llvm-svn: 191977
*	Pull fptrunc's upwards through selects when one of the select's selectands ↵	Owen Anderson	2013-10-03	1	-0/+13
\| \| \| \| \| \|	was a constant. This has a number of benefits, including producing small immediates (easier to materialize, smaller constant pools) as well as being more likely to allow the fptrunc to fuse with a preceding instruction (truncating selects are unusual). llvm-svn: 191929
*	Optimize linkonce_odr unnamed_addr functions during LTO.	Rafael Espindola	2013-10-03	3	-11/+39
\| \| \| \| \| \| \| \| \| \| \|	Generalize the API so we can distinguish symbols that are needed just for a DSO symbol table from those that are used from some native .o. The symbols that are only wanted for the dso symbol table can be dropped if llvm can prove every other dso has a copy (linkonce_odr) and the address is not important (unnamed_addr). llvm-svn: 191922
*	Make gep i8* X, -(ptrtoint Y) transform work with address spaces	Matt Arsenault	2013-10-03	1	-8/+10
\| \| \| \|	llvm-svn: 191920
*	Don't use runtime bounds check between address spaces.	Matt Arsenault	2013-10-02	1	-11/+49
\| \| \| \| \| \| \| \| \|	Don't vectorize with a runtime check if it requires a comparison between pointers with different address spaces. The values can't be assumed to be directly comparable. Previously it would create an illegal bitcast. llvm-svn: 191862
*	Apply slp vectorization on fully-vectorizable tree of height 2	Yi Jiang	2013-10-02	1	-4/+21
\| \| \| \|	llvm-svn: 191852