bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Allow DeadStoreElimination to track combinations of partial later wrties	Hal Finkel	2016-06-23	1	-2/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DeadStoreElimination can currently remove a small store rendered unnecessary by a later larger one, but could not remove a larger store rendered unnecessary by a series of later smaller ones. This adds that capability. It works by keeping a map, which is used as an effective interval map, for each store later overwritten only partially, and filling in that interval map as more such stores are discovered. No additional walking or aliasing queries are used. In the map forms an interval covering the the entire earlier store, then it is dead and can be removed. The map is used as an interval map by storing a mapping between the ending offset and the beginning offset of each interval. I discovered this problem when investigating a performance issue with code like this on PowerPC: #include <complex> using namespace std; complex<float> bar(complex<float> C); complex<float> foo(complex<float> C) { return bar(C)C; } which produces this: define void @_Z4testSt7complexIfE(%"struct.std::complex" noalias nocapture sret %agg.result, i64 %c.coerce) { entry: %ref.tmp = alloca i64, align 8 %tmpcast = bitcast i64* %ref.tmp to %"struct.std::complex"* %c.sroa.0.0.extract.shift = lshr i64 %c.coerce, 32 %c.sroa.0.0.extract.trunc = trunc i64 %c.sroa.0.0.extract.shift to i32 %0 = bitcast i32 %c.sroa.0.0.extract.trunc to float %c.sroa.2.0.extract.trunc = trunc i64 %c.coerce to i32 %1 = bitcast i32 %c.sroa.2.0.extract.trunc to float call void @_Z3barSt7complexIfE(%"struct.std::complex"* nonnull sret %tmpcast, i64 %c.coerce) %2 = bitcast %"struct.std::complex"* %agg.result to i64* %3 = load i64, i64* %ref.tmp, align 8 store i64 %3, i64* %2, align 4 ; <--- *** THIS SHOULD NOT BE HERE ** %_M_value.realp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 0 %4 = lshr i64 %3, 32 %5 = trunc i64 %4 to i32 %6 = bitcast i32 %5 to float %_M_value.imagp.i.i = getelementptr inbounds %"struct.std::complex", %"struct.std::complex"* %agg.result, i64 0, i32 0, i32 1 %7 = trunc i64 %3 to i32 %8 = bitcast i32 %7 to float %mul_ad.i.i = fmul fast float %6, %1 %mul_bc.i.i = fmul fast float %8, %0 %mul_i.i.i = fadd fast float %mul_ad.i.i, %mul_bc.i.i %mul_ac.i.i = fmul fast float %6, %0 %mul_bd.i.i = fmul fast float %8, %1 %mul_r.i.i = fsub fast float %mul_ac.i.i, %mul_bd.i.i store float %mul_r.i.i, float* %_M_value.realp.i.i, align 4 store float %mul_i.i.i, float* %_M_value.imagp.i.i, align 4 ret void } the problem here is not just that the i64 store is unnecessary, but also that it blocks further backend optimizations of the other uses of that i64 value in the backend. In the future, we might want to add a special case for handling smaller accesses (e.g. using a bit vector) if the map mechanism turns out to be noticeably inefficient. A sorted vector is also a possible replacement for the map for small numbers of tracked intervals. Differential Revision: http://reviews.llvm.org/D18586 llvm-svn: 273559
*	Fix unused variable warning by folding the temporary into the debug statement.	Eric Christopher	2016-06-23	1	-2/+2
\| \| \| \|	llvm-svn: 273523
*	[SCCP] Don't assume all Constants are ConstantInt	David Majnemer	2016-06-23	1	-8/+8
\| \| \| \| \| \|	This fixes PR28269. llvm-svn: 273521
*	[RS4GC] Use StringRef; NFC	Sanjoy Das	2016-06-22	1	-4/+3
\| \| \| \| \| \|	Spotted during random inspection. llvm-svn: 273512
*	IR: Introduce Module::global_objects().	Peter Collingbourne	2016-06-22	1	-13/+6
\| \| \| \| \| \| \| \| \| \| \| \|	This is a convenience iterator that allows clients to enumerate the GlobalObjects within a Module. Also start using it in a few places where it is obviously the right thing to use. Differential Revision: http://reviews.llvm.org/D21580 llvm-svn: 273470
*	[asan] Do not instrument accesses to profiling globals	Vedant Kumar	2016-06-22	1	-5/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's only useful to asan-itize profiling globals while debugging llvm's profiling instrumentation passes. Enabling asan along with instrprof or gcov instrumentation shouldn't incur extra overhead. This patch is in the same spirit as r264805 and r273202, which disabled tsan instrumentation of instrprof/gcov globals. Differential Revision: http://reviews.llvm.org/D21541 llvm-svn: 273444
*	Delete more dead code.	Rafael Espindola	2016-06-22	4	-80/+0
\| \| \| \| \| \|	Found by gcc 6. llvm-svn: 273402
*	[asan] Do not instrument pointers with address space attributes	Anna Zaks	2016-06-22	2	-0/+17
\| \| \| \| \| \| \| \| \|	Do not instrument pointers with address space attributes since we cannot track them anyway. Instrumenting them results in false positives in ASan and a compiler crash in TSan. (The compiler should not crash in any case, but that's a different problem.) llvm-svn: 273339
*	Delete some dead code.	Rafael Espindola	2016-06-21	1	-8/+0
\| \| \| \| \| \|	Found by gcc 6. llvm-svn: 273303
*	Fix PR28219: Use profile summary from reader and not compute it	Easwaran Raman	2016-06-21	1	-4/+1
\| \| \| \| \| \|	Differentiaal revision: http://reviews.llvm.org/D21546 llvm-svn: 273301
*	Add MemoryAccess creation and PHI creation APIs to MemorySSA	Daniel Berlin	2016-06-21	1	-3/+108
\| \| \| \| \| \| \| \| \| \|	Reviewers: george.burgess.iv, gberry, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21463 llvm-svn: 273295
*	This is part of the effort for asan to support Windows 64 bit.	Etienne Bergeron	2016-06-21	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	The large offset is being tested on Windows 10 (which has larger usable virtual address space than Windows 8 or earlier) Patch by: Wei Wang Differential Revision: http://reviews.llvm.org/D21523 llvm-svn: 273269
*	reverted the prev commit due to assertion failure	Elena Demikhovsky	2016-06-21	1	-7/+80
\| \| \| \|	llvm-svn: 273258
*	Fixed consecutive memory access detection in Loop Vectorizer.	Elena Demikhovsky	2016-06-21	1	-80/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It did not handle correctly cases without GEP. The following loop wasn't vectorized: for (int i=0; i<len; i++) to++ = from++; I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1. Differential revision: http://reviews.llvm.org/D20789 llvm-svn: 273257
*	Replace silly uses of 'signed' with 'int'	David Majnemer	2016-06-21	1	-1/+1
\| \| \| \|	llvm-svn: 273244
*	clang format change /NFC	Xinliang David Li	2016-06-21	1	-28/+34
\| \| \| \|	llvm-svn: 273233
*	[tsan] Do not instrument accesses to the gcov counters array	Vedant Kumar	2016-06-20	1	-0/+4
\| \| \| \| \| \| \| \|	There is a known intended race here. This is a follow-up to r264805, which disabled tsan instrumentation for updates to instrprof counters. For more background on this please see the discussion in D18164. llvm-svn: 273202
*	[InstSimplify] analyze (optionally casted) icmps to eliminate obviously ↵	Sanjay Patel	2016-06-20	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	false logic (PR27869) By moving this transform to InstSimplify from InstCombine, we sidestep the problem/question raised by PR27869: https://llvm.org/bugs/show_bug.cgi?id=27869 ...where InstCombine turns an icmp+zext into a shift causing us to miss the fold. Credit to David Majnemer for a draft patch of the changes to InstructionSimplify.cpp. Differential Revision: http://reviews.llvm.org/D21512 llvm-svn: 273200
*	Pass AssumptionCacheTracker from SampleProfileLoader to Inliner	Dehao Chen	2016-06-20	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Inliner needs ACT when calling InlineFunction. Instead of nullptr, we need to pass it in from SampleProfileLoader Reviewers: davidxl Subscribers: eraman, vsk, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D21205 llvm-svn: 273199
*	Rename to be consistent with other type names. NFC	Daniel Berlin	2016-06-20	1	-11/+12
\| \| \| \|	llvm-svn: 273194
*	InstCombine: Don't strip convergent from intrinsic callsites	Matt Arsenault	2016-06-20	1	-1/+2
\| \| \| \| \| \| \|	Specific instances of intrinsic calls may want to be convergent, such as certain register reads but the intrinsic declaration is not. llvm-svn: 273188
*	Forgot to update callers of deleteDeadInstruction	David Majnemer	2016-06-20	1	-2/+2
\| \| \| \|	llvm-svn: 273163
*	Reapply "[LoopIdiom] Don't remove dead operands manually"	David Majnemer	2016-06-20	1	-9/+1
\| \| \| \| \| \| \| \|	This reverts commit r273160, reapplying r273132. RecursivelyDeleteTriviallyDeadInstructions cannot be called on a parentless Instruction. llvm-svn: 273162
*	Revert "[LoopIdiom] Don't remove dead operands manually"	Cong Liu	2016-06-20	1	-1/+2
\| \| \| \| \| \| \| \|	This reverts commit r273132. Breaks multiple test under /llvm/test:Transforms (e.g. llvm/test:Transforms/LoopIdiom/basic.ll.test) under asan. llvm-svn: 273160
*	Fix formatting of r273144. NFC.	Patrik Hagglund	2016-06-20	1	-4/+4
\| \| \| \|	llvm-svn: 273149
*	Avoid output indeterminism between GCC and Clang builds.	Patrik Hagglund	2016-06-20	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove dependency of the evalution order of function arguments, which is unspecified. The following test previously failed when built with GCC (but succeded when built with Clang): ; RUN: opt -sroa -S < %s \| FileCheck %s target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %A = type {i16} @a = global %A* null @b = global i16 0 ; CHECK-LABEL: @f1( ; CHECK: alloca %A ; CHECK-NEXT: extractvalue %A ; CHECK-NEXT: getelementptr inbounds %A define void @f1 (%A %a) { %1 = alloca %A store %A %a, %A* %1 %2 = load i16, i16* @b %3 = icmp ne i16 %2, 0 br i1 %3, label %bb1, label %bb2 bb1: store %A* %1, %A** @a br label %bb2 bb2: ret void } Patch by David Stenberg. Differential Revision: http://reviews.llvm.org/D21226 llvm-svn: 273144
*	Fix for PR27940	Patrik Hagglund	2016-06-20	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	After a store has been eliminated, when making sure that the instruction iterator points to a valid instruction, dbg intrinsics are now ignored as a new instruction. Patch by Henric Karlsson. Reviewed by Daniel Berlin. Differential Revision: http://reviews.llvm.org/D21076 llvm-svn: 273141
*	[LoopIdiom] Don't remove dead operands manually	David Majnemer	2016-06-20	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	Removing dead instructions requires remembering which operands have already been removed. RecursivelyDeleteTriviallyDeadInstructions has this logic, don't partially reimplement it in LoopIdiomRecognize. This fixes PR28196. llvm-svn: 273132
*	Address Eli's post-commit comments	David Majnemer	2016-06-19	1	-16/+19
\| \| \| \| \| \| \|	Use an APInt to handle pointers of arbitrary width, let accumulateConstantOffset handle overflow issues. llvm-svn: 273126
*	fix formatting, typo; NFC	Sanjay Patel	2016-06-19	1	-1/+1
\| \| \| \|	llvm-svn: 273118
*	[LoadCombine] Combine Loads formed from GEPS with negative indexes	David Majnemer	2016-06-19	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \|	Change the underlying offset and comparisons to use int64_t instead of uint64_t. Patch by River Riddle! Differential Revision: http://reviews.llvm.org/D21499 llvm-svn: 273105
*	[sanitizers] Disable target-specific lowering of string functions.	Marcin Koscielnicki	2016-06-18	5	-8/+80
\| \| \| \| \| \| \| \| \| \| \| \|	CodeGen has hooks that allow targets to emit specialized code instead of calls to memcmp, memchr, strcpy, stpcpy, strcmp, strlen, strnlen. When ASan/MSan/TSan/ESan is in use, this sidesteps its interceptors, resulting in uninstrumented memory accesses. To avoid that, make these sanitizers mark the calls as nobuiltin. Differential Revision: http://reviews.llvm.org/D19781 llvm-svn: 273083
*	Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."""	Matt Arsenault	2016-06-17	1	-22/+5
\| \| \| \| \| \| \|	This seems to be causing an infinite loop / crash in instcombine on some bots. llvm-svn: 273069
*	[LAA] Enable symbolic stride speculation for all LAA clients	Adam Nemet	2016-06-17	2	-17/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a functional change for LLE and LDist. The other clients (LV, LVerLICM) already had this explicitly enabled. The temporary boolean parameter to LAA is removed that allowed turning off speculation of symbolic strides. This makes LAA's caching interface LAA::getInfo only take the loop as the parameter. This makes the interface more friendly to the new Pass Manager. The flag -enable-mem-access-versioning is moved from LV to a LAA which now allows turning off speculation globally. llvm-svn: 273064
*	Apply another batch of fixes from clang-tidy's ↵	Benjamin Kramer	2016-06-17	4	-14/+14
\| \| \| \| \| \| \| \|	performance-unnecessary-value-param. Contains some manual fixes. No functionality change intended. llvm-svn: 273047
*	Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""	Matt Arsenault	2016-06-17	1	-5/+22
\| \| \| \| \| \| \|	Reapply r272987. Condition should be in terms of the destination type, and the flags should not be copied. llvm-svn: 273045
*	[PM] Port MergedLoadStoreMotion to the new pass manager, take two.	Davide Italiano	2016-06-17	2	-54/+79
\| \| \| \| \| \| \| \| \|	This is indeed a much cleaner approach (thanks to Daniel Berlin for pointing out), and also David/Sean for review. Differential Revision: http://reviews.llvm.org/D21454 llvm-svn: 273032
*	Avoid duplicated map lookups. No functionality change intended.	Benjamin Kramer	2016-06-17	3	-11/+5
\| \| \| \|	llvm-svn: 273030
*	LoopSimplifyCFG: Prefer `const auto &` to `auto &`, for clarity. NFC	Justin Bogner	2016-06-17	1	-1/+2
\| \| \| \|	llvm-svn: 273023
*	[InstCombine] allow more than one use for vector bitcast folding with selects	Sanjay Patel	2016-06-17	1	-13/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating example for this transform is similar to D20774 where bitcasts interfere with a single cmp/select sequence, but in this case we have 2 uses of each bitcast to produce min and max ops: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %bc1 = bitcast <4 x float> %a to <4 x i32> %bc2 = bitcast <4 x float> %b to <4 x i32> %sel1 = select <4 x i1> %cmp, <4 x i32> %bc1, <4 x i32> %bc2 %sel2 = select <4 x i1> %cmp, <4 x i32> %bc2, <4 x i32> %bc1 %bc3 = bitcast <4 x float>* %ptr1 to <4 x i32>* store <4 x i32> %sel1, <4 x i32>* %bc3 %bc4 = bitcast <4 x float>* %ptr2 to <4 x i32>* store <4 x i32> %sel2, <4 x i32>* %bc4 ret void } With this patch, we move the selects up to use the input args which allows getting rid of all of the bitcasts: define void @minmax_bc_store(<4 x float> %a, <4 x float> %b, <4 x float>* %ptr1, <4 x float>* %ptr2) { %cmp = fcmp olt <4 x float> %a, %b %sel1.v = select <4 x i1> %cmp, <4 x float> %a, <4 x float> %b %sel2.v = select <4 x i1> %cmp, <4 x float> %b, <4 x float> %a store <4 x float> %sel1.v, <4 x float>* %ptr1, align 16 store <4 x float> %sel2.v, <4 x float>* %ptr2, align 16 ret void } The asm for x86 SSE then improves from: movaps %xmm0, %xmm2 cmpltps %xmm1, %xmm2 movaps %xmm2, %xmm3 andnps %xmm1, %xmm3 movaps %xmm2, %xmm4 andnps %xmm0, %xmm4 andps %xmm2, %xmm0 orps %xmm3, %xmm0 andps %xmm1, %xmm2 orps %xmm4, %xmm2 movaps %xmm0, (%rdi) movaps %xmm2, (%rsi) To: movaps %xmm0, %xmm2 minps %xmm1, %xmm2 maxps %xmm0, %xmm1 movaps %xmm2, (%rdi) movaps %xmm1, (%rsi) The TODO comments show that we're limiting this transform only to vectors and only to bitcasts because we need to improve other transforms or risk creating worse codegen. Differential Revision: http://reviews.llvm.org/D21190 llvm-svn: 273011
*	Revert "InstCombine: Reduce trunc (shl x, K) width."	Matt Arsenault	2016-06-17	1	-24/+5
\| \| \| \| \| \| \| \|	This reverts commit r272987. This might be causing crashes on some bots. llvm-svn: 272990
*	[esan\|cfrag] Add the struct field size array in StructInfo	Qin Zhao	2016-06-17	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the struct field size array in struct StructInfo. Updates test struct_field_count_basic.ll. Reviewers: aizatsky Subscribers: vitalybuka, zhaoqin, kcc, eugenis, bruening, llvm-commits Differential Revision: http://reviews.llvm.org/D21341 llvm-svn: 272989
*	InstCombine: Reduce trunc (shl x, K) width.	Matt Arsenault	2016-06-17	1	-5/+24
\| \| \| \|	llvm-svn: 272987
*	[RS4GC] Pass CallSite by value instead of const ref; NFC	Sanjoy Das	2016-06-17	1	-11/+10
\| \| \| \| \| \|	That's the idiomatic LLVM pattern. llvm-svn: 272981
*	[PM] Remove support for omitting the AnalysisManager argument to new	Chandler Carruth	2016-06-17	11	-15/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pass manager passes' `run` methods. This removes a bunch of SFINAE goop from the pass manager and just requires pass authors to accept `AnalysisManager<IRUnitT> &` as a dead argument. This is a small price to pay for the simplicity of the system as a whole, despite the noise that changing it causes at this stage. This will also helpfull allow us to make the signature of the run methods much more flexible for different kinds af passes to support things like intelligently updating the pass's progression over IR units. While this touches many, many, files, the changes are really boring. Mostly made with the help of my trusty perl one liners. Thanks to Sean and Hal for bouncing ideas for this with me in IRC. llvm-svn: 272978
*	Use m_APInt in SimplifyCFG	Chuang-Yu Cheng	2016-06-17	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \|	Switch from m_Constant to m_APInt per David's request. NFC. Author: Thomas Jablin (tjablin) Reviewers: majnemer cycheng http://reviews.llvm.org/D21440 llvm-svn: 272977
*	[LV] Move management of symbolic strides to LAA. NFCI	Adam Nemet	2016-06-16	2	-60/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is still NFCI, so the list of clients that allow symbolic stride speculation does not change (yes: LV and LoopVersioningLICM, no: LLE, LDist). However since the symbolic strides are now managed by LAA rather than passed by client a new bool parameter is used to enable symbolic stride speculation. The existing test Transforms/LoopVectorize/version-mem-access.ll checks that stride speculation is performed for LV. The previously added test Transforms/LoopLoadElim/symbolic-stride.ll ensures that no speculation is performed for LLE. The next patch will change the functionality and turn on symbolic stride speculation in all of LAA's clients and remove the bool parameter. llvm-svn: 272970
*	[safestack] Fixup llvm.dbg.value when rewriting unsafe allocas.	Evgeniy Stepanov	2016-06-16	1	-19/+73
\| \| \| \| \| \| \| \| \|	When moving unsafe allocas to the unsafe stack, dbg.declare intrinsics are updated to refer to the new location. This change does the same to dbg.value intrinsics. llvm-svn: 272968
*	[LV] Make getSymbolicStrides return a pointer rather than a reference. NFC	Adam Nemet	2016-06-16	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Turns out SymbolicStrides is actually used in canVectorizeWithIfConvert before it gets set up in canVectorizeMemory. This works fine as long as SymbolicStrides resides in LV since we just have an empty map. Based on this the conclusion is made that there are no symbolic strides which is conservatively correct. However once SymbolicStrides becomes part of LAI, LAI is nullptr at this point so we need to differentiate the uninitialized state by returning a nullptr for SymbolicStrides. llvm-svn: 272966
*	[EarlyCSE] Minor cosmetic NFC changes	Sanjoy Das	2016-06-16	1	-2/+2
\| \| \| \| \| \| \|	- Avoid implicit conversion from pointer to bool - Add a comment when passing in a boolean value llvm-svn: 272955