bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div ↵	Craig Topper	2016-09-04	3	-235/+200
\| \| \| \| \| \| \| \|	builtins and replace with native operations. We can't do the 512-bit ones because they take a rounding mode argument that we can't represent. llvm-svn: 280635
*	[X86] Regenerate x64 mmx/f64 return value tests	Simon Pilgrim	2016-09-04	1	-17/+26
\| \| \| \|	llvm-svn: 280634
*	[AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div ↵	Craig Topper	2016-09-04	5	-316/+296
\| \| \| \| \| \|	intrinsics and upgrade to native IR. llvm-svn: 280633
*	[ORC] Clone module flags metadata into the globals module in the	Lang Hames	2016-09-04	5	-10/+40
\| \| \| \| \| \| \| \|	CompileOnDemandLayer. Also contains a tweak to the orc-lazy jit in LLI to enable the test case. llvm-svn: 280632
*	[X86] Regenerate trunc-store legalization test	Simon Pilgrim	2016-09-04	1	-4/+12
\| \| \| \|	llvm-svn: 280631
*	[ELF][MIPS] Do not emit DT_REL[A]COUNT for MIPS targets	Simon Atanasyan	2016-09-04	2	-7/+13
\| \| \| \| \| \| \| \|	It looks like MIPS dynamic loader does not support RELCOUNT tag. Both gold/bfd linkers does not emit this tag on MIPS. I will investigate the problem further but for now it is better to behave like GNU linkers. llvm-svn: 280630
*	[X86][SSE] Regenerate fcmp/uitofp combine tests	Simon Pilgrim	2016-09-04	1	-12/+25
\| \| \| \|	llvm-svn: 280629
*	[ORC] Fix an unfinished comment.	Lang Hames	2016-09-04	1	-1/+1
\| \| \| \|	llvm-svn: 280628
*	[InstCombine] recode icmp fold in a vector-friendly way; NFC	Sanjay Patel	2016-09-04	1	-22/+30
\| \| \| \| \| \| \| \| \| \| \|	The transform in question: icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1' ...is still not enabled for vectors, thus no functional change intended. It's not clear to me if this is a good transform for vectors or even scalars in general. Changing that behavior may be a follow-on patch. llvm-svn: 280627
*	[PowerPC] During branch relaxation, recompute padding offsets before each ↵	Hal Finkel	2016-09-04	1	-7/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	iteration We used to compute the padding contributions to the block sizes during branch relaxation only at the start of the transformation. As we perform branch relaxation, we change the sizes of the blocks, and so the amount of inter-block padding might change. Accordingly, we need to recompute the (alignment-based) padding in between every iteration on our way toward the fixed point. Unfortunately, I don't have a test case (and none was provided in the bug report), and while this obviously seems needed, algorithmically, I don't have any way of generating a small and/or non-fragile regression test. llvm-svn: 280626
*	revert r279960.	Igor Breger	2016-09-04	7	-296/+728
\| \| \| \| \| \|	https://llvm.org/bugs/show_bug.cgi?id=30249 llvm-svn: 280625
*	EOL fixes	Simon Pilgrim	2016-09-04	2	-54/+54
\| \| \| \|	llvm-svn: 280624
*	Strip trailing whitespace	Simon Pilgrim	2016-09-04	1	-1/+1
\| \| \| \|	llvm-svn: 280623
*	Test case for r280607 to check presence and sanity of the *_LOCK_FREE	Joerg Sonnenberger	2016-09-04	1	-0/+32
\| \| \| \| \| \|	macros. llvm-svn: 280622
*	[libcxx] Fix a data race in call_once	Kuba Brecka	2016-09-04	3	-6/+15
\| \| \| \| \| \| \| \|	call_once is using relaxed atomic load to perform double-checked locking, which contains a data race. The fast-path load has to be an acquire atomic load. Differential Revision: https://reviews.llvm.org/D24028 llvm-svn: 280621
*	[PM] Revert r280447: Add a unittest for invalidating module analyses with an ↵	Chandler Carruth	2016-09-04	1	-96/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SCC pass. This was mistakenly committed. The world isn't ready for this test, the test code has horrible debugging code in it that should never have landed in tree, it currently passes because of bugs elsewhere, and it needs to be rewritten to not be susceptible to passing for the wrong reasons. I'll re-land this in a better form when the prerequisite patches land. So sorry that I got this mixed into a series of commits that were ready to land. I shouldn't have. =[ What's worse is that it stuck around for so long and I discovered it while fixing the underlying bug that caused it to pass. llvm-svn: 280620
*	[LCG] Clean up and make NDEBUG verify calls more rigorous with	Chandler Carruth	2016-09-04	1	-32/+38
\| \| \| \| \| \| \| \| \| \| \|	make_scope_exit now that we have that utility. This makes the code much more clear and readable by isolating the check. It also makes it easy to go through and make sure all the interesting update routines have a start and end verify so we don't slowly let the graph drift into an invalid state. llvm-svn: 280619
*	[LCG] A NFC refactoring to extract the logic for doing	Chandler Carruth	2016-09-04	1	-111/+184
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a postorder-sequence based update after edge insertion into a generic helper function. This separates the SCC-specific logic into two fairly simple lambdas and extracts the rest into a generic helper template function. I think this is a net win on its own merits because it disentangles different pieces of the algorithm. Now there is one place that does the two-step partition to identify a set of newly connected components and at the same time update the postorder sequence. However, I'm also hoping to re-use this an upcoming patch to update a cached post-order sequence of RefSCCs when doing the analogous update to the RefSCC graph, and I don't want to have two copies. The diff is quite messy but this really is just moving things around and making types generic rather than specific. llvm-svn: 280618
*	[InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing	Dorit Nuzman	2016-09-04	2	-0/+67
\| \| \| \| \| \| \| \| \| \| \| \|	memcpy with ld/st. When InstCombine replaces a memcpy with loads+stores it does not copy over the llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes that. Differential Revision: https://reviews.llvm.org/D23499 llvm-svn: 280617
*	[ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine.	Lang Hames	2016-09-04	2	-2/+3
\| \| \| \| \| \| \| \|	ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The practical impact of this change is that ORC users no longer need to link MCJIT to use ObjectCaches. llvm-svn: 280616
*	Test commit.	Dorit Nuzman	2016-09-04	1	-0/+1
\| \| \| \|	llvm-svn: 280615
*	[PowerPC] Zero-extend constants in FastISel	Hal Finkel	2016-09-04	2	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which are illegal types promoted to i32 on PowerPC, is a choice constrained by assumptions within the infrastructure. Specifically, the logic in FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI operands will be zero extended, and so, at least when materializing constants that are PHI operands, we must do the same. The rest of our fast-isel implementation does not appear to depend on the fact that we were sign-extending i8/i16 constants, and all other targets also appear to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we had been doing this only for i1 constants, and sign-extending the others). Fixes PR27721. llvm-svn: 280614
*	[Modules] Add 'freestanding' to the 'requires-declaration' feature-list.	Elad Cohen	2016-09-04	4	-0/+11
\| \| \| \| \| \| \| \| \|	This adds support for modules that require (non-)freestanding environment, such as the compiler builtin mm_malloc submodule. Differential Revision: https://reviews.llvm.org/D23871 llvm-svn: 280613
*	Apply curr_symbol.pass.cpp test fix to missed test case	Eric Fiselier	2016-09-04	1	-1/+6
\| \| \| \|	llvm-svn: 280612
*	[AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to ↵	Craig Topper	2016-09-04	11	-1968/+1853
\| \| \| \| \| \|	native IR. llvm-svn: 280611
*	Fix inliner funclet unwind memoization	Joseph Tremoulet	2016-09-04	2	-11/+303
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The inliner may need to determine where a given funclet unwinds to, and this determination may depend on other funclets throughout the funclet tree. The code that performs this walk in getUnwindDestToken memoizes results to avoid redundant computations. In the case that a funclet's unwind destination is derived from its ancestor, there's code to walk back down the tree from the ancestor updating the memo map of its descendants to record the unwind destination. This change fixes that code to account for the case that some descendant has a different unwind destination, which can happen if that unwind dest is a descendant of the EHPad being queried and thus didn't determine its unwind destination. Also update test inline-funclets.ll, which is supposed to cover such scenarios, to include a case that fails an assertion without this fix but passes with it. Fixes PR29151. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24117 llvm-svn: 280610
*	Trailing dot that shouldn't have been committed.	Joerg Sonnenberger	2016-09-04	1	-1/+1
\| \| \| \|	llvm-svn: 280609
*	Fix bad locale test data when using the newest glibc	Eric Fiselier	2016-09-04	2	-0/+15
\| \| \| \|	llvm-svn: 280608
*	PR 27200: Fix names of the atomic lock-free macros.	Joerg Sonnenberger	2016-09-04	1	-6/+6
\| \| \| \|	llvm-svn: 280607
*	XFAIL TestGdbRemoteExitCode failing tests	Todd Fiala	2016-09-04	1	-0/+2
\| \| \| \| \| \| \|	Tracked by: llvm.org/pr30271 llvm-svn: 280606
*	Mark test as XFAIL for C++03, rather than providing a dummy pass.	Marshall Clow	2016-09-04	1	-5/+2
\| \| \| \|	llvm-svn: 280605
*	[NFC] Darwin llgs support from Week of Code	Todd Fiala	2016-09-04	35	-128/+6319
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This code represents the Week of Code work I did on bringing up lldb-server LLGS support for Darwin. It does not include the Xcode project changes needed, as we don't want to throw that switch until more support is implemented (i.e. this change is inert, no build systems use it yet. I've verified on Ubuntu 16.04, macOS Xcode and macOS cmake builds). This change does some minimal refactoring of code that is shared with the Linux LLGS portion, moving it from NativeProcessLinux into NativeProcessProtocol. That code is also used by NativeProcessDarwin. Current state on Darwin: * Process launching is implemented. (Attach is not). Launching on devices has not yet been tested (FBS/BKS might need a bit of work). * Inferior waitpid monitoring and communication of exit status via MainLoop callback is implemented. * Memory read/write, breakpoints, thread register context, etc. are not yet implemented. This impacts process stop/resume, as the initial launch suspended immediately starts the process up and running because it doesn't know it is supposed to remain stopped. * I implemented the equivalent of MachThreadList as NativeThreadListDarwin, in anticipation that we might want to factor out common parts into NativeThreadList{Protocol} and share some code here. After writing it, though, the fallout from merging Mach Task/Process into a single concept plus some other minor changes makes the whole NativeThreadListDarwin concept nothing more than dead weight. I am likely going to get rid of this class and just manage it directly in NativeProcessDarwin, much like I did for NativeProcessLinux. * There is a stub-out call for starting a STDIO thread. That will go away and adopt the MainLoop pselect-based IOObject reading. I am developing the fully-integrated changes in the following repo, which contains the necessary Xcode bits and the glue that enables lldb-debugserver on a macOS system: https://github.com/tfiala/lldb/tree/llgs-darwin This change also breaks out a few of the lldb-server tests into their own directory, and adds some $qHostInfo tests (not sure why I didn't write those tests back when I initially implemented that on the Linux side). llvm-svn: 280604
*	[X86] Combine some of the strings in autoupgrade code.	Craig Topper	2016-09-03	1	-35/+7
\| \| \| \|	llvm-svn: 280603
*	Cleanup : Use metadata preserving API for branch creation	Xinliang David Li	2016-09-03	1	-9/+4
\| \| \| \| \| \| \|	Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602
*	ScopInfo: Do not derive assumptions from all GEP pointer instructions	Tobias Grosser	2016-09-03	2	-107/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	... but instead rely on the assumptions that we derive for load/store instructions. Before we were able to delinearize arrays, we used GEP pointer instructions to derive information about the likely range of induction variables, which gave us more freedom during loop scheduling. Today, this is not needed any more as we delinearize multi-dimensional memory accesses and as part of this process also "assume" that all accesses to these arrays remain inbounds. The old derive-assumptions-from-GEP code has consequently become mostly redundant. We drop it both to clean up our code, but also to improve compile time. This change reduces the scop construction time for 3mm in no-asserts mode on my machine from 48 to 37 ms. llvm-svn: 280601
*	[Profile] preserve branch metadata lowering select in CGP	Xinliang David Li	2016-09-03	4	-8/+42
\| \| \| \| \| \| \| \| \| \|	CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600
*	Fix ThinLTO crash with debug info	Mehdi Amini	2016-09-03	4	-0/+87
\| \| \| \| \| \| \| \| \| \| \| \|	Because the recent change about ODR type uniquing in the context, we can reach types defined in another module during IR linking. This triggered some assertions in case we IR link without starting from an empty module. To alleviate that, we can self-map metadata defined in the destination module so that they won't be visited. Differential Revision: https://reviews.llvm.org/D23841 llvm-svn: 280599
*	Strip trailing whitespace	Simon Pilgrim	2016-09-03	1	-2/+2
\| \| \| \|	llvm-svn: 280598
*	[AVX-512] Remove masked integer mullo builtins and replace with native IR.	Craig Topper	2016-09-03	13	-126/+108
\| \| \| \|	llvm-svn: 280597
*	[AVX-512] Remove masked integer add/sub builtins and replace with native IR.	Craig Topper	2016-09-03	9	-341/+290
\| \| \| \|	llvm-svn: 280596
*	AMDGPU: Set sizes of spill pseudos	Matt Arsenault	2016-09-03	3	-3/+13
\| \| \| \|	llvm-svn: 280595
*	AMDGPU: Fix adding duplicate implicit exec uses	Matt Arsenault	2016-09-03	1	-1/+15
\| \| \| \| \| \| \| \|	I'm not sure if this should be considered a bug in copyImplicitOps or not, but implicit operands that are part of the static instruction definition should not be copied. llvm-svn: 280594
*	[AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an ↵	Craig Topper	2016-09-03	3	-0/+498
\| \| \| \| \| \|	AVX512 stack folding test. llvm-svn: 280593
*	[AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE ↵	Craig Topper	2016-09-03	1	-21/+23
\| \| \| \| \| \|	equivalent. llvm-svn: 280592
*	Fix the attribute documentation build.	Aaron Ballman	2016-09-03	1	-0/+1
\| \| \| \|	llvm-svn: 280591
*	AMDGPU: Reduce the duration of whole-quad-mode	Nicolai Haehnle	2016-09-03	2	-82/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This contains two changes that reduce the time spent in WQM, with the intention of reducing bandwidth required by VMEM loads: 1. Sampling instructions by themselves don't need to run in WQM, only their coordinate inputs need it (unless of course there is a dependent sampling instruction). The initial scanInstructions step is modified accordingly. 2. When switching back from WQM to Exact, switch back as soon as possible. This affects the logic in processBlock. This should always be a win or at best neutral. There are also some cleanups (e.g. remove unused ExecExports) and some new debugging output. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22092 llvm-svn: 280590
*	AMDGPU: Fix an interaction between WQM and polygon stippling	Nicolai Haehnle	2016-09-03	3	-11/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes a rare bug in polygon stippling with non-monolithic pixel shaders. The underlying problem is as follows: the prolog part contains the polygon stippling sequence, i.e. a kill. The main part then enables WQM based on the _reduced_ exec mask, effectively undoing most of the polygon stippling. Since we cannot know whether polygon stippling will be used, the main part of a non-monolithic shader must always return to exact mode to fix this problem. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23131 llvm-svn: 280589
*	Fix PR30202 - notify_all_at_thread_exit seg faults if run from a raw pthread ↵	Eric Fiselier	2016-09-03	3	-16/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	context. Summary: This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`. Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should never be old thread local data because pthread should null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec: > > Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread. > > An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits. > > If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop. However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit". Reviewers: mclow.lists, bcraig, EricWF Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24159 llvm-svn: 280588
*	Replace the Radeon GCN GPU family names by more descriptive ones	Niels Ole Salscheider	2016-09-03	1	-25/+25
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D23957 llvm-svn: 280587
*	AMDGPU: Do basic folding of class intrinsic	Matt Arsenault	2016-09-03	2	-0/+316
\| \| \| \| \| \| \|	This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586