bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	R600: Implement 64bit SHL	Jan Vesely	2014-06-18	2	-0/+42
\| \| \| \| \| \| \|	v2: Use c++ style comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211157
*	[msan] Handle X86 .psad. and .pmadd. intrinsics.	Evgeniy Stepanov	2014-06-18	1	-0/+55
\| \| \| \|	llvm-svn: 211156
*	DAG: move sret demotion into most basic LowerCallTo implementation.	Tim Northover	2014-06-18	1	-119/+124
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It looks like there are two versions of LowerCallTo here: the SelectionDAGBuilder one is designed to operate on LLVM IR, and the TargetLowering one in the case where everything is at DAG level. Previously, only the SelectionDAGBuilder variant could handle demoting an impossible return to sret semantics (before delegating to the TargetLowering version), but this functionality is also useful for certain libcalls (e.g. 128-bit operations on 32-bit x86). So this commit moves the sret handling down a level. rdar://problem/17242889 llvm-svn: 211155
*	Revert "Random Number Generator (llvm)"	JF Bastien	2014-06-18	3	-67/+1
\| \| \| \| \| \| \| \|	This reverts commit cccba093090d127e0b6d17473b14c264c14c5259. It causes build breakage. llvm-svn: 211146
*	Random Number Generator (llvm)	JF Bastien	2014-06-18	3	-1/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Provides an abstraction for a random number generator (RNG) that produces a stream of pseudo-random numbers. The current implementation uses C++11 facilities and is therefore not cryptographically secure. The RNG is salted with the text of the current command line invocation. In addition, a user may specify a seed (reproducible builds). In clang, the seed can be set via -frandom-seed=X In the back end, the seed can be set via -rng-seed=X This is the llvm part of the patch. clang part: D3391 Reviewers: ahomescu, rinon, nicholas, jfb Reviewed By: jfb Subscribers: jfb, perl Differential Revision: http://reviews.llvm.org/D3390 llvm-svn: 211145
*	[AArch64] Fix a pattern match failure caused by creating improper CONCAT_VECTOR.	Kevin Qin	2014-06-18	1	-27/+39
\| \| \| \| \| \| \| \| \|	ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to concat 2 of v2i32 into v4i16. This commit is to fix this issue and try to generate UZP1 instead of lots of MOV and INS. Patch is initalized by Kevin Qin, and refactored by Tim Northover. llvm-svn: 211144
*	Replace some assert(0)'s with llvm_unreachable.	Craig Topper	2014-06-18	18	-26/+27
\| \| \| \|	llvm-svn: 211141
*	Allow X86FastIsel to cope with 64 bit absolute relocations	Louis Gerbarg	2014-06-17	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This patch is a follow up to r211040 & r211052. Rather than bailing out of fast isel this patch will generate an alternate instruction (movabsq) instead of the leaq. While this will always have enough room to handle the 64 bit displacment it is generally over kill for internal symbols (most displacements will be within 32 bits) but since we have no way of communicating the code model to the the assmebler in order to avoid flagging an absolute leal/leaq as illegal when using a symbolic displacement. llvm-svn: 211130
*	[FastISel][X86] Optimize predicates and fold CMP instructions.	Juergen Ributzka	2014-06-17	1	-13/+109
\| \| \| \| \| \| \| \| \|	This optimizes predicates for certain compares, such as fcmp oeq %x, %x to fcmp ord %x, %x. The latter one is more efficient to generate. The same optimization is applied to conditional branches. llvm-svn: 211126
*	Remove more occurrences of the unused-mutex-parameter pattern.	Zachary Turner	2014-06-17	1	-16/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pattern loses some of its usefulness when the mutex type is statically polymorphic as opposed to runtime polymorphic, as swapping out the mutex type requires changing a significant number of function parameters, and templatizing the function parameter requires the methods to be defined in the headers. Furthermore, if LLVM is compiled with threads disabled then there may even be no mutex to acquire anyway, so it should not be up to individual APIs to know whether or not acquiring a mutex is required to use those APIs to begin with. It should be up to the user of the API. llvm-svn: 211125
*	R600/SI: Make sure target flags are set on pseudo VOP3 instructions	Tom Stellard	2014-06-17	2	-14/+14
\| \| \| \|	llvm-svn: 211120
*	Merge lib/Support/WindowsError.cpp into ib/Support/ErrorHandling.cpp.	Rafael Espindola	2014-06-17	3	-92/+69
\| \| \| \| \| \| \|	The OSX ranlib warns on files with no symbols, and lib/Support/WindowsError.cpp was empty when building on non-windows. llvm-svn: 211118
*	R600/SI: Match cttz_zero_undef	Matt Arsenault	2014-06-17	2	-1/+6
\| \| \| \|	llvm-svn: 211116
*	R600/SI: Match ctlz_zero_undef	Matt Arsenault	2014-06-17	3	-3/+8
\| \| \| \|	llvm-svn: 211115
*	R600: Use LDS and vectors for private memory	Tom Stellard	2014-06-17	17	-18/+661
\| \| \| \|	llvm-svn: 211110
*	R600/SI: Add a pattern for llvm.AMDGPU.barrier.global	Tom Stellard	2014-06-17	3	-1/+16
\| \| \| \|	llvm-svn: 211109
*	SelectionDAG: Expand i64 = FP_TO_SINT i32	Tom Stellard	2014-06-17	2	-0/+60
\| \| \| \|	llvm-svn: 211108
*	R600/SI: Re-initialize the m0 register after using it for indirect addressing	Tom Stellard	2014-06-17	1	-37/+50
\| \| \| \| \| \| \| \| \| \| \| \|	We need to store a value greater than or equal to the number of LDS bytes allocated by the shader in the m0 register in order for LDS instructions to work correctly. We always initialize m0 at the beginning of a shader, but this register is also used for indirect addressing offsets, so we need to re-initialize it any time we use indirect addressing. llvm-svn: 211107
*	[FastISel][X86] Fix previous refactoring commit (r211077)	Juergen Ributzka	2014-06-17	1	-4/+4
\| \| \| \| \| \| \|	Overlooked that fcmp_une uses an "or" instead of an "and" for combining the flags. llvm-svn: 211104
*	Fixed jump threading going to infinite loop.	Dinesh Dwivedi	2014-06-17	1	-0/+3
\| \| \| \| \| \| \| \| \|	This patch add code to remove unreachable blocks from function as they may cause jump threading to stuck in infinite loop. Differential Revision: http://reviews.llvm.org/D3991 llvm-svn: 211103
*	Move SetTheory from utils/TableGen into lib/TableGen so Clang can use it.	James Molloy	2014-06-17	2	-0/+324
\| \| \| \|	llvm-svn: 211100
*	Fix memory leak of RegScavenger accidentally added in r211037.	James Molloy	2014-06-17	1	-1/+3
\| \| \| \|	llvm-svn: 211097
*	AArch64: estimate inline asm length during branch relaxation	Tim Northover	2014-06-17	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	To make sure branches are in range, we need to do a better job of estimating the length of an inline assembly block than "it's probably 1 instruction, who'd write asm with more than that?". Fortunately there's already a (highly suspect, see how many ways you can think of to break it!) callback for this purpose, which is used by the other targets. rdar://problem/17277590 llvm-svn: 211095
*	[msan] Fix a comment.	Evgeniy Stepanov	2014-06-17	1	-2/+2
\| \| \| \|	llvm-svn: 211094
*	[msan] Fix handling of multiplication by a constant with a number of ↵	Evgeniy Stepanov	2014-06-17	1	-1/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	trailing zeroes. Multiplication by an integer with a number of trailing zero bits leaves the same number of lower bits of the result initialized to zero. This change makes MSan take this into account in the case of multiplication by a compile-time constant. We don't handle the general, non-constant, case because (a) it's not going to be cheap (computation-wise); (b) multiplication by a partially uninitialized value in user code is a bad idea anyway. Constant case must be handled because it appears from LLVM optimization of a completely valid user code, as the test case in compiler-rt demonstrates. llvm-svn: 211092
*	Support: Inject LLVM_VERSION_INFO into the Support library	Justin Bogner	2014-06-17	1	-0/+4
\| \| \| \| \| \| \| \|	Mimic r116632 in passing LLVM_VERSION_INFO from the Makefile build system to the build. This improves the -version output of tools that use llvm::cl under the configure+make system. llvm-svn: 211091
*	tools: Add a space between package version and LLVM_VERSION_INFO	Justin Bogner	2014-06-17	1	-1/+1
\| \| \| \| \| \|	This reads a little strangely. Add a space to clean it up. llvm-svn: 211090
*	Convert a few loops to use ranges.	Rafael Espindola	2014-06-17	1	-18/+15
\| \| \| \|	llvm-svn: 211089
*	Add an overload for SourceMgr::PrintMessage that takes an existing diagnostic.	Jordan Rose	2014-06-17	1	-8/+11
\| \| \| \|	llvm-svn: 211087
*	Modernize doc comments for SourceMgr.	Jordan Rose	2014-06-17	1	-12/+0
\| \| \| \| \| \|	No functionality change. llvm-svn: 211086
*	[InstCombine] mark ADD with nuw if no unsigned overflow	Jingyue Wu	2014-06-17	2	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As a starting step, we only use one simple heuristic: if the sign bits of both a and b are zero, we can prove "add a, b" do not unsigned overflow, and thus convert it to "add nuw a, b". Updated all affected tests and added two new tests (@zero_sign_bit and @zero_sign_bit2) in AddOverflow.ll Test Plan: make check-all Reviewers: eliben, rafael, meheff, chandlerc Reviewed By: chandlerc Subscribers: chandlerc, llvm-commits Differential Revision: http://reviews.llvm.org/D4144 llvm-svn: 211084
*	SROA: Only split loads on byte boundaries	Duncan P. N. Exon Smith	2014-06-17	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	r199771 accidently broke the logic that makes sure that SROA only splits load on byte boundaries. If such a split happens, some bits get lost when reassembling loads of wider types, causing data corruption. Move the width check up to reject such splits early, avoiding the corruption. Fixes PR19250. Patch by: Björn Steinbrink <bsteinbr@gmail.com> llvm-svn: 211082
*	[FastISel][X86] Refactor the code to get the X86 condition from a helper ↵	Juergen Ributzka	2014-06-16	3	-96/+110
\| \| \| \| \| \| \| \| \|	function. NFC. Make use of helper functions to simplify the branch and compare instruction selection in FastISel. Also add test cases for compare and conditonal branch. llvm-svn: 211077
*	Teach LoopUnrollPass to respect loop unrolling hints in metadata.	Eli Bendersky	2014-06-16	1	-87/+275
\| \| \| \| \| \| \| \| \| \| \| \| \|	[This is resubmitting r210721, which was reverted due to suspected breakage which turned out to be unrelated]. Some extra review comments were addressed. See D4090 and D4147 for more details. The Clang change that produces this metadata was committed in r210667 Patch by Mark Heffernan. llvm-svn: 211076
*	Revert r211066, 211067, 211068, 211069, 211070.	Zachary Turner	2014-06-16	4	-29/+48
\| \| \| \| \| \| \|	These were committed accidentally from the wrong branch before having a review sign-off. llvm-svn: 211072
*	Cleanup more unreferenced MutexGuard parameters on functions.	Zachary Turner	2014-06-16	3	-29/+29
\| \| \| \| \| \| \| \| \| \| \|	These parameters are intended to serve as sort of a contract that you cannot access the functions outside of a mutex. However, the entire JIT class cannot be accessed outside of a mutex anyway, and all methods acquire a lock as soon as they are entered. Since the containing class already is not intended to be thread-safe, it only serves to add code clutter. llvm-svn: 211071
*	Kill the LLVM global lock.	Zachary Turner	2014-06-16	3	-7/+23
\| \| \| \|	llvm-svn: 211069
*	Remove some code churn.	Zachary Turner	2014-06-16	1	-1/+1
\| \| \| \|	llvm-svn: 211068
*	Remove some more code out into a separate CL.	Zachary Turner	2014-06-16	4	-32/+6
\| \| \| \|	llvm-svn: 211067
*	Users of the llvm global mutex must now acquire it manually.	Zachary Turner	2014-06-16	3	-17/+8
\| \| \| \| \| \|	This allows the mutex to be acquired in a guarded, RAII fashion. llvm-svn: 211066
*	Add load/store functionality	Reed Kotler	2014-06-16	1	-7/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patches allows non conversions like i1=i2; where both are global ints. In addition, arithmetic and other things start to work since fast-isel will use existing patterns for non fast-isel from tablegen files where applicable. In addition i8, i16 will work in this limited context for assignment without the need for sign extension (zero or signed). It does not matter how i8 or i16 are loaded (zero or sign extended) since only the 8 or 16 relevant bits are used and clang will ask for sign extension before using them in arithmetic. This is all made more complete in forthcoming patches. for example: int i, j=1, k=3; void foo() { i = j + k; } Keep in mind that this pass is not enabled right now and is an experimental pass It can only be enabled with a hidden option to llvm of -mips-fast-isel. Test Plan: Run test-suite, loadstore2.ll and I will run some executable tests. Reviewers: dsanders Subscribers: mcrosier Differential Revision: http://reviews.llvm.org/D3856 llvm-svn: 211061
*	AArch64: Add backend intrinsic for rbit.	Jim Grosbach	2014-06-16	1	-0/+4
\| \| \| \| \| \| \| \| \|	Define an intrinsic for the frontend to use and pattern match it to the RBIT instruction. rdar://9283021 llvm-svn: 211058
*	ARM: intrinsic support for rbit.	Jim Grosbach	2014-06-16	1	-0/+5
\| \| \| \| \| \| \| \| \|	We already have an ARMISD node. Create an intrinsic to map to it so we can add support for the frontend __rbit() intrinsic. rdar://9283021 llvm-svn: 211057
*	[PPC64] Fix PR19893 - improve code generation for local function addresses	Bill Schmidt	2014-06-16	3	-21/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal code generation for forming a function address that is local to the compile unit. The existing code was treating both local and non-local functions identically. This patch fixes the problem by properly identifying local functions and generating the proper addis/addi code. I also noticed that Rafael's earlier changes to correct the surrounding code in PPCISelLowering.cpp were also needed for fast instruction selection in PPCFastISel.cpp, so this patch fixes that code as well. The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new code generation. I've added a -O0 run line to test the fast-isel code as well. Tested on powerpc64[le]-unknown-linux-gnu with no regressions. llvm-svn: 211056
*	Since the DataLayout is always found off of the subtarget go ahead	Eric Christopher	2014-06-16	1	-7/+3
\| \| \| \| \| \|	and query the base target machine implementation for it. llvm-svn: 211055
*	Clean up some unnecessary mutex guards.	Zachary Turner	2014-06-16	1	-25/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These were being used as unreferenced parameters to enforce that the methods must not be called without holding a mutex, but all of the methods in question were internal, and the methods were only exposed through an interface whose entire purpose was to serialize access to these structures, so expecting the methods to be accessed under a mutex is reasonable enough. Reviewed by: blaikie Differential Revision: http://reviews.llvm.org/D4162 llvm-svn: 211054
*	Improve comments for r211040	Louis Gerbarg	2014-06-16	1	-1/+4
\| \| \| \| \| \| \| \|	Added comment to clarify why we r211040 choose to bail out of fast isel instead of generating a more complicated relocation, and fix mislabelled register in the comments of the asan test case. llvm-svn: 211052
*	ARM: implement correct atomic operations on v7M	Tim Northover	2014-06-16	1	-8/+14
\| \| \| \| \| \| \| \| \| \|	ARM v7M has ldrex/strex but not ldrexd/strexd. This means 32-bit operations should work as normal, but 64-bit ones are almost certainly doomed. Patch by Phoebe Buckheister. llvm-svn: 211042
*	Fix illegal relocations in X86FastISel	Louis Gerbarg	2014-06-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	On x86_86 the lea instruction can only use a 32 bit immediate value. When the code is compiled statically the RIP register is not used, meaning the immediate is all that can be used for the relocation, which is not sufficient in the case of targets more than +/- 2GB away. This patch bails out of fast isel in those cases and reverts to DAG which does the right thing. Test case included. llvm-svn: 211040
*	LowerSwitch: track bounding range for the condition tree.	Jim Grosbach	2014-06-16	1	-27/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When LowerSwitch transforms a switch instruction into a tree of ifs it is actually performing a binary search into the various case ranges, to see if the current value falls into one cases range of values. So, if we have a program with something like this: switch (a) { case 0: do0(); break; case 1: do1(); break; case 2: do2(); break; default: break; } the code produced is something like this: if (a < 1) { if (a == 0) { do0(); } } else { if (a < 2) { if (a == 1) { do1(); } } else { if (a == 2) { do2(); } } } This code is inefficient because the check (a == 1) to execute do1() is not needed. The reason is that because we already checked that (a >= 1) initially by checking that also (a < 2) we basically already inferred that (a == 1) without the need of an extra basic block spawned to check if actually (a == 1). The patch addresses this problem by keeping track of already checked bounds in the LowerSwitch algorithm, so that when the time arrives to produce a Leaf Block that checks the equality with the case value / range the algorithm can decide if that block is really needed depending on the already checked bounds . For example, the above with "a = 1" would work like this: the bounds start as LB: NONE , UB: NONE as (a < 1) is emitted the bounds for the else path become LB: 1 UB: NONE. This happens because by failing the test (a < 1) we know that the value "a" cannot be smaller than 1 if we enter the else branch. After the emitting the check (a < 2) the bounds in the if branch become LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then the upper bound becomes 2 - 1 = 1. When it is time to emit the leaf block for "case 1:" we notice that 1 can be squeezed exactly in between the LB and UB, which means that if we arrived to that block there is no need to emit a block that checks if (a == 1). Patch by: Marcello Maggioni <hayarms@gmail.com> llvm-svn: 211038