bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make headers self-contained again.	Benjamin Kramer	2016-03-04	2	-0/+2
\| \| \| \|	llvm-svn: 262702
*	AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics	Nikolay Haustov	2016-03-04	4	-32/+169
\| \| \| \| \| \| \| \| \| \| \|	These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701
*	Fix a memory leak.	Easwaran Raman	2016-03-04	1	-1/+4
\| \| \| \|	llvm-svn: 262682
*	Fix a use-after-free bug introduced in r262636	Easwaran Raman	2016-03-04	2	-6/+11
\| \| \| \|	llvm-svn: 262679
*	[libfuzzer] arbitrary function adapter.	Mike Aizatsky	2016-03-03	5	-0/+299
\| \| \| \| \| \| \| \| \|	The adapter automates converting sequence of bytes into arbitrary arguments. Differential Revision: http://reviews.llvm.org/D17829 llvm-svn: 262673
*	[InstCombine] Combine A->B->A BitCast	Guozhi Wei	2016-03-03	2	-0/+104
\| \| \| \| \| \| \| \| \| \|	This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
*	[libFuzzer] when interrupted, call _Exit() instead of exit()	Kostya Serebryany	2016-03-03	1	-1/+1
\| \| \| \|	llvm-svn: 262667
*	[X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.	Simon Pilgrim	2016-03-03	1	-3/+3
\| \| \| \| \| \|	PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661
*	[RuntimeDyld] Fix '_' stripping in ↵	Lang Hames	2016-03-03	1	-8/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RTDyldMemoryManager::getSymbolAddressInProcess. The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a linker-mangled symbol name, but it calls through to dlsym to do the lookup (via DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled symbol name. Historically we've attempted to "demangle" by removing leading '_'s on all platforms, and fallen back to an extra search if that failed. That's broken, as it can cause symbols to resolve incorrectly on platforms that don't do mangling if you query '_foo' and the process also happens to contain a 'foo'. Fix this by demangling conditionally based on the host platform. That's safe here because this function is specifically for symbols in the host process, so the usual cross-process JIT looking concerns don't apply. M unittests/ExecutionEngine/ExecutionEngineTest.cpp M lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp llvm-svn: 262657
*	[ValueTracking] "constant fold" an experimental hidden option	Philip Reames	2016-03-03	1	-7/+0
\| \| \| \|	llvm-svn: 262648
*	[ValueTracking] Remove dead code from an old experiment	Philip Reames	2016-03-03	1	-208/+2
\| \| \| \| \| \| \| \| \| \|	This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits. While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default. Given this, it's time to abandon the experiment. Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments. Given how easy the patch is to apply, there's no reason to leave the code in tree. For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads. In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach. llvm-svn: 262646
*	[InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)	Sanjay Patel	2016-03-03	1	-7/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given that we're not actually reducing the instruction count in the included regression tests, I think we would call this a canonicalization step. The motivation comes from the example in PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable example of: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %bc = bitcast <4 x i32> %not to <2 x i64> %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1> %bc2 = bitcast <2 x i64> %notnot to <4 x i32> ret <4 x i32> %bc2 } Simplifies to the expected: define <4 x i32> @is_negative(<4 x i32> %x) { %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> ret <4 x i32> %lobit } Differential Revision: http://reviews.llvm.org/D17583 llvm-svn: 262645
*	Fix breakage caused by r262636.	Easwaran Raman	2016-03-03	1	-1/+1
\| \| \| \| \| \|	Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused)) llvm-svn: 262643
*	[SCEV] Prove no-overflow via constant ranges	Sanjoy Das	2016-03-03	1	-0/+41
\| \| \| \| \| \| \|	Exploit ScalarEvolution::getRange's newly acquired smartness (since r262438) by using that to infer nsw and nuw when possible. llvm-svn: 262639
*	[SCEV] Be less eager about demoting zexts to sexts	Sanjoy Das	2016-03-03	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \|	After r262438 we can have provably positive NSW SCEV expressions whose zero extensions cannot be simplified (since r262438 makes SCEV better at computing constant ranges). This means demoting sexts of positive add recurrences eagerly can result in an unsimplified zero extension where we could have had a simplified sign extension. This change fixes the issue by teaching SCEV to demote sext of a positive SCEV expression to a zext only if the sext could not be simplified. llvm-svn: 262638
*	[ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on ranges	Sanjoy Das	2016-03-03	1	-12/+20
\| \| \| \| \| \| \|	This will be used in a later patch to ScalarEvolution. Right now only the unit tests exercise the newly added code. llvm-svn: 262637
*	Infrastructure for PGO enhancements in inliner	Easwaran Raman	2016-03-03	5	-45/+221
\| \| \| \| \| \| \| \| \| \| \| \|	This patch provides the following infrastructure for PGO enhancements in inliner: Enable the use of block level profile information in inliner Incremental update of block frequency information during inlining Update the function entry counts of callees when they get inlined into callers. Differential Revision: http://reviews.llvm.org/D16381 llvm-svn: 262636
*	[X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS	Simon Pilgrim	2016-03-03	3	-19/+35
\| \| \| \| \| \| \| \| \| \|	The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635
*	Use LineLocation instead of CallsiteLocation to index callsite profile.	Dehao Chen	2016-03-03	4	-54/+30
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples). Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17827 llvm-svn: 262634
*	[X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.	Simon Pilgrim	2016-03-03	1	-14/+2
\| \| \| \| \| \|	lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway. llvm-svn: 262633
*	[X86] Pulled out repeated code testing for constant vector shift amount. NFCI.	Simon Pilgrim	2016-03-03	1	-8/+6
\| \| \| \|	llvm-svn: 262631
*	MCU target has its own ABI, however X86 interrupt handler calling convention ↵	Amjad Aboud	2016-03-03	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	overrides this ABI. Fixed the ordering to check first for X86 interrupt handler then for MCU target. Differential Revision: http://reviews.llvm.org/D17801 llvm-svn: 262628
*	[X86] Don't assume that shuffle non-mask operands starts at #0.	Ahmed Bougacha	2016-03-03	2	-32/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627
*	[LoopUtils, LV] Fix PR26734	Matthew Simpson	2016-03-03	1	-1/+1
\| \| \| \| \| \| \| \|	The vectorization of first-order recurrences (r261346) caused PR26734. When detecting these recurrences, we need to ensure that the previous value is actually defined inside the loop. This patch includes the fix and test case. llvm-svn: 262624
*	[AArch64] fold 'isPositive' vector integer operations (PR26819)	Sanjay Patel	2016-03-03	1	-1/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26819 Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might come from autovectorization even if it's unlikely to be produced from C NEON intrinsics. The patch is based on the x86 equivalent: http://reviews.llvm.org/rL262036 Differential Revision: http://reviews.llvm.org/D17834 llvm-svn: 262623
*	AVX512: Combine AND + TESTM instructions .	Igor Breger	2016-03-03	3	-8/+25
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17844 llvm-svn: 262621
*	[AVR] Add calling convention parser tokens	Dylan McKay	2016-03-03	4	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser. Reviewers: arsenm Subscribers: dylanmckay, llvm-commits Differential Revision: http://reviews.llvm.org/D16348 llvm-svn: 262600
*	[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG	Simon Pilgrim	2016-03-03	2	-10/+42
\| \| \| \| \| \| \| \|	Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 262599
*	Revert "[ARM] Merging 64-bit divmod lib calls into one"	Renato Golin	2016-03-03	2	-11/+1
\| \| \| \| \| \|	This reverts commit r262507, which broke some ARM buildbots. llvm-svn: 262594
*	[LLVM][AVX512] PSRLWI Chnage imm8 to int	Michael Zuckerman	2016-03-03	1	-3/+3
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17753 llvm-svn: 262592
*	[BranchFolding] Change function name related with merging MMOs. NFC	Junmo Park	2016-03-03	1	-7/+5
\| \| \| \| \| \| \| \| \| \| \|	Summary: Removing MMOs is not our prefer behavior any more. Reviewers: mcrosier, reames Differential Revision: http://reviews.llvm.org/D17668 llvm-svn: 262580
*	AMDGPU: Insert two S_NOP instructions for every high level source statement.	Tom Stellard	2016-03-03	4	-0/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch by: Konstantin Zhuravlyov Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input. Reviewers: tstellarAMD, arsenm Subscribers: echristo, dblaikie, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17454 llvm-svn: 262579
*	AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRs	Tom Stellard	2016-03-03	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When there were no free SGPRs, we were trying to move this value into some of the reserved registers which was causing a segmentation fault. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17590 llvm-svn: 262577
*	[X86] Enable forwarding bool arguments in tail calls (PR26305)	Hans Wennborg	2016-03-03	1	-0/+20
\| \| \| \| \| \| \| \| \|	The code was previously not able to track a boolean argument at a call site back to the formal argument of the caller. Differential Revision: http://reviews.llvm.org/D17786 llvm-svn: 262575
*	[PPCVSXFMAMutate] Temporarily disable this pass	Tim Shen	2016-03-03	1	-2/+8
\| \| \| \|	llvm-svn: 262573
*	[MBP] Renaming a confusing variable and add clarifying comments	Philip Reames	2016-03-03	1	-19/+24
\| \| \| \| \| \|	Was discussed as part of http://reviews.llvm.org/D17830 llvm-svn: 262571
*	[MBP] Avoid placing random blocks between loop preheader and header	Philip Reames	2016-03-03	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code. The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!. Differential Revision: http://reviews.llvm.org/D17830 llvm-svn: 262547
*	[X86] Don't give catch objects a displacement of zero	David Majnemer	2016-03-03	5	-25/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Catch objects with a displacement of zero do not initialize a catch object. The displacement is relative to %rsp at the end of the function's prologue for x86_64 targets. If we place an object at the top-of-stack, we will end up wit a displacement of zero resulting in our catch object remaining uninitialized. Address this by creating our catch objects as fixed objects. We will ensure that the UnwindHelp object is created after the catch objects so that no catch object will have a displacement of zero. Differential Revision: http://reviews.llvm.org/D17823 llvm-svn: 262546
*	AMDGPU: Simplify boolean conditional return statements	Matt Arsenault	2016-03-02	7	-44/+19
\| \| \| \| \| \|	Patch by Richard Thomson llvm-svn: 262536
*	[MBP] Remove overly verbose debug output	Philip Reames	2016-03-02	1	-5/+2
\| \| \| \|	llvm-svn: 262531
*	Explode store of arrays in instcombine	Amaury Sechet	2016-03-02	1	-1/+33
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized. Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17828 llvm-svn: 262530
*	[MBP] Adjust debug output to be more focused and approachable	Philip Reames	2016-03-02	1	-18/+9
\| \| \| \|	llvm-svn: 262522
*	Unpack array of all sizes in InstCombine	Amaury Sechet	2016-03-02	1	-5/+38
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements. Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15890 llvm-svn: 262521
*	Really fix ASAN leak/etc issues with MemorySSA unittests	Daniel Berlin	2016-03-02	1	-3/+2
\| \| \| \|	llvm-svn: 262519
*	[libFuzzer] add -Werror for libFuzzer build rule	Kostya Serebryany	2016-03-02	1	-1/+1
\| \| \| \|	llvm-svn: 262517
*	Revert "Fix ASAN detected errors in code and test" (it was not meant to be ↵	Daniel Berlin	2016-03-02	1	-2/+3
\| \| \| \| \| \| \| \|	committed yet) This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95. llvm-svn: 262512
*	Fix ASAN detected errors in code and test	Daniel Berlin	2016-03-02	1	-3/+2
\| \| \| \|	llvm-svn: 262511
*	[ARM] Merging 64-bit divmod lib calls into one	Renato Golin	2016-03-02	2	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262507
*	Revert "[X86] Elide references to _chkstk for dynamic allocas"	Reid Kleckner	2016-03-02	1	-29/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit r262370. It turns out there is code out there that does sequences of allocas greater than 4K: http://crbug.com/591404 The goal of this change was to improve the code size of inalloca call sequences, but we got tangled up in the mess of dynamic allocas. Instead, we should come back later with a separate MI pass that uses dominance to optimize the full sequence. This should also be able to remove the often unneeded stacksave/stackrestore pairs around the call. llvm-svn: 262505
*	ARM: Introduce conservative load/store optimization mode	Matthias Braun	2016-03-02	1	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504