bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ThinLTO] Handle bitcode without function summary sections gracefully	Teresa Johnson	2015-11-21	2	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Several fixes to the handling of bitcode files without function summary sections so that they are skipped during ThinLTO processing in llvm-lto and the gold plugin when appropriate instead of aborting. 1 Don't assert when trying to add a FunctionInfo that doesn't have a summary attached. 2 Skip FunctionInfo structures that don't have attached function summary sections when trying to create the combined function summary. 3 In both llvm-lto and gold-plugin, check whether a bitcode file has a function summary section before trying to parse the index, and skip the bitcode file if it does not. 4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug where we returned to early while looking for the summary section. Also added llvm-lto and gold-plugin based tests for cases where we don't have function summaries in the bitcode file. I verified that either the first couple fixes described above are enough to avoid the crashes, or fixes 1,3,4. But have combined them all here for added robustness. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D14903 llvm-svn: 253796
*	[MachineInstrBuilder] Support for adding a ConstantPoolIndex MO with an ↵	Simon Pilgrim	2015-11-21	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	additional offset. MachineInstrBuilder::addDisp can already add an immediate or global address MO with an adjusted offset, this patch adds support for constant pool indices as well. All remaining MO types still assert - there are a number of other types that could support adjusted offsets but I have no test cases at this time. Required to fix a regression in D13988 found by Mikael Holmén during stress testing (test case attached). Differential Revision: http://reviews.llvm.org/D14867 llvm-svn: 253795
*	move a single test case to where most other instcombine shuffle bug test ↵	Sanjay Patel	2015-11-21	2	-16/+15
\| \| \| \| \| \|	cases exist llvm-svn: 253784
*	[X86][SSE] Added SSE2 PSUBUS tests	Simon Pilgrim	2015-11-21	1	-43/+96
\| \| \| \|	llvm-svn: 253783
*	[X86][SSE] Regenerate TRUNC-SEXT tests	Simon Pilgrim	2015-11-21	1	-15/+16
\| \| \| \| \| \|	Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253782
*	[X86][SSE] Regenerate MINMAX tests	Simon Pilgrim	2015-11-21	1	-1158/+9712
\| \| \| \| \| \|	Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253781
*	[X86][SSE] Regenerate PSUBUS tests	Simon Pilgrim	2015-11-21	1	-196/+331
\| \| \| \| \| \|	Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253780
*	[DAGCombiner] Bugfix for lost chain depenedency.	Jonas Paulsson	2015-11-21	1	-0/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When MergeConsecutiveStores() combines two loads and two stores into wider loads and stores, the chain users of both of the original loads must be transfered to the new load, because it may be that a chain user only depends on one of the loads. New test case: test/CodeGen/SystemZ/dag-combine-01.ll Reviewed by James Y Knight. Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6 llvm-svn: 253779
*	[X86][AVX] Regenerate AVX splat tests	Simon Pilgrim	2015-11-21	1	-20/+55
\| \| \| \| \| \|	Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253778
*	[X86][AVX512] Added AVX512 VMOVLHPS/VMOVHLPS shuffle decode comments.	Simon Pilgrim	2015-11-21	1	-28/+8
\| \| \| \|	llvm-svn: 253777
*	[X86][SSE] Legal XMM Register Class ordering for SSE1	Simon Pilgrim	2015-11-21	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \|	It turns out we have a number of places that just grab the first type attached to a register class for various reasons. This is fine unless for some reason that type isn't legal on the current target, such as for SSE1 which doesn't support v16i8/v8i16/v4i32/v2i64 - all of which were included before 4f32 in the class. Given that this is such a rare situation I've just re-ordered the types and placed the float types first. Fix for PR16133 Differential Revision: http://reviews.llvm.org/D14787 llvm-svn: 253773
*	llvm-link option and test for recent metadata mapping bug	Teresa Johnson	2015-11-21	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add a -preserve-modules option to llvm-link that simulates LTO clients that don't destroy modules as they are linked. This enables reproduction of a recent bug introduced by a metadata linking change that was only caught when the modules weren't destroyed before writing bitcode (LTO on Windows). See http://llvm.org/viewvc/llvm-project?view=revision&revision=253170 for more details on the original bug and the fix. Confirmed the new test added here reproduces the failure using the new option when I suppress the fix. Reviewers: pcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14818 llvm-svn: 253740
*	Move free-zext.ll to llvm/test/Transforms/CodeGenPrepare/AArch64/	NAKAMURA Takumi	2015-11-20	1	-0/+0
\| \| \| \|	llvm-svn: 253730
*	Fix another infinite loop in Reassociate caused by Constant::isZero().	Owen Anderson	2015-11-20	1	-0/+13
\| \| \| \| \| \|	Not all zero vectors are ConstantDataVector's. llvm-svn: 253723
*	[CodeGenPrepare] Create more extloads and fewer ands	Geoff Berry	2015-11-20	2	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add and instructions immediately after loads that only have their low bits used, assuming that the (and (load x) c) will be matched as a extload and the ands/truncs fed by the extload will be removed by isel. Reviewers: mcrosier, qcolombet, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14584 llvm-svn: 253722
*	[ShrinkWrap] Teach ShrinkWrap to handle targets requiring a register scavenger.	Arnaud A. de Grandmaison	2015-11-20	1	-0/+184
\| \| \| \| \| \| \| \|	The included test only checks for a compiler crash for now. Several people are facing this issue, so we first resolve the crash, and will increase shrinkwrap's coverage later in a follow-up patch. llvm-svn: 253718
*	SamplePGO - Tweak RUN command for a test. NFC.	Diego Novillo	2015-11-20	1	-1/+1
\| \| \| \|	llvm-svn: 253717
*	SamplePGO - Do not count never-executed inlined functions when computing ↵	Diego Novillo	2015-11-20	2	-0/+152
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	coverage. If a function was originally inlined but not actually hot at runtime, its samples will not be counted inside the parent function. This throws off the coverage calculation because it expects to find more used records than it should. Fixed by ignoring functions that will not be inlined into the parent. Currently, this is inlined functions with 0 samples. In subsequent patches, I'll change this to mean "cold" functions. llvm-svn: 253716
*	[AArch64]Merge narrow zero stores to a wider store	Jun Bum Lim	2015-11-20	1	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \|	This change merges adjacent zero stores into a wider single store. For example : strh wzr, [x0] strh wzr, [x0, #2] becomes str wzr, [x0] This will fix PR25410. llvm-svn: 253711
*	Weak non-function symbols were being accessed directly, which is	Eric Christopher	2015-11-20	1	-0/+27
\| \| \| \| \| \| \| \| \| \|	incorrect, as the chosen representative of the weak symbol may not live with the code in question. Always indirect the access through the TOC instead. Patch by Kyle Butt! llvm-svn: 253708
*	Fix test case label check	Bill Seurer	2015-11-20	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several (but not all) of the labels that are checked for in this test case are checked as strings instead of labels. This can cause an apparent test case failure if they are tested in an appropriately named directory. For example, one of them that fails: define zeroext i32 @test2(i32 %A.u, i32 %B.u) { ; A8: test2 ; A8: uxtab r0, r0, r1 Output that causes it to fail: . . . .file "/home/seurer/llvm/llvm-test2/test/CodeGen/Thumb2/thumb2-uxt_rot.ll" . . . .globl test2 .align 1 .type test2,%function .code 16 @ @test2 .thumb_func test2: .fnstart The "A8: test2" matches on the directory name instead of the label. llvm-svn: 253702
*	Handle ARMv6-J as an alias, instead of fake architecture	Artyom Skrobov	2015-11-20	1	-34/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This follows D14577 to treat ARMv6-J as an alias for ARMv6, instead of an architecture in its own right. The functional change is that the default CPU when targeting ARMv6-J changes from arm1136j-s to arm1136jf-s, which is currently used as the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs. The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't affect code generation, attributes, optimizations, or anything else, apart from selecting the default CPU. Reviewers: rengolin, logan, compnerd Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14755 llvm-svn: 253675
*	SamplePGO - Add line offset and discriminator information to sample reports.	Diego Novillo	2015-11-20	1	-4/+4
\| \| \| \| \| \| \| \| \|	While debugging some sampling coverage problems, I found this useful: When applying samples from a profile, it helps to also know what line offset and discriminator the sample belongs to. This makes it easy to correlate against the input profile. llvm-svn: 253670
*	Fix a pair of issues that caused an infinite loop in reassociate.	Owen Anderson	2015-11-20	1	-0/+20
\| \| \| \| \| \| \| \|	Terrifyingly, one of them is a mishandling of floating point vectors in Constant::isZero(). How exactly this issue survived this long is beyond me. llvm-svn: 253655
*	[mips][microMIPS] Implement MUL[_S].PH, MULEQ_S.W.PHL, MULEQ_S.W.PHR, ↵	Hrvoje Varga	2015-11-20	4	-0/+30
\| \| \| \| \| \| \| \|	MULEU_S.PH.QBL, MULEU_S.PH.QBR, MULQ_RS.PH, MULQ_RS.W, MULQ_S.PH and MULQ_S.W instructions Differential Revision: http://reviews.llvm.org/D14280 llvm-svn: 253651
*	[WebAssembly] Rename SWITCH to TABLESWITCH to match the current wording in ↵	Dan Gohman	2015-11-20	1	-2/+2
\| \| \| \| \| \|	the spec. llvm-svn: 253642
*	ScalarEvolution: do not set nuw when creating exprs of form <expr> + <all-ones>.	Peter Collingbourne	2015-11-20	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \|	The nuw constraint will not be satisfied unless <expr> == 0. This bug has been around since r102234 (in 2010!), but was uncovered by r251052, which introduced more aggressive optimization of nuw scev expressions. Differential Revision: http://reviews.llvm.org/D14850 llvm-svn: 253627
*	[LTO] Add options to llvm-lto to select output format and dump merged module	Tobias Edler von Koch	2015-11-20	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \|	This introduces two new options: - "llvm-lto -save-merged-module -o outfile" dumps the LTO Module to outfile.merged.bc prior to CodeGen and after LTO optimizations have been run. - "llvm-lto -filetype=asm -o outfile" makes llvm-lto emit assembly instead of object code in outfile. Both are intended for use in lit tests. llvm-svn: 253624
*	[WinEH] Disable most forms of demotion	Reid Kleckner	2015-11-19	4	-139/+76
\| \| \| \| \| \| \| \| \| \|	Now that the register allocator knows about the barriers on funclet entry and exit, testing has shown that this is unnecessary. We still demote PHIs on unsplittable blocks due to the differences between the IR CFG and the Machine CFG. llvm-svn: 253619
*	[X86][SSE4A] Fix issue with EXTRQI shuffles not starting at the correct ↵	Simon Pilgrim	2015-11-19	1	-0/+37
\| \| \| \| \| \| \| \|	start index. Found during stress testing. llvm-svn: 253611
*	[InstCombine] add tests to show missing trunc optimizations	Sanjay Patel	2015-11-19	1	-0/+46
\| \| \| \|	llvm-svn: 253609
*	[InstCombine] add tests to show missing bitcast optimizations	Sanjay Patel	2015-11-19	1	-0/+29
\| \| \| \|	llvm-svn: 253602
*	Reimplement discriminator assignment algorithm.	Dehao Chen	2015-11-19	2	-4/+76
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line. Reviewers: dblaikie, davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14738 llvm-svn: 253594
*	[GlobalOpt] Localize some globals that have non-instruction users	James Molloy	2015-11-19	1	-0/+28
\| \| \| \| \| \|	We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive. llvm-svn: 253584
*	[AArch64]Extend merging narrow loads into a wider load	Jun Bum Lim	2015-11-19	1	-0/+256
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change extends r251438 to handle more narrow load promotions including byte type, unscaled, and signed. For example, this change will convert : ldursh w1, [x0, #-2] ldurh w2, [x0, #-4] into ldur w2, [x0, #-4] asr w1, w2, #16 and w2, w2, #0xffff llvm-svn: 253577
*	this new test file was accidentally left out of r253573	Sanjay Patel	2015-11-19	1	-0/+56
\| \| \| \|	llvm-svn: 253574
*	[CGP] despeculate expensive cttz/ctlz intrinsics	Sanjay Patel	2015-11-19	1	-18/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is another step towards allowing SimplifyCFG to speculate harder, but then have CGP clean things up if the target doesn't like it. Previous patches in this series: http://reviews.llvm.org/D12882 http://reviews.llvm.org/D13297 D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special handling because of weirdness in the intrinsic definition for handling a zero input (that definition can probably be blamed on x86). For example, if we have the usual speculated-by-select expensive op pattern like this: %tobool = icmp eq i64 %A, 0 %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true %cond = select i1 %tobool, i64 64, i64 %0 ret i64 %cond There's an instcombine that will turn it into: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false) ; is_zero_undef == false This CGP patch is looking for that case and despeculating it back into: entry: %tobool = icmp eq i64 %A, 0 br i1 %tobool, label %cond.end, label %cond.true cond.true: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true br label %cond.end cond.end: %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ] ret i64 %cond This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), but if we increase speculation in SimplifyCFG (the next step in this patch series), then we should avoid those kinds of cases in the first place. The need for this patch was originally mentioned here: http://reviews.llvm.org/D7506 with follow-up here: http://reviews.llvm.org/D7554 Differential Revision: http://reviews.llvm.org/D14630 llvm-svn: 253573
*	X86: More efficient legalization of wide integer compares	Hans Wennborg	2015-11-19	6	-73/+149
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In particular, this makes the code for 64-bit compares on 32-bit targets much more efficient. Example: define i32 @test_slt(i64 %a, i64 %b) { entry: %cmp = icmp slt i64 %a, %b br i1 %cmp, label %bb1, label %bb2 bb1: ret i32 1 bb2: ret i32 2 } Before this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax setae %al cmpl 16(%esp), %ecx setge %cl je .LBB2_2 movb %cl, %al .LBB2_2: testb %al, %al jne .LBB2_4 movl $1, %eax retl .LBB2_4: movl $2, %eax retl After this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax sbbl 16(%esp), %ecx jge .LBB1_2 movl $1, %eax retl .LBB1_2: movl $2, %eax retl Differential Revision: http://reviews.llvm.org/D14496 llvm-svn: 253572
*	SamplePGO - Sort samples by source location when emitting as text.	Diego Novillo	2015-11-19	2	-2/+2
\| \| \| \| \| \| \| \|	When dumping function samples or writing them out as text format, it helps if the samples are emitted sorted by source location. The sorting of the maps is a bit slow, so we only do it on demand. llvm-svn: 253568
*	[mips] Add tests for ROL and ROR macros expansion	Zoran Jovanovic	2015-11-19	3	-0/+356
\| \| \| \| \|	Author: obucina llvm-svn: 253567
*	AVX-512: Fixed COPY_TO_REGCLASS for mask registers	Elena Demikhovsky	2015-11-19	3	-13/+13
\| \| \| \| \| \| \| \| \|	Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits. Copying 8 bits under DQ may be done with kmovb. Differential Revision: http://reviews.llvm.org/D14812 llvm-svn: 253563
*	Removing specific target from the generic test	Artyom Skrobov	2015-11-19	1	-2/+2
\| \| \| \|	llvm-svn: 253562
*	[X86][AVX] Fix lowering of X86ISD::VZEXT_MOVL for 128-bit -> 256-bit extension	Simon Pilgrim	2015-11-19	1	-2/+69
\| \| \| \| \| \| \| \| \| \|	The lowering patterns for X86ISD::VZEXT_MOVL for 128-bit to 256-bit vectors were just copying the lower xmm instead of actually masking off the first scalar using a blend. Fix for PR25320. Differential Revision: http://reviews.llvm.org/D14151 llvm-svn: 253561
*	Alternative to long nops for X86 CPUs, by Andrey Turetsky	Alexey Bataev	2015-11-19	1	-7/+1
\| \| \| \| \| \| \|	Make X86AsmBackend generate smarter nops instead of a bunch of 0x90 for code alignment for CPUs which don't support long nop instructions. Differential Revision: http://reviews.llvm.org/D14178 llvm-svn: 253557
*	[FunctionAttrs] Provide a mechanism for adding function attributes from the ↵	James Molloy	2015-11-19	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	command line This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc). The syntax is -force-attribute=function_name:attribute_name All function attributes are parsed except alignstack as it requires an argument. llvm-svn: 253550
*	AVX512: Implemented encoding, intrinsics and DAG lowering for VMOVDDUP ↵	Igor Breger	2015-11-19	7	-15/+247
\| \| \| \| \| \| \| \|	instructions. Differential Revision: http://reviews.llvm.org/D14702 llvm-svn: 253548
*	AVX512: Implemented encoding for the vmovss.s and vmovsd.s instructions.	Igor Breger	2015-11-19	1	-0/+96
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D14771 llvm-svn: 253547
*	AVX512: Implemented encoding for the follow instructions.	Igor Breger	2015-11-19	4	-0/+1440
\| \| \| \| \| \| \| \|	vmovapd.s, vmovaps.s, vmovdqa32.s, vmovdqa64.s, vmovdqu16.s, vmovdqu32.s, vmovdqu64.s, vmovdqu8.s, vmovupd.s, vmovups.s Differential Revision: http://reviews.llvm.org/D14768 llvm-svn: 253546
*	Pointers in Masked Load, Store, Gather, Scatter intrinsics	Elena Demikhovsky	2015-11-19	3	-0/+177
\| \| \| \| \| \| \| \| \| \|	The masked intrinsics support all integer and floating point data types. I added the pointer type to this list. Added tests for CodeGen and for Loop Vectorizer. Updated the Language Reference. Differential Revision: http://reviews.llvm.org/D14150 llvm-svn: 253544
*	Revert "Change memcpy/memset/memmove to have dest and source alignments."	Pete Cooper	2015-11-19	266	-1499/+1474
\| \| \| \| \| \| \| \| \| \|	This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543