bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86] PMOVX tests - remove unnecessary mcpu arguments and regenerate	Simon Pilgrim	2015-10-25	1	-38/+110
\| \| \| \|	llvm-svn: 251230
*	[X86] Stack folding tests - just use mtriple - no need for mcpu in these tests	Simon Pilgrim	2015-10-25	7	-7/+7
\| \| \| \|	llvm-svn: 251229
*	[X86] Use correct calling convention for MCU psABI libcalls	Michael Kuperstein	2015-10-25	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \|	When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223
*	[X86][SSE] Use lowerVectorShuffleWithUNPCK instead of custom matches.	Simon Pilgrim	2015-10-24	4	-24/+24
\| \| \| \| \| \|	Most 128-bit and 256-bit shuffles were manually matching UNPCK patterns - use lowerVectorShuffleWithUNPCK to be more thorough. llvm-svn: 251211
*	Removed old FIXME - we do generate movddup for SSE3 and higher	Simon Pilgrim	2015-10-24	1	-2/+1
\| \| \| \|	llvm-svn: 251205
*	[DAGCombiner] Generalize masking of constant rotates.	Simon Pilgrim	2015-10-24	2	-58/+32
\| \| \| \| \| \| \| \|	We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197
*	X86ISelLowering: Support tail calls to/from callee pop functions	Hans Wennborg	2015-10-24	1	-10/+159
\| \| \| \| \| \| \| \| \|	This enables tail calls with thiscall, stdcall, vectorcall and fastcall functions. Differential Revision: http://reviews.llvm.org/D13999 llvm-svn: 251190
*	[X86][XOP] Add support for lowering vector rotations	Simon Pilgrim	2015-10-24	2	-302/+126
\| \| \| \| \| \| \| \| \| \|	This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188
*	[X86] Clean up the tail call eligibility logic	Reid Kleckner	2015-10-23	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The logic here isn't straightforward because our support for TargetOptions::GuaranteedTailCallOpt. Also fix a bug where we were allowing tail calls to cdecl functions from fastcall and vectorcall functions. We were special casing thiscall and stdcall callers rather than checking for any convention that requires clearing stack arguments before returning. Reviewers: hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14024 llvm-svn: 251137
*	[ARM] Renaming +t2dsp feature into +dsp, as discussed on llvm-dev	Artyom Skrobov	2015-10-23	3	-4/+4
\| \| \| \|	llvm-svn: 251125
*	[ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee ↵	Oleg Ranevskyy	2015-10-23	1	-0/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	saved registers Summary: When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR. This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap. Reviewers: asl, rengolin Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D13672 llvm-svn: 251123
*	[CodeGen] Mark setjmp/catchret MBBs address-taken	Joseph Tremoulet	2015-10-23	4	-10/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113
*	Revert "[AArch64]Merge halfword loads into a 32-bit load"	James Molloy	2015-10-23	1	-49/+0
\| \| \| \| \| \|	This reverts commit r250719. This introduced a codegen fault in SPEC2000.gcc, when compiled for Cortex-A53. llvm-svn: 251108
*	[X86] - Catch extra combine opportunities for redundant imuls.	Zia Ansari	2015-10-22	1	-0/+163
\| \| \| \| \| \| \| \| \| \| \| \|	When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028
*	Fix incorrect target triple in fp16-promote.ll	Pirama Arumuga Nainar	2015-10-22	1	-96/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Hyphens were missing from the triple, causing it to be parsed incorrectly. This patch updates the triple and makes necessary changes to the expected output. Patch is from Vinicius Tinti. Reviewers: ab, tinti Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D13792 llvm-svn: 251020
*	[mips][mips16] Fix typo in FileCheck directive.	Daniel Sanders	2015-10-22	1	-1/+1
\| \| \| \|	llvm-svn: 251019
*	[X86][AVX512] extend vcvtph2ps to support xmm/ymm and sae versions	Asaf Badouh	2015-10-22	2	-0/+76
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13945 llvm-svn: 251018
*	AVX-512: Fixed a bug in select_cc for i1 type	Elena Demikhovsky	2015-10-22	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \|	Fixed faiure: LLVM ERROR: Cannot select: t33: i1 = select_cc t25, Constant:i32<0>, t45, t42, seteq:ch added a test Differential Revision: http://reviews.llvm.org/D13943 llvm-svn: 250996
*	WebAssembly: fix more syntax	JF Bastien	2015-10-22	3	-9/+9
\| \| \| \| \| \| \|	br_if shouldn't start with a dot. div and rem went from prefix u/s to suffix. llvm-svn: 250972
*	Add missing load/store flags to thumb2 instructions.	Pete Cooper	2015-10-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These were the cause of a verifier error when building 7zip with -verify-machineinstrs. Running 'make check' with the verifier triggered the same error on the test here so i've updated the test to run the verifier on one of its runs instead of adding a new one. While looking at this code, there was a stale comment that these instructions were only used for disassembly. This probably used to be the case, but they are now used in the 'ARM load / store optimization pass' too. This reapplies r242300 which was reverted in r242428 due to bot failures. Ultimately those failures were spurious and completely unrelated to this commit. I reverted this at the time because it was thought to be at fault. llvm-svn: 250969
*	AMDGPU: Fix verifier error in SIFoldOperands	Matt Arsenault	2015-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer. PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill> to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0 Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy. llvm-svn: 250960
*	Drop assert that a call with struct return goes to a function with sret	Joerg Sonnenberger	2015-10-21	1	-0/+9
\| \| \| \| \| \| \|	attribute. Clang incorrectly misses it on __muldc3 and friends and the type system doesn't include it properly either. llvm-svn: 250938
*	[WinEH] Add test for llvm.va.start in catchpad	Reid Kleckner	2015-10-21	1	-0/+103
\| \| \| \| \| \| \| \|	It already works, but we should have a test for it. This used to be PR23094 in the old model. llvm-svn: 250936
*	[x86] add test case that shows holes in LEA isel	Sanjay Patel	2015-10-21	1	-0/+125
\| \| \| \|	llvm-svn: 250910
*	[mips][mips16] Re-work the inline assembly stubs to work with IAS. NFC.	Daniel Sanders	2015-10-21	3	-137/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, we were inserting an InlineAsm statement for each line of the inline assembly. This works for GAS but it triggers prologue/epilogue emission when IAS is in use. This caused: .set noreorder .cpload $25 to be emitted as: .set push .set reorder .set noreorder .set pop .set push .set reorder .cpload $25 .set pop which led to assembler errors and caused the test to fail. The whitespace-after-comma changes included in this patch are necessary to match the output when IAS is in use. Reviewers: vkalintiris Subscribers: rkotler, llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D13653 llvm-svn: 250895
*	Masked Load/Store optimization for scalar code	Elena Demikhovsky	2015-10-21	1	-0/+37
\| \| \| \| \| \| \| \| \|	When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893
*	[mips][msa] Remove copy_u.d and move copy_u.w to MSA64.	Daniel Sanders	2015-10-21	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The forwards compatibility strategy employed by MIPS is to consider registers to be infinitely sign-extended. Then on ISA's with a wider register, the result of existing instructions are sign-extended to register width and zero-extended counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this strategy and we have therefore corrected the MSA specs to fix this. We still keep track of sign/zero-extension during legalization but we now match copy_s.[wd] where required. No change required to clang since __builtin_msa_copy_u_[wd] will map to copy_s.[wd] where appropriate for the target. Reviewers: vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13472 llvm-svn: 250887
*	Let MachineVerifier be aware of mem-to-mem instructions.	Jonas Paulsson	2015-10-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A mem-to-mem instruction (that both loads and stores), which store to an FI, cannot pass the verifier since it thinks it is loading from the FI. For the mem-to-mem instruction, do a looser check in visitMachineOperand() and only check liveness at the reg-slot while analyzing a frame index operand. Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Evan Cheng and Quentin Colombet. llvm-svn: 250885
*	Tail duplication can mix incompatible registers in phi nodes	Krzysztof Parzyszek	2015-10-21	1	-0/+28
\| \| \| \| \| \| \| \| \|	Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877
*	WebAssembly: support imports	JF Bastien	2015-10-21	1	-0/+21
\| \| \| \| \| \|	C/C++ code can declare an extern function, which will show up as an import in WebAssembly's output. It's expected that the linker will resolve these, and mark unresolved imports as call_import (I have a patch which does this in wasmate). llvm-svn: 250875
*	[Hexagon] Bit-based instruction simplification	Krzysztof Parzyszek	2015-10-20	6	-4/+137
\| \| \| \| \| \| \|	Analyze bit patterns of operands and values of instructions to perform various simplifications, dead/redundant code elimination, etc. llvm-svn: 250868
*	[X86][SSE] Add 256-bit vector bit rotation tests.	Simon Pilgrim	2015-10-20	1	-0/+1200
\| \| \| \|	llvm-svn: 250853
*	[SystemZ] Comment fix in test/CodeGen/SystemZ/fp-cmp-05.ll	Jonas Paulsson	2015-10-20	1	-3/+1
\| \| \| \|	llvm-svn: 250828
*	Adding support for TargetLoweringBase::LibCall	Artyom Skrobov	2015-10-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826
*	AVX512: Implemented encoding and intrinsics for VPBROADCASTB/W/D/Q instructions.	Igor Breger	2015-10-20	4	-10/+262
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13884 llvm-svn: 250819
*	[x86] Fix AVX maskload/store intrinsic prototypes.	Andrea Di Biagio	2015-10-20	3	-28/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mask value type for maskload/maskstore GCC builtins is never a vector of packed floats/doubles. This patch fixes the following issues: 1. The mask argument for builtin_ia32_maskloadpd and builtin_ia32_maskstorepd should be of type llvm_v2i64_ty and not llvm_v2f64_ty. 2. The mask argument for builtin_ia32_maskloadpd256 and builtin_ia32_maskstorepd256 should be of type llvm_v4i64_ty and not llvm_v4f64_ty. 3. The mask argument for builtin_ia32_maskloadps and builtin_ia32_maskstoreps should be of type llvm_v4i32_ty and not llvm_v4f32_ty. 4. The mask argument for builtin_ia32_maskloadps256 and builtin_ia32_maskstoreps256 should be of type llvm_v8i32_ty and not llvm_v8f32_ty. Differential Revision: http://reviews.llvm.org/D13776 llvm-svn: 250817
*	AMDGPU: Stop reserving v[254:255]	Matt Arsenault	2015-10-20	1	-52/+52
\| \| \| \| \| \| \| \| \| \| \|	This wasn't doing anything useful. They weren't explicitly used anywhere, and the RegScavenger ignores reserved registers. This for some reason caused a random scheduling change in the test. Getting the check lines to pass is too frustrating, and there's probably not too much value in checking the vector case's operands N times. llvm-svn: 250794
*	WebAssembly: fix call/return syntax.	JF Bastien	2015-10-20	2	-3/+13
\| \| \| \| \| \|	They are now typeless, unlike other operations. llvm-svn: 250793
*	WebAssembly: fix syntax for br_if.	JF Bastien	2015-10-20	1	-13/+13
\| \| \| \|	llvm-svn: 250777
*	Enhance loop rotation with existence of profile data in ↵	Cong Hou	2015-10-19	2	-0/+202
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MachineBlockPlacement pass. Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation: 1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header. 2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop. 3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain. Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies. Differential revision: http://reviews.llvm.org/D10717 llvm-svn: 250754
*	[AArch64]Merge halfword loads into a 32-bit load	Jun Bum Lim	2015-10-19	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \|	Convert two halfword loads into a single 32-bit word load with bitfield extract instructions. For example : ldrh w0, [x2] ldrh w1, [x2, #2] becomes ldr w0, [x2] ubfx w1, w0, #16, #16 and w0, w0, #ffff llvm-svn: 250719
*	[Hexagon] Delay emission of CFI instructions	Krzysztof Parzyszek	2015-10-19	3	-6/+70
\| \| \| \| \| \| \|	Emit the CFI instructions after all code transformation have been done. This will avoid any interference between CFI instructions and packetization. llvm-svn: 250714
*	Fix mapping of @llvm.arm.ssat/usat intrinsics to ssat/usat instructions	Asiri Rathnayake	2015-10-19	5	-0/+100
\| \| \| \| \| \| \| \| \| \| \| \| \|	The mapping of these two intrinsics in ARMInstrInfo.td had a small omission which lead to their operands not being validated/transformed before being lowered into usat and ssat instructions. This can cause incorrect instructions to be emitted. I've also added tests for the remaining two saturating arithmatic intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing codegen tests. llvm-svn: 250697
*	[X86][SSE] Add vector bit rotation tests.	Simon Pilgrim	2015-10-18	1	-0/+1684
\| \| \| \|	llvm-svn: 250656
*	[X86][AVX512DQ] add scalar fpclass	Asaf Badouh	2015-10-18	1	-0/+34
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13769 llvm-svn: 250650
*	AVX512: Lowering i8/i16 vector CTLZ using the dword LZCNT vector instruction	Igor Breger	2015-10-18	3	-1376/+515
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D13632 llvm-svn: 250649
*	[X86][XOP] Add VPROT rotate by immediate intrinsics tests	Simon Pilgrim	2015-10-17	1	-0/+28
\| \| \| \|	llvm-svn: 250618
*	[X86][FastISel] Teach how to select SSE4A nontemporal stores.	Simon Pilgrim	2015-10-17	1	-12/+53
\| \| \| \| \| \| \| \| \| \|	Add FastISel support for SSE4A scalar float / double non-temporal stores Follow up to D13698 Differential Revision: http://reviews.llvm.org/D13773 llvm-svn: 250610
*	[Hexagon] Reverting test file change.	Colin LeMahieu	2015-10-17	1	-1/+2
\| \| \| \|	llvm-svn: 250601
*	[Hexagon] Adding skeleton of HVX extension instructions.	Colin LeMahieu	2015-10-17	1	-2/+1
\| \| \| \|	llvm-svn: 250600