bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[SystemZ] Support CL(G)T instructions	Ulrich Weigand	2016-11-11	6	-3/+58
\| \| \| \| \| \| \| \|	This adds support for the compare logical and trap (memory) instructions that were added as part of the miscellaneous instruction extensions feature with zEC12. llvm-svn: 286587
*	[SystemZ] Support load-and-zero-rightmost-byte facility	Ulrich Weigand	2016-11-11	6	-3/+49
\| \| \| \| \| \| \| \| \| \|	This adds support for the LZRF/LZRG/LLZRGF instructions that were added on z13, and uses them for code generation were appropriate. SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF over RISBG where both would be possible. llvm-svn: 286586
*	[SystemZ] Use LLGT(R) instructions	Ulrich Weigand	2016-11-11	5	-46/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for the 31-to-64-bit zero extension instructions LLGT and LLGTR and uses them for code generation where appropriate. Since this operation can also be performed via RISBG, we have to update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT over RISBG in case both are possible. The patch includes some simplification to the tryRISBGZero code; this is not intended to cause any (further) functional change in codegen. llvm-svn: 286585
*	[ARM] Add plumbing for GlobalISel	Diana Picus	2016-11-11	13	-4/+407
\| \| \| \| \| \|	Add GlobalISel skeleton, up to the point where we can select a ret void. llvm-svn: 286573
*	AMDGPU: Attempt to fix build failure on x86-64 selfhost build	Yaxun Liu	2016-11-11	1	-2/+0
\| \| \| \| \| \|	Remove redundant include file. llvm-svn: 286552
*	Add a blank line for a test commit.	Sean Fertile	2016-11-11	1	-0/+1
\| \| \| \|	llvm-svn: 286550
*	Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate ↵	Stanislav Mekhanoshin	2016-11-11	2	-26/+5
\| \| \| \| \| \| \| \|	condition copies" This reverts commit r286171, it breaks piglit test fs-discard-exit-2 llvm-svn: 286530
*	Fix requirements.	Joerg Sonnenberger	2016-11-10	1	-1/+1
\| \| \| \|	llvm-svn: 286527
*	Timer: Remove group-less NamedRegionTimer constructor.	Matthias Braun	2016-11-10	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NamedRegionTimer initializer without a group name puts the Timer into the "Misc" group and is (nearly) unused. Remove it. The only user of this constructor appears to be the HexagonGenInsert pass, which creates a counter without group to count the complete execution time of that pass, however since every pass gets a counter by the PassManager anyway this should be unnecessary. Also removed the pointless TimerGroup there. Differential Revision: https://reviews.llvm.org/D25582 llvm-svn: 286524
*	[DAG Combiner] Fix the native computation of the Newton series for reciprocals	Evandro Menezes	2016-11-10	8	-26/+31
\| \| \| \| \| \| \| \| \| \| \| \|	The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 llvm-svn: 286523
*	AMDGPU: Emit runtime metadata as a note element in .note section	Yaxun Liu	2016-11-10	6	-348/+450
\| \| \| \| \| \| \| \| \| \| \| \|	Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata. However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html). This patch lets AMDGPU backend emits runtime metadata as a note element in .note section. Differential Revision: https://reviews.llvm.org/D25781 llvm-svn: 286502
*	[Target] Rename X86/ARM Assembly printer to reflect reality.	Davide Italiano	2016-11-10	2	-2/+2
\| \| \| \| \| \| \|	This shows up a lot profiling LTO testcases with -time-passes, so better have a non confusing name. llvm-svn: 286488
*	AMDGPU: Add VI i16 support	Tom Stellard	2016-11-10	15	-78/+409
\| \| \| \| \| \| \| \|	Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
*	[ARM] Thumb2 LDR (literal) should accept PC as the destination	Oliver Stannard	2016-11-10	1	-1/+1
\| \| \| \| \| \| \| \| \|	The version of this instruction with the .w suffix already correctly accepts this, but the alias without the .w did not. Differential Revision: https://reviews.llvm.org/D26499 llvm-svn: 286446
*	[AVX-512] Allow legacy cvtpd2dq intrinsics to select EVEX encoded ↵	Craig Topper	2016-11-10	2	-8/+12
\| \| \| \| \| \|	instruction when available. llvm-svn: 286435
*	[AVX-512][X86] Convert avx_cvtt_ps2dq_256 and sse2_cvttps2dq intrinsics to ↵	Craig Topper	2016-11-10	2	-54/+28
\| \| \| \| \| \| \| \|	ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions. This allows these intrinsics to also use EVEX instructons. llvm-svn: 286434
*	[X86] Convert int_x86_avx_cvtt_pd2dq_256 to fp_to_sint using the intrinsics ↵	Craig Topper	2016-11-10	2	-7/+5
\| \| \| \| \| \|	table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available. llvm-svn: 286433
*	[X86] Move some custom patterns into the currently empty pattern of their ↵	Craig Topper	2016-11-10	1	-46/+37
\| \| \| \| \| \|	corresponding instructions. NFC llvm-svn: 286432
*	[X86] Remove some patterns still referencing int_x86_sse2_cvttpd2dq that ↵	Craig Topper	2016-11-10	1	-9/+5
\| \| \| \| \| \|	should have been removed in r286344. NFC llvm-svn: 286431
*	Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which ↵	Peter Collingbourne	2016-11-09	4	-52/+35
\| \| \| \| \| \| \| \| \|	represents a relocatable immediate.", with a fix for 32-bit x86. Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions that take a global address operand. llvm-svn: 286420
*	GlobalISel: translate invoke and landingpad instructions	Tim Northover	2016-11-09	1	-1/+1
\| \| \| \| \| \| \|	Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj etc), but it should get things going. llvm-svn: 286407
*	Revert r286384, "X86: Introduce the "relocImm" ComplexPattern, which ↵	Peter Collingbourne	2016-11-09	3	-31/+52
\| \| \| \| \| \| \| \| \|	represents a relocatable immediate." Suspected to be the cause of a sanitizer-windows bot failure: Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420 llvm-svn: 286385
*	X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable ↵	Peter Collingbourne	2016-11-09	3	-52/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	immediate. A relocatable immediate is either an immediate operand or an operand that can be relocated by the linker to an immediate, such as a regular symbol in non-PIC code. Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands of type "imm32_su". Remove a number of now-redundant patterns. Differential Revision: https://reviews.llvm.org/D25812 llvm-svn: 286384
*	[Hexagon] Silence "sometimes uninitialized" warning in HexagonCopyToCombine	Krzysztof Parzyszek	2016-11-09	1	-1/+3
\| \| \| \|	llvm-svn: 286383
*	[Hexagon] Separate Hexagon subreg indices for different register classes	Krzysztof Parzyszek	2016-11-09	23	-204/+255
\| \| \| \| \| \| \| \| \| \| \|	For pairs of 32-bit registers: isub_lo, isub_hi. For pairs of vector registers: vsub_lo, vsub_hi. Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg) that returns the appropriate subreg index for RegClass. llvm-svn: 286377
*	[Hexagon] Eliminate Insert4 pseudo-instruction, use combines instead	Krzysztof Parzyszek	2016-11-09	3	-48/+2
\| \| \| \|	llvm-svn: 286368
*	[SystemZ] A few fixes in scheduler files.	Jonas Paulsson	2016-11-09	3	-11/+11
\| \| \| \| \|	Review: U Weigand llvm-svn: 286362
*	[MachineScheduler] Comments fixing.	Jonas Paulsson	2016-11-09	1	-1/+1
\| \| \| \| \| \| \| \|	The name/comment of the third argument to the ScheduleDAGMI constructor is RemoveKillFlags and not IsPostRA. Only the comments are changed. Review: A Trick llvm-svn: 286350
*	[AVX-512] Add lowering to cvttpd2udq/cvttps2udq for fptoui v2f64/2f32 to 2i32	Craig Topper	2016-11-09	5	-8/+26
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions. If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result. It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result. Differential Revision: https://reviews.llvm.org/D26331 llvm-svn: 286345
*	[X86] Lower AVX512 and SSE intrinsics for CVTTPD2DQ to X86ISD::CVTTPD2DQ.	Craig Topper	2016-11-09	3	-30/+34
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26330 llvm-svn: 286344
*	[AVX-512] Use alignedstore256 in patterns that look for stores of the lower ↵	Craig Topper	2016-11-09	1	-10/+10
\| \| \| \| \| \| \| \|	256-bits of a 512-bit vector to use a 256-bit aligned store. Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947. llvm-svn: 286342
*	[AVX-512] Make VBMI instruction set enabling imply that the BWI instruction ↵	Craig Topper	2016-11-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	set is also enabled. Summary: This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912. Reviewers: delena, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26322 llvm-svn: 286339
*	AArch64DeadRegisterDefinitionsPass: Fix Changed flag	Matthias Braun	2016-11-08	1	-1/+0
\| \| \| \| \| \|	Fix a bug in the calculation of the changed flag introduced in r285488. llvm-svn: 286293
*	[SystemZ] Add missing FP extension instructions	Ulrich Weigand	2016-11-08	4	-18/+42
\| \| \| \| \| \| \| \|	This completes assembler / disassembler support for all BFP instructions provided by the floating-point extensions facility. The instructions added here are not currently used for codegen. llvm-svn: 286285
*	[SystemZ] Add program mask and addressing mode instructions	Ulrich Weigand	2016-11-08	5	-11/+109
\| \| \| \| \| \| \| \| \|	Add several instructions that operate on the program mask or the addressing mode. These are not really needed for code generation under Linux, but are provided for completeness for the assembler/disassembler. llvm-svn: 286284
*	[SystemZ] Model access registers as LLVM registers	Ulrich Weigand	2016-11-08	17	-102/+126
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add the 16 access registers as LLVM registers. This allows removing a lot of special cases in the assembler and disassembler where we were handling access registers; this can all just use the generic register code now. Also add a bunch of instructions to operate on access registers, for assembler/disassembler use only. No change in code generation intended. llvm-svn: 286283
*	[WebAssembly] Convert stackified IMPLICIT_DEF into constant 0.	Dan Gohman	2016-11-08	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	Since IMPLIFIT_DEF instructions are omitted in the output, when the output of an IMPLICIT_DEF instruction is stackified, the resulting register lacks an explicit push, leading to a push/pop mismatch. Fix this by converting such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit pushes. llvm-svn: 286274
*	[SystemZ] Always use semantic instruction classes	Ulrich Weigand	2016-11-08	3	-96/+190
\| \| \| \| \| \| \| \| \| \|	Define a couple of additional semantic classes and use them throughout the .td files to make them more consistent and more easily readable. No functional change. llvm-svn: 286268
*	[SystemZ] Refactor InstRR* instruction format patterns	Ulrich Weigand	2016-11-08	3	-227/+260
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This changes the InstRR (and related) patterns to no longer automatically add an "r" at the end of the mnemonic. This makes the .td files more obviously understandable, and also allows using the patterns for those few instructions that do not follow the *r scheme. Also add some more sub-formats of the RRF format class, to match operand names and sequence from the PoP better. No functional change. llvm-svn: 286267
*	[SystemZ] Rename some Inst* instruction format classes	Ulrich Weigand	2016-11-08	2	-96/+96
\| \| \| \| \| \| \| \| \| \| \|	Now that we've added instruction format subclasses like InstRIb, it makes sense to rename the old InstRI to InstRIa. Similar for InstRX, InstRXY, InstRS, InstRSY, and InstSS. No functional change. llvm-svn: 286266
*	[MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser.	Nirav Dave	2016-11-08	1	-222/+136
\| \| \| \| \| \| \| \| \| \|	Reviewers: t.p.northover, rengolin Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D26309 llvm-svn: 286265
*	[SystemZ] Refactor branch and conditional instruction patterns	Ulrich Weigand	2016-11-08	6	-615/+978
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework patterns for branches, call & return instructions, compare-and-branch, compare-and-trap, and conditional move instructions. In particular, simplify creation of patterns for the extended opcodes of instructions that take a CC mask. Also, use semantical instruction classes for all the instructions instead of open-coding them in SystemZInstrInfo.td. Adds a couple of the basic branch instructions (that are unused for codegen) for the assembler/disassembler. llvm-svn: 286263
*	GlobalISel: support selecting fpext/fptrunc instructions on AArch64.	Tim Northover	2016-11-08	1	-0/+54
\| \| \| \|	llvm-svn: 286253
*	Fix PR27500: on MSP430 the branch destination offset is measured in words, ↵	Anton Korobeynikov	2016-11-08	1	-115/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	not bytes. Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before. Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23718 llvm-svn: 286252
*	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible	Simon Pilgrim	2016-11-08	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233
*	[AArch64] Fix incorrect CSEL node created	Roger Ferrer Ibanez	2016-11-08	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64 transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to "a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have the same type which can lead to a wrong CSEL node that crashes later due to nonsensical copies. Differential Revision: https://reviews.llvm.org/D26394 llvm-svn: 286231
*	GlobalISel: support selecting G_SELECT on AArch64.	Tim Northover	2016-11-08	1	-0/+40
\| \| \| \|	llvm-svn: 286185
*	GlobalISel: constrain PHI registers on AArch64.	Tim Northover	2016-11-08	1	-3/+33
\| \| \| \| \| \| \| \| \| \|	Self-referencing PHI nodes need their destination operands to be constrained because nothing else is likely to do so. For now we just pick a register class naively. Patch mostly by Ahmed again. llvm-svn: 286183
*	[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition ↵	Stanislav Mekhanoshin	2016-11-07	2	-5/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	copies Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions. Changed BE to report we have many condition registers. That way IR LICM pass would hoist an invariant comparison out of a loop and codegen prepare will not sink it. With that done a condition is calculated in one block and used in another. Current behavior is to store workitem's condition in a VGPR using v_cndmask and then restore it with yet another v_cmp instruction from that v_cndmask's result. To mitigate the issue a forward propagation of a v_cmp 64 bit result to an user is implemented. Additional side effect of this is that we may consume less VGPRs in a cost of more SGPRs in case if holding of multiple conditions is needed, and that is a clear win in most cases. llvm-svn: 286171
*	[AArch64] Transfer memory operands when lowering vector load/store intrinsics	Sanjin Sijaric	2016-11-07	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168