summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Support CL(G)T instructionsUlrich Weigand2016-11-116-3/+58
| | | | | | | | This adds support for the compare logical and trap (memory) instructions that were added as part of the miscellaneous instruction extensions feature with zEC12. llvm-svn: 286587
* [SystemZ] Support load-and-zero-rightmost-byte facilityUlrich Weigand2016-11-116-3/+49
| | | | | | | | | | This adds support for the LZRF/LZRG/LLZRGF instructions that were added on z13, and uses them for code generation were appropriate. SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF over RISBG where both would be possible. llvm-svn: 286586
* [SystemZ] Use LLGT(R) instructionsUlrich Weigand2016-11-115-46/+50
| | | | | | | | | | | | | This adds support for the 31-to-64-bit zero extension instructions LLGT and LLGTR and uses them for code generation where appropriate. Since this operation can also be performed via RISBG, we have to update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT over RISBG in case both are possible. The patch includes some simplification to the tryRISBGZero code; this is not intended to cause any (further) functional change in codegen. llvm-svn: 286585
* [ARM] Add plumbing for GlobalISelDiana Picus2016-11-1113-4/+407
| | | | | | Add GlobalISel skeleton, up to the point where we can select a ret void. llvm-svn: 286573
* AMDGPU: Attempt to fix build failure on x86-64 selfhost buildYaxun Liu2016-11-111-2/+0
| | | | | | Remove redundant include file. llvm-svn: 286552
* Add a blank line for a test commit.Sean Fertile2016-11-111-0/+1
| | | | llvm-svn: 286550
* Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate ↵Stanislav Mekhanoshin2016-11-112-26/+5
| | | | | | | | condition copies" This reverts commit r286171, it breaks piglit test fs-discard-exit-2 llvm-svn: 286530
* Fix requirements.Joerg Sonnenberger2016-11-101-1/+1
| | | | llvm-svn: 286527
* Timer: Remove group-less NamedRegionTimer constructor.Matthias Braun2016-11-101-2/+0
| | | | | | | | | | | | | | | The NamedRegionTimer initializer without a group name puts the Timer into the "Misc" group and is (nearly) unused. Remove it. The only user of this constructor appears to be the HexagonGenInsert pass, which creates a counter without group to count the complete execution time of that pass, however since every pass gets a counter by the PassManager anyway this should be unnecessary. Also removed the pointless TimerGroup there. Differential Revision: https://reviews.llvm.org/D25582 llvm-svn: 286524
* [DAG Combiner] Fix the native computation of the Newton series for reciprocalsEvandro Menezes2016-11-108-26/+31
| | | | | | | | | | | | The generic infrastructure to compute the Newton series for reciprocal and reciprocal square root was conceived to allow a target to compute the series itself. However, the original code did not properly consider this condition if returned by a target. This patch addresses the issues to allow a target to compute the series on its own. Differential revision: https://reviews.llvm.org/D22975 llvm-svn: 286523
* AMDGPU: Emit runtime metadata as a note element in .note sectionYaxun Liu2016-11-106-348/+450
| | | | | | | | | | | | Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata. However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html). This patch lets AMDGPU backend emits runtime metadata as a note element in .note section. Differential Revision: https://reviews.llvm.org/D25781 llvm-svn: 286502
* [Target] Rename X86/ARM Assembly printer to reflect reality.Davide Italiano2016-11-102-2/+2
| | | | | | | This shows up a lot profiling LTO testcases with -time-passes, so better have a non confusing name. llvm-svn: 286488
* AMDGPU: Add VI i16 supportTom Stellard2016-11-1015-78/+409
| | | | | | | | Patch By: Wei Ding Differential Revision: https://reviews.llvm.org/D18049 llvm-svn: 286464
* [ARM] Thumb2 LDR (literal) should accept PC as the destinationOliver Stannard2016-11-101-1/+1
| | | | | | | | | The version of this instruction with the .w suffix already correctly accepts this, but the alias without the .w did not. Differential Revision: https://reviews.llvm.org/D26499 llvm-svn: 286446
* [AVX-512] Allow legacy cvtpd2dq intrinsics to select EVEX encoded ↵Craig Topper2016-11-102-8/+12
| | | | | | instruction when available. llvm-svn: 286435
* [AVX-512][X86] Convert avx_cvtt_ps2dq_256 and sse2_cvttps2dq intrinsics to ↵Craig Topper2016-11-102-54/+28
| | | | | | | | ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions. This allows these intrinsics to also use EVEX instructons. llvm-svn: 286434
* [X86] Convert int_x86_avx_cvtt_pd2dq_256 to fp_to_sint using the intrinsics ↵Craig Topper2016-11-102-7/+5
| | | | | | table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available. llvm-svn: 286433
* [X86] Move some custom patterns into the currently empty pattern of their ↵Craig Topper2016-11-101-46/+37
| | | | | | corresponding instructions. NFC llvm-svn: 286432
* [X86] Remove some patterns still referencing int_x86_sse2_cvttpd2dq that ↵Craig Topper2016-11-101-9/+5
| | | | | | should have been removed in r286344. NFC llvm-svn: 286431
* Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which ↵Peter Collingbourne2016-11-094-52/+35
| | | | | | | | | represents a relocatable immediate.", with a fix for 32-bit x86. Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions that take a global address operand. llvm-svn: 286420
* GlobalISel: translate invoke and landingpad instructionsTim Northover2016-11-091-1/+1
| | | | | | | Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj etc), but it should get things going. llvm-svn: 286407
* Revert r286384, "X86: Introduce the "relocImm" ComplexPattern, which ↵Peter Collingbourne2016-11-093-31/+52
| | | | | | | | | represents a relocatable immediate." Suspected to be the cause of a sanitizer-windows bot failure: Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420 llvm-svn: 286385
* X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable ↵Peter Collingbourne2016-11-093-52/+31
| | | | | | | | | | | | | | | immediate. A relocatable immediate is either an immediate operand or an operand that can be relocated by the linker to an immediate, such as a regular symbol in non-PIC code. Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands of type "imm32_su". Remove a number of now-redundant patterns. Differential Revision: https://reviews.llvm.org/D25812 llvm-svn: 286384
* [Hexagon] Silence "sometimes uninitialized" warning in HexagonCopyToCombineKrzysztof Parzyszek2016-11-091-1/+3
| | | | llvm-svn: 286383
* [Hexagon] Separate Hexagon subreg indices for different register classesKrzysztof Parzyszek2016-11-0923-204/+255
| | | | | | | | | | | For pairs of 32-bit registers: isub_lo, isub_hi. For pairs of vector registers: vsub_lo, vsub_hi. Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg) that returns the appropriate subreg index for RegClass. llvm-svn: 286377
* [Hexagon] Eliminate Insert4 pseudo-instruction, use combines insteadKrzysztof Parzyszek2016-11-093-48/+2
| | | | llvm-svn: 286368
* [SystemZ] A few fixes in scheduler files.Jonas Paulsson2016-11-093-11/+11
| | | | | Review: U Weigand llvm-svn: 286362
* [MachineScheduler] Comments fixing.Jonas Paulsson2016-11-091-1/+1
| | | | | | | | The name/comment of the third argument to the ScheduleDAGMI constructor is RemoveKillFlags and not IsPostRA. Only the comments are changed. Review: A Trick llvm-svn: 286350
* [AVX-512] Add lowering to cvttpd2udq/cvttps2udq for fptoui v2f64/2f32 to 2i32Craig Topper2016-11-095-8/+26
| | | | | | | | | | | | This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions. If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result. It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result. Differential Revision: https://reviews.llvm.org/D26331 llvm-svn: 286345
* [X86] Lower AVX512 and SSE intrinsics for CVTTPD2DQ to X86ISD::CVTTPD2DQ.Craig Topper2016-11-093-30/+34
| | | | | | | | | | | | Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves. Reviewers: delena, zvi, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26330 llvm-svn: 286344
* [AVX-512] Use alignedstore256 in patterns that look for stores of the lower ↵Craig Topper2016-11-091-10/+10
| | | | | | | | 256-bits of a 512-bit vector to use a 256-bit aligned store. Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947. llvm-svn: 286342
* [AVX-512] Make VBMI instruction set enabling imply that the BWI instruction ↵Craig Topper2016-11-091-2/+2
| | | | | | | | | | | | | | | set is also enabled. Summary: This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912. Reviewers: delena, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26322 llvm-svn: 286339
* AArch64DeadRegisterDefinitionsPass: Fix Changed flagMatthias Braun2016-11-081-1/+0
| | | | | | Fix a bug in the calculation of the changed flag introduced in r285488. llvm-svn: 286293
* [SystemZ] Add missing FP extension instructionsUlrich Weigand2016-11-084-18/+42
| | | | | | | | This completes assembler / disassembler support for all BFP instructions provided by the floating-point extensions facility. The instructions added here are not currently used for codegen. llvm-svn: 286285
* [SystemZ] Add program mask and addressing mode instructionsUlrich Weigand2016-11-085-11/+109
| | | | | | | | | Add several instructions that operate on the program mask or the addressing mode. These are not really needed for code generation under Linux, but are provided for completeness for the assembler/disassembler. llvm-svn: 286284
* [SystemZ] Model access registers as LLVM registersUlrich Weigand2016-11-0817-102/+126
| | | | | | | | | | | | | Add the 16 access registers as LLVM registers. This allows removing a lot of special cases in the assembler and disassembler where we were handling access registers; this can all just use the generic register code now. Also add a bunch of instructions to operate on access registers, for assembler/disassembler use only. No change in code generation intended. llvm-svn: 286283
* [WebAssembly] Convert stackified IMPLICIT_DEF into constant 0.Dan Gohman2016-11-081-0/+37
| | | | | | | | | | Since IMPLIFIT_DEF instructions are omitted in the output, when the output of an IMPLICIT_DEF instruction is stackified, the resulting register lacks an explicit push, leading to a push/pop mismatch. Fix this by converting such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit pushes. llvm-svn: 286274
* [SystemZ] Always use semantic instruction classesUlrich Weigand2016-11-083-96/+190
| | | | | | | | | | Define a couple of additional semantic classes and use them throughout the .td files to make them more consistent and more easily readable. No functional change. llvm-svn: 286268
* [SystemZ] Refactor InstRR* instruction format patternsUlrich Weigand2016-11-083-227/+260
| | | | | | | | | | | | | | | This changes the InstRR (and related) patterns to no longer automatically add an "r" at the end of the mnemonic. This makes the .td files more obviously understandable, and also allows using the patterns for those few instructions that do not follow the *r scheme. Also add some more sub-formats of the RRF format class, to match operand names and sequence from the PoP better. No functional change. llvm-svn: 286267
* [SystemZ] Rename some Inst* instruction format classesUlrich Weigand2016-11-082-96/+96
| | | | | | | | | | | Now that we've added instruction format subclasses like InstRIb, it makes sense to rename the old InstRI to InstRIa. Similar for InstRX, InstRXY, InstRS, InstRSY, and InstSS. No functional change. llvm-svn: 286266
* [MC][AArch64] Cleanup end-of-line parsing in AArch64 AsmParser.Nirav Dave2016-11-081-222/+136
| | | | | | | | | | Reviewers: t.p.northover, rengolin Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D26309 llvm-svn: 286265
* [SystemZ] Refactor branch and conditional instruction patternsUlrich Weigand2016-11-086-615/+978
| | | | | | | | | | | | | | | | | Rework patterns for branches, call & return instructions, compare-and-branch, compare-and-trap, and conditional move instructions. In particular, simplify creation of patterns for the extended opcodes of instructions that take a CC mask. Also, use semantical instruction classes for all the instructions instead of open-coding them in SystemZInstrInfo.td. Adds a couple of the basic branch instructions (that are unused for codegen) for the assembler/disassembler. llvm-svn: 286263
* GlobalISel: support selecting fpext/fptrunc instructions on AArch64.Tim Northover2016-11-081-0/+54
| | | | llvm-svn: 286253
* Fix PR27500: on MSP430 the branch destination offset is measured in words, ↵Anton Korobeynikov2016-11-081-115/+191
| | | | | | | | | | | | | | not bytes. Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before. Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23718 llvm-svn: 286252
* [VectorLegalizer] Expansion of CTLZ using CTPOP when possibleSimon Pilgrim2016-11-081-1/+4
| | | | | | | | | | This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233
* [AArch64] Fix incorrect CSEL node createdRoger Ferrer Ibanez2016-11-081-2/+3
| | | | | | | | | | | | Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64 transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to "a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have the same type which can lead to a wrong CSEL node that crashes later due to nonsensical copies. Differential Revision: https://reviews.llvm.org/D26394 llvm-svn: 286231
* GlobalISel: support selecting G_SELECT on AArch64.Tim Northover2016-11-081-0/+40
| | | | llvm-svn: 286185
* GlobalISel: constrain PHI registers on AArch64.Tim Northover2016-11-081-3/+33
| | | | | | | | | | Self-referencing PHI nodes need their destination operands to be constrained because nothing else is likely to do so. For now we just pick a register class naively. Patch mostly by Ahmed again. llvm-svn: 286183
* [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition ↵Stanislav Mekhanoshin2016-11-072-5/+26
| | | | | | | | | | | | | | | | | | | | copies Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions. Changed BE to report we have many condition registers. That way IR LICM pass would hoist an invariant comparison out of a loop and codegen prepare will not sink it. With that done a condition is calculated in one block and used in another. Current behavior is to store workitem's condition in a VGPR using v_cndmask and then restore it with yet another v_cmp instruction from that v_cndmask's result. To mitigate the issue a forward propagation of a v_cmp 64 bit result to an user is implemented. Additional side effect of this is that we may consume less VGPRs in a cost of more SGPRs in case if holding of multiple conditions is needed, and that is a clear win in most cases. llvm-svn: 286171
* [AArch64] Transfer memory operands when lowering vector load/store intrinsicsSanjin Sijaric2016-11-071-0/+11
| | | | | | | | | | | | | | | Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168
OpenPOWER on IntegriCloud