| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
This adds support for the compare logical and trap (memory)
instructions that were added as part of the miscellaneous
instruction extensions feature with zEC12.
llvm-svn: 286587
|
| |
|
|
|
|
|
|
|
|
| |
This adds support for the LZRF/LZRG/LLZRGF instructions that were
added on z13, and uses them for code generation were appropriate.
SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF
over RISBG where both would be possible.
llvm-svn: 286586
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the 31-to-64-bit zero extension instructions
LLGT and LLGTR and uses them for code generation where appropriate.
Since this operation can also be performed via RISBG, we have to
update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT
over RISBG in case both are possible. The patch includes some
simplification to the tryRISBGZero code; this is not intended
to cause any (further) functional change in codegen.
llvm-svn: 286585
|
| |
|
|
|
|
| |
Add GlobalISel skeleton, up to the point where we can select a ret void.
llvm-svn: 286573
|
| |
|
|
|
|
| |
Remove redundant include file.
llvm-svn: 286552
|
| |
|
|
| |
llvm-svn: 286550
|
| |
|
|
|
|
|
|
| |
condition copies"
This reverts commit r286171, it breaks piglit test fs-discard-exit-2
llvm-svn: 286530
|
| |
|
|
| |
llvm-svn: 286527
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NamedRegionTimer initializer without a group name puts the Timer
into the "Misc" group and is (nearly) unused. Remove it.
The only user of this constructor appears to be the HexagonGenInsert pass,
which creates a counter without group to count the complete execution
time of that pass, however since every pass gets a counter by the
PassManager anyway this should be unnecessary. Also removed the
pointless TimerGroup there.
Differential Revision: https://reviews.llvm.org/D25582
llvm-svn: 286524
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The generic infrastructure to compute the Newton series for reciprocal and
reciprocal square root was conceived to allow a target to compute the series
itself. However, the original code did not properly consider this condition
if returned by a target. This patch addresses the issues to allow a target
to compute the series on its own.
Differential revision: https://reviews.llvm.org/D22975
llvm-svn: 286523
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata.
However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html).
This patch lets AMDGPU backend emits runtime metadata as a note element in .note section.
Differential Revision: https://reviews.llvm.org/D25781
llvm-svn: 286502
|
| |
|
|
|
|
|
| |
This shows up a lot profiling LTO testcases with -time-passes, so
better have a non confusing name.
llvm-svn: 286488
|
| |
|
|
|
|
|
|
| |
Patch By: Wei Ding
Differential Revision: https://reviews.llvm.org/D18049
llvm-svn: 286464
|
| |
|
|
|
|
|
|
|
| |
The version of this instruction with the .w suffix already correctly accepts
this, but the alias without the .w did not.
Differential Revision: https://reviews.llvm.org/D26499
llvm-svn: 286446
|
| |
|
|
|
|
| |
instruction when available.
llvm-svn: 286435
|
| |
|
|
|
|
|
|
| |
ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions.
This allows these intrinsics to also use EVEX instructons.
llvm-svn: 286434
|
| |
|
|
|
|
| |
table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available.
llvm-svn: 286433
|
| |
|
|
|
|
| |
corresponding instructions. NFC
llvm-svn: 286432
|
| |
|
|
|
|
| |
should have been removed in r286344. NFC
llvm-svn: 286431
|
| |
|
|
|
|
|
|
|
| |
represents a relocatable immediate.", with a fix for 32-bit x86.
Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions
that take a global address operand.
llvm-svn: 286420
|
| |
|
|
|
|
|
| |
Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj
etc), but it should get things going.
llvm-svn: 286407
|
| |
|
|
|
|
|
|
|
| |
represents a relocatable immediate."
Suspected to be the cause of a sanitizer-windows bot failure:
Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420
llvm-svn: 286385
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
immediate.
A relocatable immediate is either an immediate operand or an operand that
can be relocated by the linker to an immediate, such as a regular symbol
in non-PIC code.
Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands
of type "imm32_su". Remove a number of now-redundant patterns.
Differential Revision: https://reviews.llvm.org/D25812
llvm-svn: 286384
|
| |
|
|
| |
llvm-svn: 286383
|
| |
|
|
|
|
|
|
|
|
|
| |
For pairs of 32-bit registers: isub_lo, isub_hi.
For pairs of vector registers: vsub_lo, vsub_hi.
Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function
HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg)
that returns the appropriate subreg index for RegClass.
llvm-svn: 286377
|
| |
|
|
| |
llvm-svn: 286368
|
| |
|
|
|
| |
Review: U Weigand
llvm-svn: 286362
|
| |
|
|
|
|
|
|
| |
The name/comment of the third argument to the ScheduleDAGMI constructor
is RemoveKillFlags and not IsPostRA. Only the comments are changed.
Review: A Trick
llvm-svn: 286350
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions.
If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result.
It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result.
Differential Revision: https://reviews.llvm.org/D26331
llvm-svn: 286345
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves.
Reviewers: delena, zvi, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26330
llvm-svn: 286344
|
| |
|
|
|
|
|
|
| |
256-bits of a 512-bit vector to use a 256-bit aligned store.
Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947.
llvm-svn: 286342
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
set is also enabled.
Summary:
This is needed to make the v64i8 and v32i16 types legal for the 512-bit VBMI instructions. Fixes PR30912.
Reviewers: delena, zvi
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26322
llvm-svn: 286339
|
| |
|
|
|
|
| |
Fix a bug in the calculation of the changed flag introduced in r285488.
llvm-svn: 286293
|
| |
|
|
|
|
|
|
| |
This completes assembler / disassembler support for all BFP
instructions provided by the floating-point extensions facility.
The instructions added here are not currently used for codegen.
llvm-svn: 286285
|
| |
|
|
|
|
|
|
|
| |
Add several instructions that operate on the program mask
or the addressing mode. These are not really needed for
code generation under Linux, but are provided for completeness
for the assembler/disassembler.
llvm-svn: 286284
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Add the 16 access registers as LLVM registers. This allows removing
a lot of special cases in the assembler and disassembler where we
were handling access registers; this can all just use the generic
register code now.
Also add a bunch of instructions to operate on access registers,
for assembler/disassembler use only. No change in code generation
intended.
llvm-svn: 286283
|
| |
|
|
|
|
|
|
|
|
| |
Since IMPLIFIT_DEF instructions are omitted in the output, when the output
of an IMPLICIT_DEF instruction is stackified, the resulting register lacks
an explicit push, leading to a push/pop mismatch. Fix this by converting
such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit
pushes.
llvm-svn: 286274
|
| |
|
|
|
|
|
|
|
|
| |
Define a couple of additional semantic classes and use them
throughout the .td files to make them more consistent and
more easily readable.
No functional change.
llvm-svn: 286268
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes the InstRR (and related) patterns to no longer
automatically add an "r" at the end of the mnemonic. This
makes the .td files more obviously understandable, and also
allows using the patterns for those few instructions that
do not follow the *r scheme.
Also add some more sub-formats of the RRF format class, to
match operand names and sequence from the PoP better.
No functional change.
llvm-svn: 286267
|
| |
|
|
|
|
|
|
|
|
|
| |
Now that we've added instruction format subclasses like
InstRIb, it makes sense to rename the old InstRI to InstRIa.
Similar for InstRX, InstRXY, InstRS, InstRSY, and InstSS.
No functional change.
llvm-svn: 286266
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: t.p.northover, rengolin
Subscribers: llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D26309
llvm-svn: 286265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rework patterns for branches, call & return instructions,
compare-and-branch, compare-and-trap, and conditional move
instructions.
In particular, simplify creation of patterns for the extended
opcodes of instructions that take a CC mask.
Also, use semantical instruction classes for all the instructions
instead of open-coding them in SystemZInstrInfo.td.
Adds a couple of the basic branch instructions (that are unused
for codegen) for the assembler/disassembler.
llvm-svn: 286263
|
| |
|
|
| |
llvm-svn: 286253
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
not bytes.
Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before.
Reviewers: asl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23718
llvm-svn: 286252
|
| |
|
|
|
|
|
|
|
|
| |
This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available.
This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful.
Differential Revision: https://reviews.llvm.org/D25910
llvm-svn: 286233
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64
transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to
"a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have
the same type which can lead to a wrong CSEL node that crashes later
due to nonsensical copies.
Differential Revision: https://reviews.llvm.org/D26394
llvm-svn: 286231
|
| |
|
|
| |
llvm-svn: 286185
|
| |
|
|
|
|
|
|
|
|
| |
Self-referencing PHI nodes need their destination operands to be constrained
because nothing else is likely to do so. For now we just pick a register class
naively.
Patch mostly by Ahmed again.
llvm-svn: 286183
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.
With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a forward propagation of a v_cmp 64 bit result
to an user is implemented. Additional side effect of this is that we may
consume less VGPRs in a cost of more SGPRs in case if holding of multiple
conditions is needed, and that is a clear win in most cases.
llvm-svn: 286171
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some vector loads and stores generated from AArch64 intrinsics alias each other
unnecessarily, preventing better scheduling. We just need to transfer memory
operands during lowering.
Reviewers: mcrosier, t.p.northover, jmolloy
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D26313
llvm-svn: 286168
|