| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoid using allocateKernArg / AssignFn. We do not want any
of the type splitting properties of normal calling convention
lowering.
For now at least this exists alongside the IR argument lowering
pass. This is necessary to handle struct padding correctly while
some arguments are still skipped by the IR argument lowering
pass.
llvm-svn: 336373
|
|
|
|
|
|
| |
Normally InstCombine would have simplified these to SRL/AND instructions but we may still see these during SLP vectorization etc.
llvm-svn: 336371
|
|
|
|
|
|
|
|
|
|
|
| |
instructions
Map the following instructions to the proper float128 lib calls:
pow[i], exp[2], log[2|10], sin, cos, fmin, fmax
Differential Revision: https://reviews.llvm.org/D48544
llvm-svn: 336361
|
|
|
|
|
|
|
|
|
|
|
|
| |
Wait states are not properly being inserted after buffer_store for v_interp instructions.
Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can
check and insert the appropriate wait states when needed.
Differential Revision: https://reviews.llvm.org/D48772
Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64
llvm-svn: 336339
|
|
|
|
| |
llvm-svn: 336331
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the stores can cause the
atomic sequence to fail.
This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.
This version addresses issues with the initial implementation and covers
all atomic operations.
This resolves PR/32020.
Thanks to James Cowgill for reporting the issue!
Patch By: Simon Dardis
Differential Revision: https://reviews.llvm.org/D31287
llvm-svn: 336328
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves:
Unsupported ARM Neon intrinsics in Target-specific DAG combine
function for VLDDUP
https://bugs.llvm.org/show_bug.cgi?id=38031
Related diff: D48439
Differential Revision: https://reviews.llvm.org/D48920
llvm-svn: 336325
|
|
|
|
|
|
|
| |
in TableGen, for which there is already a patch in Phabricator
(D48937) that needs to be committed first.
llvm-svn: 336324
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336322
|
|
|
|
|
|
|
|
|
| |
We were miscompiling i8 loads, so reject them as unsupported narrow operations
for now.
Differential Revision: https://reviews.llvm.org/D48944
llvm-svn: 336319
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the following varieties:
- Unpredicated signed max, e.g. smax z0.h, z1.h, #-128
- Unpredicated signed min, e.g. smin z0.h, z1.h, #-128
- Unpredicated unsigned max, e.g. umax z0.h, z1.h, #255
- Unpredicated unsigned min, e.g. umin z0.h, z1.h, #255
- Predicated signed max, e.g. smax z0.h, p0/m, z0.h, z1.h
- Predicated signed min, e.g. smin z0.h, p0/m, z0.h, z1.h
- Predicated signed abd, e.g. sabd z0.h, p0/m, z0.h, z1.h
- Predicated unsigned max, e.g. umax z0.h, p0/m, z0.h, z1.h
- Predicated unsigned min, e.g. umin z0.h, p0/m, z0.h, z1.h
- Predicated unsigned abd, e.g. uabd z0.h, p0/m, z0.h, z1.h
llvm-svn: 336317
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize code sequences for integer conversion to fp128 when the integer is a result of:
* float->int
* float->long
* double->int
* double->long
Differential Revision: https://reviews.llvm.org/D48429
llvm-svn: 336316
|
|
|
|
|
|
| |
independent FMA and extractelement/insertelement.
llvm-svn: 336315
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Non-homogenous aggregates are passed in consecutive GPRs, in GPRs and in memory,
or in memory. This patch ensures that float128 members of non-homogenous
aggregates are passed via VSX registers.
This is done via custom lowering a bitcast of a build_pari(i64,i64) to float128
to a new PPCISD node, BUILD_FP128.
Differential Revision: https://reviews.llvm.org/D48308
llvm-svn: 336310
|
|
|
|
|
|
|
|
|
| |
Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.
Differential Revision: https://reviews.llvm.org/D47569
llvm-svn: 336307
|
|
|
|
|
|
|
|
|
|
| |
This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.
Differential Revision: https://reviews.llvm.org/D47552
llvm-svn: 336306
|
|
|
|
|
|
|
|
| |
for the v1i1 mask to have come from a scalar_to_vector from GR8.
We have patterns for SELECTS that top at v1i1 and we have a pattern for (v1i1 (scalar_to_vector GR8)). The patterns being removed here do the same thing as the two other patterns combined so there is no need for them.
llvm-svn: 336305
|
|
|
|
|
|
|
|
|
|
| |
input.
Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)).
This patch fixes that so we can also combine things like (fmsub (fneg)).
llvm-svn: 336304
|
|
|
|
|
|
|
|
|
|
| |
in clang.
There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes.
More removals to come, but I wanted to stop and fix the regression that showed up in this first.
llvm-svn: 336303
|
|
|
|
|
|
|
|
|
| |
Legalize and emit code for round & convert float128 to double precision and
single precision.
Differential Revision: https://reviews.llvm.org/D46997
llvm-svn: 336299
|
|
|
|
|
|
|
|
|
| |
CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.
Differential Revision: https://reviews.llvm.org/D48843
llvm-svn: 336296
|
|
|
|
|
|
|
|
|
|
|
| |
We want to run the Machine Scheduler instead of the List Scheduler after RA.
Checked with a performance run on a Power 9 machine with SPEC 2006 and while
some benchmarks improved and others degraded the geomean was slightly improved
with the Machine Scheduler.
Differential Revision: https://reviews.llvm.org/D45265
llvm-svn: 336295
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.
This change adds negative immediates support and respective tests
for the following:
ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W
Differential Revision: https://reviews.llvm.org/D48649
llvm-svn: 336286
|
|
|
|
|
|
|
|
| |
getOutlininingCandidateInfo -> getOutliningCandidateInfo
Differential Revision: https://reviews.llvm.org/D48867
llvm-svn: 336285
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds both a vector and an immediate form, e.g.
- Vector form:
subr z0.h, p0/m, z0.h, z1.h
subtract active elements of z0 from z1, and store the result in z0.
- Immediate form:
subr z0.h, z0.h, #255
subtract elements of z0, and store the result in z0.
llvm-svn: 336274
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
rdffr p0.b
- RDFFR (predicated)
rdffr p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
rdffr p0.b, p0/z
Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
setffr
- WRFFR (unpredicated)
wrffr p0.b
llvm-svn: 336267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
- fcvt (FP convert precision)
- scvtf (signed int -> FP)
- ucvtf (unsigned int -> FP)
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))
For example:
fcvt z0.h, p0/m, z0.s (single- to half-precision FP)
scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP)
ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP)
fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int)
fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int)
llvm-svn: 336265
|
|
|
|
|
|
|
|
| |
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.
Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247.
llvm-svn: 336250
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.
In short:
SVE alias => AArch64 name
--------------------------
NONE => EQ
ANY => NE
NLAST => HS
LAST => LO
FIRST => MI
NFRST => PL
PMORE => HI
PLAST => LS
TCONT => GE
TSTOP => LT
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48869
llvm-svn: 336245
|
|
|
|
|
|
| |
Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded.
llvm-svn: 336236
|
|
|
|
| |
llvm-svn: 336232
|
|
|
|
|
|
| |
Vectorization can create them.
llvm-svn: 336227
|
|
|
|
|
|
| |
pasted. NFC
llvm-svn: 336226
|
|
|
|
| |
llvm-svn: 336223
|
|
|
|
|
|
| |
definitions
llvm-svn: 336222
|
|
|
|
|
|
|
|
| |
This patch adds a new token type specifically for (%dx). We will now always create this token when we parse (%dx). After all operands have been parsed, if the mnemonic is in/out we'll morph this token to a regular register token. Otherwise we keep it as the special DX token which won't match any instructions.
This removes the need for passing Mnemonic through the parsing functions. It also seems closer to gas where when its used on the wrong instruction it just gets diagnosed as an invalid operand rather than a bad memory address.
llvm-svn: 336218
|
|
|
|
|
|
|
|
| |
This might make the error message added in r335668 unneeded, but I'm not sure yet.
The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.
llvm-svn: 336217
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added in this patch are:
- Predicated Complex floating point ADD with rotate, e.g.
fcadd z0.h, p0/m, z0.h, z1.h, #90
- Predicated Complex floating point MLA with rotate, e.g.
fcmla z0.h, p0/m, z1.h, z2.h, #180
- Unpredicated Complex floating point MLA with rotate (indexed operand), e.g.
fcmla z0.h, p0/m, z1.h, z2.h[0], #180
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48824
llvm-svn: 336210
|
|
|
|
|
|
|
|
|
|
| |
unselectable stores.
r336120 resulted in falling back to SelectionDAG more often due to the G_STORE
MMOs not matching the vreg size. This fixes that by explicitly any-extending the
value.
llvm-svn: 336209
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:
fmul z0.s, z1.s, z2.s[0]
which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.
This patch adds restricted register classes for SVE vectors:
ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements.
ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48823
llvm-svn: 336205
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch includes support for the following instructions:
ABS z0.h, p0/m, z0.h
NEG z0.h, p0/m, z0.h
(S|U)XTB z0.h, p0/m, z0.h
(S|U)XTB z0.s, p0/m, z0.s
(S|U)XTB z0.d, p0/m, z0.d
(S|U)XTH z0.s, p0/m, z0.s
(S|U)XTH z0.d, p0/m, z0.d
(S|U)XTW z0.d, p0/m, z0.d
llvm-svn: 336204
|
|
|
|
|
|
|
|
|
|
| |
With a view to support parallel operations that have their results
stored to memory, refactor the consecutive access helper out so it
could support stores instructions.
Differential Revision: https://reviews.llvm.org/D48872
llvm-svn: 336195
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
|
|
|
|
|
|
| |
This reverts commit r336113. It causes crashes.
llvm-svn: 336189
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42
unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42
signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h
unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h
llvm-svn: 336186
|
|
|
|
|
|
|
|
|
|
|
| |
Lower more than 4 arguments using stack. This patch targets MIPS32.
It supports only functions with arguments of type i32.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D47934
llvm-svn: 336185
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contains the following variants:
- Compare with (elements from) other vector
instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo.
aliases: fcmle, fcmlt.
e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h
- Compare absolute values with (absolute values from) other vector.
instructions: facge, facgt.
aliases: facle, faclt.
e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h
- Compare vector elements with #0.0
instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne.
e.g. fcmle p0.h, p0/z, z0.h, #0.0
llvm-svn: 336182
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add support for atomic store instructions.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D48839
llvm-svn: 336145
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: efriedma, rogfer01, javed.absar
Reviewed By: efriedma, rogfer01
Subscribers: kristof.beyls, chrib, llvm-commits
Differential Revision: https://reviews.llvm.org/D48846
llvm-svn: 336144
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM doesn't guarantee anything about the high bits of a register holding
an i1 value at the IR level, so don't translate LLVM IR i1 values directly
into WebAssembly conditional branch operands. WebAssembly's conditional
branches do demand all 32 bits be valid.
Fixes PR38019.
llvm-svn: 336138
|