| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 287340
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The addr64-based legalization is incorrect for MUBUF instructions with idxen
set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions. This affects
e.g. shaders that access buffer textures.
Since we never actually need the addr64-legalization in shaders, this patch
takes the easy route and keys off the calling convention. If this ever
affects (non-OpenGL) compute, the type of legalization needs to be chosen
based on some TSFlag.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664
Reviewers: arsenm, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D26747
llvm-svn: 287339
|
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287338
|
| |
|
|
|
|
|
| |
Exploit new instructions by adding patterns to .td file.
https://reviews.llvm.org/D26551
llvm-svn: 287334
|
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287333
|
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287331
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we see a SETCC whose only users are zero extend operations, we can replace
it with a subtraction. This results in doing all calculations in GPRs and
avoids CR use.
Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are
ways that this can be extended. For example for signed condition codes. In that
case we will be introducing additional sign extend instructions, so more careful
profitability analysis may be required.
Another direction to extend this is for equal, not equal conditions. Also when
users of SETCC are any_ext or sign_ext, we might be able to do something
similar.
llvm-svn: 287329
|
| |
|
|
|
|
|
|
|
|
| |
unmasked versions and selects.
The same thing was done to 32-bit and 64-bit element sizes previously.
This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics.
llvm-svn: 287312
|
| |
|
|
| |
llvm-svn: 287311
|
| |
|
|
|
|
|
| |
There are still crashes on non-MVT types in other
places.
llvm-svn: 287310
|
| |
|
|
|
|
|
|
|
|
| |
since bpf instruction set was introduced people learned to
read and understand kernel verifier output whereas llvm asm
output stayed obscure and unknown. Convert llvm to emit
assembler text similar to kernel to avoid this discrepancy
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287300
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This extends FCOPYSIGN support to 512-bit vectors.
I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.
Reviewers: delena, zvi, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26791
llvm-svn: 287298
|
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287248
|
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287247
|
| |
|
|
|
|
|
|
| |
This reverts commit r287146.
This breaks few conformance tests.
llvm-svn: 287233
|
| |
|
|
| |
llvm-svn: 287224
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves.
If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer).
This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases).
Partial fix for PR30845
Differential Revision: https://reviews.llvm.org/D26590
llvm-svn: 287223
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Variadic functions can be treated in the same way as normal functions
with respect to the number and types of parameters.
Reviewers: grosbach, olista01, t.p.northover, rengolin
Subscribers: javed.absar, aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D26748
llvm-svn: 287219
|
| |
|
|
|
|
|
|
|
|
| |
Register Calling Convention defines a new behavior for v64i1 types.
This type should be saved in GPR.
However for 32 bit machine we need to split the value into 2 GPRs (because each is 32 bit).
Differential Revision: https://reviews.llvm.org/D26181
llvm-svn: 287217
|
| |
|
|
| |
llvm-svn: 287211
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds XRay support in LLVM for AArch64 targets.
This patch is one of a series:
Clang: https://reviews.llvm.org/D26415
compiler-rt: https://reviews.llvm.org/D26413
Author: rSerge
Reviewers: rengolin, dberris
Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown
Differential Revision: https://reviews.llvm.org/D26412
llvm-svn: 287209
|
| |
|
|
|
|
| |
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.
llvm-svn: 287206
|
| |
|
|
| |
llvm-svn: 287203
|
| |
|
|
| |
llvm-svn: 287201
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26732
llvm-svn: 287199
|
| |
|
|
|
|
|
| |
The '-fpermissive' compiler flag complains if the template
specializations used in the class are used in a different namespace.
llvm-svn: 287176
|
| |
|
|
| |
llvm-svn: 287173
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We save an inter-register file move this way. If there's any CPU where
the FP logic is slower, we could transform this back to int-logic in
MachineCombiner.
This helps, but doesn't solve, PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
The 'andn' test shows that we're missing a pattern match to
recognize the xor with -1 constant as a 'not' op.
llvm-svn: 287171
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A lot of the pseudo instructions are required because LLVM assumes that
all integers of the same size as the pointer size are legal. This means
that it will not currently expand 16-bit instructions to their 8-bit
variants because it thinks 16-bit types are legal for the operations.
This also adds all of the CodeGen tests that required the pass to run.
Reviewers: arsenm, kparzysz
Subscribers: wdng, mgorny, modocache, llvm-commits
Differential Revision: https://reviews.llvm.org/D26577
llvm-svn: 287162
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We only ever create TargetConstantPool, TargetJumpTable, TargetExternalSymbol,
TargetGlobalAddress, TargetGlobalTLSAddress, MCSymbol and TargetBlockAddress
nodes as operands of X86ISD::Wrapper nodes, so we can remove one check and
invert the other.
Also update the documentation comment for X86ISD::Wrapper.
Differential Revision: https://reviews.llvm.org/D26731
llvm-svn: 287160
|
| |
|
|
|
|
|
|
|
| |
One half of the shifts obviously needed conditional selection based on whether
the shift amount is more than 32-bits, but leaving the other half as the
natural shift isn't acceptable either: it's undefined behaviour to shift a
32-bit value by more than 31.
llvm-svn: 287149
|
| |
|
|
|
|
|
| |
This fixes a probably unintended divergence from the default
scheduler behavior.
llvm-svn: 287146
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extend replaceZeroVectorStore to handle more vector type stores,
floating point zero vectors and set alignment more accurately on split
stores.
This is a follow-up change to r286875.
This change fixes PR31038.
Reviewers: MatzeB
Subscribers: mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D26682
llvm-svn: 287142
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
1. Don't try to copy values to and from the same register class.
2. Replace copies with of registers with immediate values with v_mov/s_mov
instructions.
The main purpose of this change is to make MachineSink do a better job of
determining when it is beneficial to split a critical edge, since the pass
assumes that copies will become move instructions.
This prevents a regression in uniform-cfg.ll if we enable critical edge
splitting for AMDGPU.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23408
llvm-svn: 287131
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
some bytes
We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions.
Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of
compilers, but logically equivalent int, float, and double variants of bitwise-logic
instructions are reality in x86, and the float variant may be a shorter instruction
depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all
the time.
This is a preliminary step towards solving PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
Differential Revision:
https://reviews.llvm.org/D26712
llvm-svn: 287122
|
| |
|
|
|
|
|
|
|
|
|
|
| |
generic IR
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen.
LLVM counterpart to D26686
Differential Revision: https://reviews.llvm.org/D26736
llvm-svn: 287108
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MipsFastISel uses a a class to represent addresses with a signed member
to represent the offset. MipsFastISel::emitStore, emitLoad and computeAddress
all treated the offset as being positive. In cases where the offset was
actually negative and a frame pointer was used, this would cause the constant
synthesis routine to crash as it would generate an unexpected instruction
sequence when frame indexes are replaced.
Reviewers: vkalintiris
Differential Revision: https://reviews.llvm.org/D26192
llvm-svn: 287099
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch adds the single operand form of the not alias to microMIPS and
MIPS along with additional tests.
This partially resolves PR/30381.
Thanks to Sean Bruno for reporting the issue!
llvm-svn: 287097
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26128
llvm-svn: 287087
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.
Reviewers: zvi, delena, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26660
llvm-svn: 287083
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26711
llvm-svn: 287077
|
| |
|
|
|
|
|
|
|
| |
Doing this before register allocation reduces register pressure as we do
not even have to allocate a register for those dead definitions.
Differential Revision: https://reviews.llvm.org/D26111
llvm-svn: 287076
|
| |
|
|
|
|
|
|
|
|
| |
- Select `select` to `v_cndmask_b32`
- Expand `select_cc`
- Refactor patterns
Differential Revision: https://reviews.llvm.org/D26714
llvm-svn: 287074
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the default, small and medium code model, use the existing
difference from the jump table towards the label. For all other code
models, setup the picbase and use the difference between the picbase and
the block address.
Overall, this results in smaller data tables at the expensive of one or
two more arithmetic operation at the jump site. Given that we only create
jump tables with a lot more than two entries, it is a net win in size.
For larger code models the assumption remains that individual functions
are no larger than 2GB.
Differential Revision: https://reviews.llvm.org/D26336
llvm-svn: 287059
|
| |
|
|
|
|
|
|
|
|
| |
wbinvl.* are vector instruction that do not sue vector registers.
v2: check only M?BUF instructions
Differential Revision: https://reviews.llvm.org/D26633
llvm-svn: 287056
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26673
llvm-svn: 287036
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26670
llvm-svn: 287035
|
| |
|
|
| |
llvm-svn: 287027
|
| |
|
|
|
|
|
| |
Also respect the TII hook for these like the generic code does
in case we want a flag later to disable this.
llvm-svn: 287021
|
| |
|
|
|
|
|
|
|
|
|
| |
Lower a = b * C where C = (2^n + 1) * 2^m to
add w0, w0, w0, lsl n
lsl w0, w0, m
Differential Revision: https://reviews.llvm.org/D229245
llvm-svn: 287019
|