| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287248
|
| |
|
|
|
|
| |
Identified by Pedro Giffuni in PR27636.
llvm-svn: 287247
|
| |
|
|
|
|
|
|
| |
This reverts commit r287146.
This breaks few conformance tests.
llvm-svn: 287233
|
| |
|
|
| |
llvm-svn: 287224
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves.
If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer).
This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases).
Partial fix for PR30845
Differential Revision: https://reviews.llvm.org/D26590
llvm-svn: 287223
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Variadic functions can be treated in the same way as normal functions
with respect to the number and types of parameters.
Reviewers: grosbach, olista01, t.p.northover, rengolin
Subscribers: javed.absar, aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D26748
llvm-svn: 287219
|
| |
|
|
|
|
|
|
|
|
| |
Register Calling Convention defines a new behavior for v64i1 types.
This type should be saved in GPR.
However for 32 bit machine we need to split the value into 2 GPRs (because each is 32 bit).
Differential Revision: https://reviews.llvm.org/D26181
llvm-svn: 287217
|
| |
|
|
| |
llvm-svn: 287211
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds XRay support in LLVM for AArch64 targets.
This patch is one of a series:
Clang: https://reviews.llvm.org/D26415
compiler-rt: https://reviews.llvm.org/D26413
Author: rSerge
Reviewers: rengolin, dberris
Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown
Differential Revision: https://reviews.llvm.org/D26412
llvm-svn: 287209
|
| |
|
|
|
|
| |
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.
llvm-svn: 287206
|
| |
|
|
| |
llvm-svn: 287203
|
| |
|
|
| |
llvm-svn: 287201
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26732
llvm-svn: 287199
|
| |
|
|
|
|
|
| |
The '-fpermissive' compiler flag complains if the template
specializations used in the class are used in a different namespace.
llvm-svn: 287176
|
| |
|
|
| |
llvm-svn: 287173
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We save an inter-register file move this way. If there's any CPU where
the FP logic is slower, we could transform this back to int-logic in
MachineCombiner.
This helps, but doesn't solve, PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
The 'andn' test shows that we're missing a pattern match to
recognize the xor with -1 constant as a 'not' op.
llvm-svn: 287171
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A lot of the pseudo instructions are required because LLVM assumes that
all integers of the same size as the pointer size are legal. This means
that it will not currently expand 16-bit instructions to their 8-bit
variants because it thinks 16-bit types are legal for the operations.
This also adds all of the CodeGen tests that required the pass to run.
Reviewers: arsenm, kparzysz
Subscribers: wdng, mgorny, modocache, llvm-commits
Differential Revision: https://reviews.llvm.org/D26577
llvm-svn: 287162
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We only ever create TargetConstantPool, TargetJumpTable, TargetExternalSymbol,
TargetGlobalAddress, TargetGlobalTLSAddress, MCSymbol and TargetBlockAddress
nodes as operands of X86ISD::Wrapper nodes, so we can remove one check and
invert the other.
Also update the documentation comment for X86ISD::Wrapper.
Differential Revision: https://reviews.llvm.org/D26731
llvm-svn: 287160
|
| |
|
|
|
|
|
|
|
| |
One half of the shifts obviously needed conditional selection based on whether
the shift amount is more than 32-bits, but leaving the other half as the
natural shift isn't acceptable either: it's undefined behaviour to shift a
32-bit value by more than 31.
llvm-svn: 287149
|
| |
|
|
|
|
|
| |
This fixes a probably unintended divergence from the default
scheduler behavior.
llvm-svn: 287146
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extend replaceZeroVectorStore to handle more vector type stores,
floating point zero vectors and set alignment more accurately on split
stores.
This is a follow-up change to r286875.
This change fixes PR31038.
Reviewers: MatzeB
Subscribers: mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D26682
llvm-svn: 287142
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
1. Don't try to copy values to and from the same register class.
2. Replace copies with of registers with immediate values with v_mov/s_mov
instructions.
The main purpose of this change is to make MachineSink do a better job of
determining when it is beneficial to split a critical edge, since the pass
assumes that copies will become move instructions.
This prevents a regression in uniform-cfg.ll if we enable critical edge
splitting for AMDGPU.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D23408
llvm-svn: 287131
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
some bytes
We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions.
Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of
compilers, but logically equivalent int, float, and double variants of bitwise-logic
instructions are reality in x86, and the float variant may be a shorter instruction
depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all
the time.
This is a preliminary step towards solving PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
Differential Revision:
https://reviews.llvm.org/D26712
llvm-svn: 287122
|
| |
|
|
|
|
|
|
|
|
|
|
| |
generic IR
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen.
LLVM counterpart to D26686
Differential Revision: https://reviews.llvm.org/D26736
llvm-svn: 287108
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MipsFastISel uses a a class to represent addresses with a signed member
to represent the offset. MipsFastISel::emitStore, emitLoad and computeAddress
all treated the offset as being positive. In cases where the offset was
actually negative and a frame pointer was used, this would cause the constant
synthesis routine to crash as it would generate an unexpected instruction
sequence when frame indexes are replaced.
Reviewers: vkalintiris
Differential Revision: https://reviews.llvm.org/D26192
llvm-svn: 287099
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch adds the single operand form of the not alias to microMIPS and
MIPS along with additional tests.
This partially resolves PR/30381.
Thanks to Sean Bruno for reporting the issue!
llvm-svn: 287097
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26128
llvm-svn: 287087
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.
Reviewers: zvi, delena, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26660
llvm-svn: 287083
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26711
llvm-svn: 287077
|
| |
|
|
|
|
|
|
|
| |
Doing this before register allocation reduces register pressure as we do
not even have to allocate a register for those dead definitions.
Differential Revision: https://reviews.llvm.org/D26111
llvm-svn: 287076
|
| |
|
|
|
|
|
|
|
|
| |
- Select `select` to `v_cndmask_b32`
- Expand `select_cc`
- Refactor patterns
Differential Revision: https://reviews.llvm.org/D26714
llvm-svn: 287074
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the default, small and medium code model, use the existing
difference from the jump table towards the label. For all other code
models, setup the picbase and use the difference between the picbase and
the block address.
Overall, this results in smaller data tables at the expensive of one or
two more arithmetic operation at the jump site. Given that we only create
jump tables with a lot more than two entries, it is a net win in size.
For larger code models the assumption remains that individual functions
are no larger than 2GB.
Differential Revision: https://reviews.llvm.org/D26336
llvm-svn: 287059
|
| |
|
|
|
|
|
|
|
|
| |
wbinvl.* are vector instruction that do not sue vector registers.
v2: check only M?BUF instructions
Differential Revision: https://reviews.llvm.org/D26633
llvm-svn: 287056
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D26673
llvm-svn: 287036
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26670
llvm-svn: 287035
|
| |
|
|
| |
llvm-svn: 287027
|
| |
|
|
|
|
|
| |
Also respect the TII hook for these like the generic code does
in case we want a flag later to disable this.
llvm-svn: 287021
|
| |
|
|
|
|
|
|
|
|
|
| |
Lower a = b * C where C = (2^n + 1) * 2^m to
add w0, w0, w0, lsl n
lsl w0, w0, m
Differential Revision: https://reviews.llvm.org/D229245
llvm-svn: 287019
|
| |
|
|
|
|
|
| |
Fixes giving up on clustering common addr64 accesses with
constant 0 soffset.
llvm-svn: 287018
|
| |
|
|
| |
llvm-svn: 287015
|
| |
|
|
| |
llvm-svn: 287013
|
| |
|
|
|
|
|
|
|
|
|
| |
The wave barrier represents the discardable barrier. Its main purpose is to
carry convergent attribute, thus preventing illegal CFG optimizations. All lanes
in a wave come to convergence point simultaneously with SIMT, thus no special
instruction is needed in the ISA. The barrier is discarded during code generation.
Differential Revision: https://reviews.llvm.org/D26585
llvm-svn: 287007
|
| |
|
|
| |
llvm-svn: 286993
|
| |
|
|
| |
llvm-svn: 286989
|
| |
|
|
|
|
| |
This silences some warnings that I didn't see with my host compiler.
llvm-svn: 286981
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch helps avoids poor legalization of boolean vector results (e.g. 8f32 -> 8i1 -> 8i16) that feed into SINT_TO_FP by inserting an early SIGN_EXTEND and so help improve the truncation logic.
This is not necessary for AVX512 targets where boolean vectors are legal - AVX512 manages to lower ( sint_to_fp vXi1 ) into some form of ( select mask, 1.0f , 0.0f ) in most cases.
Fix for PR13248
Differential Revision: https://reviews.llvm.org/D26583
llvm-svn: 286979
|
| |
|
|
|
|
|
|
|
| |
Move some code inside the proper 'if' block to make sure it is only run once,
when the subtarget is first created. Things can still break if we use different
ARM target machines or if we have functions with different 'target-cpu' or
'target-features', we should fix that too in the future.
llvm-svn: 286974
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch implements all the overloads for vec_xl_be and vec_xst_be. On BE,
they behaves exactly the same with vec_xl and vec_xst, therefore they are
simply implemented by defining a matching macro. On LE, they are implemented
by defining new builtins and intrinsics. For int/float/long long/double, it
is just a load (lxvw4x/lxvd2x) or store(stxvw4x/stxvd2x). For char/char/short,
we also need some extra shuffling before or after call the builtins to get the
desired BE order. For int128, simply call vec_xl or vec_xst.
llvm-svn: 286967
|
| |
|
|
|
|
| |
intrinsics. NFC
llvm-svn: 286961
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is needed to be able to use this flags in InstrMappings.
Reviewers: tstellarAMD, vpykhtin
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D26666
llvm-svn: 286960
|