| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
Further improve compatibility with the GNU assembler.
Differential Revision: https://reviews.llvm.org/D50217
Patch by Kito Cheng.
llvm-svn: 339255
|
| |
|
|
|
|
|
|
|
|
|
|
| |
sra[w], slt and sltu with immediate
Match the GNU assembler in supporting immediate operands for these
instructions even when the reg-reg mnemonic is used.
Differential Revision: https://reviews.llvm.org/D50046
Patch by Kito Cheng.
llvm-svn: 339252
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50427
llvm-svn: 339241
|
| |
|
|
|
|
|
|
| |
This adds codegen support for the vmov_n_f16 and vdup_n_f16 variants.
Differential Revision: https://reviews.llvm.org/D50329
llvm-svn: 339238
|
| |
|
|
|
|
|
|
| |
This adds codegen support for the vmul_lane_f16 and vmul_n_f16 variants.
Differential Revision: https://reviews.llvm.org/D50326
llvm-svn: 339232
|
| |
|
|
|
|
| |
This will read out of bounds. Found by asan.
llvm-svn: 339230
|
| |
|
|
|
|
|
|
| |
This adds codegen support for the different vcvt_f16 variants.
Differential Revision: https://reviews.llvm.org/D50393
llvm-svn: 339227
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50238
llvm-svn: 339221
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixup test to check for GCN prefix
These patterns always zero extend the result even though it might need sign extension.
This has been broken since the addition of i16 support.
It has popped up in mad_sat(char) test since min(max()) combination is turned into v_med3, resulting in the following (incorrect) sequence:
v_mad_i16 v2, v10, v9, v11
v_med3_i32 v2, v2, v8, v7
Fixes mad_sat(char) piglit on VI.
Differential Revision: https://reviews.llvm.org/D49836
llvm-svn: 339190
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add missing SIMD types (v2f64) and binary ops. Also adds
tablegen support for automatically prepending prefix byte to SIMD
opcodes.
Differential Revision: https://reviews.llvm.org/D50292
Patch by Thomas Lively
llvm-svn: 339186
|
| |
|
|
|
|
|
|
|
| |
Vgather requires must be in a packet with a store, which contradicts
the no-packets feature. As a consequence, gather/scatter could not be
used with no-packets. Relax this, and allow gather packets as exceptions
to the no-packets requirements.
llvm-svn: 339177
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch extends CFGSort pass to support exception handling. Once it
places a loop header, it does not place blocks that are not dominated by
the loop header until all the loop blocks are sorted. This patch extends
the same algorithm to exception 'catch' part, using the information
calculated by WebAssemblyExceptionInfo class.
Reviewers: dschuff, sunfish
Subscribers: sbc100, jgravelle-google, llvm-commits
Differential Revision: https://reviews.llvm.org/D46500
llvm-svn: 339172
|
| |
|
|
|
|
|
|
| |
nonvolatile_store/nonvolatile_load pattern fragment in TargetSelectionDAG.td
Differential Revision: https://reviews.llvm.org/D50358
llvm-svn: 339156
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D50236
llvm-svn: 339148
|
| |
|
|
|
|
|
| |
Update the comment in nextGroup since the ProcResourceCounters are not anymore
always decremented with '1'.
llvm-svn: 339140
|
| |
|
|
|
|
|
|
| |
Remove the redundant check against zero when updating ProcResourceCounters in
nextGroup(), as pointed out in https://reviews.llvm.org/D50187.
Review: Ulrich Weigand.
llvm-svn: 339139
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When potential jump instruction and target are in the same segment, use
jump instruction with immediate field.
In cases where offset does not fit immediate value of a bc/j instructions,
offset is stored into register, and then jump register instruction is used.
Differential Revision: https://reviews.llvm.org/D48019
llvm-svn: 339126
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary to add a VI specific builtin,
__builtin_amdgcn_s_dcache_wb. We already have an
overly specific feature for one of these builtins,
for s_memrealtime. I'm not sure whether it's better
to add more of those, or to get rid of that and merge
it with vi-insts.
Alternatively, maybe this logically goes with scalar-stores?
llvm-svn: 339104
|
| |
|
|
|
|
| |
Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from.
llvm-svn: 339101
|
| |
|
|
|
|
| |
getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather.
llvm-svn: 339096
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Wasm does not have direct counterparts to some of LLVM IR's atomicrmw
instructions (min, max, umin, umax, and nand). This enables atomic
expansion using cmpxchg instruction within a loop for those atomicrmw
instructions.
Reviewers: dschuff
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D49440
llvm-svn: 339084
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The spec only defines a SIMD expression type of V128 and
leaves interpretation of different vector types to the instructions.
Differential Revision: https://reviews.llvm.org/D50367
Patch by Thomas Lively
llvm-svn: 339082
|
| |
|
|
| |
llvm-svn: 339078
|
| |
|
|
| |
llvm-svn: 339077
|
| |
|
|
|
|
|
| |
This usually avoids some re-packing code, and may
help find canonical sources.
llvm-svn: 339072
|
| |
|
|
|
|
| |
This will make more complex combines easier.
llvm-svn: 339070
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Everything should quiet, and I think everything should
flush.
I assume the min3/med3/max3 follow the same rules
as regular min/max for flushing, which should at
least be conservatively correct.
There are still more operations that need to
be handled.
llvm-svn: 339065
|
| |
|
|
|
|
|
|
|
| |
Not sure why this was checking for denormals for f16.
My interpretation of the IEEE standard is conversions
should produce a canonical result, and the ISA manual
says denormals are created when appropriate.
llvm-svn: 339064
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If denormals are enabled, denormals are canonical.
Also fix a few other issues. minnum/maxnum are supposed
to canonicalize. Temporarily improve workaround for the
instruction behavior change in gfx9.
Handle selects and fcopysign.
The tests were also largely broken, since they were
checking for a flush used on some targets after the
store of the result.
llvm-svn: 339061
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Expand isFNEG so that we generate the appropriate F(N)M(ADD|SUB)
instructions in more cases. For example, the following sequence
a = _mm256_broadcast_ss(f)
d = _mm256_fnmadd_ps(a, b, c)
generates an fsub and fma without this patch and an fnma with this
change.
Reviewers: craig.topper
Subscribers: llvm-commits, davidxl, wmi
Differential Revision: https://reviews.llvm.org/D48467
llvm-svn: 339043
|
| |
|
|
|
|
|
|
|
|
| |
sure the store isn't volatile
If the store is volatile this might be a memory mapped IO access. In that case we shouldn't generate a load that didn't exist in the source
Differential Revision: https://reviews.llvm.org/D50270
llvm-svn: 339041
|
| |
|
|
|
|
| |
Appears from expansion of some packed cases.
llvm-svn: 339025
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Ensure that NormalizedBuildVector returns a BUILD_VECTOR with operands of the
same type. This fixes an assertion failure in VerifySDNode.
Reviewers: SjoerdMeijer, t.p.northover, javed.absar
Reviewed By: SjoerdMeijer
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D50202
llvm-svn: 339013
|
| |
|
|
|
|
|
|
|
| |
ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes
out that part of any addend in an object file. But it only does that for
symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting
that extra bit in other cases.
llvm-svn: 339007
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.
Patch by: yrouban (Yevgeny Rouban)
Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00
Reviewed By: tejohnson, xbolva00
Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith
Differential Revision: https://reviews.llvm.org/D49412
llvm-svn: 338969
|
| |
|
|
|
|
| |
Despite the comment removed in this patch, this is beneficial when the RHS of the sub is a register.
llvm-svn: 338930
|
| |
|
|
|
|
|
|
|
|
|
|
| |
and the normal instructions instead
At one point in time acquire implied mayLoad and mayStore as did release. Thus we needed separate pseudos that also carried that property. This appears to no longer be the case. I believe it was changed in 2012 with a comment saying that atomic memory accesses are marked volatile which preserves the ordering.
So from what I can tell we shouldn't need additional pseudos since they aren't carry any flags that are different from the normal instructions. The only thing I can think of is that we may consider them for load folding candidates in the peephole pass now where we didn't before. If that's important hopefully there's something in the memory operand we can check to prevent the folding without relying on pseudo instructions.
Differential Revision: https://reviews.llvm.org/D50212
llvm-svn: 338925
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.
Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.
llvm-svn: 338910
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.
Reviewers: jlebar
Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits
Differential Revision: https://reviews.llvm.org/D50207
llvm-svn: 338908
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
enable better codegen
Clang uses "ctpop & 1" to implement __builtin_parity. If the popcnt instruction isn't supported this generates a large amount of code to calculate the population count. Instead we can bisect the data down to a single byte using xor and then check the parity flag.
Even when popcnt is supported, its still a good idea to split 64-bit data on 32-bit targets using an xor in front of a single popcnt. Otherwise we get two popcnts and an add before the and.
I've specifically targeted this at the sizes supported by clang builtins, but we could generalize this if we think that's useful.
Differential Revision: https://reviews.llvm.org/D50165
llvm-svn: 338907
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44030
llvm-svn: 338894
|
| |
|
|
|
|
|
|
|
|
|
| |
Some instructions expand to more than one decoder group.
This has been hitherto ignored, but is handled with this patch.
Review: Ulrich Weigand
https://reviews.llvm.org/D50187
llvm-svn: 338849
|
| |
|
|
|
|
|
|
| |
This is addressing PR38404.
Differential Revision: https://reviews.llvm.org/D50186
llvm-svn: 338835
|
| |
|
|
|
|
| |
This is addressing PR38404.
llvm-svn: 338830
|
| |
|
|
|
|
|
|
|
|
| |
the Select method in X86ISelDAGToDAG.cpp instead.
There are a lot of permutations of types here generating a lot of patterns in the isel table. It's more efficient to just ReplaceUses and RemoveDeadNode from the Select function.
The test changes are because we have a some shuffle patterns that have a bitcast as their root node. But the behavior is identical to another instruction whose pattern doesn't start with a bitcast. So this isn't a functional change.
llvm-svn: 338824
|
| |
|
|
|
|
|
|
|
|
|
|
| |
instructions.
Move all the patterns to X86InstrVecCompiler.td so we can keep SSE/AVX/AVX512 all in one place.
To save some patterns we'll use an existing DAG combine to convert f128 fand/for/fxor to integer when sse2 is enabled. This allows use to reuse all the existing patterns for v2i64.
I believe this now makes SHA instructions the only case where VEX/EVEX and legacy encoded instructions could be generated simultaneously.
llvm-svn: 338821
|
| |
|
|
|
|
|
|
| |
YMM/ZMM, make sure the producing instruction is VEX/XOP/EVEX encoded.
If the producing instruction is legacy encoded it doesn't implicitly zero the upper bits. This is important for the SHA instructions which don't have a VEX encoded version. We might also be able to hit this with the incomplete f128 support that hasn't been ported to VEX.
llvm-svn: 338812
|
| |
|
|
|
|
| |
I'm assuming the R13 restriction extends to R13D. Guessing this restriction is related to the funny encoding of this register as base always requiring a displacement to be encoded.
llvm-svn: 338806
|
| |
|
|
|
|
|
|
| |
atomic load and atomic store.
This makes them consistent with i8/i32/i64. Which still seems to be more aggressive on folding than icc, gcc, or MSVC.
llvm-svn: 338795
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
By not reconstructing the operand list of the SDNode, this change makes
it easier to add the forthcoming new tbuffer and buffer intrinsics.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D49995
Change-Id: I0cb79ef0801532645d7dd954a6d7355139db7b38
llvm-svn: 338784
|