| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
instructions under AVX512.
This matches what we do for sint_to_fp.
llvm-svn: 332205
|
|
|
|
| |
llvm-svn: 332198
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass is
a) broken.
b) r600 specific.
Fixing (a) is a bit more non-trivial, but fixing (b)
is easy. Move this pass to being R600 only for now.
This pass does pass all the unit tests, however clang
no longer generates code that looks like the unit test
input, so fixing the pass requires fixing the tests and
the pass as one, and checking it works with clang still.
Patch by Dave Airlie
llvm-svn: 332196
|
|
|
|
|
|
|
| |
This is apparently necessary to stop undef from being
turned into a build_vector of 0s.
llvm-svn: 332195
|
|
|
|
|
|
| |
instructions.
llvm-svn: 332189
|
|
|
|
| |
llvm-svn: 332187
|
|
|
|
|
|
| |
clang has used for a very long time.
llvm-svn: 332186
|
|
|
|
|
|
|
|
| |
comment in the same revision but missed it.
Thanks to Dimitry Andric for catching this!
llvm-svn: 332177
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
As reported in PR37264, in some cases the X86 Domain Reassignment
`runOnMachineFunction()` is called twice. Because it only deletes the
`.second` members of its `InstrConverterBaseMap`, and does not clean up
the map itself, this can lead to double frees and crashes.
Use `DeleteContainerSeconds()` instead, so the `Converters` map can
safely be reinitialized and its members re-deleted for each X86 Domain
Reassignment pass.
Reviewers: guyblank, craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46425
llvm-svn: 332176
|
|
|
|
| |
llvm-svn: 332173
|
|
|
|
| |
llvm-svn: 332172
|
|
|
|
|
|
|
|
| |
with an older intrinsic and a select.
This is what clang already uses.
llvm-svn: 332170
|
|
|
|
|
|
|
|
|
|
| |
We cannot query this attribute from a subtarget given a machine function.
At this point attribute itself is already unavailable and can only be
obtained through MFI.
Differential Revision: https://reviews.llvm.org/D46781
llvm-svn: 332166
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D46153
llvm-svn: 332154
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We have no logic to promote alloca to vector for an AddrSpaceCast instruction.
Reviewer:
arsenm
Differential Revision:
https://reviews.llvm.org/D45993
llvm-svn: 332147
|
|
|
|
|
|
| |
used by clang.
llvm-svn: 332146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a useless SwitchSection which also causes compilation failure
when IR contains comdat.
The SwitchSection is useless because the current section is already
correct text section for the function therefore no need to switch.
It causes compilation failure for comdat because functions with comdat
has specific text section, not the default .text section.
Since HIP uses comdat, this bug caused failures for HIP.
Differential Revision: https://reviews.llvm.org/D46770
llvm-svn: 332137
|
|
|
|
|
|
| |
We still need to handle mmx/xmm moves as 'decode-only' no-pipe instructions
llvm-svn: 332109
|
|
|
|
|
|
|
|
|
|
| |
These directives allow the 'C' (compressed) extension to be enabled/disabled
within a single file.
Differential Revision: https://reviews.llvm.org/D45864
Patch by Kito Cheng
llvm-svn: 332107
|
|
|
|
|
|
| |
Fixes an issue on SLM/Btver2 where we had instructions were being treated as scalar loads/stores
llvm-svn: 332104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
performPostLD1Combine in AArch64ISelLowering looks for vector
insert_vector_elt of a loaded value which it can optimize into a single
LD1LANE instruction. The code checking for the pattern was not checking
if the lane index was a constant which could cause two problems:
- an assert when lowering the LD1LANE ISD node since it assumes an
constant operand
- an assert in isel if the lane index value depends on the
post-incremented base register
Both of these issues are avoided by simply checking that the lane index
is a constant.
Fixes bug 35822.
Reviewers: t.p.northover, javed.absar
Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D46591
llvm-svn: 332103
|
|
|
|
| |
llvm-svn: 332102
|
|
|
|
|
|
|
|
| |
Reviewers: atanasyan, smaksimovic, abeserminji
Differential Revision: https://reviews.llvm.org/D46392
llvm-svn: 332097
|
|
|
|
|
|
|
|
| |
Confirmed by both Agner and Intel's AOM - the IEC/FPC are not required for pure load/stores (even if its a partial update).
Can't fix WriteStore until all RMW instructions are cleaned up though....
llvm-svn: 332096
|
|
|
|
|
|
| |
Fixes a SNB issue that was missing vlddqu/vmovntdqa ymm instructions
llvm-svn: 332094
|
|
|
|
|
|
| |
Nothing uses this yet but this will allow us to specialize MMX/XMM/YMM/ZMM vector moves.
llvm-svn: 332090
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45883
llvm-svn: 332082
|
|
|
|
| |
llvm-svn: 332079
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.
Differential Revision: https://reviews.llvm.org/D46668
llvm-svn: 332057
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements a new table-gen emitter to create tables for
a wasm disassembler, and a dissassembler to use them.
Comes with 2 tests, that tests a few instructions manually. Is also able to
disassemble large .wasm files with objdump reasonably.
Not working so well, to be addressed in followups:
- objdump appears to be passing an incorrect starting point.
- since the disassembler works an instruction at a time, and it is
disassembling stack instruction, it has no idea of pseudo register assignments.
These registers are required for the instruction printing code that follows.
For now, all such registers appear in the output as $0.
Patch by Wouter van Oortmerssen
Differential Revision: https://reviews.llvm.org/D45848
llvm-svn: 332052
|
|
|
|
|
|
|
|
|
|
|
|
| |
from r331958.
Clang's codegen now uses 128-bit masked load/store intrinsics in IR. The backend will widen to 512-bits on AVX512F targets.
So this patch adds patterns to detect codegen's widening and patterns for AVX512VL that don't get widened.
We may be able to drop some of the old patterns, but I leave that for a future patch.
llvm-svn: 332049
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45881
llvm-svn: 332042
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45994
llvm-svn: 332039
|
|
|
|
|
|
|
| |
This was missing from r331961.
Caught by sanitizer bots.
llvm-svn: 332024
|
|
|
|
|
|
| |
Use instrs lists or merge multiple instregex patterns.
llvm-svn: 332022
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accessing the members of a large data structures needs a lot of GEPs which
usually have large offsets due to the size of the underlying data structure. If
the offsets are too large to fit into the r+i addressing mode, these GEPs cannot
be sunk to their users' blocks and many extra registers are needed then to carry
the values of these GEPs.
This patch tries to split a large data struct starting from %base like the
following.
Before:
BB0:
%base =
BB1:
%gep0 = gep %base, off0
%gep1 = gep %base, off1
%gep2 = gep %base, off2
BB2:
%load1 = load %gep0
%load2 = load %gep1
%load3 = load %gep2
After:
BB0:
%base =
%new_base = gep %base, off0
BB1:
%new_gep0 = %new_base
%new_gep1 = gep %new_base, off1 - off0
%new_gep2 = gep %new_base, off2 - off0
BB2:
%load1 = load i32, i32* %new_gep0
%load2 = load i32, i32* %new_gep1
%load3 = load i32, i32* %new_gep2
In the above example, the struct is split into two parts. The first part still
starts from %base and the second part starts from %new_base. After the
splitting, %new_gep1 and %new_gep2 have smaller offsets and then can be sunk to
BB2 and folded into their users.
The algorithm to split data structure is simple and very similar to the work of
merging SExts. First, it collects GEPs that have large offsets when iterating
the blocks. Second, it splits the underlying data structures and updates the
collected GEPs to use smaller offsets.
Differential Revision: https://reviews.llvm.org/D42759
llvm-svn: 332015
|
|
|
|
| |
llvm-svn: 332006
|
|
|
|
| |
llvm-svn: 332002
|
|
|
|
|
|
|
|
| |
WriteVecALU/WriteVecLogic/WriteShuffle/WriteVarShuffle/WritePSADBW/WritePHAdd scheduler classes
Split off XMM classes from the default (MMX) classes.
llvm-svn: 331999
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to the rL330983. The patch teaches ld, sd, and lld
commands accept 32-bit memory offsets by replacing `mem_simm16` operand
to `mem_simmptr`. In fact, these commands should accept 64-bit offsets,
but so large offsets require another command expanding and will be
supported by a separate patch.
Differential Revision: https://reviews.llvm.org/D46629
llvm-svn: 331997
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to the rL330983. The patch teaches lh and lhu
commands accepts 32-bit memory offsets by replacing `mem_simm16` operand
to `mem_simmptr`.
Differential Revision: https://reviews.llvm.org/D46513
llvm-svn: 331996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With nnan, there's no need for the masked merge / blend
sequence (that probably costs much more than the min/max
instruction).
Somewhere between clang 5.0 and 6.0, we started producing
these intrinsics for fmax()/fmin() in C source instead of
libcalls or fcmp/select. The backend wasn't prepared for
that, so we regressed perf in those cases.
Note: it's possible that other targets have similar problems
as seen here.
Noticed while investigating PR37403 and related bugs:
https://bugs.llvm.org/show_bug.cgi?id=37403
The IR FMF propagation cases still don't work. There's
a proposal that might fix those cases in D46563.
llvm-svn: 331992
|
|
|
|
|
|
|
|
| |
Reviewers: atanasyan, smaksimovic, abeserminji
Differential Revision: https://reviews.llvm.org/D46390
llvm-svn: 331969
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: craig.topper, RKSimon
Reviewed By: craig.topper, RKSimon
Differential Revision: https://reviews.llvm.org/D46539
llvm-svn: 331961
|
|
|
|
|
|
|
|
|
|
|
|
| |
Const/local/shared address spaces are all < 4GB and we can always use
32-bit pointers to access them. This has substantial performance impact
on kernels that uses shared memory for intermediary results.
The feature is disabled by default.
Differential Revision: https://reviews.llvm.org/D46147
llvm-svn: 331941
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: As per title. SETCCE is deprecated and will eventually be removed.
Reviewers: rogfer01, efriedma, rengolin, javed.absar
Subscribers: kristof.beyls, chrib, llvm-commits
Differential Revision: https://reviews.llvm.org/D46512
llvm-svn: 331929
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: FarhanaAleen
Reviewed By: rampitec
Subscribers: AMDGPU
Differential Revision: https://reviews.llvm.org/D46604
llvm-svn: 331920
|
|
|
|
|
|
|
|
|
|
| |
If a multiply is truncated, SimplifyDemandedBits
sometimes turns a zero_extend of the inputs into an
any_extend, which makes the known bits computation unhelpful.
Ignore these and compute known bits for the underlying value,
since we insert the correct extend type after.
llvm-svn: 331919
|
|
|
|
| |
llvm-svn: 331918
|
|
|
|
|
|
|
| |
If the variable shift amount has known bits, we can still reduce
the shift.
llvm-svn: 331917
|