| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
Ensure the XOR in the waterfall loop for indirect addressing is considered a terminator.
Differential Revision: https://reviews.llvm.org/D57703
llvm-svn: 353207
|
| |
|
|
|
|
|
|
|
| |
The v2i64 argument is lowered to a bitcast of v4i32 build_vector.
This would then attempt to use the i32-element as the source of the
vector truncate. This really would need to collect 2 elements from the
build_vector to produce the intended truncate.
llvm-svn: 353202
|
| |
|
|
|
|
|
|
|
|
| |
The fewerElementsVectors implementation for load/stores
handles the scalar reduction case just as well, so drop
the redundant code in narrowScalar. This also introduces
support for narrowing irregular size breakdowns for
scalars.
llvm-svn: 353125
|
| |
|
|
|
|
|
|
|
|
| |
Don't handle vector conditions.
I think this can be merged in the future with
fewerElementsVectorSelect, although this becomes slightly tricky with
a vector condition.
llvm-svn: 353122
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Try to use the underlying source registers.
This enables legalization in more cases where some irregular
operations are widened and others narrowed.
This seems to make the test_combines_2 AArch64 test worse, since the
MERGE_VALUES has multiple uses. Since this should be required for
legalization, a hasOneUse check is probably inappropriate (or maybe
should only be used if the merge is legal?).
llvm-svn: 353121
|
| |
|
|
|
|
|
|
| |
A number of of tests were using imm operands, not cimm. Since CSE
relies on the exact ConstantInt* pointer used, and implicit
conversions are generally evil, also enforce the bitsize of the types.
llvm-svn: 353113
|
| |
|
|
|
|
| |
This was hiding bugs from never legalizing the source type.
llvm-svn: 353102
|
| |
|
|
|
|
|
| |
This was pulling the mov used for register indexing on gfx9 out of the
loop.
llvm-svn: 353101
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57416
llvm-svn: 353083
|
| |
|
|
|
|
|
| |
Also add some more select tests to help show future legalization
changes.
llvm-svn: 353045
|
| |
|
|
|
|
|
|
|
| |
For the scalar case only.
Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.
llvm-svn: 352979
|
| |
|
|
|
|
| |
Handle the basic element extract case.
llvm-svn: 352978
|
| |
|
|
| |
llvm-svn: 352976
|
| |
|
|
| |
llvm-svn: 352975
|
| |
|
|
| |
llvm-svn: 352974
|
| |
|
|
| |
llvm-svn: 352973
|
| |
|
|
|
|
|
|
|
| |
Prepare for future patch which affects codegen for calls to preemptible
functions.
Differential Revision: https://reviews.llvm.org/D57605
llvm-svn: 352920
|
| |
|
|
|
|
|
|
|
| |
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57172
llvm-svn: 352911
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Incorrect code was generated when lowering insertelement operations
for vectors with 8 or 16 bit elements. The value being inserted was
not adjusted for the position of the element within the 32 bit word
and so only the low element within each 32 bit word could receive
the intended value.
Fixed by simply replicating the value to each element of a
congruent vector before the mask and or operation used to
update the intended element.
A number of affected LIT tests have been updated appropriately.
before the mask & or into the intended
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57588
llvm-svn: 352885
|
| |
|
|
| |
llvm-svn: 352720
|
| |
|
|
| |
llvm-svn: 352719
|
| |
|
|
|
|
| |
For AMDGPU the result is always 32-bit for 64-bit inputs.
llvm-svn: 352717
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is meant to be used with clang's __builtin_dynamic_object_size.
When 'true' is passed to this parameter, the intrinsic has the
potential to be folded into instructions that will be evaluated
at run time. When 'false', the objectsize intrinsic behaviour is
unchanged.
rdar://32212419
Differential revision: https://reviews.llvm.org/D56761
llvm-svn: 352664
|
| |
|
|
| |
llvm-svn: 352601
|
| |
|
|
| |
llvm-svn: 352599
|
| |
|
|
| |
llvm-svn: 352594
|
| |
|
|
|
|
| |
Also add some quick hacks to AMDGPU legality for the tests.
llvm-svn: 352591
|
| |
|
|
| |
llvm-svn: 352585
|
| |
|
|
|
|
|
| |
Not sure if the old AArch64 tests should be just
deleted or not.
llvm-svn: 352562
|
| |
|
|
| |
llvm-svn: 352560
|
| |
|
|
|
|
|
|
|
|
| |
This was ignoring the memory size, and producing multiple loads/stores
if the operand size was different from the memory size.
I assume this is the intent of not having an explicit G_ANYEXTLOAD
(although I think that would probably be better).
llvm-svn: 352523
|
| |
|
|
|
|
|
|
|
|
|
| |
I found a really strange WWM issue through a very convoluted shader that
essentially boils down to a bug in SIInstrInfo where canReadVGPR did not
correctly identify that WWM is like a copy and can have a VGPR as its
source.
Differential Revision: https://reviews.llvm.org/D56002
llvm-svn: 352500
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since these pass the pointer in m0 unlike other DS instructions, these
need to worry about whether the address is uniform or not. This
assumes the address is dynamically uniform, and just uses
readfirstlane to get a copy into an SGPR.
I don't know if these have the same 16-bit add for the addressing mode
offset problem on SI or not, but I've just assumed they do.
Also includes some misc. changes to avoid test differences between the
LDS and GDS versions.
llvm-svn: 352422
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added the intrinsics llvm.amdgcn.interp.p1.f16() and
llvm.amdgcn.interp.p2.f16() and related LIT test.
The p1 intrinsic generates code appropriate for both 16 and 32
bank LDS.
Reviewers: #amdgpu, dstuttard, arsenm, tpr
Reviewed By: #amdgpu, arsenm
Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D46754
llvm-svn: 352357
|
| |
|
|
|
|
|
| |
This is invalid for the same reason as in the narrowScalar handling
for load.
llvm-svn: 352334
|
| |
|
|
|
|
|
| |
I expected this to be automatically verified, but it seems
nothing uses that the type index was declared as a "ptype"
llvm-svn: 352319
|
| |
|
|
| |
llvm-svn: 352300
|
| |
|
|
| |
llvm-svn: 352298
|
| |
|
|
| |
llvm-svn: 352295
|
| |
|
|
| |
llvm-svn: 352294
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case:
-->OldTop<-
| . |
| . |
| . |
---Pred |
| |
BB-----
Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving.
Differential Revision: https://reviews.llvm.org/D57067
llvm-svn: 352236
|
| |
|
|
| |
llvm-svn: 352167
|
| |
|
|
| |
llvm-svn: 352166
|
| |
|
|
| |
llvm-svn: 352165
|
| |
|
|
| |
llvm-svn: 352162
|
| |
|
|
|
|
| |
Also legalize 64-bit compares for AMDGPU
llvm-svn: 352157
|
| |
|
|
| |
llvm-svn: 352155
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D57178
Now add a hook in TargetPassConfig to query if CSE needs to be
enabled. By default this hook returns false only for O0 opt level but
this can be overridden by the target.
As a consequence of the default of enabled for non O0, a few tests
needed to be updated to not use CSE (by passing in -O0) to the run
line.
reviewed by: arsenm
llvm-svn: 352126
|
| |
|
|
| |
llvm-svn: 352123
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With XNACK, an smem load whose result is coalesced with an operand (thus
it overwrites its own operand) cannot appear in a clause, because some
other instruction might XNACK and restart the whole clause.
The clause breaker already realized that an smem that overwrites an
operand cannot appear in a clause, and broke the clause. The problem
that this commit fixes is that the SIFormMemoryClauses optimization
formed a bundle with early clobber, which caused the earlier code that
set up the coalesced operand to be removed as dead.
Differential Revision: https://reviews.llvm.org/D57008
Change-Id: I703c4d5b0bf7d6060222bec491f45c18bb3c0016
llvm-svn: 351950
|