| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D71676
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D71622
|
|
|
|
|
| |
Prepare this test for moving tthe denormal setting out of the
subtarget features.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: [AMDGPU] Fix bug introduced in 47a5c36b37f0
Reviewers: foad, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69915
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Added estimations for ShuffleVector, some cast and arithmetic instructions
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added estimation for zero size insertelement, extractelement
and llvm.fabs operators.
Updated inline/unroll parameters default values.
Reviewers: rampitec, arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68881
llvm-svn: 375109
|
|
|
|
|
|
|
|
|
|
| |
For some reason multiple places need to do this, and the variant the
loop unroller and inliner use was not handling it.
Also, introduce a new wrapper to be slightly more precise, since on
AMDGPU some addrspacecasts are free, but not no-ops.
llvm-svn: 362436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are a few different issues, mostly stemming from using
generation based checks for anything instead of subtarget
features. Stop adding flat-address-space as a feature for HSA, as it
should only be a device property. This was incorrectly allowing flat
instructions to select for SI.
Increase the default generation for HSA to avoid the encoding error
when emitting objects. This has some other side effects from various
checks which probably should be separate subtarget features (in the
cost model and for dealing with the DS offset folding issue).
Partial fix for bug 41070. It should probably be an error to try using
amdhsa without flat support.
llvm-svn: 356347
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This
commit does not add them, but makes preparatory changes:
* Fixed assumptions of power-of-2 vector type in kernel arg handling,
and added v5 kernel arg tests and v3/v5 shader arg tests.
* Added v5 tests for cost analysis.
* Added vec3/vec5 arg test cases.
Some of this patch is from Matt Arsenault, also of AMD.
Differential Revision: https://reviews.llvm.org/D58928
Change-Id: I7279d6b4841464d2080eb255ef3c589e268eabcd
llvm-svn: 356342
|
|
|
|
|
|
|
|
| |
This requires corresponding clang change.
Differential Revision: https://reviews.llvm.org/D40955
llvm-svn: 324101
|
|
|
|
|
|
| |
Also refine for f16 and rcp cases.
llvm-svn: 312213
|
|
|
|
|
|
|
| |
VOP3P instructions can encode access to either
half of the register.
llvm-svn: 302730
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444
|
|
|
|
|
|
|
| |
This resolves bug 21148 by preventing promotion to
i64 induction variables.
llvm-svn: 264376
|
|
|
|
| |
llvm-svn: 264374
|
|
|
|
| |
llvm-svn: 264369
|
|
|
|
|
|
|
| |
Ideally this would also happen for fneg, but that
isn't a distinct operation in the IR.
llvm-svn: 264368
|
|
|
|
|
|
| |
We don't want to have a cost to scalarizing operations.
llvm-svn: 264364
|
|
|
|
|
|
| |
The default cost was 0 with the assumption that it is predictable.
llvm-svn: 255796
|
|
The cost for scalarized operations is computed as N * (scalar operation
cost + 1 extractelement + 1 insertelement). This partially fixes
inflating the cost of scalarized operations since every operation is
scalarized and free. I don't think we want any cost asociated with
scalarization, but for now insertelement is still counted. I'm not sure
if we should pretend that insertelement is also free, or add a way
to compute a custom scalarization cost.
llvm-svn: 254438
|