summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis/CostModel/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Implemented fma cost analysisStanislav Mekhanoshin2019-12-181-0/+120
| | | | Differential Revision: https://reviews.llvm.org/D71676
* [AMDGPU] Fixed cost model for packed 16 bit opsStanislav Mekhanoshin2019-12-177-102/+258
| | | | Differential Revision: https://reviews.llvm.org/D71622
* AMDGPU: Split test functions to avoid dependency on subtargetMatt Arsenault2019-11-191-57/+155
| | | | | Prepare this test for moving tthe denormal setting out of the subtarget features.
* [AMDGPU] Fix bug introduced in 47a5c36b37f0dfukalov2019-11-071-4/+15
| | | | | | | | | | | | | | Summary: [AMDGPU] Fix bug introduced in 47a5c36b37f0 Reviewers: foad, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69915
* [AMDGPU] Improve code size cost model (part 2)dfukalov2019-11-0610-1/+32
| | | | | | | | | | | | | | Summary: Added estimations for ShuffleVector, some cast and arithmetic instructions Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69629
* [AMDGPU] Improve code size cost modelDaniil Fukalov2019-10-173-18/+24
| | | | | | | | | | | | | | | | | | | Summary: Added estimation for zero size insertelement, extractelement and llvm.fabs operators. Updated inline/unroll parameters default values. Reviewers: rampitec, arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68881 llvm-svn: 375109
* TTI: Improve default costs for addrspacecastMatt Arsenault2019-06-031-6/+27
| | | | | | | | | | For some reason multiple places need to do this, and the variant the loop unroller and inliner use was not handling it. Also, introduce a new wrapper to be slightly more precise, since on AMDGPU some addrspacecasts are free, but not no-ops. llvm-svn: 362436
* AMDGPU: Partially fix default device for HSAMatt Arsenault2019-03-171-2/+2
| | | | | | | | | | | | | | | | | | There are a few different issues, mostly stemming from using generation based checks for anything instead of subtarget features. Stop adding flat-address-space as a feature for HSA, as it should only be a device property. This was incorrectly allowing flat instructions to select for SI. Increase the default generation for HSA to avoid the encoding error when emitting objects. This has some other side effects from various checks which probably should be separate subtarget features (in the cost model and for dealing with the DS offset folding issue). Partial fix for bug 41070. It should probably be an error to try using amdhsa without flat support. llvm-svn: 356347
* [AMDGPU] Prepare for introduction of v3 and v5 MVTsTim Renouf2019-03-178-7/+105
| | | | | | | | | | | | | | | | | | | AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This commit does not add them, but makes preparatory changes: * Fixed assumptions of power-of-2 vector type in kernel arg handling, and added v5 kernel arg tests and v3/v5 shader arg tests. * Added v5 tests for cost analysis. * Added vec3/vec5 arg test cases. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58928 Change-Id: I7279d6b4841464d2080eb255ef3c589e268eabcd llvm-svn: 356342
* [AMDGPU] Switch to the new addr space mapping by defaultYaxun Liu2018-02-021-24/+24
| | | | | | | | This requires corresponding clang change. Differential Revision: https://reviews.llvm.org/D40955 llvm-svn: 324101
* AMDGPU: Don't assert in TTI with fp32 denorms enabledMatt Arsenault2017-08-311-11/+78
| | | | | | Also refine for f16 and rcp cases. llvm-svn: 312213
* AMDGPU: Make some packed shuffles freeMatt Arsenault2017-05-103-41/+119
| | | | | | | VOP3P instructions can encode access to either half of the register. llvm-svn: 302730
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-2112-99/+99
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* AMDGPU: Cost model for basic integer operationsMatt Arsenault2016-03-254-0/+343
| | | | | | | This resolves bug 21148 by preventing promotion to i64 induction variables. llvm-svn: 264376
* AMDGPU: Partially implement getArithmeticInstrCost for FP opsMatt Arsenault2016-03-254-0/+358
| | | | llvm-svn: 264374
* TTI: Report 0 cost for free addrspacecastsMatt Arsenault2016-03-251-0/+45
| | | | llvm-svn: 264369
* TTI: Use 0 for cost of fabs if freeMatt Arsenault2016-03-251-0/+97
| | | | | | | Ideally this would also happen for fneg, but that isn't a distinct operation in the IR. llvm-svn: 264368
* AMDGPU: TTI: Make insertelement free.Matt Arsenault2016-03-251-0/+37
| | | | | | We don't want to have a cost to scalarizing operations. llvm-svn: 264364
* AMDGPU: Override getCFInstrCostMatt Arsenault2015-12-161-0/+45
| | | | | | The default cost was 0 with the assumption that it is predictable. llvm-svn: 255796
* AMDGPU: Report extractelement as free in cost modelMatt Arsenault2015-12-012-0/+112
The cost for scalarized operations is computed as N * (scalar operation cost + 1 extractelement + 1 insertelement). This partially fixes inflating the cost of scalarized operations since every operation is scalarized and free. I don't think we want any cost asociated with scalarization, but for now insertelement is still counted. I'm not sure if we should pretend that insertelement is also free, or add a way to compute a custom scalarization cost. llvm-svn: 254438
OpenPOWER on IntegriCloud