diff options
| author | Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> | 2017-02-01 22:59:50 +0000 |
|---|---|---|
| committer | Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com> | 2017-02-01 22:59:50 +0000 |
| commit | 2b913b1f493107ae7ffb6c11e59094952d595e1e (patch) | |
| tree | ec857df4a9a75225a11a6e275fef0730ffa311ec /llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h | |
| parent | c5eb8e29d05c2887a4097be074edee4a1ba6d493 (diff) | |
| download | bcm5719-llvm-2b913b1f493107ae7ffb6c11e59094952d595e1e.tar.gz bcm5719-llvm-2b913b1f493107ae7ffb6c11e59094952d595e1e.zip | |
[AMDGPU] Account workgroup size in LDS occupancy limits
Functions matching LDS use to occupancy return results for a workgroup
of 64 workitems. The numbers has to be adjusted for bigger workgroups.
For example a workgroup of size 256 already occupies 4 waves just by
itself. Given that all numbers of LDS use in the compiler are per
workgroup, occupancy shall be multiplied by 4 in this case. Each 64
workitems still limited by the same number, but 4 subrgoups 64 workitems
each can afford 4 times more LDS to get the same occupancy.
In addition change initializes LDS size in the subtarget to a real value
for SI+ targets. This is required since LDS size is a variable in these
calculations.
Differential Revision: https://reviews.llvm.org/D29423
llvm-svn: 293837
Diffstat (limited to 'llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h')
| -rw-r--r-- | llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h b/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h index f66ebd6afc2..83eda9bfbb6 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h @@ -274,11 +274,12 @@ public: /// Return the amount of LDS that can be used that will not restrict the /// occupancy lower than WaveCount. - unsigned getMaxLocalMemSizeWithWaveCount(unsigned WaveCount) const; + unsigned getMaxLocalMemSizeWithWaveCount(unsigned WaveCount, + const Function &) const; /// Inverse of getMaxLocalMemWithWaveCount. Return the maximum wavecount if /// the given LDS memory size is the only constraint. - unsigned getOccupancyWithLocalMemSize(uint32_t Bytes) const; + unsigned getOccupancyWithLocalMemSize(uint32_t Bytes, const Function &) const; bool hasFP16Denormals() const { return FP64FP16Denormals; |

