summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
diff options
context:
space:
mode:
authorStanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>2017-02-01 22:59:50 +0000
committerStanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>2017-02-01 22:59:50 +0000
commit2b913b1f493107ae7ffb6c11e59094952d595e1e (patch)
treeec857df4a9a75225a11a6e275fef0730ffa311ec /llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
parentc5eb8e29d05c2887a4097be074edee4a1ba6d493 (diff)
downloadbcm5719-llvm-2b913b1f493107ae7ffb6c11e59094952d595e1e.tar.gz
bcm5719-llvm-2b913b1f493107ae7ffb6c11e59094952d595e1e.zip
[AMDGPU] Account workgroup size in LDS occupancy limits
Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837
Diffstat (limited to 'llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h')
-rw-r--r--llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h5
1 files changed, 3 insertions, 2 deletions
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h b/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
index f66ebd6afc2..83eda9bfbb6 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h
@@ -274,11 +274,12 @@ public:
/// Return the amount of LDS that can be used that will not restrict the
/// occupancy lower than WaveCount.
- unsigned getMaxLocalMemSizeWithWaveCount(unsigned WaveCount) const;
+ unsigned getMaxLocalMemSizeWithWaveCount(unsigned WaveCount,
+ const Function &) const;
/// Inverse of getMaxLocalMemWithWaveCount. Return the maximum wavecount if
/// the given LDS memory size is the only constraint.
- unsigned getOccupancyWithLocalMemSize(uint32_t Bytes) const;
+ unsigned getOccupancyWithLocalMemSize(uint32_t Bytes, const Function &) const;
bool hasFP16Denormals() const {
return FP64FP16Denormals;
OpenPOWER on IntegriCloud