diff options
| author | Marek Olsak <marek.olsak@amd.com> | 2016-08-05 21:23:29 +0000 |
|---|---|---|
| committer | Marek Olsak <marek.olsak@amd.com> | 2016-08-05 21:23:29 +0000 |
| commit | 355a8642b470a4c1910a8e7c65c546e28622694d (patch) | |
| tree | 7c8fd2ed0531a1593a3966d1749079fd56587e23 /llvm/test | |
| parent | 71d74d4b256709ad9ad3ae6927486e46dfc280c3 (diff) | |
| download | bcm5719-llvm-355a8642b470a4c1910a8e7c65c546e28622694d.tar.gz bcm5719-llvm-355a8642b470a4c1910a8e7c65c546e28622694d.zip | |
AMDGPU/SI: Increase SGPR limit to 96 on Tonga/Iceland
Summary:
This is the setting of the Vulkan closed source driver.
It decreases the max wave count from 10 to 8.
26010 shaders in 14650 tests
Totals:
VGPRS: 829593 -> 808440 (-2.55 %)
Spilled SGPRs: 81878 -> 42226 (-48.43 %)
Spilled VGPRs: 367 -> 358 (-2.45 %)
Scratch VGPRs: 1764 -> 1748 (-0.91 %) dwords per thread
Code Size: 36677864 -> 35923932 (-2.06 %) bytes
There is a massive decrease in SGPR spilling in general and -7.4% spilled
VGPRs for DiRT Showdown (= SGPRs spilled to scratch?)
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D23034
llvm-svn: 277867
Diffstat (limited to 'llvm/test')
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/elf.ll | 2 | ||||
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/load-constant-i32.ll | 66 |
2 files changed, 34 insertions, 34 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/elf.ll b/llvm/test/CodeGen/AMDGPU/elf.ll index c62e57c6eaa..e527b8511fd 100644 --- a/llvm/test/CodeGen/AMDGPU/elf.ll +++ b/llvm/test/CodeGen/AMDGPU/elf.ll @@ -21,7 +21,7 @@ ; CONFIG: .section .AMDGPU.config ; CONFIG-NEXT: .long 45096 ; TYPICAL-NEXT: .long 0 -; TONGA-NEXT: .long 576 +; TONGA-NEXT: .long 704 ; CONFIG: .p2align 8 ; CONFIG: test: define amdgpu_ps void @test(i32 %p) { diff --git a/llvm/test/CodeGen/AMDGPU/load-constant-i32.ll b/llvm/test/CodeGen/AMDGPU/load-constant-i32.ll index 40c29be6054..56b9e3a187c 100644 --- a/llvm/test/CodeGen/AMDGPU/load-constant-i32.ll +++ b/llvm/test/CodeGen/AMDGPU/load-constant-i32.ll @@ -277,47 +277,47 @@ define void @constant_zextload_v16i32_to_v16i64(<16 x i64> addrspace(1)* %out, < ; FUNC-LABEL: {{^}}constant_sextload_v32i32_to_v32i64: ; GCN: s_load_dwordx16 -; GCN: s_load_dwordx16 +; GCN-DAG: s_load_dwordx16 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 -; GCN-NOHSA: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 +; GCN-NOHSA-DAG: buffer_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 -; GCN-HSA: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 +; GCN-HSA-DAG: flat_store_dwordx4 define void @constant_sextload_v32i32_to_v32i64(<32 x i64> addrspace(1)* %out, <32 x i32> addrspace(2)* %in) #0 { %ld = load <32 x i32>, <32 x i32> addrspace(2)* %in |

