diff options
| author | Marek Olsak <marek.olsak@amd.com> | 2018-01-31 20:18:11 +0000 |
|---|---|---|
| committer | Marek Olsak <marek.olsak@amd.com> | 2018-01-31 20:18:11 +0000 |
| commit | d4bb329d0ea29bf6882b8f3bee9b944c161980a3 (patch) | |
| tree | fd66d564231a8a63a11ac159574fcb730a499eb7 /llvm/test/CodeGen/AMDGPU | |
| parent | 13e4741275a1169c2ff50645cd6f4e964a94169f (diff) | |
| download | bcm5719-llvm-d4bb329d0ea29bf6882b8f3bee9b944c161980a3.tar.gz bcm5719-llvm-d4bb329d0ea29bf6882b8f3bee9b944c161980a3.zip | |
AMDGPU: Fold inline offset for loads properly in moveToVALU on GFX9
Summary:
This enables load merging into x2, x4, which is driven by inline offsets.
6500 shaders are affected:
Code Size in affected shaders: -15.14 %
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D42078
llvm-svn: 323909
Diffstat (limited to 'llvm/test/CodeGen/AMDGPU')
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/smrd.ll | 18 |
1 files changed, 3 insertions, 15 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/smrd.ll b/llvm/test/CodeGen/AMDGPU/smrd.ll index 9fd20fd67b8..420c7b80b8d 100644 --- a/llvm/test/CodeGen/AMDGPU/smrd.ll +++ b/llvm/test/CodeGen/AMDGPU/smrd.ll @@ -194,11 +194,7 @@ main_body: ; GCN-LABEL: {{^}}smrd_vgpr_offset_imm: ; GCN-NEXT: %bb. - -; SICIVI-NEXT: buffer_load_dword v{{[0-9]}}, v0, s[0:3], 0 offen offset:4095 ; - -; GFX9-NEXT: v_add_u32_e32 [[ADD:v[0-9]+]], 0xfff, v0 -; GFX9-NEXT: buffer_load_dword v{{[0-9]}}, [[ADD]], s[0:3], 0 offen ; +; GCN-NEXT: buffer_load_dword v{{[0-9]}}, v0, s[0:3], 0 offen offset:4095 ; define amdgpu_ps float @smrd_vgpr_offset_imm(<4 x i32> inreg %desc, i32 %offset) #0 { main_body: %off = add i32 %offset, 4095 @@ -244,16 +240,8 @@ main_body: ; GCN-LABEL: {{^}}smrd_vgpr_merged: ; GCN-NEXT: %bb. - -; SICIVI-NEXT: buffer_load_dwordx4 v[{{[0-9]}}:{{[0-9]}}], v0, s[0:3], 0 offen offset:4 -; SICIVI-NEXT: buffer_load_dwordx2 v[{{[0-9]}}:{{[0-9]}}], v0, s[0:3], 0 offen offset:28 - -; GFX9: buffer_load_dword -; GFX9: buffer_load_dword -; GFX9: buffer_load_dword -; GFX9: buffer_load_dword -; GFX9: buffer_load_dword -; GFX9: buffer_load_dword +; GCN-NEXT: buffer_load_dwordx4 v[{{[0-9]}}:{{[0-9]}}], v0, s[0:3], 0 offen offset:4 +; GCN-NEXT: buffer_load_dwordx2 v[{{[0-9]}}:{{[0-9]}}], v0, s[0:3], 0 offen offset:28 define amdgpu_ps void @smrd_vgpr_merged(<4 x i32> inreg %desc, i32 %a) #0 { main_body: %a1 = add i32 %a, 4 |

