diff options
| author | Changpeng Fang <changpeng.fang@gmail.com> | 2017-05-23 20:25:41 +0000 |
|---|---|---|
| committer | Changpeng Fang <changpeng.fang@gmail.com> | 2017-05-23 20:25:41 +0000 |
| commit | 1dbace195d29b18dee93bb8f5fd906035e9daf83 (patch) | |
| tree | d63f6bc48aa696ee03d4456f2efd6d42aca7fa0b /llvm/test | |
| parent | 3334cc017eecbd1eb95272cd5c523bdc01d1ed52 (diff) | |
| download | bcm5719-llvm-1dbace195d29b18dee93bb8f5fd906035e9daf83.tar.gz bcm5719-llvm-1dbace195d29b18dee93bb8f5fd906035e9daf83.zip | |
AMDGPU/SI: Move the local memory usage related checking after calling convention checking in PromoteAlloca
Summary:
Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.
Reviewer:
arsenm
Differential Revision:
http://reviews.llvm.org/D33139
llvm-svn: 303684
Diffstat (limited to 'llvm/test')
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/vector-alloca.ll | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/vector-alloca.ll b/llvm/test/CodeGen/AMDGPU/vector-alloca.ll index 03cf725601b..a0aac8c1d9b 100644 --- a/llvm/test/CodeGen/AMDGPU/vector-alloca.ll +++ b/llvm/test/CodeGen/AMDGPU/vector-alloca.ll @@ -138,3 +138,25 @@ entry: store float %tmp2, float addrspace(1)* %out ret void } + +; The pointer arguments in local address space should not affect promotion to vector. + +; OPT-LABEL: @vector_read_with_local_arg( +; OPT: %0 = extractelement <4 x i32> <i32 0, i32 1, i32 2, i32 3>, i32 %index +; OPT: store i32 %0, i32 addrspace(1)* %out, align 4 +define amdgpu_kernel void @vector_read_with_local_arg(i32 addrspace(3)* %stopper, i32 addrspace(1)* %out, i32 %index) { +entry: + %tmp = alloca [4 x i32] + %x = getelementptr [4 x i32], [4 x i32]* %tmp, i32 0, i32 0 + %y = getelementptr [4 x i32], [4 x i32]* %tmp, i32 0, i32 1 + %z = getelementptr [4 x i32], [4 x i32]* %tmp, i32 0, i32 2 + %w = getelementptr [4 x i32], [4 x i32]* %tmp, i32 0, i32 3 + store i32 0, i32* %x + store i32 1, i32* %y + store i32 2, i32* %z + store i32 3, i32* %w + %tmp1 = getelementptr [4 x i32], [4 x i32]* %tmp, i32 0, i32 %index + %tmp2 = load i32, i32* %tmp1 + store i32 %tmp2, i32 addrspace(1)* %out + ret void +} |

