diff options
| author | Matt Arsenault <Matthew.Arsenault@amd.com> | 2018-07-31 19:05:14 +0000 |
|---|---|---|
| committer | Matt Arsenault <Matthew.Arsenault@amd.com> | 2018-07-31 19:05:14 +0000 |
| commit | 9ced1e0d80d432034221c3ff36fdf01a8bcf1aca (patch) | |
| tree | 9a3b0fce654b4dfc0ea1272e053e0c8debdc3a0a /llvm/test/CodeGen/AMDGPU/bfi_int.ll | |
| parent | e8c70bc187845a7ee065e133a78eaad159b73ad7 (diff) | |
| download | bcm5719-llvm-9ced1e0d80d432034221c3ff36fdf01a8bcf1aca.tar.gz bcm5719-llvm-9ced1e0d80d432034221c3ff36fdf01a8bcf1aca.zip | |
AMDGPU: Scalarize vector argument types to calls
When lowering calling conventions, prefer to decompose vectors
into the constitute register types. This avoids artifical constraints
to satisfy a wide super-register.
This improves code quality because now optimizations don't need to
deal with the super-register constraint. For example the immediate
folding code doesn't deal with 4 component reg_sequences, so by
breaking the register down earlier the existing immediate folding
code is able to work.
This also avoids the need for the shader input processing code
to manually split vector types.
llvm-svn: 338416
Diffstat (limited to 'llvm/test/CodeGen/AMDGPU/bfi_int.ll')
| -rw-r--r-- | llvm/test/CodeGen/AMDGPU/bfi_int.ll | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/bfi_int.ll b/llvm/test/CodeGen/AMDGPU/bfi_int.ll index 77c5e53481e..66f8a2b111a 100644 --- a/llvm/test/CodeGen/AMDGPU/bfi_int.ll +++ b/llvm/test/CodeGen/AMDGPU/bfi_int.ll @@ -54,8 +54,8 @@ entry: ; FUNC-LABEL: {{^}}v_bitselect_v2i32_pat1: ; GCN: s_waitcnt -; GCN-NEXT: v_bfi_b32 v1, v3, v1, v5 ; GCN-NEXT: v_bfi_b32 v0, v2, v0, v4 +; GCN-NEXT: v_bfi_b32 v1, v3, v1, v5 ; GCN-NEXT: s_setpc_b64 define <2 x i32> @v_bitselect_v2i32_pat1(<2 x i32> %a, <2 x i32> %b, <2 x i32> %mask) { %xor.0 = xor <2 x i32> %a, %mask |

