summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/bfi_int.ll
diff options
context:
space:
mode:
authorMatt Arsenault <Matthew.Arsenault@amd.com>2018-07-31 19:05:14 +0000
committerMatt Arsenault <Matthew.Arsenault@amd.com>2018-07-31 19:05:14 +0000
commit9ced1e0d80d432034221c3ff36fdf01a8bcf1aca (patch)
tree9a3b0fce654b4dfc0ea1272e053e0c8debdc3a0a /llvm/test/CodeGen/AMDGPU/bfi_int.ll
parente8c70bc187845a7ee065e133a78eaad159b73ad7 (diff)
downloadbcm5719-llvm-9ced1e0d80d432034221c3ff36fdf01a8bcf1aca.tar.gz
bcm5719-llvm-9ced1e0d80d432034221c3ff36fdf01a8bcf1aca.zip
AMDGPU: Scalarize vector argument types to calls
When lowering calling conventions, prefer to decompose vectors into the constitute register types. This avoids artifical constraints to satisfy a wide super-register. This improves code quality because now optimizations don't need to deal with the super-register constraint. For example the immediate folding code doesn't deal with 4 component reg_sequences, so by breaking the register down earlier the existing immediate folding code is able to work. This also avoids the need for the shader input processing code to manually split vector types. llvm-svn: 338416
Diffstat (limited to 'llvm/test/CodeGen/AMDGPU/bfi_int.ll')
-rw-r--r--llvm/test/CodeGen/AMDGPU/bfi_int.ll2
1 files changed, 1 insertions, 1 deletions
diff --git a/llvm/test/CodeGen/AMDGPU/bfi_int.ll b/llvm/test/CodeGen/AMDGPU/bfi_int.ll
index 77c5e53481e..66f8a2b111a 100644
--- a/llvm/test/CodeGen/AMDGPU/bfi_int.ll
+++ b/llvm/test/CodeGen/AMDGPU/bfi_int.ll
@@ -54,8 +54,8 @@ entry:
; FUNC-LABEL: {{^}}v_bitselect_v2i32_pat1:
; GCN: s_waitcnt
-; GCN-NEXT: v_bfi_b32 v1, v3, v1, v5
; GCN-NEXT: v_bfi_b32 v0, v2, v0, v4
+; GCN-NEXT: v_bfi_b32 v1, v3, v1, v5
; GCN-NEXT: s_setpc_b64
define <2 x i32> @v_bitselect_v2i32_pat1(<2 x i32> %a, <2 x i32> %b, <2 x i32> %mask) {
%xor.0 = xor <2 x i32> %a, %mask
OpenPOWER on IntegriCloud