diff options
author | Sam Parker <sam.parker@arm.com> | 2019-09-30 08:03:23 +0000 |
---|---|---|
committer | Sam Parker <sam.parker@arm.com> | 2019-09-30 08:03:23 +0000 |
commit | aac03ae06a8a2eaac40c5cbb91b57c5a29e7fdad (patch) | |
tree | 150a7b47df31a237dbbd3607f07f11cbf71f5e2b /llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll | |
parent | 5a2a14db0bc4cfd4f3c8f2fbac7ca9bc93a23699 (diff) | |
download | bcm5719-llvm-aac03ae06a8a2eaac40c5cbb91b57c5a29e7fdad.tar.gz bcm5719-llvm-aac03ae06a8a2eaac40c5cbb91b57c5a29e7fdad.zip |
[ARM][MVE] Change VCTP operand
The VCTP instruction will calculate the predicate masked based upon
the number of elements that need to be processed. I had inserted the
sub before the vctp intrinsic and supplied it as the operand, but
this is incorrect as the phi should directly feed the vctp. The sub
is calculating the value for the next iteration.
Differential Revision: https://reviews.llvm.org/D67921
llvm-svn: 373188
Diffstat (limited to 'llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll')
-rw-r--r-- | llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll b/llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll index a5e922858f2..6d5516249e2 100644 --- a/llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll +++ b/llvm/test/CodeGen/Thumb2/LowOverheadLoops/nested.ll @@ -69,8 +69,8 @@ for.cond.cleanup: ; preds = %middle.block, %entr ; CHECK: phi ; CHECK: phi ; CHECK: [[IV:%[^ ]+]] = phi i32 [ %N, %for.cond1.preheader.us ], [ [[REM:%[^ ]+]], %vector.body ] +; CHECK: [[VCTP:%[^ ]+]] = call <4 x i1> @llvm.arm.vctp32(i32 [[IV]]) ; CHECK: [[REM]] = sub i32 [[IV]], 4 -; CHECK: [[VCTP:%[^ ]+]] = call <4 x i1> @llvm.arm.vctp32(i32 [[REM]]) ; CHECK: call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* {{.*}}, i32 4, <4 x i1> [[VCTP]], <4 x i32> undef) ; CHECK: call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* {{.*}}, i32 4, <4 x i1> [[VCTP]], <4 x i32> undef) define void @mat_vec_i32(i32** nocapture readonly %A, i32* nocapture readonly %B, i32* noalias nocapture %C, i32 %N) { |