[CGP] use subtract or subtract-of-cmps for result of memcmp expansion

As noted in the code comment, transforming this in the other direction might require a separate transform here in CGP given the block-at-a-time DAG constraint. Besides that theoretical motivation, there are 2 practical motivations for the subtract-of-cmps form: 1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). There is discussion about canonicalizing IR to the select form ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), so we probably need to add DAG transforms for those patterns anyway, but this improves the memcmp output without waiting for that step. 2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided if we choose to enable that. Differential Revision: https://reviews.llvm.org/D34904 llvm-svn: 309597
author: Sanjay Patel <spatel@rotateright.com> 2017-07-31 18:08:24 +0000
committer: Sanjay Patel <spatel@rotateright.com> 2017-07-31 18:08:24 +0000
commit: fea731a4aa6aabf270fbb9ba6401ca8826c55a9b (patch)
tree: 93d0d7e6532ef8f3b297548da5097f879e878312 /llvm/test/CodeGen/PowerPC/memcmpIR.ll
parent: 70d35e102ef8dbba10e2db84ea2dcbe95bbbfd38 (diff)
download: bcm5719-llvm-fea731a4aa6aabf270fbb9ba6401ca8826c55a9b.tar.gz
bcm5719-llvm-fea731a4aa6aabf270fbb9ba6401ca8826c55a9b.zip
1 files changed, 10 insertions, 8 deletions
diff --git a/llvm/test/CodeGen/PowerPC/memcmpIR.ll b/llvm/test/CodeGen/PowerPC/memcmpIR.ll
index 55f48ad19a6..9888519d8b6 100644
--- a/llvm/test/CodeGen/PowerPC/memcmpIR.ll
+++ b/llvm/test/CodeGen/PowerPC/memcmpIR.ll
@@ -59,20 +59,22 @@ define signext i32 @test2(i32* nocapture readonly %buffer1, i32* nocapture reado
   ; CHECK-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
   ; CHECK-NEXT: [[BSWAP1:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD1]])
   ; CHECK-NEXT: [[BSWAP2:%[0-9]+]] = call i32 @llvm.bswap.i32(i32 [[LOAD2]])
-  ; CHECK-NEXT: [[CMP1:%[0-9]+]] = icmp ne i32 [[BSWAP1]], [[BSWAP2]]
+  ; CHECK-NEXT: [[CMP1:%[0-9]+]] = icmp ugt i32 [[BSWAP1]], [[BSWAP2]]
   ; CHECK-NEXT: [[CMP2:%[0-9]+]] = icmp ult i32 [[BSWAP1]], [[BSWAP2]]
-  ; CHECK-NEXT: [[SELECT1:%[0-9]+]] = select i1 [[CMP2]], i32 -1, i32 1
-  ; CHECK-NEXT: [[SELECT2:%[0-9]+]] = select i1 [[CMP1]], i32 [[SELECT1]], i32 0
-  ; CHECK-NEXT: ret i32 [[SELECT2]]
+  ; CHECK-NEXT: [[Z1:%[0-9]+]] = zext i1 [[CMP1]] to i32
+  ; CHECK-NEXT: [[Z2:%[0-9]+]] = zext i1 [[CMP2]] to i32
+  ; CHECK-NEXT: [[SUB:%[0-9]+]] = sub i32 [[Z1]], [[Z2]]
+  ; CHECK-NEXT: ret i32 [[SUB]]
 
   ; CHECK-BE-LABEL: @test2(
   ; CHECK-BE: [[LOAD1:%[0-9]+]] = load i32, i32*
   ; CHECK-BE-NEXT: [[LOAD2:%[0-9]+]] = load i32, i32*
-  ; CHECK-BE-NEXT: [[CMP1:%[0-9]+]] = icmp ne i32 [[LOAD1]], [[LOAD2]]
+  ; CHECK-BE-NEXT: [[CMP1:%[0-9]+]] = icmp ugt i32 [[LOAD1]], [[LOAD2]]
   ; CHECK-BE-NEXT: [[CMP2:%[0-9]+]] = icmp ult i32 [[LOAD1]], [[LOAD2]]
-  ; CHECK-BE-NEXT: [[SELECT1:%[0-9]+]] = select i1 [[CMP2]], i32 -1, i32 1
-  ; CHECK-BE-NEXT: [[SELECT2:%[0-9]+]] = select i1 [[CMP1]], i32 [[SELECT1]], i32 0
-  ; CHECK-BE-NEXT: ret i32 [[SELECT2]]
+  ; CHECK-BE-NEXT: [[Z1:%[0-9]+]] = zext i1 [[CMP1]] to i32
+  ; CHECK-BE-NEXT: [[Z2:%[0-9]+]] = zext i1 [[CMP2]] to i32
+  ; CHECK-BE-NEXT: [[SUB:%[0-9]+]] = sub i32 [[Z1]], [[Z2]]
+  ; CHECK-BE-NEXT: ret i32 [[SUB]]
 
 entry:
   %0 = bitcast i32* %buffer1 to i8*
author	Sanjay Patel <spatel@rotateright.com>	2017-07-31 18:08:24 +0000
committer	Sanjay Patel <spatel@rotateright.com>	2017-07-31 18:08:24 +0000
commit	fea731a4aa6aabf270fbb9ba6401ca8826c55a9b (patch)
tree	93d0d7e6532ef8f3b297548da5097f879e878312 /llvm/test/CodeGen/PowerPC/memcmpIR.ll
parent	70d35e102ef8dbba10e2db84ea2dcbe95bbbfd38 (diff)
download	bcm5719-llvm-fea731a4aa6aabf270fbb9ba6401ca8826c55a9b.tar.gz bcm5719-llvm-fea731a4aa6aabf270fbb9ba6401ca8826c55a9b.zip