summaryrefslogtreecommitdiffstats
path: root/llvm/test
diff options
context:
space:
mode:
authorSanjay Patel <spatel@rotateright.com>2014-10-09 21:26:35 +0000
committerSanjay Patel <spatel@rotateright.com>2014-10-09 21:26:35 +0000
commit3d497cd778261818eb76b5ad97f7f751cfebfcbb (patch)
treebfe0a173e16fda4bc5c4271a572710d6e44f6c38 /llvm/test
parent6d28da10e5d1ff6567871dd5929ec84b762fe208 (diff)
downloadbcm5719-llvm-3d497cd778261818eb76b5ad97f7f751cfebfcbb.tar.gz
bcm5719-llvm-3d497cd778261818eb76b5ad97f7f751cfebfcbb.zip
Improve sqrt estimate algorithm (fast-math)
This patch changes the fast-math implementation for calculating sqrt(x) from: y = 1 / (1 / sqrt(x)) to: y = x * (1 / sqrt(x)) This has 2 benefits: less code / faster code and one less estimate instruction that may lose precision. The only target that will be affected (until http://reviews.llvm.org/D5658 is approved) is PPC. The difference in codegen for PPC is 2 less flops for a single-precision sqrtf or vector sqrtf and 4 less flops for a double-precision sqrt. We also eliminate a constant load and extra register usage. Differential Revision: http://reviews.llvm.org/D5682 llvm-svn: 219445
Diffstat (limited to 'llvm/test')
-rw-r--r--llvm/test/CodeGen/PowerPC/recipest.ll11
1 files changed, 2 insertions, 9 deletions
diff --git a/llvm/test/CodeGen/PowerPC/recipest.ll b/llvm/test/CodeGen/PowerPC/recipest.ll
index de74c043ece..2f6a3eca488 100644
--- a/llvm/test/CodeGen/PowerPC/recipest.ll
+++ b/llvm/test/CodeGen/PowerPC/recipest.ll
@@ -197,11 +197,7 @@ define double @foo3(double %a) nounwind {
; CHECK-NEXT: fmul
; CHECK-NEXT: fmadd
; CHECK-NEXT: fmul
-; CHECK-NEXT: fre
-; CHECK-NEXT: fnmsub
-; CHECK-NEXT: fmadd
-; CHECK-NEXT: fnmsub
-; CHECK-NEXT: fmadd
+; CHECK-NEXT: fmul
; CHECK: blr
; CHECK-SAFE: @foo3
@@ -220,9 +216,7 @@ define float @goo3(float %a) nounwind {
; CHECK: fmuls
; CHECK-NEXT: fmadds
; CHECK-NEXT: fmuls
-; CHECK-NEXT: fres
-; CHECK-NEXT: fnmsubs
-; CHECK-NEXT: fmadds
+; CHECK-NEXT: fmuls
; CHECK: blr
; CHECK-SAFE: @goo3
@@ -236,7 +230,6 @@ define <4 x float> @hoo3(<4 x float> %a) nounwind {
; CHECK: @hoo3
; CHECK: vrsqrtefp
-; CHECK-DAG: vrefp
; CHECK-DAG: vcmpeqfp
; CHECK-SAFE: @hoo3
OpenPOWER on IntegriCloud