add a note, this is why we're faster at SciMark-MonteCarlo with

SSE disabled. llvm-svn: 63751
author: Chris Lattner <sabre@nondot.org> 2009-02-04 19:08:01 +0000
committer: Chris Lattner <sabre@nondot.org> 2009-02-04 19:08:01 +0000
commit: 553fd7e1ebababeb1c22526580c406eaef213d98 (patch)
tree: 54575731253ce88b9634b1e3aa53402ee87b39e3 /llvm/lib/Target
parent: ded2d7b0211a96f70392e829b399546453dff40e (diff)
download: bcm5719-llvm-553fd7e1ebababeb1c22526580c406eaef213d98.tar.gz
bcm5719-llvm-553fd7e1ebababeb1c22526580c406eaef213d98.zip
1 files changed, 40 insertions, 0 deletions
diff --git a/llvm/lib/Target/X86/README-SSE.txt b/llvm/lib/Target/X86/README-SSE.txt
index bc51b534824..67cad42a354 100644
--- a/llvm/lib/Target/X86/README-SSE.txt
+++ b/llvm/lib/Target/X86/README-SSE.txt
@@ -912,3 +912,43 @@ since we know the stack slot is already zext'd.
 Consider using movlps instead of movsd to implement (scalar_to_vector (loadf64))
 when code size is critical. movlps is slower than movsd on core2 but it's one
 byte shorter.
+
+//===---------------------------------------------------------------------===//
+
+We should use a dynamic programming based approach to tell when using FPStack
+operations is cheaper than SSE.  SciMark montecarlo contains code like this
+for example:
+
+double MonteCarlo_num_flops(int Num_samples) {
+    return ((double) Num_samples)* 4.0;
+}
+
+In fpstack mode, this compiles into:
+
+LCPI1_0:					
+	.long	1082130432	## float 4.000000e+00
+_MonteCarlo_num_flops:
+	subl	$4, %esp
+	movl	8(%esp), %eax
+	movl	%eax, (%esp)
+	fildl	(%esp)
+	fmuls	LCPI1_0
+	addl	$4, %esp
+	ret
+        
+in SSE mode, it compiles into significantly slower code:
+
+_MonteCarlo_num_flops:
+	subl	$12, %esp
+	cvtsi2sd	16(%esp), %xmm0
+	mulsd	LCPI1_0, %xmm0
+	movsd	%xmm0, (%esp)
+	fldl	(%esp)
+	addl	$12, %esp
+	ret
+
+There are also other cases in scimark where using fpstack is better, it is
+cheaper to do fld1 than load from a constant pool for example, so
+"load, add 1.0, store" is better done in the fp stack, etc.
+
+//===---------------------------------------------------------------------===//
author	Chris Lattner <sabre@nondot.org>	2009-02-04 19:08:01 +0000
committer	Chris Lattner <sabre@nondot.org>	2009-02-04 19:08:01 +0000
commit	553fd7e1ebababeb1c22526580c406eaef213d98 (patch)
tree	54575731253ce88b9634b1e3aa53402ee87b39e3 /llvm/lib/Target
parent	ded2d7b0211a96f70392e829b399546453dff40e (diff)
download	bcm5719-llvm-553fd7e1ebababeb1c22526580c406eaef213d98.tar.gz bcm5719-llvm-553fd7e1ebababeb1c22526580c406eaef213d98.zip