diff options
| author | Evan Cheng <evan.cheng@apple.com> | 2006-11-10 22:03:35 +0000 |
|---|---|---|
| committer | Evan Cheng <evan.cheng@apple.com> | 2006-11-10 22:03:35 +0000 |
| commit | 5b725a71a9462bfe7706b4f1f495bc413503d41e (patch) | |
| tree | 8c07d11c423891b9413bf4e81859d2e12abaf2d7 /llvm/lib | |
| parent | 49683ba2369fc5489976f8bfc5b18900bd9dc10f (diff) | |
| download | bcm5719-llvm-5b725a71a9462bfe7706b4f1f495bc413503d41e.tar.gz bcm5719-llvm-5b725a71a9462bfe7706b4f1f495bc413503d41e.zip | |
These are done.
llvm-svn: 31649
Diffstat (limited to 'llvm/lib')
| -rw-r--r-- | llvm/lib/Target/X86/README-SSE.txt | 58 | ||||
| -rw-r--r-- | llvm/lib/Target/X86/README.txt | 11 |
2 files changed, 0 insertions, 69 deletions
diff --git a/llvm/lib/Target/X86/README-SSE.txt b/llvm/lib/Target/X86/README-SSE.txt index 2de68edad00..5c7722da535 100644 --- a/llvm/lib/Target/X86/README-SSE.txt +++ b/llvm/lib/Target/X86/README-SSE.txt @@ -4,30 +4,6 @@ //===---------------------------------------------------------------------===// -There are serious issues folding loads into "scalar sse" intrinsics. For -example, this: - -float minss4( float x, float *y ) { - return _mm_cvtss_f32(_mm_min_ss(_mm_set_ss(x),_mm_set_ss(*y))); -} - -compiles to: - -_minss4: - subl $4, %esp - movl 12(%esp), %eax -*** movss 8(%esp), %xmm0 -*** movss (%eax), %xmm1 -*** minss %xmm1, %xmm0 - movss %xmm0, (%esp) - flds (%esp) - addl $4, %esp - ret - -Each operand of the minss is a load. At least one should be folded! - -//===---------------------------------------------------------------------===// - Expand libm rounding functions inline: Significant speedups possible. http://gcc.gnu.org/ml/gcc-patches/2006-10/msg00909.html @@ -165,17 +141,6 @@ This will be solved when we go to a dynamic programming based isel. //===---------------------------------------------------------------------===// -Should generate min/max for stuff like: - -void minf(float a, float b, float *X) { - *X = a <= b ? a : b; -} - -Make use of floating point min / max instructions. Perhaps introduce ISD::FMIN -and ISD::FMAX node types? - -//===---------------------------------------------------------------------===// - Lower memcpy / memset to a series of SSE 128 bit move instructions when it's feasible. @@ -225,29 +190,6 @@ Perhaps use pxor / xorp* to clear a XMM register first? //===---------------------------------------------------------------------===// -Better codegen for: - -void f(float a, float b, vector float * out) { *out = (vector float){ a, 0.0, 0.0, b}; } -void f(float a, float b, vector float * out) { *out = (vector float){ a, b, 0.0, 0}; } - -For the later we generate: - -_f: - pxor %xmm0, %xmm0 - movss 8(%esp), %xmm1 - movaps %xmm0, %xmm2 - unpcklps %xmm1, %xmm2 - movss 4(%esp), %xmm1 - unpcklps %xmm0, %xmm1 - unpcklps %xmm2, %xmm1 - movl 12(%esp), %eax - movaps %xmm1, (%eax) - ret - -This seems like it should use shufps, one for each of a & b. - -//===---------------------------------------------------------------------===// - How to decide when to use the "floating point version" of logical ops? Here are some code fragments: diff --git a/llvm/lib/Target/X86/README.txt b/llvm/lib/Target/X86/README.txt index 956caff0c4e..834a47c51b9 100644 --- a/llvm/lib/Target/X86/README.txt +++ b/llvm/lib/Target/X86/README.txt @@ -232,17 +232,6 @@ which is probably slower, but it's interesting at least :) //===---------------------------------------------------------------------===// -Should generate min/max for stuff like: - -void minf(float a, float b, float *X) { - *X = a <= b ? a : b; -} - -Make use of floating point min / max instructions. Perhaps introduce ISD::FMIN -and ISD::FMAX node types? - -//===---------------------------------------------------------------------===// - The first BB of this code: declare bool %foo() |

