diff options
| author | Chandler Carruth <chandlerc@gmail.com> | 2011-01-09 22:36:18 +0000 |
|---|---|---|
| committer | Chandler Carruth <chandlerc@gmail.com> | 2011-01-09 22:36:18 +0000 |
| commit | d011d5317ce2cf9a0b1eaeddb3eae512a2eabfc3 (patch) | |
| tree | e34e51c828c4dfe20e3a1dda4fe75a3ad9fe06c9 /llvm/lib/Target | |
| parent | 96df71ad90b2e59943d0c34acf0cd661dbce050b (diff) | |
| download | bcm5719-llvm-d011d5317ce2cf9a0b1eaeddb3eae512a2eabfc3.tar.gz bcm5719-llvm-d011d5317ce2cf9a0b1eaeddb3eae512a2eabfc3.zip | |
Add a note about the inability to model FP -> int conversions which
perform rounding other than truncation in the IR. Common C code for this
turns into really an LLVM intrinsic call that blocks a lot of further
optimizations.
llvm-svn: 123135
Diffstat (limited to 'llvm/lib/Target')
| -rw-r--r-- | llvm/lib/Target/README.txt | 55 |
1 files changed, 55 insertions, 0 deletions
diff --git a/llvm/lib/Target/README.txt b/llvm/lib/Target/README.txt index a9afffd95ae..bea240c0921 100644 --- a/llvm/lib/Target/README.txt +++ b/llvm/lib/Target/README.txt @@ -2273,3 +2273,58 @@ this whenever the floating point type has enough exponent bits to represent the largest integer value as < inf. //===---------------------------------------------------------------------===// + +clang -O3 currently compiles this code: + +#include <emmintrin.h> +int f(double x) { return _mm_cvtsd_si32(_mm_set_sd(x)); } +int g(double x) { return _mm_cvttsd_si32(_mm_set_sd(x)); } + +into + +define i32 @_Z1fd(double %x) nounwind readnone { +entry: + %vecinit.i = insertelement <2 x double> undef, double %x, i32 0 + %vecinit1.i = insertelement <2 x double> %vecinit.i, double 0.000000e+00, i32 1 + %0 = tail call i32 @llvm.x86.sse2.cvtsd2si(<2 x double> %vecinit1.i) nounwind + ret i32 %0 +} + +define i32 @_Z1gd(double %x) nounwind readnone { +entry: + %conv.i = fptosi double %x to i32 + ret i32 %conv.i +} + +This difference carries over to the assmebly produced, resulting in: + +_Z1fd: # @_Z1fd +# BB#0: # %entry + pushq %rbp + movq %rsp, %rbp + xorps %xmm1, %xmm1 + movsd %xmm0, %xmm1 + cvtsd2sil %xmm1, %eax + popq %rbp + ret + +_Z1gd: # @_Z1gd +# BB#0: # %entry + pushq %rbp + movq %rsp, %rbp + cvttsd2si %xmm0, %eax + popq %rbp + ret + +The problem is that we can't see through the intrinsic call used for cvtsd2si, +and fold away the unnecessary manipulation of the function parameter. When +these functions are inlined, it forms a barrier preventing many further +optimizations. LLVM IR doesn't have a good way to model the logic of +'cvtsd2si', its only FP -> int conversion path forces truncation. We should add +a rounding flag onto fptosi so that it can represent this type of rounding +naturally in the IR rather than using intrinsics. We might need to use a +'system_rounding_mode' flag to encode that the semantics of the rounding mode +can be changed by the program, but ideally we could just say that isn't +supported, and hard code the rounding. + +//===---------------------------------------------------------------------===// |

