shld is a very high latency operation. Instead of emitting it for shifts of - bcm5719-llvm

diff options

author	Chris Lattner <sabre@nondot.org>	2004-11-13 20:48:57 +0000
committer	Chris Lattner <sabre@nondot.org>	2004-11-13 20:48:57 +0000
commit	049d33a7175882fcf2fdc56f4465af5629a7e353 (patch)
tree	5094b3b0a14ecc9ad353b4a4270dfb414875c6e7 /llvm/test/Regression/C++Frontend/2003-11-04-ArrayConstructors.cpp
parent	ef6bd92a8c78db2b1edbc444570c9fb2931f4e2f (diff)
download	bcm5719-llvm-049d33a7175882fcf2fdc56f4465af5629a7e353.tar.gz bcm5719-llvm-049d33a7175882fcf2fdc56f4465af5629a7e353.zip

shld is a very high latency operation. Instead of emitting it for shifts of

two or three, open code the equivalent operation which is faster on athlon and P4 (by a substantial margin). For example, instead of compiling this: long long X2(long long Y) { return Y << 2; } to: X3_2: movl 4(%esp), %eax movl 8(%esp), %edx shldl $2, %eax, %edx shll $2, %eax ret Compile it to: X2: movl 4(%esp), %eax movl 8(%esp), %ecx movl %eax, %edx shrl $30, %edx leal (%edx,%ecx,4), %edx shll $2, %eax ret Likewise, for << 3, compile to: X3: movl 4(%esp), %eax movl 8(%esp), %ecx movl %eax, %edx shrl $29, %edx leal (%edx,%ecx,8), %edx shll $3, %eax ret This matches icc, except that icc open codes the shifts as adds on the P4. llvm-svn: 17707

Diffstat (limited to 'llvm/test/Regression/C++Frontend/2003-11-04-ArrayConstructors.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: