InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare. - bcm5719-llvm

diff options

author	Benjamin Kramer <benny.kra@googlemail.com>	2012-06-10 20:35:00 +0000
committer	Benjamin Kramer <benny.kra@googlemail.com>	2012-06-10 20:35:00 +0000
commit	8b8a76974f1498a7a1d24292de9ce6f1e869c615 (patch)
tree	c8c4b1edd0c11019e6730efd417d717171ac2c12 /clang/lib/Frontend/CompilerInvocation.cpp
parent	4e9f1a859f6fefa0d813c31e50ba4a7aabc183ca (diff)
download	bcm5719-llvm-8b8a76974f1498a7a1d24292de9ce6f1e869c615.tar.gz bcm5719-llvm-8b8a76974f1498a7a1d24292de9ce6f1e869c615.zip

InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.

This saves a cast, and zext is more expensive on platforms with subreg support than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750. On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the same performance now when not inlining either function. stupid_memchr: 323.0us bsd_memchr: 321.0us memchr: 479.0us where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time, I haven't fully understood the issue yet, something is grossly mangling the loop after inlining. llvm-svn: 158297

Diffstat (limited to 'clang/lib/Frontend/CompilerInvocation.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: