[DAGCombiner] use UADDO to optimize saturated unsigned add - bcm5719-llvm

diff options

author	Sanjay Patel <spatel@rotateright.com>	2018-09-24 14:47:15 +0000
committer	Sanjay Patel <spatel@rotateright.com>	2018-09-24 14:47:15 +0000
commit	2c901742cac617fc9cfa16b12d0b494c20268df0 (patch)
tree	57db4521f8708b370da98673dc25f09b669dfd0d /clang/lib/CodeGen/CodeGenModule.cpp
parent	ae2e86fb2f198cc89e3e525b973989e26e4ecd9f (diff)
download	bcm5719-llvm-2c901742cac617fc9cfa16b12d0b494c20268df0.tar.gz bcm5719-llvm-2c901742cac617fc9cfa16b12d0b494c20268df0.zip

[DAGCombiner] use UADDO to optimize saturated unsigned add

This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886

Diffstat (limited to 'clang/lib/CodeGen/CodeGenModule.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: