diff options
| author | Sanjay Patel <spatel@rotateright.com> | 2018-10-10 13:39:59 +0000 |
|---|---|---|
| committer | Sanjay Patel <spatel@rotateright.com> | 2018-10-10 13:39:59 +0000 |
| commit | 6cca8af2270be8bc5494b44bb8856af591d0385b (patch) | |
| tree | f875071a18f0d27152814e086f606837ff94485f /llvm/lib/Target/X86 | |
| parent | fc51490baf1d6ad5796d8cb8bb0792de13ce8fce (diff) | |
| download | bcm5719-llvm-6cca8af2270be8bc5494b44bb8856af591d0385b.tar.gz bcm5719-llvm-6cca8af2270be8bc5494b44bb8856af591d0385b.zip | |
[x86] allow single source horizontal op matching (PR39195)
This is intended to restore horizontal codegen to what it looked like before IR demanded elements improved in:
rL343727
As noted in PR39195:
https://bugs.llvm.org/show_bug.cgi?id=39195
...horizontal ops can be worse for performance than a shuffle+regular binop, so I've added a TODO. Ideally, we'd
solve that in a machine instruction pass, but a quicker solution will be adding a 'HasFastHorizontalOp' feature
bit to deal with it here in the DAG.
Differential Revision: https://reviews.llvm.org/D52997
llvm-svn: 344141
Diffstat (limited to 'llvm/lib/Target/X86')
| -rw-r--r-- | llvm/lib/Target/X86/X86ISelLowering.cpp | 8 |
1 files changed, 6 insertions, 2 deletions
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index 4c18c5a84c2..67f98d8ee72 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -37026,9 +37026,13 @@ static bool isHorizontalBinOp(SDValue &LHS, SDValue &RHS, bool IsCommutative) { continue; // The low half of the 128-bit result must choose from A. - // The high half of the 128-bit result must choose from B. + // The high half of the 128-bit result must choose from B, + // unless B is undef. In that case, we are always choosing from A. + // TODO: Using a horizontal op on a single input is likely worse for + // performance on many CPUs, so this should be limited here or reversed + // in a later pass. unsigned NumEltsPer64BitChunk = NumEltsPer128BitChunk / 2; - unsigned Src = i >= NumEltsPer64BitChunk; + unsigned Src = B.getNode() ? i >= NumEltsPer64BitChunk : 0; // Check that successive elements are being operated on. If not, this is // not a horizontal operation. |

