summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineBasicBlock.cpp
diff options
context:
space:
mode:
authorCharlie Turner <charlie.turner@arm.com>2015-10-27 17:59:03 +0000
committerCharlie Turner <charlie.turner@arm.com>2015-10-27 17:59:03 +0000
commitab3215fa115d336c47bbf2e6387f7b72eb9a9a01 (patch)
tree012da1ab1c37cb4cff0cd4bc138f3c9e6737e66d /llvm/lib/CodeGen/MachineBasicBlock.cpp
parent5d40ae3a46a43eb0dd16dec801af5517d2ea9e96 (diff)
downloadbcm5719-llvm-ab3215fa115d336c47bbf2e6387f7b72eb9a9a01.tar.gz
bcm5719-llvm-ab3215fa115d336c47bbf2e6387f7b72eb9a9a01.zip
[SLP] Be more aggressive about reduction width selection.
Summary: This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach. It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize. I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT. The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something! Reviewers: nadav, jmolloy Subscribers: mssimpso, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D14116 llvm-svn: 251428
Diffstat (limited to 'llvm/lib/CodeGen/MachineBasicBlock.cpp')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud