diff options
| author | Sanjay Patel <spatel@rotateright.com> | 2019-11-27 13:33:11 -0500 |
|---|---|---|
| committer | Sanjay Patel <spatel@rotateright.com> | 2019-11-27 14:08:56 -0500 |
| commit | 5c166f1d1969e9c1e5b72aa672add429b9c22b53 (patch) | |
| tree | adf6302c8508cb2d3cf48fcf5e53eab409bfa65f /llvm/test/Transforms/SLPVectorizer/X86/reduction.ll | |
| parent | 5c5e860535d8924a3d6eb950bb8a4945df01e9b7 (diff) | |
| download | bcm5719-llvm-5c166f1d1969e9c1e5b72aa672add429b9c22b53.tar.gz bcm5719-llvm-5c166f1d1969e9c1e5b72aa672add429b9c22b53.zip | |
[x86] make SLM extract vector element more expensive than default
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&id=228095&whitespace=ignore-most#toc
The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.
This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605
Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.
Differential Revision: https://reviews.llvm.org/D70607
Diffstat (limited to 'llvm/test/Transforms/SLPVectorizer/X86/reduction.ll')
0 files changed, 0 insertions, 0 deletions

