diff options
| author | Sanjay Patel <spatel@rotateright.com> | 2015-03-20 21:19:52 +0000 |
|---|---|---|
| committer | Sanjay Patel <spatel@rotateright.com> | 2015-03-20 21:19:52 +0000 |
| commit | c88f724fedeff64bd333668bdcda9d8d0a50f537 (patch) | |
| tree | ddac9121533abe574c78f50a62dab68193798988 /llvm/lib/CodeGen/MachineDominanceFrontier.cpp | |
| parent | 03ad616143062560de3aa1bfe41cae60d25eb548 (diff) | |
| download | bcm5719-llvm-c88f724fedeff64bd333668bdcda9d8d0a50f537.tar.gz bcm5719-llvm-c88f724fedeff64bd333668bdcda9d8d0a50f537.zip | |
[X86] Prefer blendps over insertps codegen for one special case
With this patch, for this one exact case, we'll generate:
blendps %xmm0, %xmm1, $1
instead of:
insertps %xmm0, %xmm1, $0
If there's a memory operand available for load folding and we're
optimizing for size, we'll still generate the insertps.
The detailed performance data motivation for this may be found in D7866;
in summary, blendps has 2-3x throughput vs. insertps on widely used chips.
Differential Revision: http://reviews.llvm.org/D8332
llvm-svn: 232850
Diffstat (limited to 'llvm/lib/CodeGen/MachineDominanceFrontier.cpp')
0 files changed, 0 insertions, 0 deletions

