[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF16 stride 4). - bcm5719-llvm

diff options

author	Michael Zuckerman <Michael.zuckerman@intel.com>	2017-08-07 13:22:39 +0000
committer	Michael Zuckerman <Michael.zuckerman@intel.com>	2017-08-07 13:22:39 +0000
commit	680ac10aa7bc87c52bbd3e110d3cd227b0821044 (patch)
tree	033588904a1f1046c58755c550f50c87537113ad /llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
parent	50805a0b83a2df7d77cebfbea1c175009a883cb4 (diff)
download	bcm5719-llvm-680ac10aa7bc87c52bbd3e110d3cd227b0821044.tar.gz bcm5719-llvm-680ac10aa7bc87c52bbd3e110d3cd227b0821044.zip

[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF16 stride 4).

This patch expands the support of lowerInterleavedStore to 16x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=16) and we plan to include more patterns in the future. The patch goal is to optimize the following sequence: At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding each 16 chars: c0, c1, , c16 m0, m1, , m16 y0, y1, , y16 k0, k1, ., k16 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Differential Revision: https://reviews.llvm.org/D35829 llvm-svn: 310252

Diffstat (limited to 'llvm/lib/DebugInfo/DWARF/DWARFContext.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: