<feed xmlns='http://www.w3.org/2005/Atom'>
<title>bcm5719-llvm/llvm/test/Transforms/InterleavedAccess, branch meklort-10.0.1</title>
<subtitle>Project Ortega BCM5719 LLVM</subtitle>
<id>https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1</id>
<link rel='self' href='https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/'/>
<updated>2019-12-08T10:37:29+00:00</updated>
<entry>
<title>[ARM] Disable VLD4 under MVE</title>
<updated>2019-12-08T10:37:29+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2019-12-08T09:58:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=3a6eb5f16054e8c0f41a37542a5fc806016502a0'/>
<id>urn:sha1:3a6eb5f16054e8c0f41a37542a5fc806016502a0</id>
<content type='text'>
Alas, using half the available vector registers in a single instruction
is just too much for the register allocator to handle. The mve-vldst4.ll
test here fails when these instructions are enabled at present. This
patch disables the generation of VLD4 and VST4 by adding a
mve-max-interleave-factor option, which we currently default to 2.

Differential Revision: https://reviews.llvm.org/D71109
</content>
</entry>
<entry>
<title>[ARM] MVE interleaving load and stores.</title>
<updated>2019-11-19T18:37:30+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2019-11-19T18:37:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=882f23caeae5ad3ec1806eb6ec387e3611649d54'/>
<id>urn:sha1:882f23caeae5ad3ec1806eb6ec387e3611649d54</id>
<content type='text'>
Now that we have the intrinsics, we can add VLD2/4 and VST2/4 lowering
for MVE. This works the same way as Neon, recognising the load/shuffles
combination and converting them into intrinsics in a pre-isel pass,
which just calls getMaxSupportedInterleaveFactor, lowerInterleavedLoad
and lowerInterleavedStore.

The main difference to Neon is that we do not have a VLD3 instruction.
Otherwise most of the code works very similarly, with just some minor
differences in the form of the intrinsics to work around. VLD3 is
disabled by making isLegalInterleavedAccessType return false for those
cases.

We may need some other future adjustments, such as VLD4 take up half the
available registers so should maybe cost more. This patch should get the
basics in though.

Differential Revision: https://reviews.llvm.org/D69392
</content>
</entry>
<entry>
<title>[ARM] Add and update a lot of VLDn tests. NFC</title>
<updated>2019-11-19T18:37:30+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2019-11-19T18:17:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=411bfe476b758c09a0c9d4b3176e46f0a70de3bb'/>
<id>urn:sha1:411bfe476b758c09a0c9d4b3176e46f0a70de3bb</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Revert "Temporarily Revert "Add basic loop fusion pass.""</title>
<updated>2019-04-17T04:52:47+00:00</updated>
<author>
<name>Eric Christopher</name>
<email>echristo@gmail.com</email>
</author>
<published>2019-04-17T04:52:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=cee313d288a4faf0355d76fb6e0e927e211d08a5'/>
<id>urn:sha1:cee313d288a4faf0355d76fb6e0e927e211d08a5</id>
<content type='text'>
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
</content>
</entry>
<entry>
<title>Temporarily Revert "Add basic loop fusion pass."</title>
<updated>2019-04-17T02:12:23+00:00</updated>
<author>
<name>Eric Christopher</name>
<email>echristo@gmail.com</email>
</author>
<published>2019-04-17T02:12:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=a86343512845c9c1fdbac865fea88aa5fce7142a'/>
<id>urn:sha1:a86343512845c9c1fdbac865fea88aa5fce7142a</id>
<content type='text'>
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
</content>
</entry>
<entry>
<title>[InterleavedAccessPass] Don't increase the number of bytes loaded.</title>
<updated>2019-03-28T20:44:50+00:00</updated>
<author>
<name>Eli Friedman</name>
<email>efriedma@quicinc.com</email>
</author>
<published>2019-03-28T20:44:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=96f295e23bed5b717313f41fb71d81e8f1d49090'/>
<id>urn:sha1:96f295e23bed5b717313f41fb71d81e8f1d49090</id>
<content type='text'>
Even if the interleaving transform would otherwise be legal, we shouldn't
introduce an interleaved load that is wider than the original load: it might
have undefined behavior.

It might be possible to perform some sort of mask-narrowing transform in
some cases (using a narrower interleaved load, then extending the
results using shufflevectors).  But I haven't tried to implement that,
at least for now.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41245 .

Differential Revision: https://reviews.llvm.org/D59954

llvm-svn: 357212
</content>
</entry>
<entry>
<title>[X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in X86InterleavedAccess (VF64 stride 3-4)</title>
<updated>2017-10-02T07:35:25+00:00</updated>
<author>
<name>Michael Zuckerman</name>
<email>Michael.zuckerman@intel.com</email>
</author>
<published>2017-10-02T07:35:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=e4084f6bdbd338fd00c7c888f966f0b762f678af'/>
<id>urn:sha1:e4084f6bdbd338fd00c7c888f966f0b762f678af</id>
<content type='text'>
I continue to support different VF interleaved and in this pass for this patch,
I added the vf64 stride3 support for both load and store.
I also added support fot the stride4 store.

Reviewers:
1. zvi
2. dorit
3. igorb
4. guyblank

Differential Revision: https://reviews.llvm.org/D37687

Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b
llvm-svn: 314651
</content>
</entry>
<entry>
<title>Adding test for interleved, case stride 4 vf64 store&lt;NFC&gt;.</title>
<updated>2017-10-01T09:37:38+00:00</updated>
<author>
<name>Michael Zuckerman</name>
<email>Michael.zuckerman@intel.com</email>
</author>
<published>2017-10-01T09:37:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=17468954902a117a935b03e8dd2147fd6d2a8962'/>
<id>urn:sha1:17468954902a117a935b03e8dd2147fd6d2a8962</id>
<content type='text'>
Change-Id: I9ea62aac81b763c83d26613dca6fcd846997a017
llvm-svn: 314621
</content>
</entry>
<entry>
<title>Code refactoring for the interleaved code &lt;NFC&gt;</title>
<updated>2017-09-30T14:55:03+00:00</updated>
<author>
<name>Michael Zuckerman</name>
<email>Michael.zuckerman@intel.com</email>
</author>
<published>2017-09-30T14:55:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=b92b6d424fe143d3985e87708817a66fb2927795'/>
<id>urn:sha1:b92b6d424fe143d3985e87708817a66fb2927795</id>
<content type='text'>
Change-Id: I7831c9febad8e14278a5bc87584a0053dc837be1
llvm-svn: 314596
</content>
</entry>
<entry>
<title>[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF{8|16|32} stride 3)</title>
<updated>2017-09-26T18:49:11+00:00</updated>
<author>
<name>Michael Zuckerman</name>
<email>Michael.zuckerman@intel.com</email>
</author>
<published>2017-09-26T18:49:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=645f777e40c367e5a73acfc400677250a4661b32'/>
<id>urn:sha1:645f777e40c367e5a73acfc400677250a4661b32</id>
<content type='text'>
This patch expands the support of lowerInterleavedStore to {8|16|32}x8i stride 3.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) .
This patch is part two of two patches and it covers the store (interlevaed) side.

The patch goal is to optimize the following sequence:
a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7

into
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7

Reviewers:
zvi
guyblank
dorit
Ayal

Differential Revision: https://reviews.llvm.org/D37117

Change-Id: I56ced8bcbea809a37654060771911ade20246ccc
llvm-svn: 314234
</content>
</entry>
</feed>
