bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Revert "Temporarily Revert "Add basic loop fusion pass.""	Eric Christopher	2019-04-17	4	-0/+639
\| \| \| \| \| \| \| \|	The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
*	Temporarily Revert "Add basic loop fusion pass."	Eric Christopher	2019-04-17	4	-639/+0
\| \| \| \| \| \| \| \|	As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
*	[X86][LLVM]Expanding Supports lowerInterleaved{store\|load}() in ↵	Michael Zuckerman	2017-10-02	2	-10/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X86InterleavedAccess (VF64 stride 3-4) I continue to support different VF interleaved and in this pass for this patch, I added the vf64 stride3 support for both load and store. I also added support fot the stride4 store. Reviewers: 1. zvi 2. dorit 3. igorb 4. guyblank Differential Revision: https://reviews.llvm.org/D37687 Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b llvm-svn: 314651
*	Adding test for interleved, case stride 4 vf64 store<NFC>.	Michael Zuckerman	2017-10-01	1	-0/+15
\| \| \| \| \|	Change-Id: I9ea62aac81b763c83d26613dca6fcd846997a017 llvm-svn: 314621
*	Code refactoring for the interleaved code <NFC>	Michael Zuckerman	2017-09-30	1	-14/+14
\| \| \| \| \|	Change-Id: I7831c9febad8e14278a5bc87584a0053dc837be1 llvm-svn: 314596
*	[X86][LLVM]Expanding Supports lowerInterleavedStore() in ↵	Michael Zuckerman	2017-09-26	1	-4/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X86InterleavedAccess (VF{8\|16\|32} stride 3) This patch expands the support of lowerInterleavedStore to {8\|16\|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8\|16\|32}) . This patch is part two of two patches and it covers the store (interlevaed) side. The patch goal is to optimize the following sequence: a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 into a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 Reviewers: zvi guyblank dorit Ayal Differential Revision: https://reviews.llvm.org/D37117 Change-Id: I56ced8bcbea809a37654060771911ade20246ccc llvm-svn: 314234
*	[X86][LLVM]Expanding Supports lowerInterleavedStore() in ↵	Michael Zuckerman	2017-09-25	1	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X86InterleavedAccess (VF8 stride 4): This patch expands the support of lowerInterleavedStore to 8x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=8) and we plan to include more patterns in the future. The patch goal is to optimize the following sequence: At the end of the computation, we have xmm2, xmm0, xmm12 and xmm3 holding each 8 chars: c0, c1, , c7 m0, m1, , m7 y0, y1, , y7 k0, k1, ., k7 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Reviewers DavidKreitzer Farhana zvi igorb guyblank RKSimon Ayal Differential Revision: https://reviews.llvm.org/D36058 Change-Id: I3cc5c2ca5d6318901c192a4428493b99ef424c32 llvm-svn: 314109
*	[Interleved][Stride 3]Adding test for case the VF=64 target with AVX512.	Michael Zuckerman	2017-09-11	2	-0/+35
\| \| \| \|	llvm-svn: 312907
*	[X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess ↵	Michael Zuckerman	2017-09-07	1	-13/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(VF{8\|16\|32} stride 3). This patch expands the support of lowerInterleavedload to {8\|16\|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8\|16\|32}) and we plan to include the store (deinterleved side). The patch goal is to optimize the following sequence: a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 into a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 Reviewers 1. zvi 2. igor 3. guyblank 4. dorit 5. Ayal llvm-svn: 312722
*	Update test for testing avx512	Michael Zuckerman	2017-09-04	1	-26/+26
\| \| \| \|	llvm-svn: 312487
*	Adding base lit test for x86interleaved	Michael Zuckerman	2017-08-24	1	-0/+46
\| \| \| \|	llvm-svn: 311658
*	[InterLeaved] Adding lit test for future work interleaved load strid 3	Michael Zuckerman	2017-08-21	1	-0/+60
\| \| \| \|	llvm-svn: 311320
*	[X86][LLVM]Expanding Supports lowerInterleavedStore() in ↵	Michael Zuckerman	2017-08-07	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	X86InterleavedAccess (VF16 stride 4). This patch expands the support of lowerInterleavedStore to 16x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=16) and we plan to include more patterns in the future. The patch goal is to optimize the following sequence: At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding each 16 chars: c0, c1, , c16 m0, m1, , m16 y0, y1, , y16 k0, k1, ., k16 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Differential Revision: https://reviews.llvm.org/D35829 llvm-svn: 310252
*	Expanding the test case for vf8 for stride 4 interleaved.	Michael Zuckerman	2017-07-30	1	-0/+15
\| \| \| \|	llvm-svn: 309511
*	[X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess.	Michael Zuckerman	2017-07-26	1	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch expands the support of lowerInterleavedStore to 32x8i stride 4. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=32) and we plan to include more patterns in the future. To reach our goal of "more patterns". We include two mask creators. The first function creates shuffle's mask equivalent to unpacklo/unpackhi instructions. The other creator creates mask equivalent to a concat of two half vectors(high/low). The patch goal is to optimize the following sequence: At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding each 32 chars: c0, c1, , c31 m0, m1, , m31 y0, y1, , y31 k0, k1, ., k31 And these need to be transposed/interleaved and stored like so: c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 .... Reviewers: dorit Farhana RKSimon guyblank DavidKreitzer Differential Revision: https://reviews.llvm.org/D34601 llvm-svn: 309086
*	Adding base test for interleave store VF16 and expand the test for AVX512	Michael Zuckerman	2017-07-24	1	-0/+15
\| \| \| \| \| \|	This patch doesn't modifay any non test file. llvm-svn: 308909
*	X86InterleaveAccess: A fix for bug33826	Farhana Aleen	2017-07-21	1	-0/+17
\| \| \| \| \| \| \| \|	Reviewers: DavidKreitzer Differential Revision: https://reviews.llvm.org/D35638 llvm-svn: 308784
*	[X86][LLVM][test]Expanding Supports lowerInterleavedStore() in ↵	Michael Zuckerman	2017-06-26	1	-0/+17
\| \| \| \| \| \| \| \|	X86InterleavedAccess test. Adding base tast (to trunk) for Store strid=4 vf=32. llvm-svn: 306286
*	Supported lowerInterleavedStore() in X86InterleavedAccess.	Farhana Aleen	2017-06-22	1	-9/+64
\| \| \| \| \| \| \| \| \| \|	Reviewers: RKSimon, DavidKreitzer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32658 llvm-svn: 306068
*	Added tests for X86InterleavedStore.	Evgeny Stupachenko	2017-06-06	1	-1/+60
\| \| \| \| \| \| \| \| \| \|	Reviewers: RKSimon, DavidKreitzer Differential Revision: https://reviews.llvm.org/D33684 Patch by: Aleen Farhana <Farhana.aleen@gmail.com> llvm-svn: 304834
*	Add a pass to optimize patterns of vectorized interleaved memory accesses for	David L Kreitzer	2016-10-14	2	-0/+107
	X86. The pass optimizes as a unit the entire wide load + shuffles pattern produced by interleaved vectorization. This initial patch optimizes one pattern (64-bit elements interleaved by a factor of 4). Future patches will generalize to additional patterns. Patch by Farhana Aleen Differential revision: http://reviews.llvm.org/D24681 llvm-svn: 284260