bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Start selecting flat instruction offsets	Matt Arsenault	2017-06-12	1	-7/+30
\| \| \| \|	llvm-svn: 305201
*	AMDGPU: Start adding offset fields to flat instructions	Matt Arsenault	2017-06-12	1	-1/+4
\| \| \| \|	llvm-svn: 305194
*	Sort the remaining #include lines in include/... and lib/....	Chandler Carruth	2017-06-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
*	Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex ↵	Marek Olsak	2017-05-24	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \|	patterns" This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa. It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of the patterns, so it was putting 32-bit literals into the 8-bit field. llvm-svn: 303754
*	AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns	Marek Olsak	2017-05-23	1	-24/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4. Reviewers: arsenm, nhaehnle, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28994 llvm-svn: 303658
*	AMDGPU: Change mubuf soffset register when SP relative	Matt Arsenault	2017-05-17	1	-13/+51
\| \| \| \| \| \| \| \| \| \|	Check the MachinePointerInfo for whether the access is supposed to be relative to the stack pointer. No tests because this is used in later commits implementing calls. llvm-svn: 303301
*	AMDGPU: Make better use of op_sel with high components	Matt Arsenault	2017-05-17	1	-8/+48
\| \| \| \| \| \|	Handle more general swizzles. llvm-svn: 303296
*	AMDGPU: Try to use op_sel when selecting packed instructions	Matt Arsenault	2017-05-17	1	-1/+29
\| \| \| \| \| \| \| \| \| \| \| \|	Avoids instructions to pack a vector when the source is really a scalar being broadcast. Also be smarter and look for per-component fneg. Doesn't yet handle scalar from upper half of register or other swizzles. llvm-svn: 303291
*	AMDGPU: Remove tfe bit from flat instruction definitions	Matt Arsenault	2017-05-11	1	-5/+3
\| \| \| \| \| \| \| \| \| \|	We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814
*	Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.	Amara Emerson	2017-05-01	1	-2/+2
\| \| \| \| \| \| \| \|	This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
*	[AMDGPU] Garbage collect dead code. NFCI.	Davide Italiano	2017-04-26	1	-10/+0
\| \| \| \|	llvm-svn: 301375
*	AMDGPU: Clean up VOP3NoMods pattern	Matt Arsenault	2017-04-25	1	-23/+12
\| \| \| \| \| \| \|	There is no need to copy the operands or inspect the sources. Also remove some unnecessary clamp/omod usage. llvm-svn: 301363
*	AMDGPU: Select scratch mubuf offsets when pointer is a constant	Matt Arsenault	2017-04-24	1	-7/+46
\| \| \| \| \| \| \| \|	In call sequence setups, there may not be a frame index base and the pointer is a constant offset from the frame pointer / scratch wave offset register. llvm-svn: 301230
*	AMDGPU: Fix invalid copies when copying i1 to phys reg	Matt Arsenault	2017-04-12	1	-1/+1
\| \| \| \| \| \| \|	Insert a VReg_1 virtual register so the i1 workaround pass can handle it. llvm-svn: 300113
*	[AMDGPU][MC] Fix for Bug 28207 + LIT tests	Dmitry Preobrazhensky	2017-03-27	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31327 llvm-svn: 298852
*	[AMDGPU] Get address space mapping by target triple environment	Yaxun Liu	2017-03-27	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
*	AMDGPU: Support v2i16/v2f16 packed operations	Matt Arsenault	2017-02-27	1	-2/+67
\| \| \| \|	llvm-svn: 296396
*	AMDGPU: Generalize matching of v_med3_f32	Matt Arsenault	2017-01-31	1	-0/+20
\| \| \| \| \| \| \| \| \| \|	I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598
*	AMDGPU: Make i32 uaddo/usubo legal	Matt Arsenault	2017-01-30	1	-0/+17
\| \| \| \|	llvm-svn: 293514
*	AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel	Tom Stellard	2017-01-27	1	-13/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D29068 llvm-svn: 293321
*	AMDGPU: Remove modifiers from v_div_scale_*	Matt Arsenault	2017-01-19	1	-8/+2
\| \| \| \| \| \| \| \|	They seem to produce nonsense results when used. This should be applied to the release branch. llvm-svn: 292472
*	AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes	Jan Vesely	2017-01-06	1	-0/+4
\| \| \| \| \| \| \| \|	This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 llvm-svn: 291279
*	AMDGPU: Select branch on undef to uniform scc branch	Matt Arsenault	2016-12-15	1	-0/+6
\| \| \| \|	llvm-svn: 289877
*	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What ↵	Eugene Zelenko	2016-12-09	1	-13/+30
\| \| \| \| \| \|	You Use warnings; other minor fixes (NFC). llvm-svn: 289282
*	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	Tom Stellard	2016-12-07	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
*	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	suggested as a better solution by Matt llvm-svn: 287942
*	Revert "AMDGPU: Make m0 unallocatable"	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
*	AMDGPU: Make m0 unallocatable	Matt Arsenault	2016-11-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841
*	AMDGPU: Remove unnecessary and on conditional branch	Matt Arsenault	2016-11-07	1	-16/+2
\| \| \| \| \| \| \|	The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134
*	AMDGPU: Handle CopyToReg in getOperandRegClass	Matt Arsenault	2016-11-01	1	-1/+14
\| \| \| \|	llvm-svn: 285768
*	AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes	Nicolai Haehnle	2016-10-14	1	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224
*	[AMDGPU] Pass optimization level to SelectionDAGISel	Konstantin Zhuravlyov	2016-10-03	1	-6/+6
\| \| \| \|	llvm-svn: 283133
*	Use StringRef in Pass/PassManager APIs (NFC)	Mehdi Amini	2016-10-01	1	-2/+2
\| \| \| \|	llvm-svn: 283004
*	AMDGPU: Fix broken FrameIndex handling	Matt Arsenault	2016-09-17	1	-59/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824
*	AMDGPU: Use i64 scalar compare instructions	Matt Arsenault	2016-09-17	1	-12/+27
\| \| \| \| \| \|	VI added eq/ne for i64, so use them. llvm-svn: 281800
*	AMDGPU: Run LoadStoreVectorizer pass by default	Matt Arsenault	2016-09-09	1	-0/+3
\| \| \| \|	llvm-svn: 281112
*	MachineFunction: Return reference for getFrameInfo(); NFC	Matthias Braun	2016-07-28	1	-2/+2
\| \| \| \| \| \| \|	getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
*	AMDGPU: Remove analyzeImmediate	Matt Arsenault	2016-07-28	1	-5/+12
\| \| \| \| \| \| \|	This no longer uses the more complicated classification of constants. llvm-svn: 276945
*	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset	Nicolai Haehnle	2016-07-12	1	-30/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160
*	AMDGPU: Improve offset folding for register indexing	Matt Arsenault	2016-07-09	1	-0/+49
\| \| \| \|	llvm-svn: 274954
*	AMDGPU/SI: Remove address space query functions from AMDGPUDAGToDAGISel	Tom Stellard	2016-07-05	1	-56/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These have been replaced with TableGen code (except for isConstantLoad, which is still used for R600). The queries were broken for cases where MemOperand was a PseudoSourceValue. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21684 llvm-svn: 274561
*	AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads	Tom Stellard	2016-07-05	1	-5/+0
\| \| \| \| \| \| \| \| \|	This moves of the r600 logic out of isGlobalLoad() and into the TableGen files. Differential Revision: http://reviews.llvm.org/D21710 llvm-svn: 274527
*	AMDGPU/SI: Remove hack for selecting < 32-bit loads to MUBUF instructions	Tom Stellard	2016-07-04	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The isGlobalLoad() query was returning true for constant address space loads with memory types less than 32-bits, which is wrong. This logic has been replaced with PatFrag in the TableGen files, to provide the same functionality. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21696 llvm-svn: 274521
*	AMDGPU: Cleanup subtarget handling.	Matt Arsenault	2016-06-24	1	-1/+1
\| \| \| \| \| \| \| \| \|	Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
*	AMDGPU: Fix gcc warnings	Matt Arsenault	2016-06-22	1	-90/+0
\| \| \| \| \| \| \|	Mostly removing dead code. Apparently gcc's warning for unused functions is better llvm-svn: 273363
*	Delete more dead code.	Rafael Espindola	2016-06-21	1	-32/+0
\| \| \| \| \| \|	Found by gcc 6. llvm-svn: 273322
*	Delete some dead code.	Rafael Espindola	2016-06-21	1	-5/+0
\| \| \| \| \| \|	Found by gcc 6. llvm-svn: 273303
*	Reformat blank lines.	NAKAMURA Takumi	2016-06-20	1	-1/+0
\| \| \| \|	llvm-svn: 273131
*	Untabify.	NAKAMURA Takumi	2016-06-20	1	-5/+3
\| \| \| \|	llvm-svn: 273129
*	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics	Nicolai Haehnle	2016-06-15	1	-13/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 llvm-svn: 272761