bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU: Split MUBUF offset into aligned components	Nicolai Haehnle	2017-10-10	1	-10/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Atomic buffer operations do not work (and trap on gfx9) when the components are unaligned, even if their sum is aligned. Previously, we generated an offset of 4156 without an SGPR by splitting it as 4095 + 61 (immediate + inline constant). The highest offset for which we can do this correctly is 4156 = 4092 + 64. Fixes dEQP-GLES31.functional.ssbo.atomic.* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37850 llvm-svn: 315302
*	AMDGPU: Start selecting v_mad_mixlo_f16	Matt Arsenault	2017-09-20	1	-0/+9
\| \| \| \| \| \| \| \|	Also add some tests that should be able to use v_mad_mixhi_f16, but do not yet. This is trickier because we don't really model the partial update of the register done by 16-bit instructions. llvm-svn: 313806
*	AMDGPU: Match load d16 hi instructions	Matt Arsenault	2017-09-20	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Also starts selecting global loads for constant address in some cases. Some end up selecting to mubuf still, which requires investigation. We still get sub-optimal regalloc and extra waitcnts inserted due to not really tracking the liveness of the separate register halves. llvm-svn: 313716
*	[AMDGPU] Remove unused function. NFCI.	Davide Italiano	2017-09-08	1	-9/+0
\| \| \| \|	llvm-svn: 312836
*	AMDGPU: Start selecting v_mad_mix_f32	Matt Arsenault	2017-09-07	1	-5/+96
\| \| \| \|	llvm-svn: 312732
*	AMDGPU: Fix warnings introduced by r310336	Tom Stellard	2017-08-08	1	-4/+2
\| \| \| \|	llvm-svn: 310337
*	AMDGPU: Move R600 parts of AMDGPUISelDAGToDAG into their own class	Tom Stellard	2017-08-08	1	-112/+177
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This refactoring is required in order to split the R600 and GCN tablegen files. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D36286 llvm-svn: 310336
*	AMDGPU: Add analysis pass for function argument info	Matt Arsenault	2017-08-03	1	-4/+17
\| \| \| \| \| \| \|	This will allow only adding necessary inputs to callee functions that need special inputs forwarded from the kernel. llvm-svn: 309996
*	AMDGPU: Start selecting global instructions	Matt Arsenault	2017-07-29	1	-3/+17
\| \| \| \|	llvm-svn: 309470
*	[AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier	Dmitry Preobrazhensky	2017-07-21	1	-0/+44
\| \| \| \| \| \| \| \| \| \|	See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591 Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm Differential Revision: https://reviews.llvm.org/D35424 llvm-svn: 308740
*	AMDGPU: Rename _RTN atomic instructions	Matt Arsenault	2017-07-20	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \|	Move the _RTN to the end of the name. It reads better if the other addressing mode components line up with the non-RTN version. It is also more convenient to define saddr variants of FLAT atomics to have the RTN last, and it is good to have a consistent naming scheme. llvm-svn: 308674
*	AMDGPU: Start selecting flat instruction offsets	Matt Arsenault	2017-06-12	1	-7/+30
\| \| \| \|	llvm-svn: 305201
*	AMDGPU: Start adding offset fields to flat instructions	Matt Arsenault	2017-06-12	1	-1/+4
\| \| \| \|	llvm-svn: 305194
*	Sort the remaining #include lines in include/... and lib/....	Chandler Carruth	2017-06-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
*	Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex ↵	Marek Olsak	2017-05-24	1	-12/+24
\| \| \| \| \| \| \| \| \| \| \|	patterns" This reverts commit e065977c4b5f68ab845400b256f6a3822b1325fa. It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of the patterns, so it was putting 32-bit literals into the 8-bit field. llvm-svn: 303754
*	AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns	Marek Olsak	2017-05-23	1	-24/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4. Reviewers: arsenm, nhaehnle, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28994 llvm-svn: 303658
*	AMDGPU: Change mubuf soffset register when SP relative	Matt Arsenault	2017-05-17	1	-13/+51
\| \| \| \| \| \| \| \| \| \|	Check the MachinePointerInfo for whether the access is supposed to be relative to the stack pointer. No tests because this is used in later commits implementing calls. llvm-svn: 303301
*	AMDGPU: Make better use of op_sel with high components	Matt Arsenault	2017-05-17	1	-8/+48
\| \| \| \| \| \|	Handle more general swizzles. llvm-svn: 303296
*	AMDGPU: Try to use op_sel when selecting packed instructions	Matt Arsenault	2017-05-17	1	-1/+29
\| \| \| \| \| \| \| \| \| \| \| \|	Avoids instructions to pack a vector when the source is really a scalar being broadcast. Also be smarter and look for per-component fneg. Doesn't yet handle scalar from upper half of register or other swizzles. llvm-svn: 303291
*	AMDGPU: Remove tfe bit from flat instruction definitions	Matt Arsenault	2017-05-11	1	-5/+3
\| \| \| \| \| \| \| \| \| \|	We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. llvm-svn: 302814
*	Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.	Amara Emerson	2017-05-01	1	-2/+2
\| \| \| \| \| \| \| \|	This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
*	[AMDGPU] Garbage collect dead code. NFCI.	Davide Italiano	2017-04-26	1	-10/+0
\| \| \| \|	llvm-svn: 301375
*	AMDGPU: Clean up VOP3NoMods pattern	Matt Arsenault	2017-04-25	1	-23/+12
\| \| \| \| \| \| \|	There is no need to copy the operands or inspect the sources. Also remove some unnecessary clamp/omod usage. llvm-svn: 301363
*	AMDGPU: Select scratch mubuf offsets when pointer is a constant	Matt Arsenault	2017-04-24	1	-7/+46
\| \| \| \| \| \| \| \|	In call sequence setups, there may not be a frame index base and the pointer is a constant offset from the frame pointer / scratch wave offset register. llvm-svn: 301230
*	AMDGPU: Fix invalid copies when copying i1 to phys reg	Matt Arsenault	2017-04-12	1	-1/+1
\| \| \| \| \| \| \|	Insert a VReg_1 virtual register so the i1 workaround pass can handle it. llvm-svn: 300113
*	[AMDGPU][MC] Fix for Bug 28207 + LIT tests	Dmitry Preobrazhensky	2017-03-27	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31327 llvm-svn: 298852
*	[AMDGPU] Get address space mapping by target triple environment	Yaxun Liu	2017-03-27	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
*	AMDGPU: Support v2i16/v2f16 packed operations	Matt Arsenault	2017-02-27	1	-2/+67
\| \| \| \|	llvm-svn: 296396
*	AMDGPU: Generalize matching of v_med3_f32	Matt Arsenault	2017-01-31	1	-0/+20
\| \| \| \| \| \| \| \| \| \|	I think this is safe as long as no inputs are known to ever be nans. Also add an intrinsic for fmed3 to be able to handle all safe math cases. llvm-svn: 293598
*	AMDGPU: Make i32 uaddo/usubo legal	Matt Arsenault	2017-01-30	1	-0/+17
\| \| \| \|	llvm-svn: 293514
*	AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel	Tom Stellard	2017-01-27	1	-13/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D29068 llvm-svn: 293321
*	AMDGPU: Remove modifiers from v_div_scale_*	Matt Arsenault	2017-01-19	1	-8/+2
\| \| \| \| \| \| \| \|	They seem to produce nonsense results when used. This should be applied to the release branch. llvm-svn: 292472
*	AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes	Jan Vesely	2017-01-06	1	-0/+4
\| \| \| \| \| \| \| \|	This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 llvm-svn: 291279
*	AMDGPU: Select branch on undef to uniform scc branch	Matt Arsenault	2016-12-15	1	-0/+6
\| \| \| \|	llvm-svn: 289877
*	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What ↵	Eugene Zelenko	2016-12-09	1	-13/+30
\| \| \| \| \| \|	You Use warnings; other minor fixes (NFC). llvm-svn: 289282
*	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	Tom Stellard	2016-12-07	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
*	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	suggested as a better solution by Matt llvm-svn: 287942
*	Revert "AMDGPU: Make m0 unallocatable"	Marek Olsak	2016-11-25	1	-1/+1
\| \| \| \| \| \|	This reverts commit 124ad83dae04514f943902446520c859adee0e96. llvm-svn: 287932
*	AMDGPU: Make m0 unallocatable	Matt Arsenault	2016-11-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	m0 may need to be written for spill code, so we don't want general code uses relying on the value stored in it. This introduces a few code quality regressions where copies from m0 are not coalesced into copies of a copy of m0. llvm-svn: 287841
*	AMDGPU: Remove unnecessary and on conditional branch	Matt Arsenault	2016-11-07	1	-16/+2
\| \| \| \| \| \| \|	The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134
*	AMDGPU: Handle CopyToReg in getOperandRegClass	Matt Arsenault	2016-11-01	1	-1/+14
\| \| \| \|	llvm-svn: 285768
*	AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes	Nicolai Haehnle	2016-10-14	1	-10/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will be used for 64-bit MULHU, which is in turn used for the 64-bit divide-by-constant optimization (see D24822). Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25289 llvm-svn: 284224
*	[AMDGPU] Pass optimization level to SelectionDAGISel	Konstantin Zhuravlyov	2016-10-03	1	-6/+6
\| \| \| \|	llvm-svn: 283133
*	Use StringRef in Pass/PassManager APIs (NFC)	Mehdi Amini	2016-10-01	1	-2/+2
\| \| \| \|	llvm-svn: 283004
*	AMDGPU: Fix broken FrameIndex handling	Matt Arsenault	2016-09-17	1	-59/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. llvm-svn: 281824
*	AMDGPU: Use i64 scalar compare instructions	Matt Arsenault	2016-09-17	1	-12/+27
\| \| \| \| \| \|	VI added eq/ne for i64, so use them. llvm-svn: 281800
*	AMDGPU: Run LoadStoreVectorizer pass by default	Matt Arsenault	2016-09-09	1	-0/+3
\| \| \| \|	llvm-svn: 281112
*	MachineFunction: Return reference for getFrameInfo(); NFC	Matthias Braun	2016-07-28	1	-2/+2
\| \| \| \| \| \| \|	getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017
*	AMDGPU: Remove analyzeImmediate	Matt Arsenault	2016-07-28	1	-5/+12
\| \| \| \| \| \| \|	This no longer uses the more complicated classification of constants. llvm-svn: 276945
*	AMDGPU: Unify MOVRELSOffset and MOVRELDOffset	Nicolai Haehnle	2016-07-12	1	-30/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, constant index insertelements would be turned into SI_INDIRECT_DST, which is bound to prevent some optimization opportunities. Worse, it mislead the heuristic that decides whether immediates should be lowered to S_MOV_B32 or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22217 llvm-svn: 275160