bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU/GlobalISel: Select mul24 intrinsics	Matt Arsenault	2019-12-30	1	-2/+10
\|
*	AMDGPU/GlobalISel: Select llvm.amdgcn.fmad.ftz	Matt Arsenault	2019-12-30	1	-1/+5
\|
*	AMDGPU/GlobalISel: Fix mapping and selection of llvm.amdgcn.div.fixup	Matt Arsenault	2019-12-24	1	-1/+5
\|
*	AMDGPU: Select basic interp directly from intrinsics	Matt Arsenault	2019-10-21	1	-12/+0
\| \| \| \|	llvm-svn: 375457
*	AMDGPU/GlobalISel: Select cvt pk intrinsics	Matt Arsenault	2019-09-10	1	-5/+25
\| \| \| \|	llvm-svn: 371539
*	AMDGPU/GlobalISel: Select llvm.amdgcn.sffbh	Matt Arsenault	2019-09-10	1	-1/+5
\| \| \| \|	llvm-svn: 371538
*	AMDGPU/GlobalISel: Select llvm.amdgcn.class	Matt Arsenault	2019-09-09	1	-1/+5
\| \| \| \| \| \|	Also fixes missing SubtargetPredicate on f16 class instructions. llvm-svn: 371436
*	AMDGPU/GlobalISel: Select fmed3	Matt Arsenault	2019-09-09	1	-1/+5
\| \| \| \|	llvm-svn: 371435
*	AMDGPU: Use PatFrags to allow selecting custom nodes or intrinsics	Matt Arsenault	2019-09-09	1	-10/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This enables GlobalISel to handle various intrinsics. The custom node pattern will be ignored, and the intrinsic will work. This will also allow SelectionDAG to directly select the intrinsics, but as they are all custom lowered to the nodes, this ends up leaving dead code in the table. Eventually either GlobalISel should add the equivalent of custom nodes equivalent, or intrinsics should be directly used. These each have different tradeoffs. There are a few more to handle, but these are easy to handle ones. Some others fail for other reasons. llvm-svn: 371432
*	AMDGPU: Remove pointless wrapper nodes for init.exec intrinsics	Matt Arsenault	2019-09-09	1	-9/+0
\| \| \| \|	llvm-svn: 371364
*	AMDGPU: Remove unused custom node definition	Matt Arsenault	2019-09-01	1	-8/+0
\| \| \| \|	llvm-svn: 370603
*	[AMDGPU] gfx1010 core wave32 changes	Stanislav Mekhanoshin	2019-06-20	1	-4/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D63204 llvm-svn: 363934
*	[AMDGPU] gfx1010 AMDGPUSetCCOp definition	Stanislav Mekhanoshin	2019-06-13	1	-1/+1
\| \| \| \| \| \| \|	It was missing from D63293 and breaks in a debug tablegen w/o this part. llvm-svn: 363323
*	[AMDGPU] gfx1010 allows VOP3 to have a literal	Stanislav Mekhanoshin	2019-05-02	1	-11/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61413 llvm-svn: 359756
*	[AMDGPU] Support emitting GOT relocations for function calls	Scott Linder	2019-02-04	1	-3/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57416 llvm-svn: 353083
*	[AMDGPU] Add intrinsics for 16 bit interpolation	Tim Corringham	2019-01-28	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added the intrinsics llvm.amdgcn.interp.p1.f16() and llvm.amdgcn.interp.p2.f16() and related LIT test. The p1 intrinsic generates code appropriate for both 16 and 32 bank LDS. Reviewers: #amdgpu, dstuttard, arsenm, tpr Reviewed By: #amdgpu, arsenm Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46754 llvm-svn: 352357
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	AMDGPU: Remove PHI loop condition optimization	Nicolai Haehnle	2018-10-31	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The optimization to early break out of loops if all threads are dead was never fully implemented. But the PHI node analyzing is actually causing a number of problems, so remove all the extra code for it. (This does actually regress code quality in a few places because it ends up relying more heavily on phi's of i1, which we don't do a great job with. However, since it fixes real bugs in the wild, we should take this change. I have some prototype changes to improve i1 lowering in general -- not just for control flow -- which should help recover the code quality, I just need to make those changes fit for general consumption. -- Nicolai) Change-Id: I6fc6c6c8961857ac6009fcfb9f7e5e48dc23fbb1 Patch-by: Christian König <christian.koenig@amd.com> Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53359 llvm-svn: 345718
*	AMDGPU: Add clamp bit to dot intrinsics	Konstantin Zhuravlyov	2018-08-01	1	-2/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D49874 llvm-svn: 338470
*	[AMDGPU] [AMDGPU] Support a fdot2 pattern.	Farhana Aleen	2018-07-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Optimize fma((float)S0.x, (float)S1.x fma((float)S0.y, (float)S1.y, z)) -> fdot2((v2f16)S0, (v2f16)S1, (float)z) Author: FarhanaAleen Reviewed By: rampitec, b-sumner Subscribers: AMDGPU Differential Revision: https://reviews.llvm.org/D49146 llvm-svn: 337198
*	[AMDGPU] Convert rcp to rcp_iflag	Stanislav Mekhanoshin	2018-06-27	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	If a source of rcp instruction is a result of any conversion from an integer convert it into rcp_iflag instruction. No FP exception can ever happen except division by zero if a single precision rcp argument is a representation of an integral number. Differential Revision: https://reviews.llvm.org/D48569 llvm-svn: 335742
*	[AMDGPU] DAG combine to produce V_PERM_B32	Stanislav Mekhanoshin	2018-06-12	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48099 llvm-svn: 334559
*	AMDGPU/R600: Remove code for handling AMDGPUISD::CLAMP	Tom Stellard	2018-05-24	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We don't generate AMDGPUISD::CLAMP for R600 now that llvm.AMDGPU.clamp is gone. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47181 llvm-svn: 333153
*	AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}	Marek Olsak	2018-01-31	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
*	Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.	Wei Ding	2017-10-12	1	-0/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D37348 llvm-svn: 315610
*	AMDGPU: Start adding tail call support	Matt Arsenault	2017-08-11	1	-0/+6
\| \| \| \| \| \|	Handle the sibling call cases. llvm-svn: 310753
*	AMDGPU: Initial implementation of calls	Matt Arsenault	2017-08-01	1	-0/+16
\| \| \| \| \| \| \| \| \|	Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0. llvm-svn: 309732
*	[AMDGPU] simplify add x, *ext (setcc) => addc\|subb x, 0, setcc	Stanislav Mekhanoshin	2017-06-21	1	-0/+10
\| \| \| \| \| \| \| \| \|	This simplification allows to avoid generating v_cndmask_b32 to serialize condition code between compare and use. Differential Revision: https://reviews.llvm.org/D34300 llvm-svn: 305962
*	AMDGPU: Start defining a calling convention	Matt Arsenault	2017-05-17	1	-1/+1
\| \| \| \| \| \| \| \|	Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
*	AMDGPU: Add new amdgcn.init.exec intrinsics	Marek Olsak	2017-04-28	1	-0/+9
\| \| \| \| \| \| \| \| \| \|	v2: More tests, bug fixes, cosmetic changes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D31762 llvm-svn: 301677
*	AMDGPU: Move trap lowering to DAG	Matt Arsenault	2017-04-24	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Fixes traps in any block besides the entry block, and fixes depending on a live-in physical register by using a virtual register copy. Also happens to stop emitting a nop in the case debug trap is not supported. llvm-svn: 301206
*	AMDGPU: Remove unnecessary ands when f16 is legal	Matt Arsenault	2017-03-31	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Add a new node to act as a fancy bitcast from f16 operations to i32 that implicitly zero the high 16-bits of the result. Alternatively could try making v2f16 legal and canonicalizing on build_vectors. llvm-svn: 299246
*	AMDGPU: Rename SI_RETURN	Matt Arsenault	2017-03-21	1	-1/+5
\| \| \| \| \| \| \| \|	This is used for a specific type of return to a shader part's epilog code. Rename to try avoiding confusion from a true call's return. llvm-svn: 298452
*	AMDGPU: Cleanup control flow intrinsics	Matt Arsenault	2017-03-17	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move backend internal intrinsics along with the rest of the normal intrinsics, and use the Intrinsic::getDeclaration API instead of manually constructing the type list. It's surprising this was working before. fdiv.fast had the wrong number of parameters. The control flow intrinsic declaration attributes were not being applied, and their types were inconsistent. The actual IR use types did not match the declaration, and were closer to the types used for the patterns. The brcond lowering was changing the types, so introduce new nodes for those. llvm-svn: 298119
*	AMDGPU: Fix unnecessary ands when packing f16 vectors	Matt Arsenault	2017-03-15	1	-0/+2
\| \| \| \| \| \| \| \| \|	computeKnownBits didn't handle fp_to_fp16 to report the high bits as 0. ARM maps the generic node to an instruction that does not modify the high bits of the register, so introduce a target node where the high bits are known 0. llvm-svn: 297873
*	AMDGPU : Replace FMAD with FMA when denormals are enabled.	Wei Ding	2017-02-24	1	-0/+2
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D29958 llvm-svn: 296186
*	AMDGPU: Add cvt.pkrtz intrinsic	Matt Arsenault	2017-02-22	1	-0/+6
\| \| \| \| \| \|	Convert llvm.SI.packf16 test uses llvm-svn: 295797
*	AMDGPU: Redefine clamp node as clamp 0.0-1.0	Matt Arsenault	2017-02-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
*	AMDGPU: Remove dead node definitions	Matt Arsenault	2017-02-15	1	-10/+0
\| \| \| \|	llvm-svn: 295247
*	AMDGPU/R600: Serialize vector trunc stores to private AS	Jan Vesely	2017-01-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Add DUMMY_CHAIN SDNode to denote stores of interest Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=28915 Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=30411 Differential Revision: https://reviews.llvm.org/D27964 llvm-svn: 292651
*	AMDGPU: Add replacement export intrinsics	Matt Arsenault	2017-01-17	1	-8/+9
\| \| \| \|	llvm-svn: 292205
*	AMDGPU/SI: Implement sendmsghalt intrinsic	Jan Vesely	2017-01-04	1	-0/+4
\| \| \| \| \| \| \| \|	v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977
*	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	Tom Stellard	2016-12-07	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
*	AMDGPU: Refactor exp instructions	Matt Arsenault	2016-12-05	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695
*	AMDGPU: Select mulhi 24-bit instructions	Matt Arsenault	2016-08-27	1	-4/+11
\| \| \| \|	llvm-svn: 279902
*	AMDGPU : Add intrinsics for compare with the full wavefront result	Wei Ding	2016-07-28	1	-0/+5
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D22482 llvm-svn: 276998
*	AMDGPU: Add fp legacy instruction intrinsics	Matt Arsenault	2016-07-26	1	-0/+5
\| \| \| \| \| \| \|	This could use some additional optimization work to use mad/mac legacy. llvm-svn: 276764
*	AMDGPU: Only use legal inline immediates with kill pseudo	Matt Arsenault	2016-07-19	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	Only if the value is negative or positive is what matters, so use a constant that doesn't require an instruction to materialize. These should really just emit the write exec directly, but for stick with the kill pseudo-terminator. llvm-svn: 275988
*	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32	Matt Arsenault	2016-07-18	1	-0/+1
\| \| \| \|	llvm-svn: 275871
*	AMDGPU: Fix verifier errors in SILowerControlFlow	Matt Arsenault	2016-06-22	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467