bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU: Set hasSideEffects 0 on _term instructions	Matt Arsenault	2019-03-25	1	-0/+3
\| \| \| \| \| \| \| \|	These were defaulting to true, but they are just wrappers around bit operations. This avoids regressions in the exec mask optimization passes in a future commit. llvm-svn: 356952
*	[AMDGPU] Added v5i32 and v5f32 register classes	Tim Renouf	2019-03-22	1	-0/+22
\| \| \| \| \| \| \| \| \| \|	They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords. Differential Revision: https://reviews.llvm.org/D58903 Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271 llvm-svn: 356735
*	[AMDGPU] Support for v3i32/v3f32	Tim Renouf	2019-03-21	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet. SI (gfx6) does not have dwordx3 instructions, so they are not enabled there. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58902 Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 llvm-svn: 356659
*	[AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers	Tim Renouf	2019-03-18	1	-13/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit allows v_cndmask_b32_e64 with abs, neg source modifiers on src0, src1 to be assembled and disassembled. This does appear to be allowed, even though they are floating point modifiers and the operand type is b32. To do this, I added src0_modifiers and src1_modifiers to the MachineInstr, which involved fixing up several places in codegen and mir tests. Differential Revision: https://reviews.llvm.org/D59191 Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea llvm-svn: 356398
*	Revert "AMDGPU/NFC: Cleanup subtarget predicates"	Konstantin Zhuravlyov	2019-02-22	1	-2/+2
\| \| \| \| \| \| \|	It breaks one of our downstream merges, so revert it temporarily while investigating failures downstream llvm-svn: 354700
*	AMDGPU/NFC: Cleanup subtarget predicates	Konstantin Zhuravlyov	2019-02-21	1	-2/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D58522 llvm-svn: 354620
*	AMDGPU: Remove GCN features and predicates	Matt Arsenault	2019-02-08	1	-17/+3
\| \| \| \| \| \| \|	These are no longer necessary since the R600 tablegen files are split out now. llvm-svn: 353548
*	[AMDGPU] Support emitting GOT relocations for function calls	Scott Linder	2019-02-04	1	-15/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D57416 llvm-svn: 353083
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	AMDGPU: Add a fast path for icmp.i1(src, false, NE)	Marek Olsak	2019-01-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows moving the condition from the intrinsic to the standard ICmp opcode, so that LLVM can do simplifications on it. The icmp.i1 intrinsic is an identity for retrieving the SGPR mask. And we can also get the mask from and i1, or i1, xor i1. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52060 llvm-svn: 351150
*	AMDGPU: Add patterns for v4i16/v4f16 -> v4i16/v4f16 bitcasts	Rhys Perry	2018-12-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, tstellar Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55058 llvm-svn: 349694
*	[AMDGPU] Restored selection of scalar_to_vector (v2x16)	Stanislav Mekhanoshin	2018-11-19	1	-9/+9
\| \| \| \| \| \| \| \| \|	This works if DAG combiner is enabled, but without combining we cannot select scalar_to_vector of <2 x half> and <2 x i16>. Differential Revision: https://reviews.llvm.org/D54718 llvm-svn: 347259
*	AMDGPU: Fix analyzeBranch failing with pseudoterminators	Matt Arsenault	2018-11-16	1	-1/+2
\| \| \| \| \| \| \| \| \|	If a block had one of the _term instructions used for gluing exec modifying instructions to the end of the block, analyzeBranch would fail, preventing the verifier from catching a broken successor list. llvm-svn: 347027
*	AMDGPU: Additional pattern for i16 median3 matching	Aakanksha Patil	2018-11-14	1	-4/+17
\| \| \| \| \| \| \| \|	min(max(a, b), max(min(a, b), c)) Differential Revision: https://reviews.llvm.org/D54494 llvm-svn: 346886
*	AMDGPU: Adding more median3 patterns	Aakanksha Patil	2018-11-12	1	-3/+4
\| \| \| \| \| \| \| \|	min(max(a, b), max(min(a, b), c)) -> med3 a, b, c Differential Revision: https://reviews.llvm.org/D54331 llvm-svn: 346704
*	AMDGPU: Remove PHI loop condition optimization	Nicolai Haehnle	2018-10-31	1	-16/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The optimization to early break out of loops if all threads are dead was never fully implemented. But the PHI node analyzing is actually causing a number of problems, so remove all the extra code for it. (This does actually regress code quality in a few places because it ends up relying more heavily on phi's of i1, which we don't do a great job with. However, since it fixes real bugs in the wild, we should take this change. I have some prototype changes to improve i1 lowering in general -- not just for control flow -- which should help recover the code quality, I just need to make those changes fit for general consumption. -- Nicolai) Change-Id: I6fc6c6c8961857ac6009fcfb9f7e5e48dc23fbb1 Patch-by: Christian König <christian.koenig@amd.com> Reviewers: arsenm, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53359 llvm-svn: 345718
*	DAG: Change behavior of fminnum/fmaxnum nodes	Matt Arsenault	2018-10-22	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \|	Introduce new versions that follow the IEEE semantics to help with legalization that may need quieted inputs. There are some regressions from inserting unnecessary canonicalizes when these are matched from fast math fcmp + select which should be fixed in a future commit. llvm-svn: 344914
*	AMDGPU: Add support pattern for SUB of one bit	Changpeng Fang	2018-10-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add selection patterns to support one bit Sub. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D52946 llvm-svn: 344815
*	AMDGPU: Add Selection patterns to support add of one bit.	Changpeng Fang	2018-09-25	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We generate s_xor to lower add of i1s in general cases, and s_not to lower add with a one-bit imm of -1 (true). Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D52518 llvm-svn: 343030
*	[AMDGPU] Divergence driven instruction selection. Part 1.	Alexander Timofeev	2018-09-21	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector and scalar flows at the selection stage. Selection uses predicate functions based on the framework implemented earlier - https://reviews.llvm.org/D35267 Differential revision: https://reviews.llvm.org/D52019 Reviewers: rampitec llvm-svn: 342719
*	[AMDGPU] Add instruction selection for i1 to f16 conversion	Carl Ritson	2018-09-19	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is required for GPUs with 16 bit instructions where f16 is a legal register type and hence int_to_fp i1 to f16 is not lowered by legalizing. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52018 Change-Id: Ie4c0fd6ced7cf10ad612023c6879724d9ded5851 llvm-svn: 342558
*	AMDGPU: Fix packing undef parts of build_vector	Matt Arsenault	2018-08-12	1	-2/+21
\| \| \| \|	llvm-svn: 339511
*	AMDGPU: Use SPseudoInst helper	Matt Arsenault	2018-08-01	1	-8/+5
\| \| \| \|	llvm-svn: 338631
*	AMDGPU: Reduce code size with fcanonicalize (fneg x)	Matt Arsenault	2018-07-30	1	-0/+10
\| \| \| \| \| \| \| \|	When fcanonicalize is lowered to a mul, we can use -1.0 for free and avoid the cost of the bigger encoding for source modifers. llvm-svn: 338244
*	AMDGPU: Fix code size for return_to_epilog pseudo	Matt Arsenault	2018-07-27	1	-0/+1
\| \| \| \|	llvm-svn: 338113
*	AMDGPU: Separate R600 and GCN TableGen files	Tom Stellard	2018-06-28	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
*	[AMDGPU] Overload llvm.amdgcn.fmad.ftz to support f16	Stanislav Mekhanoshin	2018-06-28	1	-5/+9
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48677 llvm-svn: 335866
*	AMDGPU: Add implicit def of SCC to kill and indirect pseudos	Nicolai Haehnle	2018-06-21	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Kill instructions sometimes do use SCC in unusual circumstances, when v_cmpx cannot be used due to the operands that are involved. Additionally, even if SCC was never defined by the expansion, kill pseudos could previously occur between an s_cmp and an s_cbranch_scc, which breaks the SCC liveness tracking when the pseudo is expanded to split the basic block. While it would be possible to explicitly mark the SCC as live-in for the successor basic block, it's simpler to just mark the pseudo as using SCC, so that such a sequence is never emitted by instruction selection in the first place. A similar issue affects indirect source/dest pseudos in principle, although I haven't been able to come up with a test case where it actually matters (this affects instruction selection, so a MIR test can't be used). Fixes: dEQP-GLES3.functional.shaders.discard.dynamic_loop_always Change-Id: Ica8d82ecff1a763b892a1112cf1b06c948863a4f Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47761 llvm-svn: 335223
*	AMDGPU: Fix scalar_to_vector for v4i16/v4f16	Matt Arsenault	2018-06-20	1	-0/+10
\| \| \| \|	llvm-svn: 335161
*	AMDGPU: Make v4i16/v4f16 legal	Matt Arsenault	2018-06-15	1	-0/+41
\| \| \| \| \| \| \|	Some image loads return these, and it's awkward working around them not being legal. llvm-svn: 334835
*	AMDGPU: Use scalar operations for f16 fabs/fneg patterns	Matt Arsenault	2018-06-07	1	-7/+7
\| \| \| \| \| \|	Fixes unnecessary differences between subtargets. llvm-svn: 334184
*	AMDGPU: Custom lower v2f16 fneg/fabs with illegal f16	Matt Arsenault	2018-06-06	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes terrible code on targets without f16 support. The legalization creates a mess that is difficult to recover from. Also should avoid randomly breaking these tests multiple times in sequence in future commits. Some regressions in cases where it happens to be better to pull the source modifier after the conversion. llvm-svn: 334132
*	AMDGPU: Fix v2f16 fneg/fabs pattern	Matt Arsenault	2018-05-22	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	The integer operation convertion for some reason only happens if the source is a bitcast from an integer, which happens to always be the situation when the result is loaded. Add an additional pattern for when the source operation is really an FP operation. llvm-svn: 333019
*	AMDGPU: Make v2i16/v2f16 legal on VI	Matt Arsenault	2018-05-22	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \|	This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9. Partially cleans up the mess used for lowering of the d16 operations. Making v4f16 legal will help clean this up more, but this requires additional work. llvm-svn: 332953
*	AMDGPU: Add Vega12 and Vega20	Matt Arsenault	2018-04-30	1	-0/+10
\| \| \| \| \| \| \| \|	Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
*	AMDGPU: Consolidate SubtargetPredicate definitions	Matt Arsenault	2018-04-26	1	-7/+0
\| \| \| \|	llvm-svn: 330979
*	AMDGPU: Remove deprecated llvm.AMDGPU.kilp intrinsic	Tom Stellard	2018-04-24	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is no longer used by mesa since its 18.0.0 release. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D45988 llvm-svn: 330775
*	[AMDGPU][MC][GFX8][GFX9][DISASSEMBLER] Added "_e32" suffix to 32-bit VINTRP ↵	Dmitry Preobrazhensky	2018-03-16	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \|	opcodes See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751 Differential Revision: https://reviews.llvm.org/D44529 Reviewers: artem.tamazov, arsenm llvm-svn: 327723
*	AMDGPU/GCN: Promote i16 ctpop	Jan Vesely	2018-03-02	1	-0/+4
\| \| \| \| \| \| \| \| \|	i16 capable ASICs do not support i16 operands for this instruction. Add tablegen pattern to merge chained i16 additions. Differential Revision: https://reviews.llvm.org/D43985 llvm-svn: 326535
*	AMDGPU: Remove tied operand from si_else	Matt Arsenault	2018-02-09	1	-1/+0
\| \| \| \|	llvm-svn: 324751
*	AMDGPU: Select BFI patterns with 64-bit ints	Matt Arsenault	2018-02-07	1	-1/+2
\| \| \| \|	llvm-svn: 324431
*	AMDGPU: Fix missing SCC def from s_xor_b64_term	Matt Arsenault	2018-01-31	1	-0/+1
\| \| \| \|	llvm-svn: 323927
*	AMDGPU: Use gfx9 carry-less add/sub instructions	Matt Arsenault	2017-11-30	1	-4/+8
\| \| \| \|	llvm-svn: 319491
*	[AMDGPU][MC][GFX8][GFX9] Corrected names of integer ↵	Dmitry Preobrazhensky	2017-11-20	1	-44/+0
\| \| \| \| \| \| \| \| \| \| \| \|	v_{add/addc/sub/subrev/subb/subbrev} See bug 34765: https://bugs.llvm.org//show_bug.cgi?id=34765 Reviewers: tamazov, SamWot, arsenm, vpykhtin Differential Revision: https://reviews.llvm.org/D40088 llvm-svn: 318675
*	AMDGPU: Replace i64 add/sub lowering	Matt Arsenault	2017-11-15	1	-2/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use VOP3 add/addc like usual. This has some tradeoffs. Inline immediates fold a little better, but other constants are worse off. SIShrinkInstructions could be made smarter to handle these cases. This allows us to avoid selecting scalar adds where we need to track the carry in scc and replace its users. This makes it easier to use the carryless VALU adds. llvm-svn: 318340
*	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)	Marek Olsak	2017-10-24	1	-10/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427
*	AMDGPU: Fix incorrect selection of pseudo-branches	Matt Arsenault	2017-10-10	1	-0/+2
\| \| \| \| \| \|	These should only be used if the machine structurizer is enabled. llvm-svn: 315357
*	AMDGPU: Remove global isGCN predicates	Matt Arsenault	2017-10-03	1	-147/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
*	[AMDGPU] Use v_pk_max_f16 for fcanonicalize	Stanislav Mekhanoshin	2017-09-06	1	-5/+10
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37325 llvm-svn: 312676
*	[AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalize	Stanislav Mekhanoshin	2017-09-06	1	-1/+1
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D37522 llvm-svn: 312660