bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	AMDGPU: Assume call pseudos are convergent	Matt Arsenault	2019-05-21	1	-0/+6
\| \| \| \| \| \| \|	There should probably be nonconvergent versions, but my guess is it doesn't matter in practice. llvm-svn: 361331
*	AMDGPU: Fix not marking new gfx10 SGPRs as CSRs	Matt Arsenault	2019-05-21	1	-3/+3
\| \| \| \|	llvm-svn: 361330
*	AMDGPU: Force skip branches over calls	Matt Arsenault	2019-05-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unfortunately the way SIInsertSkips works is backwards, and is required for correctness. r338235 added handling of some special cases where skipping is mandatory to avoid side effects if no lanes are active. It conservatively handled asm correctly, but the same logic needs to apply to calls. Usually the call sequence code is larger than the skip threshold, although the way the count is computed is really broken, so I'm not sure if anything was likely to really hit this. llvm-svn: 361202
*	[AMDGPU] Fix std::array initializers to avoid warnings with older tool ↵	Bjorn Pettersson	2019-05-20	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	chains. NFC A std::array is implemented as a template with an array inside a struct. Older versions of clang, like 3.6, require an extra set of curly braces around std::array initializations to avoid warnings. The C++ language was changed regarding this by CWG 1270. So more modern tool chains does not complaing even if leaving out one level of braces. llvm-svn: 361171
*	R600: Fix unconditional return in loop	Matt Arsenault	2019-05-20	1	-10/+5
\| \| \| \|	llvm-svn: 361167
*	Use llvm::sort. NFC	Fangrui Song	2019-05-20	1	-1/+1
\| \| \| \|	llvm-svn: 361134
*	[AMDGPU] gfx1010 Avoid SMEM WAR hazard for some s_waitcnt values	Carl Ritson	2019-05-20	1	-6/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid introducing hazard mitigation when lgkmcnt is reduced to 0. Clarify code comments to explain assumptions made for this hazard mitigation. Expand and correct test cases to cover variants of s_waitcnt. Reviewers: nhaehnle, rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62058 llvm-svn: 361124
*	AMDGPU/GlobalISel: Implement s64->s64 [SU]ITOFP	Matt Arsenault	2019-05-17	2	-0/+39
\| \| \| \|	llvm-svn: 361082
*	GlobalISel: Implement lower for S64->S32 [SU]ITOFP	Matt Arsenault	2019-05-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081
*	[AMDGPU][MC] Corrected parsing of NAME:VALUE modifiers	Dmitry Preobrazhensky	2019-05-17	1	-33/+17
\| \| \| \| \| \| \| \| \| \|	See bug 41298: https://bugs.llvm.org/show_bug.cgi?id=41298 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61009 llvm-svn: 361045
*	[AMDGPU][MC] Enabled labels with s_call_b64 and s_cbranch_i_fork	Dmitry Preobrazhensky	2019-05-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	See https://bugs.llvm.org/show_bug.cgi?id=41888 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D62016 llvm-svn: 361040
*	[AMDGPU][MC] Enabled expressions for most operands which accept integer values	Dmitry Preobrazhensky	2019-05-17	2	-65/+110
\| \| \| \| \| \| \| \| \| \|	See bug 40873: https://bugs.llvm.org/show_bug.cgi?id=40873 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60768 llvm-svn: 361031
*	AMDGPU: Fix unused variable warnings in release builds	Matt Arsenault	2019-05-17	1	-12/+9
\| \| \| \|	llvm-svn: 361030
*	AMDGPU/GlobalISel: Legalize G_FCEIL	Matt Arsenault	2019-05-17	2	-2/+37
\| \| \| \|	llvm-svn: 361028
*	AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNC	Matt Arsenault	2019-05-17	2	-4/+70
\| \| \| \|	llvm-svn: 361027
*	AMDGPU/GlobalISel: Legalize G_FRINT	Matt Arsenault	2019-05-17	2	-0/+44
\| \| \| \|	llvm-svn: 361026
*	AMDGPU/GlobalISel: Legalize G_FCOPYSIGN	Matt Arsenault	2019-05-17	1	-0/+4
\| \| \| \|	llvm-svn: 361025
*	AMDGPU/GlobalISel: RegBankSelect for llvm.amdgcn.s.buffer.load	Matt Arsenault	2019-05-17	1	-0/+44
\| \| \| \|	llvm-svn: 361023
*	AMDGPU/GlobalISel: Use subreg index instead of extra unmerge	Matt Arsenault	2019-05-17	1	-8/+2
\| \| \| \| \| \| \|	This saves instructions and extra steps, but I'm not sure about introducing subregister indexes at this point. llvm-svn: 361022
*	AMDGPU/GlobalISel: Use waterfall loop for buffer_load	Matt Arsenault	2019-05-17	2	-36/+302
\| \| \| \| \| \| \|	This adds support for more complex waterfall loops that need to handle operands > 32-bits, and multiple operands. llvm-svn: 361021
*	[AMDGPU] detect WaW hazards when moving/merging load/store instructions	Rhys Perry	2019-05-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In order to combine memory operations efficiently, the load/store optimizer might move some instructions around. It's usually safe to move instructions down past the merged instruction because the pass checks if memory operations can be re-ordered. Though, the current logic doesn't handle Write-after-Write hazards. This fixes a reflection issue with Monster Hunter World and DXVK. v2: - rebased on top of master - clean up the test case - handle WaW hazards correctly Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40130 Original patch by Samuel Pitoiset. Reviewers: tpr, arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: ronlieb, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D61313 llvm-svn: 361008
*	AMDGPU: Introduce TokenFactor for ABI register copies in call sequence	Matt Arsenault	2019-05-16	1	-0/+7
\| \| \| \| \| \| \|	The call was missing chain dependencies on the pre-call copies. I don't think this was causing any real issues however. llvm-svn: 360906
*	AMDGPU: Assume xnack is enabled by default	Matt Arsenault	2019-05-16	3	-2/+28
\| \| \| \| \| \| \| \| \| \| \|	This is the conservatively correct default. It is always safe to assume xnack is enabled, but not the converse. Introduce a feature to blacklist targets where xnack can never be meaningfully enabled. I'm not sure the targets this is applied to is 100% correct. llvm-svn: 360903
*	AMDGPU/GlobalISel: Correct regbank for 1-bit and/or/xor	Matt Arsenault	2019-05-16	1	-1/+1
\| \| \| \| \| \|	Bool values should use the scc/vcc regbank since r350611. llvm-svn: 360877
*	[AMDGPU] Increases available SGPR for Calling Convention	Ryan Taylor	2019-05-15	2	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: SGPR in CC can be either hw initialized or set by other chained shaders and so this increases the SGPR count availalbe to CC to 105. Change-Id: I3dfadc750fe4a3e2bd07117a2899fd13f3e2fef3 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61261 llvm-svn: 360778
*	[AMDGPU] Create a TargetInfo header. NFC	Richard Trieu	2019-05-14	9	-8/+35
\| \| \| \| \| \| \| \|	Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360713
*	[AMDGPU][GFX8][GFX9] Corrected predicate of v_*_co_u32 aliases	Dmitry Preobrazhensky	2019-05-14	1	-1/+6
\| \| \| \| \| \| \| \|	Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D61905 llvm-svn: 360702
*	[AMDGPU] Fixed handling of imemdiate i1 literals	Stanislav Mekhanoshin	2019-05-14	1	-0/+3
\| \| \| \| \| \| \| \|	This bug was exposed by the rL360395. Differential Revision: https://reviews.llvm.org/D61812 llvm-svn: 360689
*	[AMDGPU] Fixed +DumpCode	Tim Renouf	2019-05-14	3	-12/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. Longer term, we should re-implement that by using the LLVM disassembler from the Vulkan driver. Recent LLVM changes broke +DumpCode. With -filetype=asm it crashed, and with -filetype=obj I think it did not include any instructions, only the labels. Fixed with this commit: now it has no effect with -filetype=asm, and works as intended with -filetype=obj. Differential Revision: https://reviews.llvm.org/D60682 Change-Id: I6436d86fe2ea220d74a643a85e64753747c9366b llvm-svn: 360688
*	[AMDGPU] Reorder includes per coding standard. NFC.	Stanislav Mekhanoshin	2019-05-13	1	-1/+1
\| \| \| \|	llvm-svn: 360609
*	[AMDGPU] Remove now unused V2FP16_ONE constant def. NFC.	Stanislav Mekhanoshin	2019-05-13	1	-1/+0
\| \| \| \|	llvm-svn: 360608
*	[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC	Richard Trieu	2019-05-11	11	-37/+11
\| \| \| \| \| \| \| \| \|	For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360487
*	[AMDGPU] Pattern for v_xor3_b32	Stanislav Mekhanoshin	2019-05-10	1	-1/+4
\| \| \| \| \| \| \| \| \|	This also allows three op patterns to use increased constant bus limit of GFX10. Differential Revision: https://reviews.llvm.org/D61763 llvm-svn: 360395
*	[AMDGPU] gfx1010 v_interp_* instructions	Stanislav Mekhanoshin	2019-05-09	1	-6/+11
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61703 llvm-svn: 360364
*	[AMDGPU] gfx1010 changes for PAL metadata	Stanislav Mekhanoshin	2019-05-09	1	-2/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61704 llvm-svn: 360353
*	AMDGPU: Mark scheduler classes as final	Matt Arsenault	2019-05-08	1	-2/+2
\| \| \| \|	llvm-svn: 360294
*	AMDGPU: Select VOP3 form of add	Matt Arsenault	2019-05-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The VOP3 form should always be the preferred selection, to be shrunk later. This should only be an optimization issue, but this partially works around a problem from clobbering VCC when SIFixSGPRCopies rewrites an SCC defining operation directly to VCC. 3 of the testcases are regressions from failing to fold the immediate in cases it should. These can be avoided by improving the VCC liveness handling in SIFoldOperands. Simply increasing the threshold to computeRegisterLiveness works, although this is common enough that VCC liveness should probably be tracked throughout the pass. The hack of leaving behind an implicit_def instruction to avoid breaking iterator wastes instruction count, which inhibits finding the VCC def in long chains of adds. Doing this however exposes different, worse looking regressions from poor scheduling behavior. This could probably be avoided around by forcing the shrink of the addc here, but the scheduler should probably be fixed. The r600 add test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. llvm-svn: 360293
*	[AMDGPU] gfx1010 exp modifications	Stanislav Mekhanoshin	2019-05-08	3	-2/+17
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61701 llvm-svn: 360287
*	AMDGPU: Fix a mis-placed bracket	Changpeng Fang	2019-05-08	1	-1/+1
\| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61430 llvm-svn: 360283
*	[AMDGPU] Reapplied BFE canonicalization from D60462	Simon Pilgrim	2019-05-08	1	-11/+25
\| \| \| \| \| \|	This was committed in rL358887 but reverted in rL360066 due to a x86 regression, really it should be have been pre-committed instead of being part of the SimplifyDemandedBits bitcast patch. llvm-svn: 360263
*	R600InstrInfo.cpp - Add getTransSwizzle assert for the swizzle op index. NFCI.	Simon Pilgrim	2019-05-08	1	-0/+1
\| \| \| \| \| \|	Fixes static analyzer undefined value warning. llvm-svn: 360239
*	[SIMode] Fix typo in Status constructor	Simon Pilgrim	2019-05-08	1	-1/+1
\| \| \| \| \| \| \| \|	As noted in https://www.viva64.com/en/b/0629/ (Snippet No. 36) and the scan-build CI reports (https://llvm.org/reports/scan-build/report-SIModeRegister.cpp-Status-1-1.html#EndPath), rL348754 introduced a typo in the Status constructor due to argument variable names shadowing the member variable names. Differential Revision: https://reviews.llvm.org/D61595 llvm-svn: 360236
*	[AMDGPU] Check MI bundles for hazards	Austin Kerbow	2019-05-07	2	-11/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions in both scheduler and hazard recognizer mode. We ignore “bundledness” for the purpose of detecting hazards and examine the instructions individually. Reviewers: arsenm, msearles, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61564 llvm-svn: 360199
*	AMDGPU: Verify that SOP2/SOPC instructions have at most one immediate operand	Nicolai Haehnle	2019-05-07	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: No test case because I don't know of a way to trigger this, but I accidentally caused this to fail while working on a different change. Change-Id: I8015aa447fe27163cc4e4902205a203bd44bf7e3 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61490 llvm-svn: 360123
*	[AMDGPU] gfx1010 verifier changes	Stanislav Mekhanoshin	2019-05-06	1	-7/+15
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61521 llvm-svn: 360095
*	[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32	Stanislav Mekhanoshin	2019-05-06	1	-1/+1
\| \| \| \| \| \| \| \| \|	GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for all targets. Differential Revision: https://reviews.llvm.org/D61525 llvm-svn: 360094
*	[AMDGPU] gfx1010 memory legalizer	Stanislav Mekhanoshin	2019-05-06	1	-1/+262
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61535 llvm-svn: 360087
*	Revert r359392 and r358887	Craig Topper	2019-05-06	1	-25/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead" Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling" Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list. Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work there. I'll file a separate PR for that and add test cases. Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised if that bug can still be hit independent of that. This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again. llvm-svn: 360066
*	Fix compilation warnings when compiling with GCC 7.3	Alexandre Ganea	2019-05-06	1	-0/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61046 llvm-svn: 360044
*	[AMDGPU] Fixed asan error after D61536	Stanislav Mekhanoshin	2019-05-04	1	-1/+1
\| \| \| \|	llvm-svn: 359963