bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	AMDGPU: Refactor Subtarget classes	Tom Stellard	2018-07-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
*	AMDGPU: Separate R600 and GCN TableGen files	Tom Stellard	2018-06-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
*	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers	Tom Stellard	2018-05-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930
*	Rename DEBUG macro to LLVM_DEBUG.	Nicola Zaghen	2018-05-14	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
*	AMDGPU: Add Vega12 and Vega20	Matt Arsenault	2018-04-30	1	-5/+13
\| \| \| \| \| \| \| \|	Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215
*	[AMDGPU] Truncate packed inline constant	Stanislav Mekhanoshin	2018-04-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a packed inline constant is sign extended it must be truncated after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended to 64 bit. After the value shifted right by 16 to use it in a low part with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline constant any longer. Fixed the error and added verification code. Without the fix and with the verification bug is causing pk_max_f16_literal.ll to fail. Differential Revision: https://reviews.llvm.org/D45987 llvm-svn: 330752
*	[AMDGPU] Use packed literals with zero either lower or hi part	Stanislav Mekhanoshin	2018-04-19	1	-2/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D45790 llvm-svn: 330365
*	[AMDGPU] Enabled v2.16 literals for VOP3P	Stanislav Mekhanoshin	2018-04-17	1	-0/+19
\| \| \| \| \| \| \| \|	Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 llvm-svn: 330230
*	AMDGPU: Fix crash when constant folding with physreg operand	Matt Arsenault	2018-03-10	1	-1/+2
\| \| \| \|	llvm-svn: 327209
*	AMDGPU: Don't crash when trying to fold implicit operands	Matt Arsenault	2018-02-08	1	-0/+1
\| \| \| \|	llvm-svn: 324550
*	MachineFunction: Return reference from getFunction(); NFC	Matthias Braun	2017-12-15	1	-1/+1
\| \| \| \| \| \|	The Function can never be nullptr so we can return a reference. llvm-svn: 320884
*	Rename LiveIntervalAnalysis.h to LiveIntervals.h	Matthias Braun	2017-12-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Headers/Implementation files should be named after the class they declare/define. Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in favor of `class LiveIntarvals;` llvm-svn: 320546
*	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.	Francis Visoiu Mistrih	2017-12-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" \) -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' llvm-svn: 320022
*	[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output	Francis Visoiu Mistrih	2017-11-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . \( -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" \) -type f -print0 \| xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427
*	[CodeGen] Print register names in lowercase in both MIR and debug output	Francis Visoiu Mistrih	2017-11-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
*	Remove unused variables	Vitaly Buka	2017-10-15	1	-1/+1
\| \| \| \|	llvm-svn: 315847
*	AMDGPU: Add comment about clamps	Matt Arsenault	2017-10-05	1	-0/+2
\| \| \| \|	llvm-svn: 314952
*	AMDGPU: Do not fold clamp instructions when sources are different	Matt Arsenault	2017-10-05	1	-0/+1
\| \| \| \| \| \|	Patch by hakzsam (Samuel Pitoiset) llvm-svn: 314951
*	AMDGPU: Start selecting v_mad_mixhi_f16	Matt Arsenault	2017-09-20	1	-0/+1
\| \| \| \|	llvm-svn: 313814
*	AMDGPU: Fold clamp modifier for packed instructions	Matt Arsenault	2017-08-31	1	-5/+18
\| \| \| \|	llvm-svn: 312297
*	AMDGPU: Fix crash when folding immediates into multiple uses	Nicolai Haehnle	2017-07-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When an immediate is folded by constant folding, we re-scan the entire use list for two reasons: 1. The constant folding may have created a new use of the same reg. 2. The constant folding may have removed an additional use in the list we're currently traversing (e.g., constant folding an S_ADD_I32 c, c). However, this could previously lead to a crash when an unrelated use was added twice into the FoldList. Since we re-scan the whole list anyway, we might as well just clear the FoldList again before we do so. Using a MIR test to show this because real code seems to trigger the issue only in connection with some really subtle control flow structures. Fixes GL45-CTS.shading_language_420pack.binding_images on gfx9. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35416 llvm-svn: 308314
*	[AMDGPU] Fix -Wimplicit-fallthrough warnings. NFCI.	Simon Pilgrim	2017-07-07	1	-0/+1
\| \| \| \|	llvm-svn: 307381
*	AMDGPU: Do operand folding in program order	Matt Arsenault	2017-06-20	1	-5/+3
\| \| \| \| \| \| \| \| \|	Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. llvm-svn: 305821
*	AMDGPU: Preserve undef when folding register operands	Matt Arsenault	2017-06-20	1	-0/+2
\| \| \| \| \| \| \| \|	If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. llvm-svn: 305816
*	AMDGPU: Fix crash with undef vreg input operand	Matt Arsenault	2017-06-20	1	-1/+1
\| \| \| \|	llvm-svn: 305814
*	[AMDGPU] Fix SIFoldOperands crash with clamp	Stanislav Mekhanoshin	2017-06-05	1	-1/+2
\| \| \| \| \| \| \| \| \|	Fixes bug #33302. Pass did not account that Src1 of max instruction can be an immediate. Differential Revision: https://reviews.llvm.org/D33884 llvm-svn: 304696
*	[AMDGPU] Preserve operand order in SIFoldOperands	Stanislav Mekhanoshin	2017-06-03	1	-3/+18
\| \| \| \| \| \| \| \| \|	SIFoldOperands can commute operands even if no folding was done. This change is to preserve IR is no folding was done. Differential Revision: https://reviews.llvm.org/D33802 llvm-svn: 304625
*	[AMDGPU] Allow SDWA in instructions with immediates and SGPRs	Stanislav Mekhanoshin	2017-05-30	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An encoding does not allow to use SDWA in an instruction with scalar operands, either literals or SGPRs. That is however possible to copy these operands into a VGPR first. Several copies of the value are produced if multiple SDWA conversions were done. To cleanup MachineLICM (to hoist copies out of loops), MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace SGPR to VGPR copy with immediate copy right to the VGPR) runs are added after the SDWA pass. Differential Revision: https://reviews.llvm.org/D33583 llvm-svn: 304219
*	[AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns	Sam Kolton	2017-03-31	1	-22/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously compiler often extracted common immediates into specific register, e.g.: ``` %vreg0 = S_MOV_B32 0xff; %vreg2 = V_AND_B32_e32 %vreg0, %vreg1 %vreg4 = V_AND_B32_e32 %vreg0, %vreg3 ``` Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands: ``` SDWA src: %vreg2 src_sel:BYTE_0 SDWA src: %vreg4 src_sel:BYTE_0 ``` With this change peephole check if operand is either immediate or register that is copy of immediate. llvm-svn: 299202
*	[AMDGPU] Fold V_CNDMASK with identical source operands	Stanislav Mekhanoshin	2017-03-24	1	-0/+29
\| \| \| \| \| \| \| \|	Such instructions sometimes appear after lowering and folding. Differential Revision: https://reviews.llvm.org/D31318 llvm-svn: 298723
*	AMDGPU: Support v2i16/v2f16 packed operations	Matt Arsenault	2017-02-27	1	-6/+8
\| \| \| \|	llvm-svn: 296396
*	AMDGPU: Fold omod into instructions	Matt Arsenault	2017-02-27	1	-5/+139
\| \| \| \|	llvm-svn: 296372
*	AMDGPU: Use clamp with f64	Matt Arsenault	2017-02-22	1	-1/+2
\| \| \| \|	llvm-svn: 295908
*	AMDGPU: Fold FP clamp as modifier bit	Matt Arsenault	2017-02-22	1	-4/+72
\| \| \| \| \| \| \| \| \| \| \|	The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. llvm-svn: 295905
*	AMDGPU: Fix folding immediates into mac src2	Matt Arsenault	2017-01-11	1	-2/+30
\| \| \| \| \| \| \|	Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711
*	AMDGPU: Constant fold when immediate is materialized	Matt Arsenault	2017-01-10	1	-141/+228
\| \| \| \| \| \|	In future commits these patterns will appear after moveToVALU changes. llvm-svn: 291615
*	AMDGPU: Fix handling of 16-bit immediates	Matt Arsenault	2016-12-10	1	-5/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306
*	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	Tom Stellard	2016-12-07	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch By: Wei Ding Summary: This patch fixes the fdiv precision issues. Reviewers: b-sumner, cfang, wdng, arsenm Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D26424 llvm-svn: 288879
*	AMDGPU: Refactor immediate folding logic	Matt Arsenault	2016-11-29	1	-14/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	Change the logic for when to fold immediates to consider the destination operand rather than the source of the materializing mov instruction. No change yet, but this will allow for correctly handling i16/f16 operands. Since 32-bit moves are used to materialize constants for these, the same bitvalue will not be in the register. llvm-svn: 288184
*	AMDGPU: Cleanup immediate folding code	Matt Arsenault	2016-11-23	1	-64/+62
\| \| \| \| \| \| \|	Move code down to use, reorder to avoid hard to follow immediate folding logic. llvm-svn: 287818
*	AMDGPU: Fix debug printing	Matt Arsenault	2016-11-23	1	-1/+1
\| \| \| \| \| \|	The uint8_t was printed as a char which didn't really work. llvm-svn: 287817
*	[AMDGPU] Add f16 support (VI+)	Konstantin Zhuravlyov	2016-11-13	1	-7/+9
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753
*	AMDGPU: Don't fold undef uses or copies with implicit uses	Matt Arsenault	2016-10-06	1	-4/+22
\| \| \| \|	llvm-svn: 283476
*	AMDGPU: Remove leftover implicit operands when folding immediates	Matt Arsenault	2016-10-06	1	-7/+26
\| \| \| \| \| \| \| \|	When constant folding an operation to a copy or an immediate mov, the implicit uses/defs of the old instruction were left behind, e.g. replacing v_or_b32 left the implicit exec use on the new copy. llvm-svn: 283471
*	Use StringRef in Pass/PassManager APIs (NFC)	Mehdi Amini	2016-10-01	1	-3/+1
\| \| \| \|	llvm-svn: 283004
*	AMDGPU: Support folding FrameIndex operands	Matt Arsenault	2016-09-14	1	-9/+26
\| \| \| \| \| \|	This avoids test regressions in a future commit. llvm-svn: 281491
*	AMDGPU: Improve splitting 64-bit bit ops by constants	Matt Arsenault	2016-09-14	1	-0/+126
\| \| \| \| \| \| \| \|	This addresses a TODO to handle operations besides and. This also starts eliminating no-op operations with a constant that can emerge later. llvm-svn: 281488
*	AMDGPU: Don't fold subregister extracts into tied operands	Matt Arsenault	2016-08-15	1	-3/+15
\| \| \| \|	llvm-svn: 278676
*	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC	Duncan P. N. Exon Smith	2016-06-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
*	AMDGPU: Cleanup subtarget handling.	Matt Arsenault	2016-06-24	1	-4/+3
\| \| \| \| \| \| \| \| \|	Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652