summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Scavenge register instead of findUnusedRegMatt Arsenault2019-03-141-1/+1
| | | | llvm-svn: 356149
* AMDGPU/GlobalISel: Implement select for G_EXTRACTTom Stellard2019-02-281-0/+7
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49714 llvm-svn: 355156
* [AMDGPU][MC] Added support of lds_direct operandDmitry Preobrazhensky2019-02-081-0/+3
| | | | | | | | | | See bug 39293: https://bugs.llvm.org/show_bug.cgi?id=39293 Reviewers: artem.tamazov, rampitec Differential Revision: https://reviews.llvm.org/D57889 llvm-svn: 353524
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [AMDGPU] Simplify negated conditionStanislav Mekhanoshin2018-12-131-0/+57
| | | | | | | | | | | | | | | | | | | Optimize sequence: %sel = V_CNDMASK_B32_e64 0, 1, %cc %cmp = V_CMP_NE_U32 1, %1 $vcc = S_AND_B64 $exec, %cmp S_CBRANCH_VCC[N]Z => $vcc = S_ANDN2_B64 $exec, %cc S_CBRANCH_VCC[N]Z It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the rebuildSetCC(). Differential Revision: https://reviews.llvm.org/D55402 llvm-svn: 349003
* AMDGPU: Only add implicit super-reg def for first subregMatt Arsenault2018-11-261-2/+2
| | | | llvm-svn: 347572
* [MI] Change the array of `MachineMemOperand` pointers to beChandler Carruth2018-08-161-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a generically extensible collection of extra info attached to a `MachineInstr`. The primary change here is cleaning up the APIs used for setting and manipulating the `MachineMemOperand` pointer arrays so chat we can change how they are allocated. Then we introduce an extra info object that using the trailing object pattern to attach some number of MMOs but also other extra info. The design of this is specifically so that this extra info has a fixed necessary cost (the header tracking what extra info is included) and everything else can be tail allocated. This pattern works especially well with a `BumpPtrAllocator` which we use here. I've also added the basic scaffolding for putting interesting pointers into this, namely pre- and post-instruction symbols. These aren't used anywhere yet, they're just there to ensure I've actually gotten the data structure types correct. I'll flesh out support for these in a subsequent patch (MIR dumping, parsing, the works). Finally, I've included an optimization where we store any single pointer inline in the `MachineInstr` to avoid the allocation overhead. This is expected to be the overwhelmingly most common case and so should avoid any memory usage growth due to slightly less clever / dense allocation when dealing with >1 MMO. This did require several ergonomic improvements to the `PointerSumType` to reasonably support the various usage models. This also has a side effect of freeing up 8 bits within the `MachineInstr` which could be repurposed for something else. The suggested direction here came largely from Hal Finkel. I hope it was worth it. ;] It does hopefully clear a path for subsequent extensions w/o nearly as much leg work. Lots of thanks to Reid and Justin for careful reviews and ideas about how to do all of this. Differential Revision: https://reviews.llvm.org/D50701 llvm-svn: 339940
* [AMDGPU] Fix VGPR spills where offset doesn't fit in 12 bitsScott Linder2018-07-261-11/+16
| | | | | | | | | | Scale the offset of VGPR spills by the wave size when it cannot fit in the 12-bit offset immediate field and so is added to the soffset SGPR. This accounts for hardware swizzling of scratch memory. Differential Revision: https://reviews.llvm.org/D49448 llvm-svn: 338060
* AMDGPU: Refactor Subtarget classesTom Stellard2018-07-111-12/+12
| | | | | | | | | | | | | | | | | Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
* AMDGPU: Separate R600 and GCN TableGen filesTom Stellard2018-06-281-2/+0
| | | | | | | | | | | | | | | | | | | | | Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
* AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headersTom Stellard2018-05-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930
* AMDGPU/GlobalISel: Implement select() for >32-bit G_STORETom Stellard2018-05-111-0/+6
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D46153 llvm-svn: 332154
* AMDGPU/GlobalISel: Enable TableGen'd instruction selectorTom Stellard2018-05-101-0/+21
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, mgorny, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45994 llvm-svn: 332039
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-2/+2
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* AMDGPU: Move a flawed assert when spilling SGPRsMatt Arsenault2018-04-231-0/+4
| | | | | | | | It's possible to validly spill the frame offset register in a call sequence to a VGPR. There are definitely issues with SGPR spilling to memory, so move the assert later. llvm-svn: 330612
* [AMDGPU] : fix for the crash in SIRegisterInfo when the regiser class not foundAlexander Timofeev2018-03-011-1/+7
| | | | | | Differential revision: https://reviews.llvm.org./D43334 llvm-svn: 326451
* [AMDGPU] added writelane intrinsicTim Renouf2018-02-281-1/+12
| | | | | | | | | | | | | | | | | Summary: For use by LLPC SPV_AMD_shader_ballot extension. The v_writelane instruction was already implemented for use by SGPR spilling, but I had to add an extra dummy operand tied to the destination, to represent that all lanes except the selected one keep the old value of the destination register. .ll test changes were due to schedule changes caused by that new operand. Differential Revision: https://reviews.llvm.org/D42838 llvm-svn: 326353
* [AMDGPU] Make sure all super regs of reserved regs are marked reserved.Geoff Berry2018-01-241-7/+0
| | | | | | | | | | | | | | | | | | | | | Summary: Move reserveRegisterTuples into AMDGPURegisterInfo and use it in R600RegisterInfo::getReservedRegs and R600InstrInfo::reserveIndirectRegisters to ensure that all super registers of reserved registers are also marked as reserved. Before this change, under certain circumstances, the registers %t1_x and %t1_xyzw would be marked as reserved, but %t1_xy and %t1_xyz would not be, leading to the register allocator sometimes assigning a register to %t1_xy, which is invalid since %t1_x is reserved. Reviewers: arsenm, tstellar, MatzeB, qcolombet Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D42448 llvm-svn: 323356
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-01-221-2/+2
| | | | | | "the the" -> "the" llvm-svn: 323074
* [AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK supportDmitry Preobrazhensky2018-01-101-0/+3
| | | | | | | | | See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764 Differential Revision: https://reviews.llvm.org/D41614 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 322189
* MachineFunction: Return reference from getFunction(); NFCMatthias Braun2017-12-151-1/+1
| | | | | | The Function can never be nullptr so we can return a reference. llvm-svn: 320884
* [AMDGPU][MC][GFX9] Corrected encoding of ttmp registers, disabled tba/tmaDmitry Preobrazhensky2017-12-111-0/+2
| | | | | | | | | | | | See bugs 35494 and 35559: https://bugs.llvm.org/show_bug.cgi?id=35494 https://bugs.llvm.org/show_bug.cgi?id=35559 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D41007 llvm-svn: 320375
* AMDGPU: Use carry-less adds in FI eliminationMatt Arsenault2017-11-301-8/+2
| | | | llvm-svn: 319501
* [CodeGen] Print "%vreg0" as "%0" in both MIR and debug outputFrancis Visoiu Mistrih2017-11-301-5/+5
| | | | | | | | | | | | | | | | | As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 llvm-svn: 319427
* AMDGPU: Fix not converting d16 load/stores to offsetMatt Arsenault2017-11-131-1/+22
| | | | | | Fixes missed optimization with new MUBUF instructions. llvm-svn: 318106
* [SystemZ] implement shouldCoalesce()Jonas Paulsson2017-09-291-1/+2
| | | | | | | | | | | | | | | | | | | Implement shouldCoalesce() to help regalloc avoid running out of GR128 registers. If a COPY involving a subreg of a GR128 is coalesced, the live range of the GR128 virtual register will be extended. If this happens where there are enough phys-reg clobbers present, regalloc will run out of registers (if there is not a single GR128 allocatable register available). This patch tries to allow coalescing only when it can prove that this will be safe by checking the (local) interval in question. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D37899 https://bugs.llvm.org/show_bug.cgi?id=34610 llvm-svn: 314516
* AMDGPU: Pass special input registers to functionsMatt Arsenault2017-08-031-55/+0
| | | | llvm-svn: 309998
* AMDGPU: Initial implementation of callsMatt Arsenault2017-08-011-2/+9
| | | | | | | | | Includes a hack to fix the type selected for the GlobalAddress of the function, which will be fixed by changing the default datalayout to use generic pointers for 0. llvm-svn: 309732
* AMDGPU: Move INDIRECT_BASE_ADDR definition out of common filesTom Stellard2017-07-291-1/+0
| | | | | | | | | | | | | | Summary: This is only used by R600. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35926 llvm-svn: 309476
* AMDGPU: Preserve undef flag in eliminateFrameIndexMatt Arsenault2017-07-211-10/+9
| | | | | | | | | | Fixes verifier errors in some call tests. Not sure why we haven't run into this before. Test split into separate patch for once call support is committed. llvm-svn: 308774
* Implement LaneBitmask::getNumLanes and LaneBitmask::getHighestLaneKrzysztof Parzyszek2017-07-201-2/+1
| | | | | | | This should eliminate most uses of countPopulation and Log2_32 on the lane mask values. llvm-svn: 308658
* AMDGPU: Figure out private memory regs after loweringMatt Arsenault2017-07-181-0/+4
| | | | | | | | | | | | | | | | | | Introduce pseudo-registers for registers needed for stack access, which are replaced during finalizeLowering. Note these pseudo-registers are currently only used for the used register location, and not for determining their input argument register. This is better because it avoids the need to try to predict whether a call will be emitted from the IR, and also detects stack objects introduced by legalization. Test changes are from the HasStackObjects check being more accurate since stack objects introduced during legalization are now known. llvm-svn: 308325
* AMDGPU: Partially fix implicit.buffer.ptr intrinsic handlingMatt Arsenault2017-06-261-6/+5
| | | | | | | | | | | | | | This should not be treated as a different version of private_segment_buffer. These are distinct things with different uses and register classes, and requires the function argument info to have more context about the function's type and environment. Also add missing test coverage for the intrinsic, and emit an error for HSA. This also encovers that the intrinsic is broken unless there happen to be stack objects. llvm-svn: 306264
* AMDGPU: Fix scratch wave offset relative FI expansionMatt Arsenault2017-06-191-9/+20
| | | | | | | | The offset may not be an inline immediate, so this needs to be materialized into a register. The post-RA run of SIShrinkInstructions is able to fold it later if it can. llvm-svn: 305761
* AMDGPU: Work around build special casing .inc filesMatt Arsenault2017-06-081-1/+2
| | | | | | | It complains because it assumes these were autogenerated files in the source directory. llvm-svn: 305005
* AMDGPU: Use correct register names in inline assemblyMatt Arsenault2017-06-081-0/+59
| | | | | | Fixes using physical registers in inline asm from clang. llvm-svn: 305004
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* AMDGPU: Start defining a calling conventionMatt Arsenault2017-05-171-8/+35
| | | | | | | | Partially implement callee-side for arguments and return values. byval doesn't work properly, and most likely sret or other on-stack return values most as well. llvm-svn: 303308
* AMDGPU: Expand frame indexes to be relative to scratch wave offsetMatt Arsenault2017-05-171-6/+71
| | | | | | | | | | | | In order for an arbitrary callee to access an object in a caller's stack frame, the 32-bit offset used as the private pointer needs to be relative to the kernel's scratch wave offset register. Convert to this by finding the difference from the current stack frame and scaling by the wavefront size. llvm-svn: 303303
* AMDGPU: Use appropriate soffset for spillingMatt Arsenault2017-05-171-13/+13
| | | | | | | This needs to be the frame offset register, and not the global scratch wave offset register. For kernels, these are the same. llvm-svn: 303287
* [AMDGPU] Merge M0 initializationsStanislav Mekhanoshin2017-04-241-0/+3
| | | | | | | | | | Merges equivalent initializations of M0 and hoists them into a common dominator block. Technically the same code can be used with any register, physical or virtual. Differential Revision: https://reviews.llvm.org/D32279 llvm-svn: 301228
* Move size and alignment information of regclass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-28/+31
| | | | | | | | | | | | | | | 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221
* Fix typoMatt Arsenault2017-04-181-1/+1
| | | | llvm-svn: 300597
* [AMDGPU] added SIInstrInfo::getAddNoCarry() helperStanislav Mekhanoshin2017-04-141-3/+1
| | | | | | | | Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
* Revert "Correct register pressure calculation in presence of subregs"Stanislav Mekhanoshin2017-02-241-16/+0
| | | | | | | | This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. llvm-svn: 296182
* Correct register pressure calculation in presence of subregsStanislav Mekhanoshin2017-02-231-0/+16
| | | | | | | | | | If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 llvm-svn: 296009
* AMDGPU: Don't use stack space for SGPR->VGPR spillsMatt Arsenault2017-02-211-23/+88
| | | | | | | | | | | | | | | | Before frame offsets are calculated, try to eliminate the frame indexes used by SGPR spills. Then we can delete them after. I think for now we can be sure that no other instruction will be re-using the same frame indexes. It should be easy to notice if this assumption ever breaks since everything asserts if it tries to use a dead frame index later. The unused emergency stack slot seems to still be left behind, so an additional 4 bytes is still wasted. llvm-svn: 295753
* AMDGPU: Merge initial gfx9 supportMatt Arsenault2017-02-181-0/+6
| | | | llvm-svn: 295554
* [AMDGPU] Override PSet for M0Stanislav Mekhanoshin2017-02-101-0/+8
| | | | | | | | | | | | This change returns empty PSet list for M0 register. Otherwise its PSet as defined by tablegen is SReg_32. This results in incorrect register pressure calculation every time an instruction uses M0. Such uses count as SReg_32 PSet and inadequately increase pressure on SGPRs. Differential Revision: https://reviews.llvm.org/D29798 llvm-svn: 294691
* [AMDGPU] Implement register pressure callbacksStanislav Mekhanoshin2017-02-081-0/+31
| | | | | | | | | | | | | | | | | | | Implement getRegPressureLimit and getRegPressureSetLimit callbacks in SIRegisterInfo. This makes standard converge scheduler to behave almost the same as GCNScheduler, sometime slightly better sometimes a bit worse. In gerenal that is also possible to switch GCNScheduler to use these callbacks instead of getMaxWaves(), which also makes GCNScheduler slightly better on some tests and slightly worse on another. A big win is behavior with converge scheduler. Note, these are used not only by scheduling, but in places like MachineLICM. Differential Revision: https://reviews.llvm.org/D29700 llvm-svn: 294518
OpenPOWER on IntegriCloud