summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU/SI: Remove REGISTER_STORE/REGISTER_LOAD code which is now deadTom Stellard2015-12-013-81/+0
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15050 llvm-svn: 254427
* AMDGPU: Use the default strings for data emission directivesTom Stellard2015-12-011-7/+0
| | | | | | | | | | | | | | Summary: This makes the assembly output look nicer and there is no reason to have custom strings for these. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14671 llvm-svn: 254426
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377
* Revert r254348: "Replace all weight-based interfaces in MBB with ↵Hans Wennborg2015-12-011-2/+4
| | | | | | | | | | probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366
* Squelch unused variable warning in SIRegisterInfo.cpp.Matt Arsenault2015-12-011-1/+2
| | | | | | Patch by Justin Lebar llvm-svn: 254362
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348
* AMDGPU: Fix unused functionMatt Arsenault2015-11-301-5/+0
| | | | llvm-svn: 254333
* AMDGPU: Error if too many user SGPRs usedMatt Arsenault2015-11-302-0/+8
| | | | llvm-svn: 254332
* AMDGPU: Rework how private buffer passed for HSAMatt Arsenault2015-11-309-102/+594
| | | | | | | | | | | | | | | | If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
* AMDGPU: Rename enums to be consistent with HSA code object terminologyMatt Arsenault2015-11-305-50/+49
| | | | llvm-svn: 254330
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-3012-249/+123
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Use assert zext for workgroup sizesMatt Arsenault2015-11-302-10/+24
| | | | llvm-svn: 254328
* AMDGPU: Don't reserve SCRATCH_PTR input registerMatt Arsenault2015-11-301-12/+4
| | | | | | This hasn't been doing anything since using relocations was added. llvm-svn: 254304
* AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsicTom Stellard2015-11-265-1/+28
| | | | | | | | | | | | | | Summary: This returns a pointer to the dispatch packet, which can be used to load information about the kernel dispach. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14898 llvm-svn: 254116
* AMDGPU/SI: select S_ABS_I32 when possible (v2)Marek Olsak2015-11-253-0/+37
| | | | | | | | | | v2: added more tests, moved the SALU->VALU conversion to a separate function It looks like it's not possible to get subregisters in the S_ABS lowering code, and I don't feel like guessing without testing what the correct code would look like. llvm-svn: 254095
* AMDGPU: Check feature attributes in SIMachineFunctionInfoMatt Arsenault2015-11-252-9/+134
| | | | llvm-svn: 254091
* AMDGPU: Make v2i64/v2f64 legal types.Matt Arsenault2015-11-253-3/+60
| | | | | | | They can be loaded and stored, so count them as legal. This is mostly to fix a number of common cases for load/store merging. llvm-svn: 254086
* Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)Artyom Skrobov2015-11-251-10/+3
| | | | | | | | | | | | | | Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085
* AMDGPU: Split LDS vector loadsMatt Arsenault2015-11-242-2/+3
| | | | | | If properly aligned this could allow using ds_read_b64. llvm-svn: 253975
* AMDGPU: Split x8 and x16 vector loads instead of scalarizeMatt Arsenault2015-11-242-1/+15
| | | | | | | | The one regression in the builtin tests is in the read2 test which now (again) has many extra copies, but this should be solved once the pass is replaced with a DAG combine. llvm-svn: 253974
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-191-3/+3
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-181-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* Reduce the size of MCRelaxableFragment.Akira Hatanaka2015-11-141-2/+4
| | | | | | | | | | | | | | | | | | | | | | MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127
* [MCTargetAsmParser] Move the member varialbes that referenceAkira Hatanaka2015-11-141-9/+7
| | | | | | | | | | MCSubtargetInfo in the subclasses into MCTargetAsmParser and define a member function getSTI. This is done in preparation for making changes to shrink the size of MCRelaxableFragment. (see http://reviews.llvm.org/D14346). llvm-svn: 253124
* AMDGPU: Add stony supportTom Stellard2015-11-131-0/+4
| | | | | | Patch by: Alex Deucher llvm-svn: 253053
* Revert "Remove unnecessary call to getAllocatableRegClass"Tom Stellard2015-11-123-6/+16
| | | | | | | | | | | | | This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956
* AMDGPU: Print more fields in commentsMatt Arsenault2015-11-111-3/+14
| | | | llvm-svn: 252677
* AMDGPU: Remove dead codeMatt Arsenault2015-11-111-33/+2
| | | | llvm-svn: 252675
* AMDGPU: Set isAllocatable = 0 on VS_32/VS_64Matt Arsenault2015-11-113-16/+6
| | | | llvm-svn: 252674
* AMDGPU/SI: Refactor VOP[12C] tablegen definitionsTom Stellard2015-11-062-97/+75
| | | | | | | | | | | | | | Summary: Pass the VOPProfile object all the through to *_m multiclasses. This will allow us to do more simplifications in the future. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13437 llvm-svn: 252339
* AMDGPU: Cleanup includesMatt Arsenault2015-11-062-6/+4
| | | | llvm-svn: 252328
* AMDGPU: Create emergency stack slots during frame loweringMatt Arsenault2015-11-067-14/+89
| | | | | | Test has a bogus verifier error which will be fixed by later commits. llvm-svn: 252327
* AMDGPU: Remove unused scratch resource operandsMatt Arsenault2015-11-062-75/+131
| | | | | | The SGPR spill pseudos don't actually use them. llvm-svn: 252324
* AMDGPU: Add pass to detect used kernel featuresMatt Arsenault2015-11-064-0/+138
| | | | | | | | | | | Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323
* AMDGPU: Fix hardcoded alignment of spill.Matt Arsenault2015-11-062-13/+12
| | | | | | | Instead of forcing 4 alignment when spilled, set register class alignments. llvm-svn: 252322
* AMDGPU: Hack for VS_32 register pressureMatt Arsenault2015-11-062-4/+17
| | | | | | | | | | | | | For some reason VS_32 ends up factoring into the pressure heuristics even though we should never see a virtual register with this class. When SGPRs are reserved for register spilling, this for some reason triggers reg-crit scheduling. Setting isAllocatable = 0 may help with this since that seems to remove it from the default implementation's generated table. llvm-svn: 252321
* AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNELTom Stellard2015-11-066-0/+60
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291
* AMDGPU: Also track whether SGPRs were spilledMatt Arsenault2015-11-053-2/+20
| | | | llvm-svn: 252145
* AMDGPU: Print number user SGPRsMatt Arsenault2015-11-051-0/+6
| | | | | | | This doesn't quite match how SC prints it, which doesn't put it in a comment. llvm-svn: 252144
* AMDGPU: Disallow s[102:103] on VI in assemblerMatt Arsenault2015-11-051-2/+28
| | | | llvm-svn: 252142
* AMDGPU: Fix assert when legalizing atomic operandsMatt Arsenault2015-11-053-15/+59
| | | | | | | | | | The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. llvm-svn: 252140
* AMDGPU: Make addr64 atomic operand order consistentMatt Arsenault2015-11-051-2/+2
| | | | | | | vaddr comes before srsrc in every other MUBUF instruction, and is the order it is printed. llvm-svn: 252139
* AMDGPU: Fix typoMatt Arsenault2015-11-051-2/+2
| | | | llvm-svn: 252116
* AMDGPU: Make flat_scratch name consistentMatt Arsenault2015-11-031-3/+3
| | | | | | | The printed name and the parsed assembler names weren't the same. I'm not sure which name SC prints these as, but I think it's this one. llvm-svn: 252010
* AMDGPU: Fix asserts on invalid register rangesMatt Arsenault2015-11-031-5/+13
| | | | | | | | | If the requested SGPR was not actually aligned, it was accepted and rounded down instead of rejected. Also fix an assert if the range is an invalid size. llvm-svn: 252009
* AMDGPU: Fix off by one error in register parsingMatt Arsenault2015-11-031-4/+5
| | | | | | If trying to use one past the end, this would assert. llvm-svn: 252008
* AMDGPU: s[102:103] is unavailable on VIMatt Arsenault2015-11-031-1/+10
| | | | llvm-svn: 252000
* AMDGPU: Define correct number of SGPRsMatt Arsenault2015-11-032-6/+10
| | | | | | | | | There are actually 104 so 2 were missing. More assembler tests with high register number tuples will be included in later patches. llvm-svn: 251999
* AMDGPU: Make findUsedSGPR more readableMatt Arsenault2015-11-031-7/+18
| | | | | | Add more comments etc. llvm-svn: 251996
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-033-8/+15
| | | | llvm-svn: 251995
OpenPOWER on IntegriCloud