summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Use generic bitreverse intrinsicMatt Arsenault2015-12-141-0/+1
| | | | | | Also fix bug in vector legalization for bitreverse. llvm-svn: 255512
* AMDGPU/SI: Emit constant arrays in the .text sectionTom Stellard2015-12-101-13/+1
| | | | | | | | | | | | | | | Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
* AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraintsTom Stellard2015-12-101-6/+47
| | | | | | | | | | | | Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203
* AMDGPU: Implement isNoopAddrSpaceCastMatt Arsenault2015-12-011-0/+11
| | | | llvm-svn: 254468
* AMDGPU/SI: Remove REGISTER_STORE/REGISTER_LOAD code which is now deadTom Stellard2015-12-011-16/+0
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15050 llvm-svn: 254427
* AMDGPU: Fix unused functionMatt Arsenault2015-11-301-5/+0
| | | | llvm-svn: 254333
* AMDGPU: Rework how private buffer passed for HSAMatt Arsenault2015-11-301-32/+128
| | | | | | | | | | | | | | | | If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
* AMDGPU: Rename enums to be consistent with HSA code object terminologyMatt Arsenault2015-11-301-16/+10
| | | | llvm-svn: 254330
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-301-11/+6
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Use assert zext for workgroup sizesMatt Arsenault2015-11-301-10/+21
| | | | llvm-svn: 254328
* AMDGPU: Don't reserve SCRATCH_PTR input registerMatt Arsenault2015-11-301-12/+4
| | | | | | This hasn't been doing anything since using relocations was added. llvm-svn: 254304
* AMDGPU: Add llvm.amdgcn.dispatch.ptr intrinsicTom Stellard2015-11-261-0/+16
| | | | | | | | | | | | | | Summary: This returns a pointer to the dispatch packet, which can be used to load information about the kernel dispach. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D14898 llvm-svn: 254116
* AMDGPU: Make v2i64/v2f64 legal types.Matt Arsenault2015-11-251-1/+43
| | | | | | | They can be loaded and stored, so count them as legal. This is mostly to fix a number of common cases for load/store merging. llvm-svn: 254086
* AMDGPU: Split LDS vector loadsMatt Arsenault2015-11-241-1/+2
| | | | | | If properly aligned this could allow using ds_read_b64. llvm-svn: 253975
* AMDGPU: Split x8 and x16 vector loads instead of scalarizeMatt Arsenault2015-11-241-1/+5
| | | | | | | | The one regression in the builtin tests is in the read2 test which now (again) has many extra copies, but this should be solved once the pass is replaced with a DAG combine. llvm-svn: 253974
* AMDGPU: Error on graphics shaders with HSAMatt Arsenault2015-11-021-0/+8
| | | | | | | | I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858
* AMDGPU/SI: handle undef for llvm.SI.packf16Marek Olsak2015-10-291-0/+4
| | | | llvm-svn: 251632
* AMDGPU: Simplify VOP3 operand legalization.Matt Arsenault2015-10-211-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. llvm-svn: 250951
* AMDGPU: Add MachineInstr overloads for instruction format testsMatt Arsenault2015-10-201-1/+1
| | | | llvm-svn: 250797
* AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments()Tom Stellard2015-10-061-1/+1
| | | | | | | | | | | | | | Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468
* AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is setMarek Olsak2015-09-291-3/+1
| | | | | | | | | | to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858
* AMDGPU: Re-justify workaround and fix worked around problemMatt Arsenault2015-09-251-38/+23
| | | | | | | | | | | | | | | When buffer resource descriptors were built, the upper two components of the descriptor were first composed into a 64-bit register because legalizeOperands assumed all operands had the same register class. Fix that problem, but keep the workaround. I'm not sure anything actually is actually emitting such a REG_SEQUENCE now. If multiple resource descriptors are set up with different base pointers, this is copied with a single s_mov_b64. We probably should fix this better by recognizing a pair of s_mov_b32 later, but for now delete the dead code. llvm-svn: 248585
* Reformat comment lines.NAKAMURA Takumi2015-09-221-1/+1
| | | | llvm-svn: 248262
* Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type ↵Craig Topper2015-09-211-1/+1
| | | | | | extra times. NFC llvm-svn: 248140
* propagate fast-math-flags on DAG nodesSanjay Patel2015-09-161-1/+8
| | | | | | | | | | | | | | | | | | | After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815
* SelectionDAG: Support Expand of f16 extloadsMatt Arsenault2015-09-091-29/+3
| | | | | | | | | | Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113
* check for fastness before merging in DAGCombiner::MergeConsecutiveStores() Sanjay Patel2015-09-031-1/+4
| | | | | | | | | | | | | | | | Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time we have a merged access candidate. Without this patch, we were generating unaligned 16-byte (SSE) memops for x86 targets where those accesses are slow. This change was mentioned in: http://reviews.llvm.org/D10662 and http://reviews.llvm.org/D10905 and will help solve PR21711. Differential Revision: http://reviews.llvm.org/D12573 llvm-svn: 246771
* Fix some comment typos.Benjamin Kramer2015-08-081-5/+5
| | | | llvm-svn: 244402
* AMDGPU: Assume SMRD access for constant address spaceMatt Arsenault2015-08-071-40/+75
| | | | | | | Since r243294 these are selected to SMRD and moved later if required. llvm-svn: 244354
* De-constify pointers to Type since they can't be modified. NFCCraig Topper2015-08-011-1/+1
| | | | | | This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
* AMDGPU: Fix v16i32 to v16i8 truncstoreMatt Arsenault2015-07-311-0/+1
| | | | llvm-svn: 243731
* AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory opsTom Stellard2015-07-201-6/+23
| | | | | | | | | | | | | | Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673
* AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsetsTom Stellard2015-07-161-2/+28
| | | | | | | | | | | | | | Summary: We can safely assume that the high bit of scratch offsets will never be set, because this would require at least 128 GB of GPU memory. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11225 llvm-svn: 242433
* AMDGPU: Fix chains for memory ops dependent on argument loadsMatt Arsenault2015-07-101-4/+19
| | | | | | | | | | | | | | | Most loads and stores are derived from pointers derived from a kernel argument load inserted during argument lowering. This was just using the EntryToken chain for the argument loads, and any users of these loads were also on the EntryToken chain. Return the chain of the lowered argument load so that dependent loads end up on the correct chain. No test since I'm not aware of any case where this actually broke. llvm-svn: 241960
* AMDGPU: Use requested chain when lowering argumentsMatt Arsenault2015-07-101-1/+1
| | | | | | | No test since I'm not aware of any case where this will end up being a different chain. llvm-svn: 241954
* AMDGPU: Add helper function for implicit parameter offsets.Tom Stellard2015-07-091-2/+2
| | | | | | Patch by: Zoltan Gilian llvm-svn: 241861
* Re-instate the EVT parameter to getScalarShiftAmountTy() for OOT userMehdi Amini2015-07-091-1/+1
| | | | | | | A documentation for this function would be nice by the way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241807
* Remove getDataLayout() from TargetLoweringMehdi Amini2015-07-091-4/+4
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11042 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241779
* Make isLegalAddressingMode() taking DataLayout as an argumentMehdi Amini2015-07-091-2/+3
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11040 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241778
* Make TargetLowering::getShiftAmountTy() taking DataLayout as an argumentMehdi Amini2015-07-091-1/+1
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11037 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241776
* Make TargetLowering::getPointerTy() taking DataLayout as an argumentMehdi Amini2015-07-091-5/+7
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
* [TargetLowering] StringRefize asm constraint getters.Benjamin Kramer2015-07-051-2/+1
| | | | | | | | There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411
* AMDGPU/SI: There are no implicit kernel args in the amdhsa ABITom Stellard2015-06-261-1/+2
| | | | | | | | | | Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10706 llvm-svn: 240830
* AMDGPU: Use getAsInteger instead of atoiMatt Arsenault2015-06-231-3/+5
| | | | llvm-svn: 240365
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+2241
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-195/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+195
llvm-svn: 160270
OpenPOWER on IntegriCloud