summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32Matt Arsenault2016-07-181-0/+1
| | | | llvm-svn: 275871
* AMDGPU: Remove dead codeMatt Arsenault2016-07-141-1/+0
| | | | llvm-svn: 275369
* AMDGPU: Expand unaligned accesses earlyMatt Arsenault2016-07-011-1/+1
| | | | | | | | Due to visit order problems, in the case of an unaligned copy the legalized DAG fails to eliminate extra instructions introduced by the expansion of both unaligned parts. llvm-svn: 274397
* AMDGPU: Improve load/store of illegal types.Matt Arsenault2016-07-011-1/+4
| | | | | | | | | | There was a combine before to handle the simple copy case. Split this into handling loads and stores separately. We might want to change how this handles some of the vector extloads, since this can result in large code size increases. llvm-svn: 274394
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-1/+1
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* AMDGPU: Fix verifier errors in SILowerControlFlowMatt Arsenault2016-06-221-1/+2
| | | | | | | | | | | | | The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. llvm-svn: 273467
* AMDGPU: Add implicitarg.ptr intrinsic.Jan Vesely2016-06-211-2/+3
| | | | | | | | Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 llvm-svn: 273317
* AMDGPU/SI: Refactor fixup handling for constant addrspace variablesTom Stellard2016-06-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272705
* Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables"Tom Stellard2016-06-141-1/+0
| | | | | | This reverts commit r272675. llvm-svn: 272677
* AMDGPU/SI: Refactor fixup handling for constant addrspace variablesTom Stellard2016-06-141-0/+1
| | | | | | | | | | | | | | | | | | | Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 llvm-svn: 272675
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-14/+8
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* AMDGPU: Remove custom load/store scalarizationMatt Arsenault2016-04-141-6/+0
| | | | llvm-svn: 266385
* AMDGPU: Add atomic_inc + atomic_dec intrinsicsMatt Arsenault2016-04-121-0/+2
| | | | | | | These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074
* AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}Tom Stellard2016-04-011-0/+1
| | | | | | | | | | | | | | | | | Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170
* AMDGPU: R600 code splitting cleanupMatt Arsenault2016-03-111-8/+3
| | | | | | | Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
* AMDGPU: Move function only used by R600Matt Arsenault2016-03-071-1/+0
| | | | llvm-svn: 262853
* AMDGPU: Rename intrinsic to better match instruction nameMatt Arsenault2016-02-131-1/+1
| | | | | | Also fixes missing f32 test. llvm-svn: 260780
* AMDGPU: Split R600 and SI store loweringMatt Arsenault2016-02-111-1/+0
| | | | | | | These were only sharing some somewhat incorrect logic for when to scalarize or split vectors. llvm-svn: 260490
* AMDGPU: Split R600 and SI load loweringMatt Arsenault2016-02-101-1/+0
| | | | | | | These weren't actually sharing anything in the common LowerLOAD. llvm-svn: 260398
* AMDGPU: Match some med3 patternsMatt Arsenault2016-01-281-0/+3
| | | | llvm-svn: 259089
* AMDGPU: Remove more unused intrinsicsMatt Arsenault2016-01-231-1/+0
| | | | | | Replace tests with lrp with basic IR expansion llvm-svn: 258612
* AMDGPU: Remove abs intrinsicMatt Arsenault2016-01-201-1/+0
| | | | llvm-svn: 258343
* AMDGPU: Reduce 64-bit SRAsMatt Arsenault2016-01-181-0/+2
| | | | llvm-svn: 258096
* AMDGPU: Split 64-bit and of constant upMatt Arsenault2016-01-181-1/+7
| | | | | | | | | | This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095
* AMDGPU: Reduce 64-bit lshr by constant to 32-bitMatt Arsenault2016-01-181-0/+1
| | | | | | 64-bit shifts are very slow on some subtargets. llvm-svn: 258090
* AMDGPU/SI: Add support for non-void functionsMarek Olsak2016-01-131-0/+2
| | | | | | | | | | | | | | | | | | | Summary: Return values can be stored in SGPRs (i32) and VGPRs (f32). This will be used by functions which expect some bytecode or other binary to be appended at the end. It allows defining in which registers the return values will be stored. v2: don't do this for compute shaders Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16033 llvm-svn: 257621
* AMDGPU: Implement {{s|u}}int_to_fp i64 -> f32Matt Arsenault2016-01-111-0/+1
| | | | | | | The old lowering for uint_to_fp failed opencl conformance. It might be OK for fast math mode, but I'm not sure. llvm-svn: 257393
* AMDGPU: Pattern match ffbh pattern to instruction.Matt Arsenault2016-01-111-0/+4
| | | | | | | | The hardware instruction's output on 0 is -1 rather than 32. Eliminate a test and select to -1. This removes an extra instruction from the compatability function with HSAIL's firstbit instruction. llvm-svn: 257352
* AMDGPU: Custom lower i64 ctlzMatt Arsenault2016-01-111-0/+2
| | | | llvm-svn: 257348
* AMDGPU: Use generic bitreverse intrinsicMatt Arsenault2015-12-141-1/+0
| | | | | | Also fix bug in vector legalization for bitreverse. llvm-svn: 255512
* DAGCombiner: Combine extract_vector_elt from build_vectorMatt Arsenault2015-10-121-0/+1
| | | | | | | | | | | | | | This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129
* AMDGPU: Produce error on dynamic_stackallocMatt Arsenault2015-08-261-0/+3
| | | | llvm-svn: 246048
* [AMDGPU] Use the general SMAX/SMIN/UMAX/UMIN pattern matching and remove the ↵Simon Pilgrim2015-08-131-8/+0
| | | | | | | | | | AMDGPU implementation D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect. Differential Revision: http://reviews.llvm.org/D12007 llvm-svn: 244960
* AMDGPU: Fix return type of getImplicitParameterOffset.Matt Arsenault2015-07-281-1/+1
| | | | | | Patch by Zoltan Gilian <zoltan.gilian@gmail.com> llvm-svn: 243459
* AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32)Matt Arsenault2015-07-141-0/+1
| | | | | | | | | | | | | | | | | This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. llvm-svn: 242177
* AMDGPU: Add helper function for implicit parameter offsets.Tom Stellard2015-07-091-0/+10
| | | | | | Patch by: Zoltan Gilian llvm-svn: 241861
* Make TargetLowering::getPointerTy() taking DataLayout as an argumentMehdi Amini2015-07-091-1/+1
| | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D11028 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241775
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+307
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-77/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+77
llvm-svn: 160270
OpenPOWER on IntegriCloud