summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/AMDGPUInstructions.td
Commit message (Collapse)AuthorAgeFilesLines
* R600 -> AMDGPU renameTom Stellard2015-06-131-682/+0
| | | | llvm-svn: 239657
* R600/SI: Remove explicit m0 operand from DS instructionsTom Stellard2015-05-121-15/+22
| | | | | | | Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237141
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-1/+1
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* Reduce dyn_cast<> to isa<> or cast<> where possible.Benjamin Kramer2015-04-101-2/+2
| | | | | | No functional change intended. llvm-svn: 234586
* R600/SI: Select V_BFE_U32 for and+shift with a non-literal offsetMarek Olsak2015-03-241-12/+10
| | | | llvm-svn: 233079
* R600: Use new fmad node.Matt Arsenault2015-02-201-5/+0
| | | | | | | | | | | This enables a few useful combines that used to only use fma. Also since v_mad_f32 apparently does not support denormals, disable the existing cases that are custom handled if they are requested. llvm-svn: 230071
* R600/SI: Don't set isCodeGenOnly = 1 on all instructionsTom Stellard2015-02-181-2/+0
| | | | | | | We only need to set this on pseudo instructions which won't be used by the assembler. llvm-svn: 229689
* R600/SI: Extend private extload pattern to include zext loadsTom Stellard2015-02-171-4/+6
| | | | llvm-svn: 229507
* R600/SI: Implement correct f64 fdivMatt Arsenault2015-02-141-11/+4
| | | | | | This version passes the OpenCL conformance test. llvm-svn: 229239
* MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line ↵Benjamin Kramer2015-02-121-1/+1
| | | | | | | | with countTrailingZeros Update all callers. llvm-svn: 228930
* R600/SI: Only select cvt_flr/cvt_rpi with no NaNs.Matt Arsenault2015-01-311-2/+4
| | | | | | These have different behavior from cvt_i32_f32 on NaN. llvm-svn: 227693
* R600/SI: Add patterns for v_cvt_{flr|rpi}_i32_f32Matt Arsenault2015-01-151-0/+17
| | | | llvm-svn: 226230
* R600/SI: Make more unordered comparisons legalMatt Arsenault2014-12-111-1/+1
| | | | | | | This saves a second compare and an and / or by using the unordered comparison instructions. llvm-svn: 224066
* R600/SI: Use unordered not equal instructionsMatt Arsenault2014-12-111-4/+14
| | | | llvm-svn: 224065
* R600/SI: Start implementing an assemblerTom Stellard2014-11-141-0/+2
| | | | | | | This was done using the Sparc and PowerPC AsmParsers as guides. So far it is very simple and only supports sopp instructions. llvm-svn: 221994
* R600/SI: Use REG_SEQUENCE instead of INSERT_SUBREGsMatt Arsenault2014-11-021-4/+5
| | | | llvm-svn: 221118
* R600/SI: Add global atomicrmw xchgAaron Watry2014-10-171-0/+1
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220110
* R600/SI: Add global atomicrmw xorAaron Watry2014-10-171-0/+1
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220109
* R600/SI: Add global atomicrmw orAaron Watry2014-10-171-0/+1
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220108
* R600/SI: Add global atomicrmw min/uminAaron Watry2014-10-171-0/+2
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220107
* R600/SI: Add global atomicrmw max/umaxAaron Watry2014-10-171-0/+2
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220106
* R600/SI: Add global atomicrmw andAaron Watry2014-10-171-0/+1
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220105
* R600/SI: Add global atomicrmw subAaron Watry2014-10-171-0/+1
| | | | | | | | v2: Add separate offset/no-offset tests Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 220104
* R600/SI: Add support for global atomic addTom Stellard2014-09-251-0/+8
| | | | llvm-svn: 218457
* Revert "R600/SI: Add support for global atomic add"Tom Stellard2014-09-221-8/+0
| | | | | | | | | This reverts commit r218254. The global_atomics.ll test fails with asserts disabled. For some reason, the compiler fails to produce the atomic no return variants. llvm-svn: 218257
* R600/SI: Add support for global atomic addTom Stellard2014-09-221-0/+8
| | | | llvm-svn: 218254
* R600/SI: Add preliminary support for flat address spaceMatt Arsenault2014-09-151-0/+46
| | | | llvm-svn: 217777
* R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignmentTom Stellard2014-08-221-0/+11
| | | | llvm-svn: 216279
* R600/SI: Fix build warningTom Stellard2014-08-011-1/+1
| | | | llvm-svn: 214475
* R600/SI: Do abs/neg folding with ComplexPatternsTom Stellard2014-08-011-0/+8
| | | | | | | | | | Abs/neg folding has moved out of foldOperands and into the instruction selection phase using complex patterns. As a consequence of this change, we now prefer to select the 64-bit encoding for most instructions and the modifier operands have been dropped from integer VOP3 instructions. llvm-svn: 214467
* R600/SI: Use scratch memory for large private arraysTom Stellard2014-07-211-0/+26
| | | | llvm-svn: 213551
* R600: Add predicate for UnsafeFPMathMatt Arsenault2014-07-151-0/+1
| | | | llvm-svn: 213088
* R600: Add denormal handling subtarget features.Matt Arsenault2014-07-141-0/+3
| | | | llvm-svn: 213018
* R600: Fix inconsistency in rsq instructions.Matt Arsenault2014-06-241-4/+11
| | | | | | | | | | | | | R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637
* R600/SI: Add intrinsics for various math instructions.Matt Arsenault2014-06-191-0/+10
| | | | | | | | These will be used for custom lowering and for library implementations of various math functions, so it's useful to expose these as builtins. llvm-svn: 211247
* R600: Remove AMDIL instruction and register definitionsTom Stellard2014-06-131-0/+20
| | | | | | Most of these are no longer used any more. llvm-svn: 210915
* R600: Mostly remove remaining AMDIL intrinsics.Matt Arsenault2014-06-121-1/+1
| | | | | | | | | Delete all unused ones, and add new AMDGPU named intrinsics for the ones that are. Handle the old AMDIL names for comptability (although remove their GCCBuiltin names) and add tests since there weren't any for these before. llvm-svn: 210827
* R600/SI: Add common 64-bit LDS atomicsMatt Arsenault2014-06-111-0/+8
| | | | llvm-svn: 210680
* R600/SI: Add 32-bit LDS atomic cmpxchgMatt Arsenault2014-06-111-0/+9
| | | | llvm-svn: 210678
* R600/SI: Refactor local atomics.Matt Arsenault2014-06-111-7/+17
| | | | | | | Use patterns that will also match the immediate offset to match the normal read / writes. llvm-svn: 210673
* R600: Handle fcopysignMatt Arsenault2014-06-101-1/+14
| | | | llvm-svn: 210564
* R600/SI: Fix [s|u]int_to_fp for i1Matt Arsenault2014-05-311-0/+2
| | | | llvm-svn: 209971
* R600: Expand mul24 for GPUs without itMatt Arsenault2014-05-221-8/+21
| | | | llvm-svn: 209458
* R600: Expand mad24 for GPUs without itMatt Arsenault2014-05-221-0/+10
| | | | llvm-svn: 209457
* R600: Add intrinsics for mad24Matt Arsenault2014-05-221-0/+11
| | | | llvm-svn: 209456
* R600/SI: Print more immediates in hex formatMatt Arsenault2014-04-151-0/+12
| | | | | | | | | Print in decimal for inline immediates, and hex otherwise. Use hex always for offsets in addressing offsets. This approximately matches what the shader compiler does. llvm-svn: 206335
* R600: Match 24-bit arithmetic patterns in a Target DAGCombineTom Stellard2014-04-071-3/+0
| | | | | | | | | | | | | | | | | Moving these patterns from TableGen files to PerformDAGCombine() should allow us to generate better code by eliminating unnecessary shifts and extensions earlier. This also fixes a bug where the MAD pattern was calling SimplifyDemandedBits with a 24-bit mask on the first operand even when the full pattern wasn't being matched. This occasionally resulted in some instructions being incorrectly deleted from the program. v2: - Fix bug with 64-bit mul llvm-svn: 205731
* R600: Reorganize tablegen instruction definitionsTom Stellard2014-03-241-0/+3
| | | | | | Each GPU family now has its own file. llvm-svn: 204615
OpenPOWER on IntegriCloud