summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/SIInstructions.td
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Fix definitions for ds_read2 / ds_write2 instructions.Matt Arsenault2014-08-041-2/+2
| | | | | | | These were just wrong, using the wrong register classes and store2 was missing an operand. llvm-svn: 214756
* R600/SI: Do abs/neg folding with ComplexPatternsTom Stellard2014-08-011-452/+512
| | | | | | | | | | Abs/neg folding has moved out of foldOperands and into the instruction selection phase using complex patterns. As a consequence of this change, we now prefer to select the 64-bit encoding for most instructions and the modifier operands have been dropped from integer VOP3 instructions. llvm-svn: 214467
* R600/SI: Remove redundant setting of bits on instructions.Matt Arsenault2014-07-301-13/+2
| | | | | | | | neverHasSideEffects is deprecated, and hasSideEffects = 0 is already set on the base classes of the basic ALU instruction classes. The base classes also already set mayLoad = 0 and mayStore = 0 llvm-svn: 214283
* R600/SI: Use scratch memory for large private arraysTom Stellard2014-07-211-22/+64
| | | | llvm-svn: 213551
* R600/SI: Remove vaddr operand from BUFFER_LOAD_*_OFFSET instructionsTom Stellard2014-07-211-2/+2
| | | | | | This operand is never used. llvm-svn: 213549
* R600/SI: Store constant initializer data in constant memoryTom Stellard2014-07-211-1/+15
| | | | | | | | | | | | This implements a solution for constant initializers suggested by Vadim Girlin, where we store the data after the shader code and then use the S_GETPC instruction to compute its address. This saves use the trouble of creating a new buffer for constant data and then having to pass the pointer to the kernel via user SGPRs or the input buffer. llvm-svn: 213530
* R600/SI: Add isCFDepth0 Predicate to SALU addc patternTom Stellard2014-07-211-10/+16
| | | | llvm-svn: 213529
* R600/SI: Use VALU for i1 XORTom Stellard2014-07-211-5/+5
| | | | llvm-svn: 213528
* R600/SI: Use a custom encoding method for simm16 in SOPP branch instructionsTom Stellard2014-07-211-7/+7
| | | | | | | This allows us to explicitly define the type of fixup that is needed, so we can distinguish this from future fixup types. llvm-svn: 213527
* R600/SI: Rename SOPP operands to match the encoding fieldsTom Stellard2014-07-211-17/+17
| | | | llvm-svn: 213526
* R600/SI: implement range reduction for sin/cosMatt Arsenault2014-07-191-12/+6
| | | | | | | | | | | | | | | | These instructions can only take a limited input range, and return the constant value 1 out of range. We should do range reduction to be able to process arbitrary values. Use a FRACT instruction after normalization to achieve this. Also add a test for constant folding with the lowered code with unsafe-fp-math enabled. v2: use DAG lowering instead of intrinsic, adapt test v3: calculate constant, fold pattern into instruction definition v4: misc style fixes, add sin-fold testcase, cosmetics Patch by Grigori Goronzy llvm-svn: 213458
* CodeGen: extend f16 conversions to permit types > float.Tim Northover2014-07-171-2/+2
| | | | | | | | | | | | | | | | | | | This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248
* R600/SI: Allow using f32 rcp / rsq when denormals not handled.Matt Arsenault2014-07-151-2/+1
| | | | | | | These are precise enough to use for OpenCL unless denormals are handled. llvm-svn: 213107
* R600/SI: Implement less wrong f32 fdivMatt Arsenault2014-07-151-7/+4
| | | | | | | Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test. llvm-svn: 213089
* R600/SI: Use i32 vectors for resources and samplersMarek Olsak2014-07-111-2/+2
| | | | | | | | This affects new intrinsics only. What surprises me is that v32i8 still works. llvm-svn: 212831
* R600/SI: add sample and image intrinsics exposing all instruction fieldsMarek Olsak2014-07-111-41/+118
| | | | | | | | | | | We need the intrinsics with offsets, so why not just add them all. The R128 parameter will also be useful for reducing SGPR usage. GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent", so Mesa will probably translate those to slc, glc, etc. When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics. llvm-svn: 212830
* R600/SI: Add support for llvm.convert.{to|from}.fp16Matt Arsenault2014-07-101-2/+6
| | | | llvm-svn: 212676
* R600/SI: Use a ComplexPattern for ADDR64 addressing of MUBUF loadsTom Stellard2014-07-021-35/+29
| | | | llvm-svn: 212217
* R600: Promote i64 loads to v2i32Tom Stellard2014-07-021-6/+1
| | | | llvm-svn: 212216
* R600/SI: Add verifier check for immediates in register operands.Tom Stellard2014-07-021-1/+1
| | | | llvm-svn: 212214
* R600/SI: Use a ComplexPattern for MUBUF storesTom Stellard2014-06-241-34/+5
| | | | | | | | Now that non-leaf ComplexPatterns are allowed we can fold all the MUBUF store patterns into the instruction definition. We will also be able to reuse this new ComplexPattern for MUBUF loads and atomic operations. llvm-svn: 211644
* R600: Promote i64 stores to v2i32Tom Stellard2014-06-241-2/+1
| | | | | | Now we need only one 64-bit pattern for stores. llvm-svn: 211643
* R600: Fix inconsistency in rsq instructions.Matt Arsenault2014-06-241-7/+11
| | | | | | | | | | | | | R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637
* R600/SI: Move pattern to instruction definitionMatt Arsenault2014-06-241-6/+1
| | | | llvm-svn: 211614
* R600/SI: Fix div_scale intrinsic.Matt Arsenault2014-06-231-2/+4
| | | | | | | The operand that must match one of the others does matter, and implement selecting for it. llvm-svn: 211523
* R600/SI: Add patterns for ctpop inside a branchTom Stellard2014-06-201-12/+38
| | | | llvm-svn: 211378
* R600/SI: Add a pattern for f32 ftruncTom Stellard2014-06-201-1/+1
| | | | llvm-svn: 211377
* R600/SI: Add a VALU pattern for i64 xorTom Stellard2014-06-201-4/+7
| | | | llvm-svn: 211373
* R600/SI: Add intrinsics for various math instructions.Matt Arsenault2014-06-191-8/+29
| | | | | | | | These will be used for custom lowering and for library implementations of various math functions, so it's useful to expose these as builtins. llvm-svn: 211247
* R600/SI: add gather4 and getlod intrinsics (v3)Marek Olsak2014-06-181-25/+77
| | | | | | | | | This contains all the previous patches + getlod support on top of it. It doesn't use SDNodes anymore, so it's quite small. It also adds v16i8 to SReg_128, which is used for the sampler descriptor. Reviewed-by: Tom Stellard llvm-svn: 211228
* R600/SI: Add intrinsics for brev instructionsMatt Arsenault2014-06-181-1/+3
| | | | llvm-svn: 211187
* R600/SI: Comparisons set vcc.Matt Arsenault2014-06-181-102/+102
| | | | llvm-svn: 211178
* R600/SI: Match cttz_zero_undefMatt Arsenault2014-06-171-1/+5
| | | | llvm-svn: 211116
* R600/SI: Match ctlz_zero_undefMatt Arsenault2014-06-171-3/+5
| | | | llvm-svn: 211115
* R600: Use LDS and vectors for private memoryTom Stellard2014-06-171-2/+2
| | | | llvm-svn: 211110
* R600/SI: Add a pattern for llvm.AMDGPU.barrier.globalTom Stellard2014-06-171-0/+9
| | | | llvm-svn: 211109
* R600: Remove AMDIL instruction and register definitionsTom Stellard2014-06-131-1/+1
| | | | | | Most of these are no longer used any more. llvm-svn: 210915
* R600: Mostly remove remaining AMDIL intrinsics.Matt Arsenault2014-06-121-1/+1
| | | | | | | | | Delete all unused ones, and add new AMDGPU named intrinsics for the ones that are. Handle the old AMDIL names for comptability (although remove their GCCBuiltin names) and add tests since there weren't any for these before. llvm-svn: 210827
* R600/SI: Use a register set to -1 for data0 on ds_inc*/ds_dec*Matt Arsenault2014-06-121-15/+29
| | | | | | | There is not such thing as a 0-data ds instruction, and the data operand needs to be a vgpr set to something meaningful. llvm-svn: 210756
* R600/SI: Fix bitcast between v2i32 and f64Matt Arsenault2014-06-111-0/+2
| | | | | | | | | | This is the same problem fixed in r210664 for more types. The test passes without this fix. For some reason I'm only hitting this when creating selects lowered to v2i32 selects. llvm-svn: 210692
* R600/SI: Update place using old subtarget predicateMatt Arsenault2014-06-111-2/+2
| | | | llvm-svn: 210683
* R600/SI: Add common 64-bit LDS atomicsMatt Arsenault2014-06-111-13/+31
| | | | llvm-svn: 210680
* R600/SI: Add instruction definitions for 64-bit LDS atomicsMatt Arsenault2014-06-111-0/+47
| | | | llvm-svn: 210679
* R600/SI: Add 32-bit LDS atomic cmpxchgMatt Arsenault2014-06-111-0/+15
| | | | llvm-svn: 210678
* R600/SI: Use LDS atomic inc / decMatt Arsenault2014-06-111-0/+16
| | | | llvm-svn: 210677
* R600/SI: Add other LDS atomic operationsMatt Arsenault2014-06-111-3/+12
| | | | llvm-svn: 210676
* R600/SI: Add instruction definitions for more LDS opsMatt Arsenault2014-06-111-0/+42
| | | | llvm-svn: 210675
* R600/SI: Fix backwards names for local atomic instructions.Matt Arsenault2014-06-111-4/+4
| | | | | | | The manual lists them as *_RTN_U32, not *_U32_RTN, which is more consistent with how every other sized instruction is named. llvm-svn: 210674
* R600/SI: Refactor local atomics.Matt Arsenault2014-06-111-4/+13
| | | | | | | Use patterns that will also match the immediate offset to match the normal read / writes. llvm-svn: 210673
* R600/SI: Use v_cvt_f32_ubyte* instructionsMatt Arsenault2014-06-111-4/+12
| | | | | | | This eliminates extra extract instructions when loading an i8 vector to a float vector. llvm-svn: 210666
OpenPOWER on IntegriCloud