summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Custom select 64-bit ADDTom Stellard2014-02-253-30/+48
| | | | llvm-svn: 202194
* Fix unused variableMatt Arsenault2014-02-241-3/+3
| | | | llvm-svn: 202080
* R600/SI - Add new CI arithmetic instructions.Matt Arsenault2014-02-244-2/+72
| | | | | | | Does not yet include larger part required to match v_mad_i64_i32 / v_mad_u64_u32. llvm-svn: 202077
* R600: Make check clearer.Matt Arsenault2014-02-241-1/+1
| | | | | | | The check is clearer as southern islands or later, rather than checking for later than northern islands. llvm-svn: 202076
* Fix DOT4 missing from getTargetOpcodeNameMatt Arsenault2014-02-241-0/+1
| | | | llvm-svn: 202075
* R600/SI: Expand all v8[if]32 operationsTom Stellard2014-02-133-1/+37
| | | | llvm-svn: 201371
* R600/SI: Add a pattern for i32 anyextTom Stellard2014-02-131-2/+5
| | | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 201370
* R600/SI: Completely Disable TypeRewriter on computeTom Stellard2014-02-131-3/+3
| | | | llvm-svn: 201369
* R600/SI: Split global vector loads with more than 4 elementsTom Stellard2014-02-131-3/+5
| | | | llvm-svn: 201368
* R600: Always implement both versions of isTruncateFree and add a sanity check.Benjamin Kramer2014-02-122-5/+12
| | | | llvm-svn: 201222
* R600/SI: Fix assertion on infinite loops.Matt Arsenault2014-02-111-2/+4
| | | | | | | This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177
* R600: Implement isTruncateFreeMatt Arsenault2014-02-102-0/+6
| | | | | | | Truncation is just accessing a subregister for any multiple of the register size, so it's free. llvm-svn: 201107
* R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are usedTom Stellard2014-02-104-10/+28
| | | | | | | | | | | DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097
* R600/SI: Only use S_WQM_B64 in pixel shadersTom Stellard2014-02-101-1/+1
| | | | | | | | This doesn't change any functionality, since we only have two shader types (compute and pixel) that use local memory. We're just changing the logic to match the documentation. llvm-svn: 201096
* R600/SI: Add a MUBUF store pattern for Reg+Imm offsetsTom Stellard2014-02-062-1/+11
| | | | llvm-svn: 200935
* R600/SI: Add a MUBUF store pattern for Imm offsetsTom Stellard2014-02-061-0/+5
| | | | llvm-svn: 200934
* R600/SI: Add a MUBUF load pattern for Reg+Imm offsetsTom Stellard2014-02-061-0/+5
| | | | llvm-svn: 200933
* R600/SI: Use immediates offsets for SMRD instructions whenever possibleTom Stellard2014-02-062-10/+13
| | | | | | | | There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932
* Add address space argument to allowsUnalignedMemoryAccess.Matt Arsenault2014-02-052-1/+2
| | | | | | | On R600, some address spaces have more strict alignment requirements than others. llvm-svn: 200887
* R600/SI: Add pattern for zero-extending i1 to i32Michel Danzer2014-02-051-0/+5
| | | | | | | | | Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830
* cleanup: scc_iterator consumers should use isAtEndDuncan P. N. Exon Smith2014-02-041-2/+2
| | | | | | | | | | | | | | No functional change. Updated loops from: for (I = scc_begin(), E = scc_end(); I != E; ++I) to: for (I = scc_begin(); !I.isAtEnd(); ++I) for teh win. llvm-svn: 200789
* Every target uses .align. Simplify.Rafael Espindola2014-02-041-1/+0
| | | | llvm-svn: 200782
* R600/SI: Expand i1 BR_CCTom Stellard2014-02-041-0/+2
| | | | | | | | | | | This fixes a crashes in the OpenCV test suite and also the scrypt kernel in bfgminer. I was unable to come up with a reduced test case for this. https://bugs.freedesktop.org/show_bug.cgi?id=72785 llvm-svn: 200776
* R600/SI: Don't assume copies will be coalesced in SIFixSGPRCopiesTom Stellard2014-02-041-1/+1
| | | | | | | There is no lit test for this, because it would be too big and complicated, but it does fix a crash in the Arithm/Absdiff.* OpenCV test. llvm-svn: 200775
* R600/SI: Custom lower i64 ISD::SELECTTom Stellard2014-02-042-0/+28
| | | | llvm-svn: 200774
* R600: Enable vector fpow.Tom Stellard2014-02-041-0/+1
| | | | | | | | | | | The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773
* R600/SI: Fix fneg for 0.0Michel Danzer2014-02-041-4/+18
| | | | | | | | | | | | V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743
* Add DEBUG_TYPE to SIAnnotateControlFlowMatt Arsenault2014-02-031-0/+2
| | | | llvm-svn: 200720
* R600/SI: Fix insertelement with dynamic indices.Matt Arsenault2014-02-021-7/+17
| | | | | | | | This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619
* Remove the last hasRawTextSupport call from R600.Rafael Espindola2014-01-311-2/+1
| | | | | | | | There is nothing wrong with printing the disassembly section when printing text. An hypothetical assembler would then produce a .o just like our direct object emission produces. llvm-svn: 200583
* Replace another use with hasRawTextSupport+EmitRawText with emitRawComment.Rafael Espindola2014-01-311-2/+2
| | | | llvm-svn: 200582
* Use emitRawComment to avoid a call to hasRawTextSupport.Rafael Espindola2014-01-311-3/+1
| | | | llvm-svn: 200581
* Delete MCSubtargetInfo data members from target MCCodeEmitter classesDavid Woodhouse2014-01-282-7/+5
| | | | | | | | The subtarget info is explicitly passed to the EncodeInstruction method and we should use that subtarget info to influence any encoding decisions. llvm-svn: 200350
* Propagate MCSubtargetInfo through TableGen's getBinaryCodeForInstr()David Woodhouse2014-01-283-10/+17
| | | | llvm-svn: 200349
* Explictly pass MCSubtargetInfo to MCCodeEmitter::EncodeInstruction()David Woodhouse2014-01-283-5/+10
| | | | llvm-svn: 200348
* Change MCStreamer EmitInstruction interface to take subtarget infoDavid Woodhouse2014-01-281-1/+1
| | | | llvm-svn: 200345
* R600/SI: Add pattern for truncating i32 to i1Michel Danzer2014-01-281-0/+5
| | | | | | | Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283
* R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructionsMichel Danzer2014-01-273-21/+101
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196
* R600/SI: Add intrinsic for S_SENDMSG instructionMichel Danzer2014-01-275-2/+54
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195
* Add back spaces I missed in the conversion to emitRawComments.Rafael Espindola2014-01-271-3/+3
| | | | | | Sorry about that. llvm-svn: 200171
* Use emitRawComment instead of EmitRawText.Rafael Espindola2014-01-271-4/+5
| | | | llvm-svn: 200170
* Pass a MCSubtargetInfo down to the TargetStreamer creation.Rafael Espindola2014-01-261-0/+1
| | | | | | | With this the target streamers will be able to know the target features that are in use. llvm-svn: 200135
* Construct the MCStreamer before constructing the MCTargetStreamer.Rafael Espindola2014-01-261-1/+1
| | | | | | | | | | This has a few advantages: * Only targets that use a MCTargetStreamer have to worry about it. * There is never a MCTargetStreamer without a MCStreamer, so we can use a reference. * A MCTargetStreamer can talk to the MCStreamer in its constructor. llvm-svn: 200129
* Add final and owerride keywords to TargetTransformInfo's subclasses.Juergen Ributzka2014-01-241-5/+5
| | | | llvm-svn: 200021
* Fix known typosAlp Toker2014-01-2412-14/+14
| | | | | | | Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018
* R600: Remove successive JUMP in AnalyzeBranch when AllowModify is trueTom Stellard2014-01-231-1/+7
| | | | | | | | | | | | This fixes a crash in the OpenCV OpenCL test suite. There is no lit test for this, because the test would be very large and could easily be invalidated by changes to the scheduler or other parts of the compiler. Patch by: Vincent Lejeune llvm-svn: 199919
* R600: Disable the BFE patternTom Stellard2014-01-232-1/+10
| | | | | | | | | | This pattern uses an SDNodeXForm, which isn't being emitted for some reason. I can get it to work by attaching the PatLeaf that has the XForm to the argument in the output pattern, but this results in an immediate being used in a register operand, which the backend can't handle yet. llvm-svn: 199918
* R600: Correctly handle vertex fetch clauses the precede ENDIFsTom Stellard2014-01-231-0/+1
| | | | | | | | The control flow finalizer would sometimes use an ALU_POP_AFTER instruction before the vetex fetch clause instead of using a POP instruction after it. llvm-svn: 199917
* R600: Unconditionally unroll loops that contain GEPs with alloca pointersTom Stellard2014-01-231-0/+29
| | | | | | | | | | | | Implement the getUnrollingPreferences() function for AMDGPUTargetTransformInfo so that loops that do address calculations on pointers derived from alloca are unconditionally unrolled. Unrolling these loops makes it more likely that SROA will be able to eliminate the allocas, which is a big win for R600 since memory allocated by alloca (private memory) is really slow. llvm-svn: 199916
* R600: Recommit 199842: Add work-around for the CF stack entry HW bugTom Stellard2014-01-235-7/+63
| | | | | | | | | | | | | | | | | | The unit test is now disabled on non-asserts builds. The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199905
OpenPOWER on IntegriCloud