summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/R600
Commit message (Collapse)AuthorAgeFilesLines
...
* CodeGen/R600/v_cndmask.ll: Relax an expression to unbreak msvcrt.NAKAMURA Takumi2014-03-181-1/+1
| | | | | | | V_CNDMASK_B32_e64 v0, v0, -1.#QNAN0e+00, s[2:3], 0, 0, 0, 0 FIXME: We really need to implement our formatter... llvm-svn: 204118
* Making a guess to fix the test case with r204056 to get the build bot working.Kevin Enderby2014-03-171-1/+1
| | | | llvm-svn: 204073
* R600: Match sign_extend_inreg to BFE instructionsMatt Arsenault2014-03-174-23/+262
| | | | llvm-svn: 204072
* R600/SI: Fix implementation of isInlineConstant() used by the verifierTom Stellard2014-03-171-0/+13
| | | | | | | | The type of the immediates should not matter as long as the encoding is equivalent to the encoding of one of the legal inline constants. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204056
* R600/SI: Use correct dest register class for V_READFIRSTLANE_B32Tom Stellard2014-03-171-2/+2
| | | | | | | | | | | | This instructions writes to an 32-bit SGPR. This change required adding the 32-bit VCC_LO and VCC_HI registers, because the full VCC register is 64 bits. This fixes verifier errors on several of the indirect addressing piglit tests. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204055
* R600: LDS instructions shouldn't implicitly define OQAPTom Stellard2014-03-131-0/+28
| | | | | | | | | LDS instructions are pseudo instructions which model the OQAP defs and uses within a single instruction. This fixes a hang in the opencv MedianFilter tests. llvm-svn: 203818
* R600: Fix trunc store from i64 to i1Matt Arsenault2014-03-121-0/+30
| | | | llvm-svn: 203695
* R600/SI: Using SGPRs is illegal for instructions that read carry-out from VCCTom Stellard2014-03-071-0/+15
| | | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 203281
* R600/SI: Custom lower i1 storesTom Stellard2014-03-071-3/+20
| | | | | | | | These are sometimes created by the shrink to boolean optimization in the globalopt pass. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 203280
* R600: Fix extloads from i8 / i16 to i64.Matt Arsenault2014-03-061-5/+72
| | | | | | | This appears to only be working for global loads. Private and local break for other reasons. llvm-svn: 203135
* R600/SI: Expand selects on vectors.Matt Arsenault2014-03-061-0/+155
| | | | llvm-svn: 203134
* R600: Add failing control flow tests.Matt Arsenault2014-03-015-0/+319
| | | | | | Simple cases hit a variety of problems at -O0. llvm-svn: 202601
* R600/SI: Expand all v16[if]32 operationsTom Stellard2014-02-281-0/+40
| | | | llvm-svn: 202543
* R600/SI: Optimize SI_KILL for constant operandsMichel Danzer2014-02-271-3/+7
| | | | | | | | If the SI_KILL operand is constant, we can either clear the exec mask if the operand is negative, or do nothing otherwise. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202337
* R600/SI: Allow SI_KILL for geometry shadersMichel Danzer2014-02-271-0/+18
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 202336
* R600/SI: Custom select 64-bit ADDTom Stellard2014-02-252-6/+27
| | | | llvm-svn: 202194
* R600/SI - Add new CI arithmetic instructions.Matt Arsenault2014-02-243-0/+252
| | | | | | | Does not yet include larger part required to match v_mad_i64_i32 / v_mad_u64_u32. llvm-svn: 202077
* [CodeGenPrepare] Fix the check of the legality of an instruction.Quentin Colombet2014-02-221-4/+7
| | | | | | | | | The API expects an ISD opcode, not an IR opcode. Fixes a regression for R600. Related to <rdar://problem/15519855>. llvm-svn: 201923
* Fix more broken CHECK linesNico Rieck2014-02-161-2/+2
| | | | llvm-svn: 201493
* [CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if theQuentin Colombet2014-02-141-2/+0
| | | | | | | transformation does not bring any immediate benefits and introduce an illegal operation. llvm-svn: 201439
* TargetLowering: n * r where n > 2 should be an illegal addressing modeTom Stellard2014-02-141-0/+18
| | | | llvm-svn: 201433
* R600/SI: Expand all v8[if]32 operationsTom Stellard2014-02-132-16/+58
| | | | llvm-svn: 201371
* R600/SI: Add a pattern for i32 anyextTom Stellard2014-02-131-0/+14
| | | | | Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 201370
* R600/SI: Completely Disable TypeRewriter on computeTom Stellard2014-02-131-0/+9
| | | | llvm-svn: 201369
* R600/SI: Split global vector loads with more than 4 elementsTom Stellard2014-02-131-85/+93
| | | | llvm-svn: 201368
* R600/SI: Add ShaderType attribute to some testsTom Stellard2014-02-134-14/+22
| | | | llvm-svn: 201367
* R600/SI: Fix assertion on infinite loops.Matt Arsenault2014-02-111-0/+17
| | | | | | | This isn't the most useful case to fix in the real world, but bugpoint runs into this. llvm-svn: 201177
* R600/SI: Initialize M0 and emit S_WQM_B64 whenever DS instructions are usedTom Stellard2014-02-101-2/+17
| | | | | | | | | | | DS instructions that access local memory can only uses addresses that are less than or equal to the value of M0. When M0 is uninitialized, then we experience undefined behavior. This patch also changes the behavior to emit S_WQM_B64 on pixel shaders no matter what kind of DS instruction is used. llvm-svn: 201097
* R600/SI: Add failing test for 3 x i64 vectors.Matt Arsenault2014-02-071-0/+28
| | | | | | | Stores of <4 x i64> do work (although they do expand to 4 stores instead of 2), but 3 x i64 vectors fail to select. llvm-svn: 200989
* R600/SI: Add a MUBUF store pattern for Reg+Imm offsetsTom Stellard2014-02-061-0/+12
| | | | llvm-svn: 200935
* R600/SI: Add a MUBUF store pattern for Imm offsetsTom Stellard2014-02-061-0/+35
| | | | llvm-svn: 200934
* R600/SI: Add a MUBUF load pattern for Reg+Imm offsetsTom Stellard2014-02-061-0/+51
| | | | llvm-svn: 200933
* R600/SI: Use immediates offsets for SMRD instructions whenever possibleTom Stellard2014-02-061-0/+80
| | | | | | | | There was a problem with the old pattern, so we were copying some larger immediates into registers when we could have been encoding them in the instruction. llvm-svn: 200932
* R600/SI: Add pattern for zero-extending i1 to i32Michel Danzer2014-02-051-0/+10
| | | | | | | | | Fixes opencl-example if_* tests with radeonsi. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74469 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200830
* R600/SI: Custom lower i64 ISD::SELECTTom Stellard2014-02-041-0/+12
| | | | llvm-svn: 200774
* R600: Enable vector fpow.Tom Stellard2014-02-041-4/+25
| | | | | | | | | | | The OpenCL specs say: "The vector versions of the math functions operate component-wise. The description is per-component." Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 200773
* R600/SI: Fix fneg for 0.0Michel Danzer2014-02-043-14/+69
| | | | | | | | | | | | V_ADD_F32 with source modifier does not produce -0.0 for this. Just manipulate the sign bit directly instead. Also add a pattern for (fneg (fabs ...)). Fixes a bunch of bit encoding piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200743
* Add some xfailed R600 tests for 64-bit private accesses.Matt Arsenault2014-02-022-0/+67
| | | | llvm-svn: 200620
* R600/SI: Fix insertelement with dynamic indices.Matt Arsenault2014-02-021-11/+169
| | | | | | | | This didn't work for any integer vectors, and didn't work with some sizes of float vectors. This should now work with all sizes of float and i32 vectors. llvm-svn: 200619
* R600/SI: Add pattern for truncating i32 to i1Michel Danzer2014-01-281-0/+10
| | | | | | | Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283
* R600/SI: Add intrinsic for BUFFER_LOAD_DWORD* instructionsMichel Danzer2014-01-271-0/+40
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200196
* R600/SI: Add intrinsic for S_SENDMSG instructionMichel Danzer2014-01-271-0/+21
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200195
* R600: Disable the BFE patternTom Stellard2014-01-231-0/+2
| | | | | | | | | | This pattern uses an SDNodeXForm, which isn't being emitted for some reason. I can get it to work by attaching the PatLeaf that has the XForm to the argument in the output pattern, but this results in an immediate being used in a register operand, which the backend can't handle yet. llvm-svn: 199918
* R600: Correctly handle vertex fetch clauses the precede ENDIFsTom Stellard2014-01-231-0/+29
| | | | | | | | The control flow finalizer would sometimes use an ALU_POP_AFTER instruction before the vetex fetch clause instead of using a POP instruction after it. llvm-svn: 199917
* R600: Unconditionally unroll loops that contain GEPs with alloca pointersTom Stellard2014-01-231-0/+37
| | | | | | | | | | | | Implement the getUnrollingPreferences() function for AMDGPUTargetTransformInfo so that loops that do address calculations on pointers derived from alloca are unconditionally unrolled. Unrolling these loops makes it more likely that SROA will be able to eliminate the allocas, which is a big win for R600 since memory allocated by alloca (private memory) is really slow. llvm-svn: 199916
* R600: Recommit 199842: Add work-around for the CF stack entry HW bugTom Stellard2014-01-231-0/+227
| | | | | | | | | | | | | | | | | | The unit test is now disabled on non-asserts builds. The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199905
* Revert "R600: Add work-around for the CF stack entry HW bug"Tom Stellard2014-01-221-225/+0
| | | | | | | | | This reverts commit 35b8331cad6eb512a2506adbc394201181da94ba. The -debug-only flag for llc doesn't appear to be available in all build configurations. llvm-svn: 199845
* R600: Add work-around for the CF stack entry HW bugTom Stellard2014-01-221-0/+225
| | | | | | | | | | | | | | | | The CF stack can be corrupted if you use CF_ALU_PUSH_BEFORE, CF_ALU_ELSE_AFTER, CF_ALU_BREAK, or CF_ALU_CONTINUE when the number of sub-entries on the stack is greater than or equal to the stack entry size and sub-entries modulo 4 is either 0 or 3 (on cedar the bug is present when number of sub-entries module 8 is either 7 or 0) We choose to be conservative and always apply the work-around when the number of sub-enries is greater than or equal to the stack entry size, so that we can safely over-allocate the stack when we are unsure of the stack allocation rules. reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199842
* R600: Refactor stack size calculationTom Stellard2014-01-221-1/+1
| | | | | reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 199840
* R600: MOVA is vector onlyTom Stellard2014-01-221-1/+3
| | | | llvm-svn: 199827
OpenPOWER on IntegriCloud