summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Extend private extload pattern to include zext loadsTom Stellard2015-02-171-4/+6
| | | | llvm-svn: 229507
* Prefer SmallVector::append/insert over push_back loops.Benjamin Kramer2015-02-172-21/+8
| | | | | | Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500
* AArch64: Safely handle the incoming sret call argument.Andrew Trick2015-02-162-4/+4
| | | | | | | | | | | | | | | | | | This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413
* Removing LLVM_DELETED_FUNCTION, as MSVC 2012 was the last reason for ↵Aaron Ballman2015-02-152-4/+4
| | | | | | requiring the macro. NFC; LLVM edition. llvm-svn: 229340
* R600/SI: Implement correct f64 fdivMatt Arsenault2015-02-146-25/+79
| | | | | | This version passes the OpenCL conformance test. llvm-svn: 229239
* R600/SI: Use complex operand folding for div_scaleMatt Arsenault2015-02-141-12/+7
| | | | llvm-svn: 229238
* R600/SI: Fix implicit vcc operand to v_div_fmas_*Matt Arsenault2015-02-144-7/+46
| | | | | | | | | This should allow finally fixing the f64 fdiv implementation. Test is disabled for VI since there seems to be a problem with one of the buffer load instructions on it. llvm-svn: 229236
* R600/SI: Fix schedule model for v_div_scale_{f32|f64}Matt Arsenault2015-02-141-1/+3
| | | | llvm-svn: 229235
* R600/SI: Really fix size of VReg_1Matt Arsenault2015-02-141-1/+3
| | | | llvm-svn: 229234
* R600/SI: Rename encoding field to match docs for VOP3bMatt Arsenault2015-02-141-2/+2
| | | | llvm-svn: 229233
* R600/SI: Fix not encoding src2 for v_div_scale_{f32|f64}Matt Arsenault2015-02-141-1/+14
| | | | | | This apparently got lost in the VI changes. llvm-svn: 229230
* R600/SI: Fix VOP3b encoding on VIMatt Arsenault2015-02-143-11/+43
| | | | llvm-svn: 229228
* R600/SI: Fix phys reg copies in SIFoldOperandsMatt Arsenault2015-02-141-3/+13
| | | | llvm-svn: 229227
* R600/SI: Fix copies from SGPR to VCCMatt Arsenault2015-02-141-5/+10
| | | | | | | This shows up without optimizations when vcc is required to be used. llvm-svn: 229226
* R600/SI: Add hack to copy from a VGPR to VCCMatt Arsenault2015-02-141-0/+10
| | | | | | This hopefully should be fixed when VReg_1 is removed. llvm-svn: 229225
* R600/SI: Fix size of VReg_1Matt Arsenault2015-02-141-1/+1
| | | | | | | This is really a 32-bit register, if we try to check the size of it, we want 32-bits. llvm-svn: 229223
* R600: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith2015-02-142-5/+2
| | | | | | | | | | | | Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229222
* R600/SI: Refactor SOP1 classesTom Stellard2015-02-131-26/+19
| | | | llvm-svn: 229152
* R600/SI: Lowercase register namesTom Stellard2015-02-131-4/+4
| | | | llvm-svn: 229151
* R600/SI: Remove some unused TableGen classesTom Stellard2015-02-131-19/+0
| | | | llvm-svn: 229150
* R600/SI: Remove handling of fpimmMatt Arsenault2015-02-131-16/+6
| | | | llvm-svn: 229136
* R600/SI: Allow f64 inline immediates in i64 operandsMatt Arsenault2015-02-136-65/+155
| | | | | | | This requires considering the size of the operand when checking immediate legality. llvm-svn: 229135
* [PM] Remove the old 'PassManager.h' header file at the top level ofChandler Carruth2015-02-132-2/+2
| | | | | | | | | | | | | | | | | | | | LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094
* R600/SI: Remove unnecessary check for fpimmMatt Arsenault2015-02-131-1/+1
| | | | llvm-svn: 229034
* MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line ↵Benjamin Kramer2015-02-121-1/+1
| | | | | | | | with countTrailingZeros Update all callers. llvm-svn: 228930
* R600/SI: Disable subreg livenessTom Stellard2015-02-111-1/+1
| | | | | | This is temporary while we try to fix a crash in the register coalescer. llvm-svn: 228861
* R600: Split AMDGPUPassConfig into R600PassConfig and GCNPassConfigTom Stellard2015-02-112-66/+96
| | | | llvm-svn: 228850
* R600: Create an R600TargetMachine for pre-gcn GPUsTom Stellard2015-02-112-15/+36
| | | | | | | No functinality change. R600TargetMachine inherits from AMDGPUTargetMachine. llvm-svn: 228849
* R600/SI: Store immediate offsets > 12-bits in soffsetTom Stellard2015-02-111-13/+19
| | | | | | | This will save us from having to extend these offsets to 64-bits and storing them in a pair of vgprs. llvm-svn: 228776
* R600/SI: Add soffset operand to mubuf addr64 instructionTom Stellard2015-02-115-28/+33
| | | | | | We were previously hard-coding soffset to 0. llvm-svn: 228775
* Make helper functions/classes/globals static. NFC.Benjamin Kramer2015-02-061-1/+1
| | | | llvm-svn: 228410
* R600/SI: Don't enable WQM for V_INTERP_* instructions v2Michel Danzer2015-02-061-6/+0
| | | | | | | | | Doesn't seem necessary anymore. I think this was mostly compensating for not enabling WQM for texture sampling instructions. v2: Add test coverage Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 228373
* R600/SI: Also enable WQM for image opcodes which calculate LOD v3Michel Danzer2015-02-066-56/+79
| | | | | | | | | | | | | If whole quad mode isn't enabled for these, the level of detail is calculated incorrectly for pixels along diagonal triangle edges, causing artifacts. v2: Use a TSFlag instead of lots of switch cases v3: Add test coverage Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88642 Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 228372
* R600/SI: Fix bug in TTI loop unrolling preferencesTom Stellard2015-02-051-1/+1
| | | | | | | | | | | | | We should be setting UnrollingPreferences::MaxCount to MAX_UINT instead of UnrollingPreferences::Count. Count is a 'forced unrolling factor', while MaxCount sets an upper limit to the unrolling factor. Setting Count to MAX_UINT was causing the loop in the testcase to be unrolled 15 times, when it only had a maximum of 4 iterations. llvm-svn: 228303
* R600/SI: Fix bug from insertion of llvm.SI.end.cf into loop headersTom Stellard2015-02-051-3/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The llvm.SI.end.cf intrinsic is used to mark the end of if-then blocks, if-then-else blocks, and loops. It is responsible for updating the exec mask to re-enable threads that had been masked during the preceding control flow block. For example: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf The bug fixed by this patch was one where the llvm.SI.end.cf intrinsic was being inserted into the header of loops. This would happen when an if block terminated in a loop header and we would end up with code like this: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() LOOP: ; Start of loop header s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf <-BUG: The exec mask has the same value at the beginning of each loop iteration. do_stuff(); s_cbranch_execnz LOOP The fix is to create a new basic block before the loop and insert the llvm.SI.end.cf there. This way the exec mask is restored before the start of the loop instead of at the beginning of each iteration. llvm-svn: 228302
* R600/SI: Fix i64 truncate to i1Matt Arsenault2015-02-051-0/+6
| | | | llvm-svn: 228273
* R600/SI: Enable subreg liveness by defaultTom Stellard2015-02-041-0/+4
| | | | llvm-svn: 228228
* R600/SI: Expand misaligned 16-bit memory accessesTom Stellard2015-02-041-0/+5
| | | | llvm-svn: 228190
* R600/SI: Make more store operations legalTom Stellard2015-02-042-12/+0
| | | | | | | | | | | v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. llvm-svn: 228189
* R600: Don't promote i64 stores to v2i32 during DAG legalizationTom Stellard2015-02-042-3/+25
| | | | | | | We take care of this during instruction selection now. This fixes a potential infinite loop when lowering misaligned stores. llvm-svn: 228188
* R600/SI: Remove useless patterns in VALU which are already covered by SALUMarek Olsak2015-02-031-45/+16
| | | | | | | Also remove hasPostISelHook=1 from V_LSHL_B32. It's defined by InstSI already. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228039
* R600/SI: Rewrite VOP1InstSI to contain a pseudo and _si opcodeMarek Olsak2015-02-031-7/+23
| | | | | | | | | | | What this does is that if you accidentally select these instructions on VI, the code generation will fail, because the pseudo -> _vi mapping will be undefined. The idea is to be able to catch possible future bugs easily. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228038
* R600/SI: Fix B64 VALU shifts on VIMarek Olsak2015-02-033-0/+33
| | | | | | | SI only has standard versions. VI only has REV versions. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 228037
* R600/SI: Don't generate non-existent LSHL, LSHR, ASHR B32 variants on VIMarek Olsak2015-02-033-11/+41
| | | | | | | | | | | | | | | This can happen when a REV instruction is commuted. The trick is not to define the _vi versions of instructions, which has these consequences: - code generation will always fail if a pseudo cannot be lowered (very useful to catch bugs where an unsupported instruction somehow makes it to the printer) - ability to query if a pseudo can be lowered, which is done in commuteOpcode to prevent REV from commuting to non-REV on VI Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227990
* R600/SI: Remove VOP2_REV definitions from target-specific instructionsMarek Olsak2015-02-032-32/+22
| | | | | | | | | | | The getCommute* functions are only used with pseudos, so this commit doesn't change anything. The issue with missing non-rev versions of shift instructions on VI will fixed separately. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227989
* R600/SI: Trivial instruction definition corrections for VI (v2)Marek Olsak2015-02-032-12/+24
| | | | | | | | | | | | - V_MAC_LEGACY_F32 exists on VI, but it's VOP3-only. - Define CVT_PK opcodes which are different between SI and VI. These are unused. The idea is to define all chip differences. v2: keep V_MUL_LO_U32 Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227988
* R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2Marek Olsak2015-02-032-2/+12
| | | | | | | | | | | | | These are VOP2 on SI and VOP3 on VI, and their pseudos are neither, which can be a problem. In order to make isVOP2 and isVOP3 queries behave as expected, the encoding must be determined first. This doesn't fix any known issue, but better safe than sorry. v2: add and use getMCOpcodeFromPseudo Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227987
* R600/SI: Fix dependency between instruction writing M0 and S_SENDMSG on VI (v2)Marek Olsak2015-02-031-0/+34
| | | | | | | | | | This fixes a hang when using an empty geometry shader. v2: - don't add s_nop when followed by s_waitcnt - comestic changes Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 227986
* R600/SI: 64-bit and larger memory access must be at least 4-byte alignedTom Stellard2015-02-021-4/+4
| | | | | | | | This is true for SI only. CI+ supports unaligned memory accesses, but this requires driver support, so for now we disallow unaligned accesses for all GCN targets. llvm-svn: 227822
* [multiversion] Remove the function parameter from the unrollingChandler Carruth2015-02-012-3/+2
| | | | | | preferences interface on TTI now that all of TTI is per-function. llvm-svn: 227741
OpenPOWER on IntegriCloud