summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIInstrInfo.h
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/SI: Add SI Machine SchedulerNicolai Haehnle2016-01-131-0/+3
| | | | | | | | | | | | | | | | Summary: It is off by default, but can be used with --misched=si Patch by: Axel Davy Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D11885 llvm-svn: 257609
* AMDGPU: Fix off-by-one in SIRegisterInfo::eliminateFrameIndexNicolai Haehnle2015-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: The method insertNOPs expected the number of wait states to be passed as parameter, while eliminateFrameIndex passed the immediate argument for the S_NOP, leading to an off-by-one error. Rename the method to make the meaning of its parameter clearer. The number of 4 / 5 wait states (which is what the method has always _tried_ to do according to the comment) is correct according to the hardware docs. I stumbled upon this while trying to track down the cause of https://bugs.freedesktop.org/show_bug.cgi?id=93264. While clearly needed, this patch unfortunately does not fix that bug... Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15542 llvm-svn: 255906
* AMDGPU: Optimize VOP2 operand legalizationMatt Arsenault2015-12-011-0/+17
| | | | | | | | | | | | | | | | | | Don't use commuteInstruction, and don't commute if doing so will not improve legality. Skip the more complex checks for literal operands and constant bus restrictions, which are not a concern for VOP2 instructions because src1 does not accept SGPRs or constants and few implicitly read vcc. This gets called quite a few times and the attempts at commuting are a significant fraction of the time spent in SIFixSGPRCopies, so it's somewhat worthwhile to optimize. With this patch and others leading up to it, this reduces the compile time of SIFixSGPRCopies on some of the LuxMark 2 kernels from ~8ms to ~5ms on my system. llvm-svn: 254452
* AMDGPU/SI: select S_ABS_I32 when possible (v2)Marek Olsak2015-11-251-0/+3
| | | | | | | | | | v2: added more tests, moved the SALU->VALU conversion to a separate function It looks like it's not possible to get subregisters in the S_ABS lowering code, and I don't feel like guessing without testing what the correct code would look like. llvm-svn: 254095
* AMDGPU: Fix assert when legalizing atomic operandsMatt Arsenault2015-11-051-0/+6
| | | | | | | | | | The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. llvm-svn: 252140
* AMDGPU: Simplify VOP3 operand legalization.Matt Arsenault2015-10-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. llvm-svn: 250951
* AMDGPU: Add MachineInstr overloads for instruction format testsMatt Arsenault2015-10-201-0/+76
| | | | llvm-svn: 250797
* AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is setMarek Olsak2015-09-291-1/+1
| | | | | | | | | | to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> llvm-svn: 248858
* AMDGPU: Factor switch into separate functionMatt Arsenault2015-09-281-0/+3
| | | | llvm-svn: 248742
* Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor2015-09-281-2/+6
| | | | | | | | | | mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735
* AMDGPU: Construct new buffer instruction when moving SMRDMatt Arsenault2015-09-251-1/+2
| | | | | | | | | It's easier to understand creating a full instruction than the current situation where sometimes a new instruction is created and sometimes it is awkwardly mutated in place. llvm-svn: 248627
* AMDGPU: Make getNamedOperandIdx declaration readonlyMatt Arsenault2015-09-251-0/+2
| | | | | | This matches how it is defined in the generated implementation. llvm-svn: 248598
* AMDGPU: Add readonly to InstrMapping functionsMatt Arsenault2015-09-241-1/+15
| | | | llvm-svn: 248474
* AMDGPU: Delete dead codeMatt Arsenault2015-08-261-6/+0
| | | | | | | | | | | | | | | | | There is no context where s_mov_b64 is emitted and could potentially be moved to the VALU. It is currently only emitted for materializing immediates, which can't be dependent on vector sources. The immediate splitting is already done when selecting constants. I'm not sure what contexts if any the register splitting would have been used before. Also clean up using s_mov_b64 in place of v_mov_b64_pseudo, although this isn't required and just skips the extra step of eliminating the copy from the SReg_64. llvm-svn: 246080
* AMDGPU: Don't create intermediate SALU instructionsMatt Arsenault2015-08-261-0/+4
| | | | | | | | | | | | When splitting 64-bit operations, create the correct VALU instructions immediately. This was splitting things like s_or_b64 into the two s_or_b32s and then pushing the new instructions onto the worklist. There's no reason we need to do this intermediate step. llvm-svn: 246077
* AMDGPU/SI: Remove EXECRegMatt Arsenault2015-08-051-2/+0
| | | | | | For the same reasons as the other physical registers. llvm-svn: 244062
* AMDGPU/SI: Add implicit register operands in the correct order.Alex Lorenz2015-07-311-2/+0
| | | | | | | | | | | | | | | | | | This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in an incorrect order - the implicit uses were added before the implicit defs. I found this bug while working on moving the implicit register operand verification code from the MIR parser to the machine verifier. This commit also makes the method 'addImplicitDefUseOperands' in the machine instruction class public so that it can be reused in the 'SIInstrInfo' class. Reviewers: Matt Arsenault Differential Revision: http://reviews.llvm.org/D11689 llvm-svn: 243799
* AMDGPU/SI: Remove isTriviallyReMaterializable() function from SIInstrInfoTom Stellard2015-07-301-3/+0
| | | | | | | | | | | | | | Summary: This function is never called. isReallyTriviallyReMaterializable() is the function that should be implemented instead. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11620 llvm-svn: 243651
* AMDGPU/SI: Select mad patterns to v_mac_f32Tom Stellard2015-07-131-0/+4
| | | | | | | | | The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 llvm-svn: 242038
* AMDGPU: really don't commute REV opcodes if the target variant doesn't existMarek Olsak2015-06-261-1/+1
| | | | | | | | | | | | | | If pseudoToMCOpcode failed, we would return the original opcode, so operands would be swapped, but the instruction would remain the same. It resulted in LSHLREV a, b ---> LSHLREV b, a. This fixes Glamor text rendering and piglit/arb_sample_shading-builtin-gl-sample-mask on VI. This is a candidate for stable branches. v2: the test was simplified by Tom Stellard llvm-svn: 240824
* [TargetInstrInfo] Rename getLdStBaseRegImmOfs and implement for x86.Sanjoy Das2015-06-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Summary: TargetInstrInfo::getLdStBaseRegImmOfs to TargetInstrInfo::getMemOpBaseRegImmOfs and implement for x86. The implementation only handles a few easy cases now and will be made more sophisticated in the future. This is NFCI: the only user of `getLdStBaseRegImmOfs` (now `getmemOpBaseRegImmOfs`) is `LoadClusterMotion` and `LoadClusterMotion` is disabled for x86. Reviewers: reames, ab, MatzeB, atrick Reviewed By: MatzeB, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10199 llvm-svn: 239741
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+391
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-89/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+89
llvm-svn: 160270
OpenPOWER on IntegriCloud