summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU : Fix common dominator of two incoming blocks terminates with uniform ↵Wei Ding2017-04-121-2/+24
| | | | | | | | branch issue. Differential Revision: http://reviews.llvm.org/D31350 llvm-svn: 300142
* AMDGPU: Fix folding reg_sequence into copy to phys regMatt Arsenault2017-04-111-0/+4
| | | | | | | This was producing an illegal reg_sequence defining a physical register with virtual register inputs. llvm-svn: 299997
* [CodeGen] Rename MachineInstrBuilder::addOperand. NFCDiana Picus2017-01-131-2/+3
| | | | | | | | | | | Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891
* AMDGPU/SI: Don't move copies of immediates to the VALUTom Stellard2016-12-061-1/+43
| | | | | | | | | | | | | | | Summary: If we write an immediate to a VGPR and then copy the VGPR to an SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than moving the copy to the SALU. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27272 llvm-svn: 288849
* AMDGPU/SI: Avoid moving PHIs to VALU when phi values are defined in scalar ↵Tom Stellard2016-11-291-8/+38
| | | | | | | | | | | | branches Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23417 llvm-svn: 288095
* AMDGPU/SI: Fix visit order assumption in SIFixSGPRCopiesTom Stellard2016-11-111-24/+44
| | | | | | | | | | | | | | | | | | | | Summary: This pass was assuming that when a PHI instruction defined a register used by another PHI instruction that the defining insstruction would be legalized before the using instruction. This assumption was causing the pass to not legalize some PHI nodes within divergent flow-control. This fixes a bug that was uncovered by r285762. Reviewers: nhaehnle, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D26303 llvm-svn: 286676
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-011-3/+1
| | | | llvm-svn: 283004
* Revert "AMDGPU: Remove unused control flow intrinsic"Matt Arsenault2016-07-091-0/+1
| | | | llvm-svn: 274978
* AMDGPU: Remove unused control flow intrinsicMatt Arsenault2016-07-081-1/+0
| | | | llvm-svn: 274939
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-4/+3
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* AMDGPU: Fix debug name of pass to better matchMatt Arsenault2016-04-211-1/+1
| | | | | | I get this wrong every time I try to debug this. llvm-svn: 267030
* AMDGPU/SI: Fold operands with sub-registersNicolai Haehnle2016-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-031-6/+9
| | | | llvm-svn: 251995
* AMDGPU: Stop assuming vreg for build_vectorMatt Arsenault2015-11-021-1/+4
| | | | | | | | | | | | | This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860
* AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCEMatt Arsenault2015-11-021-23/+89
| | | | | | | Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855
* AMDGPU: Refactor isVGPRToSGPRCopyMatt Arsenault2015-10-131-19/+48
| | | | | | | It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132
* AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefsMatt Arsenault2015-10-071-70/+0
| | | | | | | | | I'm not sure why this would be necessary, and no tests fail with them removed. Looking at the uses is suspect as well because the use reg classes will likely change when the users are moved as a result of moving this instruction. llvm-svn: 249493
* AMDGPU: Remove dead codeMatt Arsenault2015-10-011-7/+4
| | | | | | | There's no point in checking VReg_1 because all uses of it should already have been removed by SILowerI1Copies. llvm-svn: 249081
* AMDGPU: Fix recomputing dominator tree unnecessarilyMatt Arsenault2015-09-251-0/+4
| | | | | | | SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587
* AMDGPU: Move copy handling under switch like other instructionsMatt Arsenault2015-09-211-5/+10
| | | | llvm-svn: 248172
* AMDGPU: Simplify debug printingMatt Arsenault2015-09-101-3/+1
| | | | llvm-svn: 247345
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+338
llvm-svn: 239657
OpenPOWER on IntegriCloud