summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-4/+3
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* AMDGPU: Fix debug name of pass to better matchMatt Arsenault2016-04-211-1/+1
| | | | | | I get this wrong every time I try to debug this. llvm-svn: 267030
* AMDGPU/SI: Fold operands with sub-registersNicolai Haehnle2016-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-031-6/+9
| | | | llvm-svn: 251995
* AMDGPU: Stop assuming vreg for build_vectorMatt Arsenault2015-11-021-1/+4
| | | | | | | | | | | | | This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860
* AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCEMatt Arsenault2015-11-021-23/+89
| | | | | | | Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855
* AMDGPU: Refactor isVGPRToSGPRCopyMatt Arsenault2015-10-131-19/+48
| | | | | | | It should now correctly handle physical registers and make it easier to identify the other direction. llvm-svn: 250132
* AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefsMatt Arsenault2015-10-071-70/+0
| | | | | | | | | I'm not sure why this would be necessary, and no tests fail with them removed. Looking at the uses is suspect as well because the use reg classes will likely change when the users are moved as a result of moving this instruction. llvm-svn: 249493
* AMDGPU: Remove dead codeMatt Arsenault2015-10-011-7/+4
| | | | | | | There's no point in checking VReg_1 because all uses of it should already have been removed by SILowerI1Copies. llvm-svn: 249081
* AMDGPU: Fix recomputing dominator tree unnecessarilyMatt Arsenault2015-09-251-0/+4
| | | | | | | SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. llvm-svn: 248587
* AMDGPU: Move copy handling under switch like other instructionsMatt Arsenault2015-09-211-5/+10
| | | | llvm-svn: 248172
* AMDGPU: Simplify debug printingMatt Arsenault2015-09-101-3/+1
| | | | llvm-svn: 247345
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+338
llvm-svn: 239657
OpenPOWER on IntegriCloud