AMDGPU/SI: Fold operands with sub-registers

Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074
author: Nicolai Haehnle <nhaehnle@gmail.com> 2016-01-07 17:10:29 +0000
committer: Nicolai Haehnle <nhaehnle@gmail.com> 2016-01-07 17:10:29 +0000
commit: 82fc962c2018b8130f5952a1d1e993e08cbea750 (patch)
tree: 131fa31a59a754c7d81d0a0b62583bcb7541bc80 /llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
parent: 3c05d6d3b5340d83d55ad29cb5509042efcabf2e (diff)
download: bcm5719-llvm-82fc962c2018b8130f5952a1d1e993e08cbea750.tar.gz
bcm5719-llvm-82fc962c2018b8130f5952a1d1e993e08cbea750.zip
1 files changed, 1 insertions, 4 deletions
diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index 02a39307e74..ccbf7c80f2a 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -334,13 +334,10 @@ bool SIFoldOperands::runOnMachineFunction(MachineFunction &MF) {
           !MRI.hasOneUse(MI.getOperand(0).getReg()))
         continue;
 
-      // FIXME: Fold operands with subregs.
       if (OpToFold.isReg() &&
-          (!TargetRegisterInfo::isVirtualRegister(OpToFold.getReg()) ||
-           OpToFold.getSubReg()))
+          !TargetRegisterInfo::isVirtualRegister(OpToFold.getReg()))
         continue;
 
-
       // We need mutate the operands of new mov instructions to add implicit
       // uses of EXEC, but adding them invalidates the use_iterator, so defer
       // this.
author	Nicolai Haehnle <nhaehnle@gmail.com>	2016-01-07 17:10:29 +0000
committer	Nicolai Haehnle <nhaehnle@gmail.com>	2016-01-07 17:10:29 +0000
commit	82fc962c2018b8130f5952a1d1e993e08cbea750 (patch)
tree	131fa31a59a754c7d81d0a0b62583bcb7541bc80 /llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
parent	3c05d6d3b5340d83d55ad29cb5509042efcabf2e (diff)
download	bcm5719-llvm-82fc962c2018b8130f5952a1d1e993e08cbea750.tar.gz bcm5719-llvm-82fc962c2018b8130f5952a1d1e993e08cbea750.zip