diff options
author | Nicolai Haehnle <nhaehnle@gmail.com> | 2016-10-24 14:56:02 +0000 |
---|---|---|
committer | Nicolai Haehnle <nhaehnle@gmail.com> | 2016-10-24 14:56:02 +0000 |
commit | a785209bc2fb8b779d1aab3498e9641943306b17 (patch) | |
tree | 9ddf73deae332ea179a2291542337c1c44fc3239 /llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | |
parent | b38d3411064bb59641714fde981564cf51217308 (diff) | |
download | bcm5719-llvm-a785209bc2fb8b779d1aab3498e9641943306b17.tar.gz bcm5719-llvm-a785209bc2fb8b779d1aab3498e9641943306b17.zip |
AMDGPU: Fix Two Address problems with v_movreld
Summary:
The v_movreld machine instruction is used with three operands that are
in a sense tied to each other (the explicit VGPR_32 def and the implicit
VGPR_NN def and use). There is no way to express that using the currently
available operand bits, and indeed there are cases where the Two Address
instructions pass does the wrong thing.
This patch introduces a new set of pseudo instructions that are identical
in intended semantics as v_movreld, but they only have two tied operands.
Having to add a new set of pseudo instructions is admittedly annoying, but
it's a fairly straightforward and solid approach. The only alternative I
see is to try to teach the Two Address instructions pass about Three Address
instructions, and I'm afraid that's trickier and is going to end up more
fragile.
Note that v_movrels does not suffer from this problem, and so this patch
does not touch it.
This fixes several GL45-CTS.shaders.indexing.* tests.
Reviewers: tstellarAMD, arsenm
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D25633
llvm-svn: 284980
Diffstat (limited to 'llvm/lib/Target/AMDGPU/SIInstrInfo.cpp')
-rw-r--r-- | llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | 26 |
1 files changed, 26 insertions, 0 deletions
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp index dd9742bfbe2..fffdeb4b5e0 100644 --- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp @@ -909,6 +909,32 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { MI.eraseFromParent(); break; } + case AMDGPU::V_MOVRELD_B32_V1: + case AMDGPU::V_MOVRELD_B32_V2: + case AMDGPU::V_MOVRELD_B32_V4: + case AMDGPU::V_MOVRELD_B32_V8: + case AMDGPU::V_MOVRELD_B32_V16: { + const MCInstrDesc &MovRelDesc = get(AMDGPU::V_MOVRELD_B32_e32); + unsigned VecReg = MI.getOperand(0).getReg(); + bool IsUndef = MI.getOperand(1).isUndef(); + unsigned SubReg = AMDGPU::sub0 + MI.getOperand(3).getImm(); + assert(VecReg == MI.getOperand(1).getReg()); + + MachineInstr *MovRel = + BuildMI(MBB, MI, DL, MovRelDesc) + .addReg(RI.getSubReg(VecReg, SubReg), RegState::Undef) + .addOperand(MI.getOperand(2)) + .addReg(VecReg, RegState::ImplicitDefine) + .addReg(VecReg, RegState::Implicit | (IsUndef ? RegState::Undef : 0)); + + const int ImpDefIdx = + MovRelDesc.getNumOperands() + MovRelDesc.getNumImplicitUses(); + const int ImpUseIdx = ImpDefIdx + 1; + MovRel->tieOperands(ImpDefIdx, ImpUseIdx); + + MI.eraseFromParent(); + break; + } case AMDGPU::SI_PC_ADD_REL_OFFSET: { MachineFunction &MF = *MBB.getParent(); unsigned Reg = MI.getOperand(0).getReg(); |