summaryrefslogtreecommitdiffstats
path: root/llvm/lib
diff options
context:
space:
mode:
authorTom Stellard <thomas.stellard@amd.com>2016-04-12 18:40:43 +0000
committerTom Stellard <thomas.stellard@amd.com>2016-04-12 18:40:43 +0000
commitab1d3a9d505ab01658d0a1e7adf06fc7415fc64c (patch)
tree2501150a557b367ed3b6d47dd2ec60fdda54aaad /llvm/lib
parent3b08238f7878988a8e2737cdab42fcc334c8547f (diff)
downloadbcm5719-llvm-ab1d3a9d505ab01658d0a1e7adf06fc7415fc64c.tar.gz
bcm5719-llvm-ab1d3a9d505ab01658d0a1e7adf06fc7415fc64c.zip
AMDGPU/SI: Insert wait states required after v_readfirstlane on SI
Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 llvm-svn: 266105
Diffstat (limited to 'llvm/lib')
-rw-r--r--llvm/lib/Target/AMDGPU/SIInsertWaits.cpp6
1 files changed, 6 insertions, 0 deletions
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaits.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaits.cpp
index f250782de58..bf0d6a74336 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaits.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaits.cpp
@@ -601,6 +601,12 @@ bool SIInsertWaits::runOnMachineFunction(MachineFunction &MF) {
insertDPPWaitStates(I);
}
+ // Insert required wait states for SMRD reading an SGPR written by a VALU
+ // instruction.
+ if (ST.getGeneration() <= AMDGPUSubtarget::SOUTHERN_ISLANDS &&
+ I->getOpcode() == AMDGPU::V_READFIRSTLANE_B32)
+ TII->insertWaitStates(MBB, std::next(I), 4);
+
// Wait for everything before a barrier.
if (I->getOpcode() == AMDGPU::S_BARRIER)
Changes |= insertWait(MBB, I, LastIssued);
OpenPOWER on IntegriCloud