summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr ↵Mark Searles2017-12-071-62/+8
| | | | | | | | | | | | | | count in debug output." Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio : lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field] int32_t InstCnt = 0; ^ 1 error generated. " This reverts commit 71627f79010aafe74fdcba901bba28dd7caa0869. llvm-svn: 320086
* [AMDGPU] Add options for waitcnt pass debugging; add instr count in debug ↵Mark Searles2017-12-071-8/+62
| | | | | | | | | | | | | output. -amdgpu-waitcnt-forcezero={1|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs Differential Revision: https://reviews.llvm.org/D40091 llvm-svn: 320084
* AMDGPU: fix missing s_waitcntTim Corringham2017-12-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651
* AMDGPU: Move hazard avoidance out of waitcnt pass.Matt Arsenault2017-11-171-54/+0
| | | | | | | | | | This is mostly moving VMEM clause breaking into the hazard recognizer. Also move another hazard currently handled in the waitcnt pass. Also stops breaking clauses unless xnack is enabled. llvm-svn: 318557
* Fix warnings discovered by rL317076. [-Wunused-private-field]NAKAMURA Takumi2017-11-011-2/+0
| | | | llvm-svn: 317091
* [AMDGPU] NFC: test commitEvgeny Mankov2017-08-161-10/+10
| | | | llvm-svn: 311019
* [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-081-53/+68
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310328
* AMDGPU: Partially fix improper reliance on memoperandsMatt Arsenault2017-07-211-17/+26
| | | | | | | There are 2 more places doing this, but I'm not sure what they are doing and don't make any sense to me llvm-svn: 308770
* AMDGPU: Don't track lgkmcnt for global_/scratch_ instructionsMatt Arsenault2017-07-211-4/+7
| | | | llvm-svn: 308766
* [AMDGPU] Fix uninit'ed var (RevisitLoop)Mark Searles2017-06-051-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33907 llvm-svn: 304729
* AMDGPU: Make auto waitcnt before barrier a featureKonstantin Zhuravlyov2017-06-021-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571
* [AMDGPU] Fix bugs in new waitcnt pass. Add test.Mark Searles2017-05-311-4/+22
| | | | | | | | | | | - new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it - fix handling of PERMUTE ops - fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass ) - add new test Differential Revision: https://reviews.llvm.org/D33114 llvm-svn: 304311
* [AMDGPU] In the new waitcnt insertion pass, use getHeader Kannan Narayanan2017-05-051-5/+5
| | | | | | | | instead of getTopBlock to find the loop header. Differential Revision: https://reviews.llvm.org/D32831 llvm-svn: 302290
* Move size and alignment information of regclass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-2/+2
| | | | | | | | | | | | | | | 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 llvm-svn: 301221
* [AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing.Kannan Narayanan2017-04-121-0/+1863
Based on comments in https://reviews.llvm.org/D31161. llvm-svn: 300023
OpenPOWER on IntegriCloud