summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/llvm.amdgcn.update.dpp.ll
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Support mov dpp with 64 bit operandsStanislav Mekhanoshin2019-10-151-6/+59
| | | | | | | | | | We define mov/update dpp intrinsics as overloaded but do not support i64, which is a practically useful type. Fix the selection and lowering. Differential Revision: https://reviews.llvm.org/D68673 llvm-svn: 374910
* [AMDGPU] gfx1010 dpp16 and dpp8Stanislav Mekhanoshin2019-06-121-16/+27
| | | | | | Differential Revision: https://reviews.llvm.org/D63203 llvm-svn: 363186
* AMDGPU: Select VOP3 form of addMatt Arsenault2019-05-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The VOP3 form should always be the preferred selection, to be shrunk later. This should only be an optimization issue, but this partially works around a problem from clobbering VCC when SIFixSGPRCopies rewrites an SCC defining operation directly to VCC. 3 of the testcases are regressions from failing to fold the immediate in cases it should. These can be avoided by improving the VCC liveness handling in SIFoldOperands. Simply increasing the threshold to computeRegisterLiveness works, although this is common enough that VCC liveness should probably be tracked throughout the pass. The hack of leaving behind an implicit_def instruction to avoid breaking iterator wastes instruction count, which inhibits finding the VCC def in long chains of adds. Doing this however exposes different, worse looking regressions from poor scheduling behavior. This could probably be avoided around by forcing the shrink of the addc here, but the scheduler should probably be fixed. The r600 add test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. llvm-svn: 360293
* AMDGPU: Add support for cross address space synchronization scopesKonstantin Zhuravlyov2019-03-251-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D59517 llvm-svn: 356946
* [AMDGPU]: Turn on the DPP combiner by defaultValery Pykhtin2018-12-051-2/+2
| | | | | | Differential revision: https://reviews.llvm.org/D55314 llvm-svn: 348371
* Relax fast register allocator related test cases; NFCMatthias Braun2018-10-291-1/+1
| | | | | | | | | | | | | - Relex hard coded registers and stack frame sizes - Some test cleanups - Change phi-dbg.ll to match on mir output after phi elimination instead of going through the whole codegen pipeline. This is in preparation for https://reviews.llvm.org/D52010 I'm committing all the test changes upfront that work before and after independently. llvm-svn: 345532
* [AMDGPU] Divergence driven instruction selection. Part 1.Alexander Timofeev2018-09-211-2/+1
| | | | | | | | | | | | | Summary: This change is the first part of the AMDGPU target description change. The aim of it is the effective splitting the vector and scalar flows at the selection stage. Selection uses predicate functions based on the framework implemented earlier - https://reviews.llvm.org/D35267 Differential revision: https://reviews.llvm.org/D52019 Reviewers: rampitec llvm-svn: 342719
* run post-RA hazard recognizer pass lateMark Searles2018-07-161-1/+30
| | | | | | | | | | | | | Memory legalizer, waitcnt, and shrink passes can perturb the instructions, which means that the post-RA hazard recognizer pass should run after them. Otherwise, one of those passes may invalidate the work done by the hazard recognizer. Note that this has adverse side-effect that any consecutive S_NOP 0's, emitted by the hazard recognizer, will not be shrunk into a single S_NOP <N>. This should be addressed in a follow-on patch. Differential Revision: https://reviews.llvm.org/D49288 llvm-svn: 337154
* [AMDGPU] Add llvm.amdgpu.update.dpp intrinsicConnor Abbott2017-08-081-0/+17
Summary: Now that we've made all the necessary backend changes, we can add a new intrinsic which exposes the new capabilities to IR producers. Since llvm.amdgpu.update.dpp is a strict superset of llvm.amdgpu.mov.dpp, we should deprecate the former. We also add tests for all the functionality that was added in previous changes, now that we can access it via an IR construct. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34718 llvm-svn: 310399
OpenPOWER on IntegriCloud