summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/branch-condition-and.ll
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[AMDGPU] Invert the handling of skip insertion."Nicolai Hähnle2020-02-031-2/+3
| | | | | | | | | This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Mesa. (cherry picked from commit a80291ce10ba9667352adcc895f9668144f5f616)
* [AMDGPU] Invert the handling of skip insertion.cdevadas2020-01-151-3/+2
| | | | | | | | | | | | | | | The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092
* [AMDGPU] Eliminate no effect instructions before s_endpgmStanislav Mekhanoshin2017-08-161-1/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D36585 llvm-svn: 310987
* [AMDGPU] Optimize SI_IF lowering for simple if regionsStanislav Mekhanoshin2017-07-261-1/+0
| | | | | | | | | | | | | Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64. The xor is used to extract only the changed bits. In case of a simple if region where the only use of that value is in the SI_END_CF to restore the old exec mask, we can omit the xor and perform an or of the exec mask with the original exec value saved by the s_and_saveexec_b64. Differential Revision: https://reviews.llvm.org/D35861 llvm-svn: 309185
* [AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests.Mark Searles2017-06-021-2/+1
| | | | | | | | | -enable-si-insert-waitcnts=1 becomes the default -enable-si-insert-waitcnts=0 to use old pass Differential Revision: https://reviews.llvm.org/D33730 llvm-svn: 304551
* AMDGPU: Unify divergent function exits.Matt Arsenault2017-03-241-6/+11
| | | | | | | | | | StructurizeCFG can't handle cases with multiple returns creating regions with multiple exits. Create a copy of UnifyFunctionExitNodes that only unifies exit nodes that skips exit nodes with uniform branch sources. llvm-svn: 298729
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-1/+1
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* [AMDGPU] Fix multiple vreg definitions in si-lower-control-flowStanislav Mekhanoshin2016-11-221-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D26939 llvm-svn: 287608
* AMDGPU: Fix use-after-free in SIOptimizeExecMaskingNicolai Haehnle2016-10-071-0/+39
Summary: There was a bug with sequences like s_mov_b64 s[0:1], exec s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill> ... s_mov_b64_term exec, s[2:3] because s[2:3] was defined and used in the same instruction, ending up with SaveExecInst inside OtherUseInsts. Note that the test case also exposes an unrelated bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028 Reviewers: tstellarAMD, arsenm Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25306 llvm-svn: 283528
OpenPOWER on IntegriCloud