summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis/DivergenceAnalysis
Commit message (Collapse)AuthorAgeFilesLines
* [SDA] Don't stop divergence propagation at the IPD.Jay Foad2019-09-181-0/+111
| | | | | | | | | | | | | | | | | | | Summary: This fixes B42473 and B42706. This patch makes the SDA propagate branch divergence until the end of the RPO traversal. Before, the SyncDependenceAnalysis propagated divergence only until the IPD in rpo order. RPO is incompatible with post dominance in the presence of loops. This made the SDA crash because blocks were missed in the propagation. Reviewers: foad, nhaehnle Reviewed By: foad Subscribers: jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65274 llvm-svn: 372223
* [AMDGPU] ImmArg and SourceOfDivergence for permlane/dppStanislav Mekhanoshin2019-06-131-0/+40
| | | | | | | | | Added missing ImmArg and SourceOfDivergence to the crosslane intrinsics. Differential Revision: https://reviews.llvm.org/D63216 llvm-svn: 363276
* [SDA] Bug fix: Use IPD outside the loop as divergence boundNicolai Haehnle2019-04-181-0/+37
| | | | | | | | | | | | | | | | | | Summary: The immediate post dominator of the loop header may be part of the divergent loop. Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: mmasten, arsenm, jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59042 llvm-svn: 358681
* [AMDGPU] Flag new raw/struct atomic ops as source of divergenceTim Renouf2019-04-171-0/+200
| | | | | | | Differential Revision: https://reviews.llvm.org/D60731 Change-Id: I821d93dec8b9cdd247b8172d92fb5e15340a9e7d llvm-svn: 358579
* [AMDGPU] Add support for 64 bit buffer atomic artihmetic instructionsRyan Taylor2019-03-061-30/+30
| | | | | | | | | | | | | | | | Summary: This adds support for 64 bit buffer atomic arithmetic instructions but does not include cmpswap as that depends on a fix to the way the register pairs are handled Change-Id: Ib207ea65fb69487ccad5066ea647ae8ddfe2ce61 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58918 llvm-svn: 355520
* [AMDGPU] Add intrinsics for 16 bit interpolationTim Corringham2019-01-281-0/+25
| | | | | | | | | | | | | | | | | | | Summary: Added the intrinsics llvm.amdgcn.interp.p1.f16() and llvm.amdgcn.interp.p2.f16() and related LIT test. The p1 intrinsic generates code appropriate for both 16 and 32 bank LDS. Reviewers: #amdgpu, dstuttard, arsenm, tpr Reviewed By: #amdgpu, arsenm Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46754 llvm-svn: 352357
* [AMDGPU] Add some missing always-uniform values.Neil Henning2019-01-181-1/+19
| | | | | | | | | This commit adds some missing intrinsics into the isAlwaysUniform list for the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D56845 llvm-svn: 351562
* [DA] GPUDivergenceAnalysis for unstructured GPU kernelsNicolai Haehnle2018-11-3019-0/+1215
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is patch #3 of the new DivergenceAnalysis <https://lists.llvm.org/pipermail/llvm-dev/2018-May/123606.html> The GPUDivergenceAnalysis is intended to eventually supersede the existing LegacyDivergenceAnalysis. The existing LegacyDivergenceAnalysis produces incorrect results on unstructured Control-Flow Graphs: <https://bugs.llvm.org/show_bug.cgi?id=37185> This patch adds the option -use-gpu-divergence-analysis to the LegacyDivergenceAnalysis to turn it into a transparent wrapper for the GPUDivergenceAnalysis. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: jholewinski, jvesely, jfb, llvm-commits, alex-t, sameerds, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D53493 llvm-svn: 348048
* Move test/Analysis/DivergenceAnalysis/AMDGPU/loads.llNicolai Haehnle2018-08-301-15/+0
| | | | | | | Should fix failures of buildbots that don't build the AMDGPU backend. Change-Id: I01cb84b4b47803b10c5b21ea0353546239860a51 llvm-svn: 341079
* [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysisNicolai Haehnle2018-08-3012-679/+0
| | | | | | | | | | | | | | | | | | | | Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes. Patch by: Simon Moll Reviewed By: nhaehnle Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50434 Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
* [AMDGPU] Consider loads from flat addrspace to be potentially divergentScott Linder2018-08-211-0/+15
| | | | | | | | | In general we can't assume flat loads are uniform, and cases where we can prove they are should be handled through infer-address-spaces. Differential Revision: https://reviews.llvm.org/D50991 llvm-svn: 340343
* DivergenceAnalysis: added debug outputTim Renouf2018-07-131-1/+4
| | | | | | | | | | | | | | | | | | Summary: This commit does two things: 1. modified the existing DivergenceAnalysis::dump() so it dumps the whole function with added DIVERGENT: annotations; 2. added code to do that dump if the appropriate -debug-only option is on. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47700 Change-Id: Id97b605aab1fc6f5a11a20c58a99bbe8c565bf83 llvm-svn: 336998
* AMDGPU: Convert test cases to the dimension-aware intrinsicsNicolai Haehnle2018-06-211-39/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Also explicitly port over some tests in llvm.amdgcn.image.* that were missing. Some tests are removed because they no longer apply (i.e. explicitly testing building an address vector via insertelement). This is in preparation for the eventual removal of the old-style intrinsics. Some additional notes: - constant-address-space-32bit.ll: change some GCN-NEXT to GCN because the instruction schedule was subtly altered - insert_vector_elt.ll: the old test didn't actually test anything, because %tmp1 was not used; remove the load, because it doesn't work (Because of the amdgpu_ps calling convention? In any case, it's orthogonal to what the test claims to be testing.) Change-Id: Idfa99b6512ad139e755e82b8b89548ab08f0afcf Reviewers: arsenm, rampitec Subscribers: MatzeB, qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D48018 llvm-svn: 335229
* AMDGPU: Dimension-aware image intrinsicsNicolai Haehnle2018-04-041-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: These new image intrinsics contain the texture type as part of their name and have each component of the address/coordinate as individual parameters. This is a preparatory step for implementing the A16 feature, where coordinates are passed as half-floats or -ints, but the Z compare value and texel offsets are still full dwords, making it difficult or impossible to distinguish between A16 on or off in the old-style intrinsics. Additionally, these intrinsics pass the 'texfailpolicy' and 'cachectrl' as i32 bit fields to reduce operand clutter and allow for future extensibility. v2: - gather4 supports 2darray images - fix a bug with 1D images on SI Change-Id: I099f309e0a394082a5901ea196c3967afb867f04 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44939 llvm-svn: 329166
* AMDGPU: Change DivergenceAnalysis for function argumentsMatt Arsenault2017-04-191-1/+26
| | | | | | Stop assuming all functions are kernels. llvm-svn: 300719
* AMDGPU: Add all atomicrmw fields to atomic.inc/decMatt Arsenault2017-03-301-12/+12
| | | | | | Add scope, order, isVolatile llvm-svn: 299122
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-214-8/+8
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsicsMatt Arsenault2017-02-161-22/+0
| | | | | | Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269
* AMDGPU: Fix atomic_inc/atomic_dec + ds_swizzle not being divergentMatt Arsenault2017-01-302-0/+43
| | | | llvm-svn: 293504
* Fix some broken CHECK lines.Benjamin Kramer2017-01-221-1/+1
| | | | | | The colon is important. llvm-svn: 292761
* NVPTX: Remove the legacy ptx intrinsicsJustin Bogner2016-07-071-2/+2
| | | | | | | | | | | | - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769
* DivergenceAnalysis: Fix crash with no return blocksMatt Arsenault2016-05-091-0/+30
| | | | | | The post dominator tree does not have a root node in this case. llvm-svn: 268933
* AMDGPU: llvm.SI.fs.constant is a source of divergenceNicolai Haehnle2016-05-021-0/+22
| | | | | | | | | | | | | | | | Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 llvm-svn: 268259
* DivergenceAnalysis: Fix crash with unreachable blocksMatt Arsenault2016-04-291-0/+17
| | | | | | | Unreachable blocks may not be in the dominator tree, so don't crash on them. llvm-svn: 268001
* [DivergenceAnalysis] Treat PHI with incoming undef as constantNicolai Haehnle2016-04-141-0/+28
| | | | | | | | | | | | | | | | | | | | | | | Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347
* AMDGPU: Remove leftover ShaderType attributes in testsMatt Arsenault2016-04-134-31/+29
| | | | llvm-svn: 266155
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-1/+1
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsicsNicolai Haehnle2016-03-181-0/+104
| | | | | | | | | | | | | | | | | | | | Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791
* AMDGPU: mark atomic instructions as sources of divergenceNicolai Haehnle2016-03-171-0/+15
| | | | | | | | | | | | | | Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 llvm-svn: 263719
* AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergenceNicolai Haehnle2016-03-141-0/+122
| | | | | | | | | | | | | | Summary: When multiple threads perform an atomic op with the same arguments, they will usually see different return values. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18101 llvm-svn: 263440
* AMDGPU: Fix not handling new workitem intrinsics in DivergenceAnalysisMatt Arsenault2016-02-111-0/+45
| | | | llvm-svn: 260491
* AMDGPU/SI: Fix implemenation of isSourceOfDivergence() for graphics shadersTom Stellard2015-12-192-0/+18
| | | | | | | | | | | | | | | Summary: The analysis of shader inputs was completely wrong. We were passing the wrong index to AttributeSet::hasAttribute() and the logic for which inputs where in SGPRs was wrong too. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15608 llvm-svn: 256082
* [DivergenceAnalysis] fix a bug in computing influence regionsJingyue Wu2015-12-181-1/+22
| | | | | | Fixes PR25864 llvm-svn: 256036
* Divergence analysis for GPU programsJingyue Wu2015-04-102-0/+200
Summary: Some optimizations such as jump threading and loop unswitching can negatively affect performance when applied to divergent branches. The divergence analysis added in this patch conservatively estimates which branches in a GPU program can diverge. This information can then help LLVM to run certain optimizations selectively. Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll Reviewers: resistor, hfinkel, eliben, meheff, jholewinski Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8576 llvm-svn: 234567
OpenPOWER on IntegriCloud