summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis/DivergenceAnalysis
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Change DivergenceAnalysis for function argumentsMatt Arsenault2017-04-191-1/+26
| | | | | | Stop assuming all functions are kernels. llvm-svn: 300719
* AMDGPU: Add all atomicrmw fields to atomic.inc/decMatt Arsenault2017-03-301-12/+12
| | | | | | Add scope, order, isVolatile llvm-svn: 299122
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-214-8/+8
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* AMDGPU: Remove SI_fs_constant and SI_fs_interp intrinsicsMatt Arsenault2017-02-161-22/+0
| | | | | | Update test uses with expansion in terms of new intrinsics. llvm-svn: 295269
* AMDGPU: Fix atomic_inc/atomic_dec + ds_swizzle not being divergentMatt Arsenault2017-01-302-0/+43
| | | | llvm-svn: 293504
* Fix some broken CHECK lines.Benjamin Kramer2017-01-221-1/+1
| | | | | | The colon is important. llvm-svn: 292761
* NVPTX: Remove the legacy ptx intrinsicsJustin Bogner2016-07-071-2/+2
| | | | | | | | | | | | - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. llvm-svn: 274769
* DivergenceAnalysis: Fix crash with no return blocksMatt Arsenault2016-05-091-0/+30
| | | | | | The post dominator tree does not have a root node in this case. llvm-svn: 268933
* AMDGPU: llvm.SI.fs.constant is a source of divergenceNicolai Haehnle2016-05-021-0/+22
| | | | | | | | | | | | | | | | Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 llvm-svn: 268259
* DivergenceAnalysis: Fix crash with unreachable blocksMatt Arsenault2016-04-291-0/+17
| | | | | | | Unreachable blocks may not be in the dominator tree, so don't crash on them. llvm-svn: 268001
* [DivergenceAnalysis] Treat PHI with incoming undef as constantNicolai Haehnle2016-04-141-0/+28
| | | | | | | | | | | | | | | | | | | | | | | Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347
* AMDGPU: Remove leftover ShaderType attributes in testsMatt Arsenault2016-04-134-31/+29
| | | | llvm-svn: 266155
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-1/+1
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsicsNicolai Haehnle2016-03-181-0/+104
| | | | | | | | | | | | | | | | | | | | Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791
* AMDGPU: mark atomic instructions as sources of divergenceNicolai Haehnle2016-03-171-0/+15
| | | | | | | | | | | | | | Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 llvm-svn: 263719
* AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergenceNicolai Haehnle2016-03-141-0/+122
| | | | | | | | | | | | | | Summary: When multiple threads perform an atomic op with the same arguments, they will usually see different return values. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18101 llvm-svn: 263440
* AMDGPU: Fix not handling new workitem intrinsics in DivergenceAnalysisMatt Arsenault2016-02-111-0/+45
| | | | llvm-svn: 260491
* AMDGPU/SI: Fix implemenation of isSourceOfDivergence() for graphics shadersTom Stellard2015-12-192-0/+18
| | | | | | | | | | | | | | | Summary: The analysis of shader inputs was completely wrong. We were passing the wrong index to AttributeSet::hasAttribute() and the logic for which inputs where in SGPRs was wrong too. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15608 llvm-svn: 256082
* [DivergenceAnalysis] fix a bug in computing influence regionsJingyue Wu2015-12-181-1/+22
| | | | | | Fixes PR25864 llvm-svn: 256036
* Divergence analysis for GPU programsJingyue Wu2015-04-102-0/+200
Summary: Some optimizations such as jump threading and loop unswitching can negatively affect performance when applied to divergent branches. The divergence analysis added in this patch conservatively estimates which branches in a GPU program can diverge. This information can then help LLVM to run certain optimizations selectively. Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll Reviewers: resistor, hfinkel, eliben, meheff, jholewinski Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8576 llvm-svn: 234567
OpenPOWER on IntegriCloud