summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/global_atomics.ll
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Split flat offsets that don't fit in DAGMatt Arsenault2019-10-201-2/+7
| | | | | | | | | | We handle it this way for some other address spaces. Since r349196, SILoadStoreOptimizer has been trying to do this. This is after SIFoldOperands runs, which can change the addressing patterns. It's simpler to just split this earlier. llvm-svn: 375366
* Allow FP types for atomicrmw xchgMatt Arsenault2019-01-171-0/+11
| | | | llvm-svn: 351427
* [AMDGPU] Add an AMDGPU specific atomic optimizer.Neil Henning2018-10-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | This commit adds a new IR level pass to the AMDGPU backend to perform atomic optimizations. It works by: - Running through a function and finding atomicrmw add/sub or uses of the atomic buffer intrinsics for add/sub. - If all arguments except the value to be added/subtracted are uniform, record the value to be optimized. - Run through the atomic operations we can optimize and, depending on whether the value is uniform/divergent use wavefront wide operations (DPP in the divergent case) to calculate the total amount to be atomically added/subtracted. - Then let only a single lane of each wavefront perform the atomic operation, reducing the total number of atomic operations in flight. - Lastly we recombine the result from the single lane to each lane of the wavefront, and calculate our individual lanes offset into the final result. Differential Revision: https://reviews.llvm.org/D51969 llvm-svn: 343973
* AMDGPU: Start selecting global instructionsMatt Arsenault2017-07-291-201/+383
| | | | llvm-svn: 309470
* AMDGPU: Implement memory modelKonstantin Zhuravlyov2017-07-211-8/+8
| | | | llvm-svn: 308781
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-98/+98
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-1/+1
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* AMDGPU: Fix missing and broken check lines in atomic testsMatt Arsenault2016-06-091-288/+222
| | | | llvm-svn: 272343
* AMDGPU/SI: Assembler: Unify parsing/printing of operands.Nikolay Haustov2016-04-291-52/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The goal is for each operand type to have its own parse function and at the same time share common code for tracking state as different instruction types share operand types (e.g. glc/glc_flat, etc). Introduce parseAMDGPUOperand which can parse any optional operand. DPP and Clamp/OMod have custom handling for now. Sam also suggested to have class hierarchy for operand types instead of table. This can be done in separate change. Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps, parseMubufOptionalOps, parseDPPOptionalOps. Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class. Rename AsmMatcher/InstPrinter methods accordingly. Print immediate type when printing parsed immediate operand. Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3). Update tests. Reviewers: tstellarAMD, SamWot, artem.tamazov Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19584 llvm-svn: 268015
* AMDGPU/SI: Implement atomic load/store for i32 and i64Jan Vesely2016-04-071-0/+178
| | | | | | | | | | Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709
* AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}Tom Stellard2016-04-011-0/+89
| | | | | | | | | | | | | | | | | Summary: Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+. 32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý. Patch by: Vedran Miletić Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: jvesely, scchan, kanarayan, arsenm Differential Revision: http://reviews.llvm.org/D17280 llvm-svn: 265170
* AMDGPU: Set flat_scratch from flat_scratch_init regMatt Arsenault2016-02-121-78/+0
| | | | | | | | | | | | | | This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. llvm-svn: 260658
* AMDGPU: Fix printing trailing whitespace for mubuf atomicsMatt Arsenault2015-09-241-10/+10
| | | | llvm-svn: 248472
* AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory opsTom Stellard2015-07-201-80/+200
| | | | | | | | | | | | | | Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+801
llvm-svn: 239657
OpenPOWER on IntegriCloud