summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
Commit message (Collapse)AuthorAgeFilesLines
* IR: print value numbers for unnamed function argumentsTim Northover2019-08-031-4/+4
| | | | | | | | | | For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755
* Revert r326937 "[OpenCL] Remove block invoke function from emitted block ↵Sven van Haastregt2018-10-021-8/+8
| | | | | | | | | | | literal struct" This reverts r326937 as it broke block argument handling in OpenCL. See the discussion on https://reviews.llvm.org/D43783 . The next commit will add a test case that revealed the issue. llvm-svn: 343582
* [OpenCL] Remove block invoke function from emitted block literal structYaxun Liu2018-03-071-8/+8
| | | | | | | | | | | | | | | | | | | | | OpenCL runtime tracks the invoke function emitted for any block expression. Due to restrictions on blocks in OpenCL (v2.0 s6.12.5), it is always possible to know the block invoke function when emitting call of block expression or __enqueue_kernel builtin functions. Since __enqueu_kernel already has an argument for the invoke function, it is redundant to have invoke function member in the llvm block literal structure. This patch removes invoke function from the llvm block literal structure. It also removes the bitcast of block invoke function to the generic block literal type which is useless for OpenCL. This will save some space for the kernel argument, and also eliminate some store instructions. Differential Revision: https://reviews.llvm.org/D43783 llvm-svn: 326937
* [OpenCL] Fix __enqueue_block for block with capturesYaxun Liu2018-02-151-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | The following test case causes issue with codegen of __enqueue_block void (^block)(void) = ^{ callee(id, out); }; enqueue_kernel(queue, 0, ndrange, block); Clang first does codegen for block expression in the first line and deletes its block info. Clang then tries to do codegen for the same block expression again for the second line, and fails because the block info is gone. The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to record llvm block invoke function and llvm block literal emitted for each AST block expression, and use the recorded information for generating the wrapper kernel. The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen of blocks. Another minor issue is that some clean up AST expression is generated for block with captures, which can be stripped by IgnoreImplicit. Differential Revision: https://reviews.llvm.org/D43240 llvm-svn: 325264
* [AMDGPU] Switch to the new addr space mapping by defaultYaxun Liu2018-02-021-7/+7
| | | | | | | | This requires corresponding llvm change. Differential Revision: https://reviews.llvm.org/D40956 llvm-svn: 324102
* [AMDGPU] Fix bug in enqueued block codegen due to an extra lineYaxun Liu2017-10-191-0/+9
| | | | llvm-svn: 316165
* [OpenCL] Emit enqueued block as kernelYaxun Liu2017-10-141-0/+36
In OpenCL the kernel function and non-kernel function has different calling conventions. For certain targets they have different argument ABIs. Also kernels have special function attributes and metadata for runtime to launch them. The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such, the block invoke function should be emitted as kernel with proper calling convention and argument ABI. This patch emits enqueued block as kernel. If a block is both called directly and passed to enqueue_kernel, separate functions will be generated. Differential Revision: https://reviews.llvm.org/D38134 llvm-svn: 315804
OpenPOWER on IntegriCloud