summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl
Commit message (Collapse)AuthorAgeFilesLines
* IR: print value numbers for unnamed function argumentsTim Northover2019-08-031-2/+2
| | | | | | | | | | For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755
* LLVM IR: Generate new-style byval-with-Type from ClangTim Northover2019-06-051-3/+3
| | | | | | | | | | | LLVM IR recently added a Type parameter to the byval Attribute, so that when pointers become opaque and no longer have an element type the information will still be present in IR. For now the Type parameter is optional (which is why Clang didn't need this change at the time), but it will become mandatory soon. llvm-svn: 362652
* [OpenCL] Re-fix invalid address space generation for clk_event_t arguments ↵Alexey Sotkin2019-04-111-2/+2
| | | | | | | | | | | | | | | | | | | of enqueue_kernel builtin function Summary: https://reviews.llvm.org/D53809 fixed wrong address space(assert in debug build) generated for event_ret argument. But exactly the same problem exists for event_wait_list argument. This patch should fix both. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: kristina, ebevhan, cfe-commits Differential Revision: https://reviews.llvm.org/D59985 llvm-svn: 358151
* [OpenCL] Simplify LLVM IR generated for OpenCL blocksAndrew Savonichev2019-02-211-8/+26
| | | | | | | | | | | | | | | | | | | Summary: Emit direct call of block invoke functions when possible, i.e. in case the block is not passed as a function argument. Also doing some refactoring of `CodeGenFunction::EmitBlockCallExpr()` Reviewers: Anastasia, yaxunl, svenvh Reviewed By: Anastasia Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58388 llvm-svn: 354568
* [OpenCL] Change type of block pointer for OpenCLAlexey Bader2019-02-191-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For some reason OpenCL blocks in LLVM IR are represented as function pointers. These pointers do not point to any real function and never get called. Actually they point to some structure, which in turn contains pointer to the real block invoke function. This patch changes represntation of OpenCL blocks in LLVM IR from function pointers to pointers to `%struct.__block_literal_generic`. Such representation allows to avoid unnecessary bitcasts and simplifies further processing (e.g. translation to SPIR-V ) of the module for targets which do not support function pointers. Patch by: Alexey Sotkin. Reviewers: Anastasia, yaxunl, svenvh Reviewed By: Anastasia Subscribers: alexbatashev, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58277 llvm-svn: 354337
* [opaque pointer types] Cleanup CGBuilder's Create*GEP.James Y Knight2019-02-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Some of these functions take some extraneous arguments, e.g. EltSize, Offset, which are computable from the Type and DataLayout. Add some asserts to ensure that the computed values are consistent with the passed-in values, in preparation for eliminating the extraneous arguments. This also asserts that the Type is an Array for the calls named "Array" and a Struct for the calls named "Struct". Then, correct a couple of errors: 1. Using CreateStructGEP on an array type. (this causes the majority of the test differences, as struct GEPs are created with i32 indices, while array GEPs are created with i64 indices) 2. Passing the wrong Offset to CreateStructGEP in TargetInfo.cpp on x86-64 NACL (which uses 32-bit pointers). Differential Revision: https://reviews.llvm.org/D57766 llvm-svn: 353529
* [OpenCL] Fix invalid address space generation for clk_event_tAlexey Sotkin2018-11-141-1/+8
| | | | | | | | | | | | | | | | | | Summary: Addrspace(32) was generated when putting 0 in clk_event_t * event_ret parameter for enqueue_kernel function. Patch by Viktoria Maksimova Reviewers: Anastasia, yaxunl, AlexeySotkin Reviewed By: Anastasia, AlexeySotkin Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D53809 llvm-svn: 346838
* Revert r326937 "[OpenCL] Remove block invoke function from emitted block ↵Sven van Haastregt2018-10-021-37/+46
| | | | | | | | | | | literal struct" This reverts r326937 as it broke block argument handling in OpenCL. See the discussion on https://reviews.llvm.org/D43783 . The next commit will add a test case that revealed the issue. llvm-svn: 343582
* [OpenCL] Restore r338899 (reverted in r338904), fixing stack-use-after-returnScott Linder2018-08-071-66/+111
| | | | | | | | | Always emit alloca in entry block for enqueue_kernel builtin. Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. llvm-svn: 339150
* Revert "[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin"Vlad Tsyrklevich2018-08-031-111/+66
| | | | | | This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 338904
* [OpenCL] Always emit alloca in entry block for enqueue_kernel builtinScott Linder2018-08-031-66/+111
| | | | | | | | | Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. Differential Revision: https://reviews.llvm.org/D50104 llvm-svn: 338899
* [OpenCL] Fix typos in emitted enqueue kernel function namesYaxun Liu2018-05-091-9/+9
| | | | | | | | | | Two typos: vaarg => vararg get_kernel_preferred_work_group_multiple => get_kernel_preferred_work_group_size_multiple Differential Revision: https://reviews.llvm.org/D46601 llvm-svn: 331895
* [OpenCL] Remove block invoke function from emitted block literal structYaxun Liu2018-03-071-46/+37
| | | | | | | | | | | | | | | | | | | | | OpenCL runtime tracks the invoke function emitted for any block expression. Due to restrictions on blocks in OpenCL (v2.0 s6.12.5), it is always possible to know the block invoke function when emitting call of block expression or __enqueue_kernel builtin functions. Since __enqueu_kernel already has an argument for the invoke function, it is redundant to have invoke function member in the llvm block literal structure. This patch removes invoke function from the llvm block literal structure. It also removes the bitcast of block invoke function to the generic block literal type which is useless for OpenCL. This will save some space for the kernel argument, and also eliminate some store instructions. Differential Revision: https://reviews.llvm.org/D43783 llvm-svn: 326937
* [OpenCL] Fix __enqueue_block for block with capturesYaxun Liu2018-02-151-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | The following test case causes issue with codegen of __enqueue_block void (^block)(void) = ^{ callee(id, out); }; enqueue_kernel(queue, 0, ndrange, block); Clang first does codegen for block expression in the first line and deletes its block info. Clang then tries to do codegen for the same block expression again for the second line, and fails because the block info is gone. The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to record llvm block invoke function and llvm block literal emitted for each AST block expression, and use the recorded information for generating the wrapper kernel. The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen of blocks. Another minor issue is that some clean up AST expression is generated for block with captures, which can be stripped by IgnoreImplicit. Differential Revision: https://reviews.llvm.org/D43240 llvm-svn: 325264
* [OpenCL] Emit enqueued block as kernelYaxun Liu2017-10-141-26/+162
| | | | | | | | | | | | | | | | | In OpenCL the kernel function and non-kernel function has different calling conventions. For certain targets they have different argument ABIs. Also kernels have special function attributes and metadata for runtime to launch them. The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such, the block invoke function should be emitted as kernel with proper calling convention and argument ABI. This patch emits enqueued block as kernel. If a block is both called directly and passed to enqueue_kernel, separate functions will be generated. Differential Revision: https://reviews.llvm.org/D38134 llvm-svn: 315804
* [OpenCL] Clean up and add missing fields for block structYaxun Liu2017-10-041-26/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently block is translated to a structure equivalent to struct Block { void *isa; int flags; int reserved; void *invoke; void *descriptor; }; Except invoke, which is the pointer to the block invoke function, all other fields are useless for OpenCL, which clutter the IR and also waste memory since the block struct is passed to the block invoke function as argument. On the other hand, the size and alignment of the block struct is not stored in the struct, which causes difficulty to implement __enqueue_kernel as library function, since the library function needs to know the size and alignment of the argument which needs to be passed to the kernel. This patch removes the useless fields from the block struct and adds size and align fields. The equivalent block struct will become struct Block { int size; int align; generic void *invoke; /* custom fields */ }; It also changes the pointer to the invoke function to be a generic pointer since the address space of a function may not be private on certain targets. Differential Revision: https://reviews.llvm.org/D37822 llvm-svn: 314932
* [OpenCL] Do not use vararg in emitted functions for enqueue_kernelYaxun Liu2017-09-031-18/+72
| | | | | | | | | | Not all targets support vararg (e.g. amdgpu). Instead of using vararg in the emitted functions for enqueue_kernel, this patch creates a temporary array of size_t, stores the size arguments in the temporary array and passes it to the emitted functions for enqueue_kernel. Differential Revision: https://reviews.llvm.org/D36678 llvm-svn: 312441
* [OpenCL] Add missing subgroup builtinsJoey Gouly2017-08-011-0/+5
| | | | | | | This adds get_kernel_max_sub_group_size_for_ndrange and get_kernel_sub_group_count_for_ndrange. llvm-svn: 309678
* [OpenCL] Add extension Sema check for subgroup builtinsJoey Gouly2017-07-311-0/+2
| | | | | | Check the subgroup extension is enabled, before doing other Sema checks. llvm-svn: 309567
* [OpenCL] Correct ndrange_t implementationAnastasia Stulova2017-02-161-23/+18
| | | | | | | | | | | | | | | Removed ndrange_t as Clang builtin type and added as a struct type in the OpenCL header. Use type name to do the Sema checking in enqueue_kernel and modify IR generation accordingly. Review: D28058 Patch by Dmitry Borisenkov! llvm-svn: 295311
* [OpenCL] Add missing address spaces in IR generation of blocksAnastasia Stulova2017-01-271-24/+23
| | | | | | | | | | | | | | | | | | | | | | | | Modify ObjC blocks impl wrt address spaces as follows: - keep default private address space for blocks generated as local variables (with captures); - add global address space for global block literals (no captures); - make the block invoke function and enqueue_kernel prototype with the generic AS block pointer parameter to accommodate both private and global AS cases from above; - add block handling into default AS because it's implemented as a special pointer type (BlockPointer) in the frontend and therefore it is used as a pointer everywhere. This is also needed to accommodate both private and global AS blocks for the two cases above. - removes ObjC RT specific symbols (NSConcreteStackBlock and NSConcreteGlobalBlock) in the OpenCL mode. Review: https://reviews.llvm.org/D28814 llvm-svn: 293286
* [OpenCL] Align fake address space map with the SPIR target maps.Egor Churaev2016-12-231-15/+15
| | | | | | | | | | | | | | | Summary: We compile user opencl kernel code with spir triple. But built-ins are written in OpenCL and we compile it with triple x86_64 to be able to use x86 intrinsics. And we need address spaces to match in both cases. So, we change fake address space map in OpenCL for matching with spir. On CPU address spaces are not really important but we'd like to preserve address space information in order to perform optimizations relying on this info like enhanced alias analysis. Reviewers: pekka.jaaskelainen, Anastasia Subscribers: pekka.jaaskelainen, yaxunl, bader, cfe-commits Differential Revision: https://reviews.llvm.org/D28048 llvm-svn: 290436
* Add the alloc_size attribute to clang, attempt 2.George Burgess IV2016-12-221-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a recommit of r290149, which was reverted in r290169 due to msan failures. msan was failing because we were calling `isMostDerivedAnUnsizedArray` on an invalid designator, which caused us to read uninitialized memory. To fix this, the logic of the caller of said function was simplified, and we now have a `!Invalid` assert in `isMostDerivedAnUnsizedArray`, so we can catch this particular bug more easily in the future. Fingers crossed that this patch sticks this time. :) Original commit message: This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. llvm-svn: 290297
* Revert r290149: Add the alloc_size attribute to clang.Chandler Carruth2016-12-201-11/+13
| | | | | | | | | | | | | | | This commit fails MSan when running test/CodeGen/object-size.c in a confusing way. After some discussion with George, it isn't really clear what is going on here. We can make the MSan failure go away by testing for the invalid bit, but *why* things are invalid isn't clear. And yet, other code in the surrounding area is doing precisely this and testing for invalid. George is going to take a closer look at this to better understand the nature of the failure and recommit it, for now backing it out to clean up MSan builds. llvm-svn: 290169
* Add the alloc_size attribute to clang.George Burgess IV2016-12-201-13/+11
| | | | | | | | | | | | | | | | | | | | This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. Differential Revision: https://reviews.llvm.org/D14274 llvm-svn: 290149
* [OpenCL] Fix for integer parameters of enqueue_kernelAnastasia Stulova2016-11-141-58/+90
| | | | | | | | | | | | | | Make handling integer parameters more flexible: - For the number of events argument allow to pass larger integers than 32 bits as soon as compiler can prove that the range fits in 32 bits. If not, the diagnostic will be given. - Change type of the arguments specifying the sizes of the corresponding block arguments to be size_t. Review: https://reviews.llvm.org/D26509 llvm-svn: 286849
* [OpenCL] Change to clk_event parameter in enqueue_kernel.Anastasia Stulova2016-11-141-11/+18
| | | | | | | | | - Accept NULL pointer as a valid parameter value for clk_event. - Generate clk_event_t arguments of internal __enqueue_kernel_XXX function as pointers in generic address space. Review: https://reviews.llvm.org/D26507 llvm-svn: 286836
* [OpenCL] Fix typo in test that I accidentally introduced in my previous commit.Joey Gouly2016-08-101-1/+1
| | | | llvm-svn: 278235
* [OpenCL] Change block descriptor address space to constant.Joey Gouly2016-08-101-7/+7
| | | | | | | The block descriptor is a GlobalVariable in the LLVM IR, so it shouldn't be in the private address space. llvm-svn: 278234
* [OpenCL] An implementation of device side enqueue (DSE) from OpenCL v2.0 ↵Anastasia Stulova2016-07-051-0/+110
s6.13.17. - Added new Builtins: enqueue_kernel, get_kernel_work_group_size and get_kernel_preferred_work_group_size_multiple. These Builtins use custom check to diagnose parameters of the passed Blocks i. e. variable number of 'local void*' type params, and check different overloads specified in Table 6.31 of OpenCL v2.0. - IR is generated as an internal library call for each OpenCL Builtin, reusing ObjC Block implementation. Review: http://reviews.llvm.org/D20249 llvm-svn: 274540
OpenPOWER on IntegriCloud