summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenOpenCL
Commit message (Collapse)AuthorAgeFilesLines
...
* [AMDGPU] Fix size and alignment of size_t and pointer typesYaxun Liu2017-07-053-28/+98
| | | | | | Differential Revision: https://reviews.llvm.org/D34995 llvm-svn: 307121
* [AMDGPU] Fix regressions on mesa/clover with libclc due to address spaceYaxun Liu2017-07-041-0/+2
| | | | | | | | | | Currently AMDGPUTargetInfo does not initialize AddrSpaceMap in constructor, which causes regressions in mesa/clover with libclc. This patch fixes that. Differential Revision: https://reviews.llvm.org/D34987 llvm-svn: 307105
* CodeGen: Fix invalid bitcast for coerced function argumentYaxun Liu2017-06-291-5/+47
| | | | | | | | | | Clang assumes coerced function argument is in address space 0, which is not always true and results in invalid bitcasts. This patch fixes failure in OpenCL conformance test api/get_kernel_arg_info with amdgcn---amdgizcl triple, where non-zero alloca address space is used. Differential Revision: https://reviews.llvm.org/D34777 llvm-svn: 306721
* [OpenCL] Fix OpenCL and SPIR version metadata generation.Alexey Bader2017-06-201-8/+9
| | | | | | | | | | | | | | Summary: OpenCL and SPIR version metadata must be generated once per module instead of once per mangled global value. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: ahatanak, cfe-commits Differential Revision: https://reviews.llvm.org/D34235 llvm-svn: 305796
* [OpenCL] spir_kern by defaul: fix old test casesPekka Jaaskelainen2017-06-016-8/+8
| | | | llvm-svn: 304396
* [OpenCL] Makes kernels use the SPIR_KERNEL CC by default.Pekka Jaaskelainen2017-06-011-0/+65
| | | | | | | | | | | | | | | | Rationale: OpenCL kernels are called via an explicit runtime API with arguments set with clSetKernelArg(), not as normal sub-functions. Return SPIR_KERNEL by default as the kernel calling convention to ensure the fingerprint is fixed such way that each OpenCL argument gets one matching argument in the produced kernel function argument list to enable feasible implementation of clSetKernelArg() with aggregates etc. In case we would use the default C calling conv here, clSetKernelArg() might break depending on the target-specific conventions; different targets might split structs passed as values to multiple function arguments etc. https://reviews.llvm.org/D33639 llvm-svn: 304389
* Fix issue with test that caused bildbot failureJaved Absar2017-05-301-1/+1
| | | | | | | | | | | | These tests did not specify the target. The failure was triggered by change - https://reviews.llvm.org/D33205 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/7314 which sets vector alignment to 8-byte for arm-targets (except for Android). So, fixing the test to make it target specific. llvm-svn: 304210
* [OpenCL] Test on half immediate support.Egor Churaev2017-05-291-0/+17
| | | | | | | | | | | | Reviewers: Anastasia Reviewed By: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D33592 llvm-svn: 304134
* IRGen: Add optnone attribute on function during O0Mehdi Amini2017-05-291-25/+25
| | | | | | | | | | | Amongst other, this will help LTO to correctly handle/honor files compiled with O0, helping debugging failures. It also seems in line with how we handle other options, like how -fnoinline adds the appropriate attribute as well. Differential Revision: https://reviews.llvm.org/D28404 llvm-svn: 304127
* Resubmit r303861.Konstantin Zhuravlyov2017-05-261-0/+7
| | | | | | | | [AMDGPU] add __builtin_amdgcn_s_getpc Patch by Tim Corringham llvm-svn: 304033
* Revert "[AMDGPU] add __builtin_amdgcn_s_getpc"Reid Kleckner2017-05-251-7/+0
| | | | | | This reverts commit r303861, the LLVM intrinsic was reverted. llvm-svn: 303908
* [AMDGPU] add __builtin_amdgcn_s_getpcTim Corringham2017-05-251-0/+7
| | | | | | | | | | Summary: Added the builtin corresponding to the s_getpc intrinsic added in llvm D32862 Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D33276 llvm-svn: 303861
* [AMDGPU] Do not require opencl triple environment for OpenCLYaxun Liu2017-05-237-1/+8
| | | | | | | | | | | | A recent change requires opencl triple environment for compiling OpenCL program, which causes regressions in libclc. This patch fixes that. Instead of deducing language based on triple environment, it checks LangOptions. Differential Revision: https://reviews.llvm.org/D33445 llvm-svn: 303644
* CodeGen: Cast alloca to expected address spaceYaxun Liu2017-05-188-8/+68
| | | | | | | | | | | Alloca always returns a pointer in alloca address space, which may be different from the type defined by the language. For example, in C++ the auto variables are in the default address space. Therefore cast alloca to the expected address space when necessary. Differential Revision: https://reviews.llvm.org/D32248 llvm-svn: 303370
* [OpenCL] Emit function-scope variable in constant address space as static ↵Yaxun Liu2017-05-153-27/+31
| | | | | | | | variable Differential Revision: https://reviews.llvm.org/D32977 llvm-svn: 303072
* [OpenCL] Add intel_reqd_sub_group_size attribute supportXiuli Pan2017-05-041-0/+4
| | | | | | | | | | | | | | | | Summary: Add intel_reqd_sub_group_size attribute support as intel extension cl_intel_required_subgroup_size from https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_required_subgroup_size.txt Reviewers: Anastasia, bader, hfinkel, pxli168 Reviewed By: Anastasia, bader, pxli168 Subscribers: cfe-commits, yaxunl Differential Revision: https://reviews.llvm.org/D30805 llvm-svn: 302125
* Debug Info: Remove special-casing of indirect function argument handling.Adrian Prantl2017-04-181-2/+1
| | | | | | | | | | | | LLVM has changed the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the *address* of a variable as the first argument, even if the argument is not an alloca. https://bugs.llvm.org/show_bug.cgi?id=32382 rdar://problem/31205000 llvm-svn: 300523
* CodeGen: Let byval parameter use alloca address spaceYaxun Liu2017-04-171-0/+18
| | | | | | Differential Revision: https://reviews.llvm.org/D32133 llvm-svn: 300487
* CodeGen: Let lifetime intrinsic use alloca address spaceYaxun Liu2017-04-171-0/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31717 llvm-svn: 300485
* [AMDGPU][GFX9] Set +fp32-denormals for >=gfx900 unless -cl-denorms-are-zero ↵Konstantin Zhuravlyov2017-04-141-0/+13
| | | | | | | | is set Differential Revision: https://reviews.llvm.org/D31482 llvm-svn: 300306
* [OpenCL] Map default address space to alloca address spaceYaxun Liu2017-04-113-14/+31
| | | | | | | | | | | | | | For OpenCL, the private address space qualifier is 0 in AST. Before this change, 0 address space qualifier is always mapped to target address space 0. As now target private address space is specified by alloca address space in data layout, address space qualifier 0 needs to be mapped to alloca addr space specified by the data layout. This change has no impact on targets whose alloca addr space is 0. With contributions from Matt Arsenault, Tony Tye and Wen-Heng (Jack) Chung Differential Revision: https://reviews.llvm.org/D31404 llvm-svn: 299965
* [AMDGPU] Temporarily change constant address space from 4 to 2 for the new ↵Yaxun Liu2017-04-062-1/+3
| | | | | | | | | | address space mapping Change constant address space from 4 to 2 for the new address space mapping in Clang. Differential Revision: https://reviews.llvm.org/D31771 llvm-svn: 299691
* [AMDGPU] Translate reqd_work_group_size into amdgpu_flat_work_group_sizeStanislav Mekhanoshin2017-04-061-0/+12
| | | | | | | | | | | | | These two attributes specify the same info in a different way. AMGPU BE only checks the latter as a target specific attribute as opposed to language specific reqd_work_group_size. This change produces amdgpu_flat_work_group_size out of reqd_work_group_size if specified. Differential Revision: https://reviews.llvm.org/D31728 llvm-svn: 299678
* [OpenCL] Enables passing sampler initializer to function argumentEgor Churaev2017-04-051-0/+4
| | | | | | | | | | | | Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: yaxunl, bader Differential Revision: https://reviews.llvm.org/D31594 llvm-svn: 299524
* Preserve vec3 type.Jin-Gu Kang2017-04-041-0/+24
| | | | | | | | | | | | | | Summary: Preserve vec3 type with CodeGen option. Reviewers: Anastasia, bruno Reviewed By: Anastasia Subscribers: bruno, ahatanak, cfe-commits Differential Revision: https://reviews.llvm.org/D30810 llvm-svn: 299445
* [OpenCL] Do not generate "kernel_arg_type_qual" metadata for non-pointer argsEgor Churaev2017-03-311-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: "kernel_arg_type_qual" metadata should contain const/volatile/restrict tags only for pointer types to match the corresponding requirement of the OpenCL specification. OpenCL 2.0 spec 5.9.3 Kernel Object Queries: CL_KERNEL_ARG_TYPE_VOLATILE is returned if the argument is a pointer and the referenced type is declared with the volatile qualifier. [...] Similarly, CL_KERNEL_ARG_TYPE_CONST is returned if the argument is a pointer and the referenced type is declared with the restrict or const qualifier. [...] CL_KERNEL_ARG_TYPE_RESTRICT will be returned if the pointer type is marked restrict. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D31321 llvm-svn: 299192
* [OpenCL] Extended mapping of parcing CodeGen argumentsEgor Churaev2017-03-271-4/+25
| | | | | | | | | | | | | | Summary: Enable cl_mad_enamle and cl_no_signed_zeros options when user turns on cl_unsafe_math_optimizations or cl_fast_relaxed_math options. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D31324 llvm-svn: 298838
* [AMDGPU] Switch address space mapping by triple environment amdgizYaxun Liu2017-03-251-0/+9
| | | | | | | | For target environment amdgiz and amdgizcl (giz means Generic Is Zero), AMDGPU will use new address space mapping where generic address space is 0 and private address space is 5. The data layout is also changed correspondingly. Differential Revision: https://reviews.llvm.org/D31210 llvm-svn: 298767
* Fix array sizes where address space is not yet knownKonstantin Zhuravlyov2017-03-211-0/+12
| | | | | | | | | | | | | | | | | | | For variables in generic address spaces, for example: ``` unsigned char V[6442450944]; ... ``` the address space is not yet known when we get into *getConstantArrayType*, it is 0. AMDGCN target's address space 0 has 32 bits pointers, so when we call *getPointerWidth* with 0, the array size is trimmed to 32 bits, which is not right. Differential Revision: https://reviews.llvm.org/D30845 llvm-svn: 298420
* [OpenCL] Added implicit conversion rank for overloading functions with ↵Egor Churaev2017-03-211-0/+46
| | | | | | | | | | | | | | | | vector data type in OpenCL Summary: I added a new rank to ImplicitConversionRank enum to resolve the function overload ambiguity with vector types. Rank of scalar types conversion is lower than vector splat. So, we can choose which function should we call. See test for more details. Reviewers: Anastasia, cfe-commits Reviewed By: Anastasia Subscribers: bader, yaxunl Differential Revision: https://reviews.llvm.org/D30816 llvm-svn: 298366
* AMDGPU: Make 0 the private nullptr valueMatt Arsenault2017-03-132-56/+144
| | | | | | | | | | | | We can't actually pretend that 0 is valid for address space 0. r295877 added a workaround to stop allocating user objects there, so we can use 0 as the invalid pointer. Some of the tests seemed to be using private as the non-0 null test address space, so add copies using local to make sure this is still stressed. llvm-svn: 297659
* [AMDGPU] Add builtin functions readlane ds_permute mov_dppYaxun Liu2017-03-102-0/+36
| | | | | | Differential Revision: https://reviews.llvm.org/D30551 llvm-svn: 297436
* [DebugInfo] Append extended dereferencing mechanism to variables' ↵Konstantin Zhuravlyov2017-03-091-0/+131
| | | | | | | | DIExpression for targets that support more than one address space Differential Revision: https://reviews.llvm.org/D29673 llvm-svn: 297397
* [DebugInfo] Add address space when creating DIDerivedTypesKonstantin Zhuravlyov2017-03-081-0/+125
| | | | | | Differential Revision: https://reviews.llvm.org/D29671 llvm-svn: 297321
* AMDGPU: export s_sendmsg{halt} instrinsicsJan Vesely2017-02-251-0/+28
| | | | | | Differential Revision: https://reviews.llvm.org/D30366 llvm-svn: 296241
* AMDGPU: export l1 cache invalidation intrinsicsJan Vesely2017-02-251-0/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D30360 llvm-svn: 296240
* AMDGPU: export s_waitcnt builtinJan Vesely2017-02-251-0/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D30359 llvm-svn: 296239
* AMDGPU: Add fmed3 half builtinMatt Arsenault2017-02-221-0/+11
| | | | llvm-svn: 295874
* [OpenCL] r600 needs OpenCL kernel calling conventionJan Vesely2017-02-221-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D30236 llvm-svn: 295843
* [OpenCL] Correct ndrange_t implementationAnastasia Stulova2017-02-161-23/+18
| | | | | | | | | | | | | | | Removed ndrange_t as Clang builtin type and added as a struct type in the OpenCL header. Use type name to do the Sema checking in enqueue_kernel and modify IR generation accordingly. Review: D28058 Patch by Dmitry Borisenkov! llvm-svn: 295311
* AMDGPU: Add a test checking alignments of emitted globals/allocasMatt Arsenault2017-02-071-0/+522
| | | | | | | Make sure the spec required type alignments are used in preparation for a possible change which may break this. llvm-svn: 294278
* AMDGPU: Add builtin for fmed3 intrinsicMatt Arsenault2017-01-311-0/+7
| | | | llvm-svn: 293600
* [OpenCL] Add missing address spaces in IR generation of blocksAnastasia Stulova2017-01-272-24/+42
| | | | | | | | | | | | | | | | | | | | | | | | Modify ObjC blocks impl wrt address spaces as follows: - keep default private address space for blocks generated as local variables (with captures); - add global address space for global block literals (no captures); - make the block invoke function and enqueue_kernel prototype with the generic AS block pointer parameter to accommodate both private and global AS cases from above; - add block handling into default AS because it's implemented as a special pointer type (BlockPointer) in the frontend and therefore it is used as a pointer everywhere. This is also needed to accommodate both private and global AS blocks for the two cases above. - removes ObjC RT specific symbols (NSConcreteStackBlock and NSConcreteGlobalBlock) in the OpenCL mode. Review: https://reviews.llvm.org/D28814 llvm-svn: 293286
* AMDGPU: Update for changed subtarget feature nameMatt Arsenault2017-01-231-5/+5
| | | | llvm-svn: 292838
* AMDGPU: Add builtin for getreg intrinsicMatt Arsenault2017-01-201-0/+11
| | | | llvm-svn: 292636
* [OpenCL] Align fake address space map with the SPIR target maps.Egor Churaev2016-12-2310-41/+41
| | | | | | | | | | | | | | | Summary: We compile user opencl kernel code with spir triple. But built-ins are written in OpenCL and we compile it with triple x86_64 to be able to use x86 intrinsics. And we need address spaces to match in both cases. So, we change fake address space map in OpenCL for matching with spir. On CPU address spaces are not really important but we'd like to preserve address space information in order to perform optimizations relying on this info like enhanced alias analysis. Reviewers: pekka.jaaskelainen, Anastasia Subscribers: pekka.jaaskelainen, yaxunl, bader, cfe-commits Differential Revision: https://reviews.llvm.org/D28048 llvm-svn: 290436
* Fix problems in "[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare ↵Egor Churaev2016-12-231-0/+18
| | | | | | | | | | | | | | operand." Summary: Fixed warnings in commit: https://reviews.llvm.org/rL290171 Reviewers: djasper, Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27981 llvm-svn: 290431
* Cleanup the handling of noinline function attributes, -fno-inline,Chandler Carruth2016-12-231-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -fno-inline-functions, -O0, and optnone. These were really, really tangled together: - We used the noinline LLVM attribute for -fno-inline - But not for -fno-inline-functions (breaking LTO) - But we did use it for -finline-hint-functions (yay, LTO is happy!) - But we didn't for -O0 (LTO is sad yet again...) - We had weird structuring of CodeGenOpts with both an inlining enumeration and a boolean. They interacted in weird ways and needlessly. - A *lot* of set smashing went on with setting these, and then got worse when we considered optnone and other inlining-effecting attributes. - A bunch of inline affecting attributes were managed in a completely different place from -fno-inline. - Even with -fno-inline we failed to put the LLVM noinline attribute onto many generated function definitions because they didn't show up as AST-level functions. - If you passed -O0 but -finline-functions we would run the normal inliner pass in LLVM despite it being in the O0 pipeline, which really doesn't make much sense. - Lastly, we used things like '-fno-inline' to manipulate the pass pipeline which forced the pass pipeline to be much more parameterizable than it really needs to be. Instead we can *just* use the optimization level to select a pipeline and control the rest via attributes. Sadly, this causes a bunch of churn in tests because we don't run the optimizer in the tests and check the contents of attribute sets. It would be awesome if attribute sets were a bit more FileCheck friendly, but oh well. I think this is a significant improvement and should remove the semantic need to change what inliner pass we run in order to comply with the requested inlining semantics by relying completely on attributes. It also cleans up tho optnone and related handling a bit. One unfortunate aspect of this is that for generating alwaysinline routines like those in OpenMP we end up removing noinline and then adding alwaysinline. I tried a bunch of other approaches, but because we recompute function attributes from scratch and don't have a declaration here I couldn't find anything substantially cleaner than this. Differential Revision: https://reviews.llvm.org/D28053 llvm-svn: 290398
* Add the alloc_size attribute to clang, attempt 2.George Burgess IV2016-12-221-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a recommit of r290149, which was reverted in r290169 due to msan failures. msan was failing because we were calling `isMostDerivedAnUnsizedArray` on an invalid designator, which caused us to read uninitialized memory. To fix this, the logic of the caller of said function was simplified, and we now have a `!Invalid` assert in `isMostDerivedAnUnsizedArray`, so we can catch this particular bug more easily in the future. Fingers crossed that this patch sticks this time. :) Original commit message: This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. llvm-svn: 290297
* Revert "[OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand."Daniel Jasper2016-12-201-18/+0
| | | | | | | | | | This reverts commit r290171. It triggers a bunch of warnings, because the new enumerator isn't handled in all switches. We want a warning-free build. Replied on the commit with more details. llvm-svn: 290173
OpenPOWER on IntegriCloud