summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenOpenCL
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenCL] Enabling the usage of CLK_NULL_QUEUE as compare operand.Egor Churaev2016-12-201-0/+18
| | | | | | | | | | | | Summary: Enabling the compression of CLK_NULL_QUEUE to variable of type queue_t. Reviewers: Anastasia Subscribers: cfe-commits, yaxunl, bader Differential Revision: https://reviews.llvm.org/D27569 llvm-svn: 290171
* Revert r290149: Add the alloc_size attribute to clang.Chandler Carruth2016-12-201-11/+13
| | | | | | | | | | | | | | | This commit fails MSan when running test/CodeGen/object-size.c in a confusing way. After some discussion with George, it isn't really clear what is going on here. We can make the MSan failure go away by testing for the invalid bit, but *why* things are invalid isn't clear. And yet, other code in the surrounding area is doing precisely this and testing for invalid. George is going to take a closer look at this to better understand the nature of the failure and recommit it, for now backing it out to clean up MSan builds. llvm-svn: 290169
* Add the alloc_size attribute to clang.George Burgess IV2016-12-201-13/+11
| | | | | | | | | | | | | | | | | | | | This patch does three things: - Gives us the alloc_size attribute in clang, which lets us infer the number of bytes handed back to us by malloc/realloc/calloc/any user functions that act in a similar manner. - Teaches our constexpr evaluator that evaluating some `const` variables is OK sometimes. This is why we have a change in test/SemaCXX/constant-expression-cxx11.cpp and other seemingly unrelated tests. Richard Smith okay'ed this idea some time ago in person. - Uniques some Blocks in CodeGen, which was reviewed separately at D26410. Lack of uniquing only really shows up as a problem when combined with our new eagerness in the face of const. Differential Revision: https://reviews.llvm.org/D14274 llvm-svn: 290149
* Recommit r289979 [OpenCL] Allow disabling types and declarations associated ↵Yaxun Liu2016-12-181-0/+25
| | | | | | | | with extensions Fixed undefined behavior due to cast integer to bool in initializer list. llvm-svn: 290056
* Revert r289979 due to regressionsYaxun Liu2016-12-161-25/+0
| | | | llvm-svn: 289991
* [OpenCL] Allow disabling types and declarations associated with extensionsYaxun Liu2016-12-161-0/+25
| | | | | | | | | | | | | | | | | | Added a map to associate types and declarations with extensions. Refactored existing diagnostic for disabled types associated with extensions and extended it to declarations for generic situation. Fixed some bugs for types associated with extensions. Allow users to use pragma to declare types and functions for supported extensions, e.g. #pragma OPENCL EXTENSION the_new_extension_name : begin // declare types and functions associated with the extension here #pragma OPENCL EXTENSION the_new_extension_name : end Differential Revision: https://reviews.llvm.org/D21698 llvm-svn: 289979
* Re-commit r289252 and r289285, and fix PR31374Yaxun Liu2016-12-151-0/+534
| | | | llvm-svn: 289787
* Revert 289252 (and follow-up 289285), it caused PR31374Nico Weber2016-12-141-534/+0
| | | | llvm-svn: 289713
* Fixing build failure by adding triple option to new test condition.Neil Hickey2016-12-131-1/+1
| | | | | | Adding -triple option to ensure target supports double for fpmath test. llvm-svn: 289552
* Improve handling of floating point literals in OpenCL to only use double ↵Neil Hickey2016-12-131-1/+15
| | | | | | | | | | | | | | precision if the target supports fp64. This change makes sure single-precision floating point types are used if the cl_fp64 extension is not supported by the target. Also removed the check to see whether the OpenCL version is >= 1.2, as this has been incorporated into the extension setting code. Differential Revision: https://reviews.llvm.org/D24235 llvm-svn: 289544
* [OpenCL] Enable unroll hint for OpenCL 1.x.Egor Churaev2016-12-131-0/+1
| | | | | | | | | | | | Summary: Although the feature was introduced only in OpenCL C v2.0 spec., it's useful for OpenCL 1.x too and doesn't require HW support. Reviewers: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27453 llvm-svn: 289535
* Add support for non-zero null pointer for C and OpenCLYaxun Liu2016-12-091-0/+534
| | | | | | | | | | | | | | | | | | In amdgcn target, null pointers in global, constant, and generic address space take value 0 but null pointers in private and local address space take value -1. Currently LLVM assumes all null pointers take value 0, which results in incorrectly translated IR. To workaround this issue, instead of emit null pointers in local and private address space, a null pointer in generic address space is emitted and casted to local and private address space. Tentative definition of global variables with non-zero initializer will have weak linkage instead of common linkage since common linkage requires zero initializer and does not have explicit section to hold the non-zero value. Virtual member functions getNullPointer and performAddrSpaceCast are added to TargetCodeGenInfo which by default returns ConstantPointerNull and emitting addrspacecast instruction. A virtual member function getNullPointerValue is added to TargetInfo which by default returns 0. Each target can override these virtual functions to get target specific null pointer and the null pointer value for specific address space, and perform specific translations for addrspacecast. Wrapper functions getNullPointer is added to CodegenModule and getTargetNullPointerValue is added to ASTContext to facilitate getting the target specific null pointers and their values. This change has no effect on other targets except amdgcn target. Other targets can provide support of non-zero null pointer in a similar way. This change only provides support for non-zero null pointer for C and OpenCL. Supporting for other languages will be added later incrementally. Differential Revision: https://reviews.llvm.org/D26196 llvm-svn: 289252
* [OpenCL] Added a LIT test for ensuring address space mangling is done the ↵Alexey Bader2016-12-071-0/+14
| | | | | | | | | | | | | | same both in OpenCL1.2 and OpenCL2.0. Patch by Egor Churaev (echuraev). Reviewers: Anastasia Subscribers: yaxunl, cfe-commits, bader Differential Revision: https://reviews.llvm.org/D27403 llvm-svn: 288891
* [OpenCL] Fix SPIR version generation.Alexey Bader2016-12-071-9/+12
| | | | | | | | | | | | Patch by Egor Churaev (echuraev). Reviewers: Anastasia Subscribers: bader, yaxunl, cfe-commits Differential Revision: https://reviews.llvm.org/D27300 llvm-svn: 288890
* [OpenCL] Prevent generation of globals in non-constant AS for OpenCL.Anastasia Stulova2016-11-291-1/+20
| | | | | | | | | | Avoid using shortcut for const qualified non-constant address space aggregate variables while generating them on the stack such that the alloca object is used instead of a global variable containing initializer. Review: https://reviews.llvm.org/D27109 llvm-svn: 288163
* [AMDGPU] Change frexp.exp builtin to return i16 for f16 inputKonstantin Zhuravlyov2016-11-182-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D26863 llvm-svn: 287390
* [AMDGPU] Add wave barrier builtinStanislav Mekhanoshin2016-11-151-0/+7
| | | | | | | | | | | The wave barrier represents the discardable barrier. Its main purpose is to carry convergent attribute, thus preventing illegal CFG optimizations. All lanes in a wave come to convergence point simultaneously with SIMT, thus no special instruction is needed in the ISA. The barrier is discarded during code generation. Differential Revision: https://reviews.llvm.org/D26584 llvm-svn: 287006
* [OpenCL] Fix for integer parameters of enqueue_kernelAnastasia Stulova2016-11-141-58/+90
| | | | | | | | | | | | | | Make handling integer parameters more flexible: - For the number of events argument allow to pass larger integers than 32 bits as soon as compiler can prove that the range fits in 32 bits. If not, the diagnostic will be given. - Change type of the arguments specifying the sizes of the corresponding block arguments to be size_t. Review: https://reviews.llvm.org/D26509 llvm-svn: 286849
* [OpenCL] Change to clk_event parameter in enqueue_kernel.Anastasia Stulova2016-11-141-11/+18
| | | | | | | | | - Accept NULL pointer as a valid parameter value for clk_event. - Generate clk_event_t arguments of internal __enqueue_kernel_XXX function as pointers in generic address space. Review: https://reviews.llvm.org/D26507 llvm-svn: 286836
* Fix r286819 (accidentally patched multiple times.Pekka Jaaskelainen2016-11-141-18/+0
| | | | llvm-svn: 286821
* [OpenCL] always use SPIR address spaces for kernel_arg_addr_space MDPekka Jaaskelainen2016-11-141-0/+27
| | | | | | | | | | | | | | | It doesn't make sense to use the target's address space ids in this context as this is metadata that should be referring to the "logical" OpenCL address spaces. For flat AS machines like all "CPUs" in general, the logical AS info gets lost as there's only one address space (0). This commit changes the logic such that we always use the SPIR address space ids for the argument metadata. It thus allows implementing the clGetKernelArgInfo() and the other detection needs. https://reviews.llvm.org/D26157 llvm-svn: 286819
* Revert "Improve handling of floating point literals in OpenCL to only use ↵Renato Golin2016-11-141-13/+0
| | | | | | | | double precision if the target supports fp64." This reverts commit r286815, as it broke all ARM and AArch64 bots. llvm-svn: 286818
* Improve handling of floating point literals in OpenCL to only use double ↵Neil Hickey2016-11-141-0/+13
| | | | | | | | | | | | | | precision if the target supports fp64. This change makes sure single-precision floating point types are used if the cl_fp64 extension is not supported by the target. Also removed the check to see whether the OpenCL version is >= 1.2, as this has been incorporated into the extension setting code. Differential Revision: https://reviews.llvm.org/D24235 llvm-svn: 286815
* [AMDGPU] Add f16 builtin functions (VI+)Konstantin Zhuravlyov2016-11-132-64/+71
| | | | | | Differential Revision: https://reviews.llvm.org/D26476 llvm-svn: 286741
* clang/test/CodeGenOpenCL/convergent.cl: Satisfy -Asserts with "opt -instnamer".NAKAMURA Takumi2016-11-011-1/+1
| | | | llvm-svn: 285733
* [OpenCL] Mark group functions as convergent in opencl-c.hYaxun Liu2016-11-011-0/+118
| | | | | | | | | | Certain OpenCL builtin functions are supposed to be executed by all threads in a work group or sub group. Such functions should not be made divergent during transformation. It makes sense to mark them with convergent attribute. The adding of convergent attribute is based on Ettore Speziale's work and the original proposal and patch can be found at https://www.mail-archive.com/cfe-commits@lists.llvm.org/msg22271.html. Differential Revision: https://reviews.llvm.org/D25343 llvm-svn: 285725
* [OpenCL] Setting constant address space for array initializersAlexey Bader2016-10-312-2/+11
| | | | | | | | | | | | | | Summary: Setting constant address space for global constants used for memcpy-initialization of arrays. Patch by Alexey Sotkin. Reviewers: bader, yaxunl, Anastasia Subscribers: cfe-commits, AlexeySotkin Differential Revision: https://reviews.llvm.org/D25305 llvm-svn: 285557
* [OpenCL] Allow partial initializer for array and structYaxun Liu2016-10-111-0/+66
| | | | | | | | | | | | | | Currently Clang allows partial initializer for C99 but not for OpenCL, e.g. float a[16][16] = {1.0f, 2.0f}; is allowed in C99 but not allowed in OpenCL. This patch fixes that. Differential Revision: https://reviews.llvm.org/D25335 llvm-svn: 283891
* [OpenCL] Fix bug in __builtin_astype causing invalid LLVM cast instructionsYaxun Liu2016-10-031-0/+39
| | | | | | | | | | | | __builtin_astype is used to cast OpenCL opaque types to other types, as such, it needs to be able to handle casting from and to pointer types correctly. Current it cannot handle 1) casting between pointers of different addr spaces 2) casting between pointer type and non-pointer types. This patch fixes that. Differential Revision: https://reviews.llvm.org/D25123 llvm-svn: 283114
* [AMDGPU] Expose flat work group size, register and wave control attributesKonstantin Zhuravlyov2016-09-262-48/+166
| | | | | | | | | __attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size __attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit Differential Revision: https://reviews.llvm.org/D24513 llvm-svn: 282371
* [OpenCL] Augment pipe built-ins with pipe packet size and alignment.Alexey Bader2016-09-232-20/+20
| | | | | | | | | | Reviewers: Anastasia, vpykhtin Subscribers: dmitry, cfe-commits Differential Revision: https://reviews.llvm.org/D23992 llvm-svn: 282252
* Reverting r281714 due to causing an assert when calling builtins that expect ↵Neil Hickey2016-09-191-13/+0
| | | | | | a double, from CL llvm-svn: 281899
* Improve handling of floating point literals in OpenCL to only use double ↵Neil Hickey2016-09-161-0/+13
| | | | | | | | precision if the target supports fp64 https://reviews.llvm.org/D24235 llvm-svn: 281714
* AMDGPU: Fix target options fp32/64-denormalsYaxun Liu2016-09-131-2/+8
| | | | | | | | | | | | | | Fix target options for fp32/64-denormals so that +fp64-denormals is set if fp64 is supported -fp32-denormals if fp32 denormals is not supported, or -cl-denorms-are-zero is set +fp32-denormals if fp32 denormals is supported and -cl-denorms-are-zero is not set If target feature fp32/64-denormals is explicitly set, they will override default options and options deduced from -cl-denorms-are-zero. Differential Revision: https://reviews.llvm.org/D24512 llvm-svn: 281357
* [OpenCL] Fix pipe built-in functions return type.Alexey Bader2016-09-071-0/+16
| | | | | | | | | | | | | | | By default return type of call expressions calling built-in functions is set to bool. Fixes https://llvm.org/bugs/show_bug.cgi?id=30219. Reviewers: Anastasia Subscribers: dmitry, cfe-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24136 llvm-svn: 280800
* [OpenCL] Remove access qualifiers on images in arg info metadata.Alexey Bader2016-09-061-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Remove access qualifiers on images in arg info metadata: * kernel_arg_type * kernel_arg_base_type Image access qualifiers are inseparable from type in clang implementation, but OpenCL spec provides a special query to get access qualifier via clGetKernelArgInfo with CL_KERNEL_ARG_ACCESS_QUALIFIER. Besides that OpenCL conformance test_api get_kernel_arg_info expects image types without access qualifier. Patch by Evgeniy Tyurin. Reviewers: bader, yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23915 llvm-svn: 280699
* AMDGPU: Handle structs directly in AMDGPUABIInfoMatt Arsenault2016-08-221-0/+66
| | | | | | | | | | | | | | | | Structs are currently handled as pointer + byval, which makes AMDGPU LLVM backend generate incorrect code when structs are used. This patch changes struct argument to be handled directly and without flattening, which Clover (Mesa 3D Gallium OpenCL state tracker) will be able to handle. Flattening would expand the struct to individual elements and pass each as a separate argument, which Clover can not handle. Furthermore, such expansion does not fit the OpenCL programming model which requires to explicitely specify each argument index, size and memory location. Patch by Vedran Miletić llvm-svn: 279463
* [AMDGPU] add s_incperflevel/s_decperflevel builtinsValery Pykhtin2016-08-192-0/+28
| | | | | | Differential revision: https://reviews.llvm.org/D23668 llvm-svn: 279235
* Re-commit [OpenCL] AMDGCN: Fix size_t typeYaxun Liu2016-08-191-0/+110
| | | | | | There was a premature cast to pointer type in emitPointerArithmetic which caused assertion in tests with assertion enabled. llvm-svn: 279206
* AMDGPU: Add clang builtin for ds_swizzle.Changpeng Fang2016-08-182-0/+11
| | | | | | | | | | | Summary: int __builtin_amdgcn_ds_swizzle (int a, int imm); while imm is a constant. Differential Revision: http://reviews.llvm.org/D23682 llvm-svn: 279165
* Revert [OpenCL] AMDGCN: Fix size_t typeYaxun Liu2016-08-181-110/+0
| | | | | | due to regressions in test/CodeGen/exprs.c on certain platforms. llvm-svn: 279127
* [OpenCL] AMDGCN: Fix size_t typeYaxun Liu2016-08-181-0/+110
| | | | | | | | Pointers of certain GPUs in AMDGCN target in private address space is 32 bit but pointers in other address spaces are 64 bit. size_t type should be defined as 64 bit for these GPUs so that it could hold pointers in all address spaces. Also fixed issues in pointer arithmetic codegen by using pointer specific intptr type. Differential Revision: https://reviews.llvm.org/D23361 llvm-svn: 279121
* [OpenCL] Fix typo in test that I accidentally introduced in my previous commit.Joey Gouly2016-08-101-1/+1
| | | | llvm-svn: 278235
* [OpenCL] Change block descriptor address space to constant.Joey Gouly2016-08-101-7/+7
| | | | | | | The block descriptor is a GlobalVariable in the LLVM IR, so it shouldn't be in the private address space. llvm-svn: 278234
* [OpenCL] Handle -cl-fp32-correctly-rounded-divide-sqrtYaxun Liu2016-08-091-4/+16
| | | | | | | | Let the driver pass the option to frontend. Do not set precision metadata for division instructions when this option is set. Set function attribute "correctly-rounded-divide-sqrt-fp-math" based on this option. Differential Revision: https://reviews.llvm.org/D22940 llvm-svn: 278155
* [OpenCL][AMDGPU] Add support for -cl-denorms-are-zeroYaxun Liu2016-08-091-1/+9
| | | | | | | | | | | | Adjust target features for amdgcn target when -cl-denorms-are-zero is set. Denormal support is controlled by feature strings fp32-denormals fp64-denormals in amdgcn target. If -cl-denorms-are-zero is not set and the command line does not set fp32/64-denormals feature string, +fp32-denormals +fp64-denormals will be on for GPU's supporting them. A new virtual function virtual void TargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts, TargetOptions &TargetOpts) const is introduced to allow adjusting target option by codegen option. Differential Revision: https://reviews.llvm.org/D22815 llvm-svn: 278151
* AMDGPU : Add Clang builtin intrinsics for compare with the fullWei Ding2016-08-052-0/+75
| | | | | | | | wavefront result. Differential Revision: http://reviews.llvm.org/D22934 llvm-svn: 277824
* [OpenCL] Add the lit test for image size which was omitted by r277647.Yaxun Liu2016-08-041-0/+18
| | | | llvm-svn: 277756
* [OpenCL] Added underscores to the names of 'to_addr' OpenCL built-ins.Alexey Bader2016-08-041-14/+14
| | | | | | | | | | | | | | | Summary: In order to re-define OpenCL built-in functions 'to_{private,local,global}' in OpenCL run-time library LLVM names must be different from the clang built-in function names. Reviewers: yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23120 llvm-svn: 277743
* [OpenCL] Generate opaque type for sampler_t and function call for the ↵Yaxun Liu2016-07-282-9/+71
| | | | | | | | | | | | | | | | initializer Currently Clang use int32 to represent sampler_t, which have been a source of issue for some backends, because in some backends sampler_t cannot be represented by int32. They have to depend on kernel argument metadata and use IPA to find the sampler arguments and global variables and transform them to target specific sampler type. This patch uses opaque pointer type opencl.sampler_t* for sampler_t. For each use of file-scope sampler variable, it generates a function call of __translate_sampler_initializer. For each initialization of function-scope sampler variable, it generates a function call of __translate_sampler_initializer. Each builtin library can implement its own __translate_sampler_initializer(). Since the real sampler type tends to be architecture dependent, allowing it to be initialized by a library function simplifies backend design. A typical implementation of __translate_sampler_initializer could be a table lookup of real sampler literal values. Since its argument is always a literal, the returned pointer is known at compile time and easily optimized to finally become some literal values directly put into image read instructions. This patch is partially based on Alexey Sotkin's work in Khronos Clang (https://github.com/KhronosGroup/SPIR/commit/3d4eec61623502fc306e8c67c9868be2b136e42b). Differential Revision: https://reviews.llvm.org/D21567 llvm-svn: 277024
OpenPOWER on IntegriCloud