summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenOpenCL
Commit message (Collapse)AuthorAgeFilesLines
...
* [AMDGPU] Require at least protected visibility for certain symbolsScott Linder2019-02-121-0/+77
| | | | | | | | | This allows the global visibility controls to be restrictive while still populating the dynamic symbol table where required. Differential Revision: https://reviews.llvm.org/D56871 llvm-svn: 353870
* [AMDGPU] Split dot-insts featureStanislav Mekhanoshin2019-02-092-15/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D57972 llvm-svn: 353588
* [opaque pointer types] Cleanup CGBuilder's Create*GEP.James Y Knight2019-02-084-39/+39
| | | | | | | | | | | | | | | | | | | | | | | Some of these functions take some extraneous arguments, e.g. EltSize, Offset, which are computable from the Type and DataLayout. Add some asserts to ensure that the computed values are consistent with the passed-in values, in preparation for eliminating the extraneous arguments. This also asserts that the Type is an Array for the calls named "Array" and a Struct for the calls named "Struct". Then, correct a couple of errors: 1. Using CreateStructGEP on an array type. (this causes the majority of the test differences, as struct GEPs are created with i32 indices, while array GEPs are created with i64 indices) 2. Passing the wrong Offset to CreateStructGEP in TargetInfo.cpp on x86-64 NACL (which uses 32-bit pointers). Differential Revision: https://reviews.llvm.org/D57766 llvm-svn: 353529
* [OpenCL] Fixed addr space manging test.Anastasia Stulova2019-01-311-5/+5
| | | | | | | | | Fixed typo in the Filecheck directive and changed the test to verify output correctly. Fixes PR40029! llvm-svn: 352760
* Revert "OpenCL: Extend argument promotion rules to vector types"Matt Arsenault2019-01-291-11/+9
| | | | | | | | | | | | This reverts r348083. This was based on a misreading of the spec for printf specifiers. Also revert r343653, as without a subsequent patch, a correctly specified format for a vector will incorrectly warn. Fixes bug 40491. llvm-svn: 352539
* AMDGPU: Add ds append/consume builtinsMatt Arsenault2019-01-281-0/+12
| | | | llvm-svn: 352443
* [AMDGPU] Add interpolation builtinsTim Corringham2019-01-281-0/+34
| | | | | | | | | | | | | | | | Summary: Added builtins for the interpolation intrinsics, and related LIT test. Reviewers: arsenm, tpr, dstuttard, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, cfe-commits Differential Revision: https://reviews.llvm.org/D46871 llvm-svn: 352358
* [AMDGPU] Separate feature dot-instsStanislav Mekhanoshin2019-01-102-15/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D56525 llvm-svn: 350794
* Fix opencl test broken on windows by r350643.Erich Keane2019-01-081-2/+2
| | | | | | | | | Windows doesn't allow common with alignment >32 bits, so these tests were broken in windows mode. This patch makes 'common' optional in these cases. Change-Id: I4d5fdd07ecdafc3570ef9b09cd816c2e5e4ed15e llvm-svn: 350645
* [OpenCL][CodeGen] Fix replacing memcpy with addrspacecastAndrew Savonichev2018-12-101-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If a function argument is byval and RV is located in default or alloca address space an optimization of creating addrspacecast instead of memcpy is performed. That is not correct for OpenCL, where that can lead to a situation of address space casting from __private * to __global *. See an example below: ``` typedef struct { int x; } MyStruct; void foo(MyStruct val) {} kernel void KernelOneMember(__global MyStruct* x) { foo (*x); } ``` for this code clang generated following IR: ... %0 = load %struct.MyStruct addrspace(1)*, %struct.MyStruct addrspace(1)** %x.addr, align 4 %1 = addrspacecast %struct.MyStruct addrspace(1)* %0 to %struct.MyStruct* ... So the optimization was disallowed for OpenCL if RV is located in an address space different than that of the argument (0). Reviewers: yaxunl, Anastasia Reviewed By: Anastasia Subscribers: cfe-commits, asavonic Differential Revision: https://reviews.llvm.org/D54947 llvm-svn: 348752
* OpenCL: Extend argument promotion rules to vector typesMatt Arsenault2018-12-011-0/+39
| | | | | | | | The spec is ambiguous on whether vector types are allowed to be implicitly converted. The only legal context I think this can be used for OpenCL is printf, where it seems necessary. llvm-svn: 348083
* Derive builtin return type from its definitionMarco Antognini2018-11-272-22/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Prior to this patch, OpenCL code such as the following would attempt to create a BranchInst with a non-bool argument: if (enqueue_kernel(get_default_queue(), 0, nd, ^(void){})) /* ... */ This patch is a follow up on a similar issue with pipe builtin operations. See commit r280800 and https://bugs.llvm.org/show_bug.cgi?id=30219. This change, while being conservative on non-builtin functions, should set the type of expressions invoking builtins to the proper type, instead of defaulting to `bool` and requiring manual overrides in Sema::CheckBuiltinFunctionCall. In addition to tests for enqueue_kernel, the tests are extended to check other OpenCL builtins. Reviewers: Anastasia, spatel, rsmith Reviewed By: Anastasia Subscribers: kristina, cfe-commits, svenvh Differential Revision: https://reviews.llvm.org/D52879 llvm-svn: 347658
* CGDecl::emitStoresForConstant fix synthesized constant's nameJF Bastien2018-11-154-8/+8
| | | | | | | | | | | | Summary: The name of the synthesized constants for constant initialization was using mangling for statics, which isn't generally correct and (in a yet-uncommitted patch) causes the mangler to assert out because the static ends up trying to mangle function parameters and this makes no sense. Instead, mangle to `"__const." + FunctionName + "." + DeclName`. Reviewers: rjmccall Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D54055 llvm-svn: 346915
* [OpenCL] Fix invalid address space generation for clk_event_tAlexey Sotkin2018-11-141-1/+8
| | | | | | | | | | | | | | | | | | Summary: Addrspace(32) was generated when putting 0 in clk_event_t * event_ret parameter for enqueue_kernel function. Patch by Viktoria Maksimova Reviewers: Anastasia, yaxunl, AlexeySotkin Reviewed By: Anastasia, AlexeySotkin Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D53809 llvm-svn: 346838
* [OpenCL] Add support of cl_intel_device_side_avc_motion_estimation extensionAndrew Savonichev2018-11-081-0/+81
| | | | | | | | | | | | | | | | | | Summary: Documentation can be found at https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_device_side_avc_motion_estimation.txt Patch by Kristina Bessonova Reviewers: Anastasia, yaxunl, shafik Reviewed By: Anastasia Subscribers: arphaman, sidorovd, AlexeySotkin, krisb, bader, asavonic, cfe-commits Differential Revision: https://reviews.llvm.org/D51484 llvm-svn: 346392
* Revert r346326 [OpenCL] Add support of ↵Andrew Savonichev2018-11-071-81/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | cl_intel_device_side_avc_motion_estimation This patch breaks Index/opencl-types.cl LIT test: Script: -- : 'RUN: at line 1'; stage1/bin/c-index-test -test-print-type llvm/tools/clang/test/Index/opencl-types.cl -cl-std=CL2.0 | stage1/bin/FileCheck llvm/tools/clang/test/Index/opencl-types.cl -- Command Output (stderr): -- llvm/tools/clang/test/Index/opencl-types.cl:3:26: warning: unsupported OpenCL extension 'cl_khr_fp16' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:4:26: warning: unsupported OpenCL extension 'cl_khr_fp64' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:8:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:11:8: error: declaring variable of type 'half' is not allowed llvm/tools/clang/test/Index/opencl-types.cl:15:3: error: use of type 'double' requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:16:3: error: use of type 'double4' (vector of 4 'double' values) requires cl_khr_fp64 extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:26:26: warning: unsupported OpenCL extension 'cl_khr_gl_msaa_sharing' - ignoring [-Wignored-pragmas] llvm/tools/clang/test/Index/opencl-types.cl:35:44: error: use of type '__read_only image2d_msaa_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:36:49: error: use of type '__read_only image2d_array_msaa_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:37:49: error: use of type '__read_only image2d_msaa_depth_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm/tools/clang/test/Index/opencl-types.cl:38:54: error: use of type '__read_only image2d_array_msaa_depth_t' requires cl_khr_gl_msaa_sharing extension to be enabled llvm-svn: 346338
* [OpenCL] Add support of cl_intel_device_side_avc_motion_estimation extensionAndrew Savonichev2018-11-071-0/+81
| | | | | | | | | | | | | | | | | | Summary: Documentation can be found at https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_device_side_avc_motion_estimation.txt Patch by Kristina Bessonova Reviewers: Anastasia, yaxunl, shafik Reviewed By: Anastasia Subscribers: arphaman, sidorovd, AlexeySotkin, krisb, bader, asavonic, cfe-commits Differential Revision: https://reviews.llvm.org/D51484 llvm-svn: 346326
* [CodeGen] Update min-legal-vector width based on function argument and ↵Craig Topper2018-10-241-4/+8
| | | | | | | | | | | | return types This is a continuation of my patches to inform the X86 backend about what the largest IR types are in the function so that we can restrict the backend type legalizer to prevent 512-bit vectors on SKX when -mprefer-vector-width=256 is specified if no explicit 512 bit vectors were specified by the user. This patch updates the vector width based on the argument and return types of the current function and from the types of any functions it calls. This is intended to make sure the backend type legalizer doesn't disturb any types that are required for ABI. Differential Revision: https://reviews.llvm.org/D52441 llvm-svn: 345168
* AMDGPU: add __builtin_amdgcn_update_dppYaxun Liu2018-10-171-1/+8
| | | | | | | | | | Emit llvm.amdgcn.update.dpp for both __builtin_amdgcn_mov_dpp and __builtin_amdgcn_update_dpp. The first argument to llvm.amdgcn.update.dpp will be undef for __builtin_amdgcn_mov_dpp. Differential Revision: https://reviews.llvm.org/D52320 llvm-svn: 344665
* [OpenCL] Add block argument CodeGen testSven van Haastregt2018-10-021-0/+20
| | | | | | | | | | | | r326937 ("[OpenCL] Remove block invoke function from emitted block literal struct", 2018-03-07) broke block argument handling. In particular the commit was causing a crash during code generation, see the discussion in https://reviews.llvm.org/D43783 . The offending commit has just been reverted; add a test to avoid breaking this again in the future. llvm-svn: 343583
* Revert r326937 "[OpenCL] Remove block invoke function from emitted block ↵Sven van Haastregt2018-10-023-67/+91
| | | | | | | | | | | literal struct" This reverts r326937 as it broke block argument handling in OpenCL. See the discussion on https://reviews.llvm.org/D43783 . The next commit will add a test case that revealed the issue. llvm-svn: 343582
* AMDGPU: Add another missing builtinMatt Arsenault2018-08-091-0/+7
| | | | llvm-svn: 339395
* AMDGPU: Fix enabling denormals by default on pre-VI targetsMatt Arsenault2018-08-082-9/+29
| | | | | | | | Fast FMAF is not a sufficient condition to enable denormals. Before VI, enabling denormals caused F32 instructions to run at F64 speeds. llvm-svn: 339278
* [DebugInfo][OpenCL] Address post-commit review for r338299Scott Linder2018-08-081-8/+10
| | | | | | | | NFC refactor of code to generate debug info for OpenCL 2.X blocks. Differential Revision: https://reviews.llvm.org/D50099 llvm-svn: 339265
* Fix one hard coded value I missed in r339185.Douglas Yung2018-08-071-1/+1
| | | | llvm-svn: 339188
* Make test more robust by not checking hard coded debug info values, but ↵Douglas Yung2018-08-071-3/+5
| | | | | | instead check the relationships between them. llvm-svn: 339185
* [OpenCL] Restore r338899 (reverted in r338904), fixing stack-use-after-returnScott Linder2018-08-072-66/+142
| | | | | | | | | Always emit alloca in entry block for enqueue_kernel builtin. Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. llvm-svn: 339150
* AMDGPU: Add builtin for s_dcache_wbMatt Arsenault2018-08-072-3/+10
| | | | llvm-svn: 339110
* AMDGPU: Add builtin for s_dcache_inv_volMatt Arsenault2018-08-072-2/+22
| | | | llvm-svn: 339109
* Revert "[OpenCL] Always emit alloca in entry block for enqueue_kernel builtin"Vlad Tsyrklevich2018-08-032-142/+66
| | | | | | This reverts commit r338899, it was causing ASan test failures on sanitizer-x86_64-linux-fast. llvm-svn: 338904
* [OpenCL] Always emit alloca in entry block for enqueue_kernel builtinScott Linder2018-08-032-66/+142
| | | | | | | | | Ensures the statically sized alloca is not converted to DYNAMIC_STACKALLOC later because it is not in the entry block. Differential Revision: https://reviews.llvm.org/D50104 llvm-svn: 338899
* AMDGPU: Fix missing declaration of queue ptr builtinMatt Arsenault2018-08-021-0/+7
| | | | llvm-svn: 338754
* Try to make builtin address space declarations not uselessMatt Arsenault2018-08-022-22/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way address space declarations for builtins currently work is nearly useless. The code assumes the address spaces used for builtins is a confusingly named "target address space" from user code using __attribute__((address_space(N))) that matches the builtin declaration. There's no way to use this to declare a builtin that returns a language specific address space. The terminology used is highly cofusing since it has nothing to do with the the address space selected by the target to use for a language address space. This feature is essentially unused as-is. AMDGPU and NVPTX are the only in-tree targets attempting to use this. The AMDGPU builtins certainly do not behave as intended (i.e. all of the builtins returning pointers can never compile because the numbered address space never matches the expected named address space). The NVPTX builtins are missing tests for some, and the others seem to rely on an implicit addrspacecast. Change the used address space for builtins based on a target hook to allow using a language address space for a builtin. This allows the same builtin declaration to be used for multiple languages with similarly purposed address spaces (e.g. the same AMDGPU builtin can be used in OpenCL and CUDA even though the constant address spaces are arbitarily different). This breaks the possibility of using arbitrary numbered address spaces alongside the named address spaces for builtins. If this is an issue we probably need to introduce another builtin declaration character to distinguish language address spaces from so-called "target address spaces". llvm-svn: 338707
* AMDGPU: Add clamp bit to dot builtinsKonstantin Zhuravlyov2018-08-013-21/+76
| | | | | | Differential Revision: https://reviews.llvm.org/D50011 llvm-svn: 338471
* [DebugInfo][OpenCL] Generate correct block literal debug info for OpenCLScott Linder2018-07-301-2/+10
| | | | | | | | | OpenCL block literal structs have different fields which are now correctly identified in the debug info. Differential Revision: https://reviews.llvm.org/D49930 llvm-svn: 338299
* CodeGen: specify alignment + inbounds for automatic variable initializationJF Bastien2018-07-131-5/+5
| | | | | | | | | | Summary: Automatic variable initialization was generating default-aligned stores (which are deprecated) instead of using the known alignment from the alloca. Further, they didn't specify inbounds. Subscribers: dexonsmith, cfe-commits Differential Revision: https://reviews.llvm.org/D49209 llvm-svn: 337041
* [AMDGPU] fixes for lds f32 builtinsDaniil Fukalov2018-05-211-6/+6
| | | | | | | | | | | | 1. added restrictions to memory scope, order and volatile parameters 2. added custom processing for these builtins - currently is not used code, needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm tree) 3. builtins renamed as requested Differential Revision: https://reviews.llvm.org/D43281 llvm-svn: 332848
* [OpenCL] make test independent of optimizerSanjay Patel2018-05-161-7/+8
| | | | | | | | | There shouldn't be any tests that run the entire optimizer here, but the last test in this file is definitely going to break with a change in LLVM IR canonicalization. Change that part to check the unoptimized IR because that's the real intent of this file. llvm-svn: 332473
* [OpenCL] Fix typos in emitted enqueue kernel function namesYaxun Liu2018-05-091-9/+9
| | | | | | | | | | Two typos: vaarg => vararg get_kernel_preferred_work_group_multiple => get_kernel_preferred_work_group_size_multiple Differential Revision: https://reviews.llvm.org/D46601 llvm-svn: 331895
* [OpenCL] Add constant address space to __func__ in AST.Anastasia Stulova2018-05-091-3/+9
| | | | | | | | | | | | Added string literal helper function to obtain the type attributed by a constant address space. Also fixed predefind __func__ expr to use the helper to constract the string literal correctly. Differential Revision: https://reviews.llvm.org/D46049 llvm-svn: 331877
* Add Microsoft Mangling for OpenCL Half TypeErich Keane2018-05-011-0/+1
| | | | | | | | | | | | Half-type mangling is accomplished following the method introduced by Erich Keane for mangling _Float16. Updated the half.cl LIT test to cover this particular case. Patch By: vbridgers Differential Revision: https://reviews.llvm.org/D46131 llvm-svn: 331263
* AMDGPU: Add Vega12 and Vega20Matt Arsenault2018-04-303-0/+73
| | | | | | | | Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331216
* [OpenCL] Add separate read_only and write_only pipe IR typesSven van Haastregt2018-04-273-38/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | SPIR-V encodes the read_only and write_only access qualifiers of pipes, so separate LLVM IR types are required to target SPIR-V. Other backends may also find this useful. These new types are `opencl.pipe_ro_t` and `opencl.pipe_wo_t`, which replace `opencl.pipe_t`. This replaces __get_pipe_num_packets(...) and __get_pipe_max_packets(...) which took a read_only pipe with separate versions for read_only and write_only pipes, namely: * __get_pipe_num_packets_ro(...) * __get_pipe_num_packets_wo(...) * __get_pipe_max_packets_ro(...) * __get_pipe_max_packets_wo(...) These separate versions exist to avoid needing a bitcast to one of the two qualified pipe types. Patch by Stuart Brady. Differential Revision: https://reviews.llvm.org/D46015 llvm-svn: 331026
* Fix some tests that were failing on WindowsHans Wennborg2018-04-201-2/+2
| | | | llvm-svn: 330441
* [OpenCL] Add 'denorms-are-zero' function attributeAlexey Sotkin2018-04-201-13/+19
| | | | | | | | | | | | | | | | | | Summary: Generate attribute 'denorms-are-zero'='true' if '-cl-denorms-are-zero' compile option was specified and 'denorms-are-zero'='false' otherwise. Patch by krisb Reviewers: Anastasia, yaxunl Reviewed By: yaxunl Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45808 llvm-svn: 330404
* Fix typos in clangAlexander Kornienko2018-04-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
* AMDGPU: Update datalayout for stack alignmentMatt Arsenault2018-03-271-1/+1
| | | | llvm-svn: 328657
* [AMDGPU] Fix codegen for inline assemblyYaxun Liu2018-03-231-0/+8
| | | | | | | | Need to override convertConstraint to recognise amdgpu specific register names. Differential Revision: https://reviews.llvm.org/D44533 llvm-svn: 328359
* [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG)Tony Tye2018-03-231-25/+25
| | | | | | | | Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44696 llvm-svn: 328350
* [AMDGPU] Remove use of OpenCL triple environment and replace with function ↵Tony Tye2018-03-231-26/+35
| | | | | | | | | | | attribute for AMDGPU (CLANG) - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use a function attribute to communicate to the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D43735 llvm-svn: 328347
OpenPOWER on IntegriCloud