summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGenOpenCL/amdgpu-attrs.cl
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Always emit amdgpu-flat-work-group-sizeMatt Arsenault2019-08-271-14/+21
| | | | | | | The backend default maximum should be the hardware maximum, so the frontend should set the implementation defined default maximum. llvm-svn: 370101
* [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and ↵Christudasan Devadasan2019-07-101-25/+25
| | | | | | | | | | | | | HIP (CLANG). To enable a new implicit kernel argument, increased the number of argument bytes from 48 to 56. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D63756 llvm-svn: 365643
* [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG)Tony Tye2018-03-231-25/+25
| | | | | | | | Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44696 llvm-svn: 328350
* [AMDGPU] Remove use of OpenCL triple environment and replace with function ↵Tony Tye2018-03-231-26/+35
| | | | | | | | | | | attribute for AMDGPU (CLANG) - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use a function attribute to communicate to the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D43735 llvm-svn: 328347
* OpenCL: Assume functions are convergentMatt Arsenault2017-10-061-25/+25
| | | | | | | | | This was done for CUDA functions in r261779, and for the same reason this also needs to be done for OpenCL. An arbitrary function could have a barrier() call in it, which in turn requires the calling function to be convergent. llvm-svn: 315094
* IRGen: Add optnone attribute on function during O0Mehdi Amini2017-05-291-25/+25
| | | | | | | | | | | Amongst other, this will help LTO to correctly handle/honor files compiled with O0, helping debugging failures. It also seems in line with how we handle other options, like how -fnoinline adds the appropriate attribute as well. Differential Revision: https://reviews.llvm.org/D28404 llvm-svn: 304127
* [AMDGPU] Translate reqd_work_group_size into amdgpu_flat_work_group_sizeStanislav Mekhanoshin2017-04-061-0/+12
| | | | | | | | | | | | | These two attributes specify the same info in a different way. AMGPU BE only checks the latter as a target specific attribute as opposed to language specific reqd_work_group_size. This change produces amdgpu_flat_work_group_size out of reqd_work_group_size if specified. Differential Revision: https://reviews.llvm.org/D31728 llvm-svn: 299678
* Cleanup the handling of noinline function attributes, -fno-inline,Chandler Carruth2016-12-231-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -fno-inline-functions, -O0, and optnone. These were really, really tangled together: - We used the noinline LLVM attribute for -fno-inline - But not for -fno-inline-functions (breaking LTO) - But we did use it for -finline-hint-functions (yay, LTO is happy!) - But we didn't for -O0 (LTO is sad yet again...) - We had weird structuring of CodeGenOpts with both an inlining enumeration and a boolean. They interacted in weird ways and needlessly. - A *lot* of set smashing went on with setting these, and then got worse when we considered optnone and other inlining-effecting attributes. - A bunch of inline affecting attributes were managed in a completely different place from -fno-inline. - Even with -fno-inline we failed to put the LLVM noinline attribute onto many generated function definitions because they didn't show up as AST-level functions. - If you passed -O0 but -finline-functions we would run the normal inliner pass in LLVM despite it being in the O0 pipeline, which really doesn't make much sense. - Lastly, we used things like '-fno-inline' to manipulate the pass pipeline which forced the pass pipeline to be much more parameterizable than it really needs to be. Instead we can *just* use the optimization level to select a pipeline and control the rest via attributes. Sadly, this causes a bunch of churn in tests because we don't run the optimizer in the tests and check the contents of attribute sets. It would be awesome if attribute sets were a bit more FileCheck friendly, but oh well. I think this is a significant improvement and should remove the semantic need to change what inliner pass we run in order to comply with the requested inlining semantics by relying completely on attributes. It also cleans up tho optnone and related handling a bit. One unfortunate aspect of this is that for generating alwaysinline routines like those in OpenMP we end up removing noinline and then adding alwaysinline. I tried a bunch of other approaches, but because we recompute function attributes from scratch and don't have a declaration here I couldn't find anything substantially cleaner than this. Differential Revision: https://reviews.llvm.org/D28053 llvm-svn: 290398
* [AMDGPU] Expose flat work group size, register and wave control attributesKonstantin Zhuravlyov2016-09-261-0/+166
__attribute__((amdgpu_flat_work_group_size(<min>, <max>))) - request minimum and maximum flat work group size __attribute__((amdgpu_waves_per_eu(<min>[, <max>]))) - request minimum and/or maximum waves per execution unit Differential Revision: https://reviews.llvm.org/D24513 llvm-svn: 282371
OpenPOWER on IntegriCloud