summaryrefslogtreecommitdiffstats
path: root/libclc/r600/lib/SOURCES
Commit message (Collapse)AuthorAgeFilesLines
* r600: Convert barrier to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346078
* r600: Convert get_num_groups to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346077
* r600: Convert get_global_size to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346076
* r600: Convert get_local_size to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346075
* r600/fmin: Flush denormals before calling builtin.Jan Vesely2018-06-071-0/+1
| | | | | | | | | Same reason as amdgcn. Fixes fmin, minmag CTS on turks. Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 334228
* r600/fmax: Flush denormals before calling builtin.Jan Vesely2018-06-071-0/+1
| | | | | | | | | Same reason as amdgcn. Fixes fmax, maxmag CTS on turks. Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 334227
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-251-0/+1
| | | | llvm-svn: 279723
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-241-0/+1
| | | | llvm-svn: 279644
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-201-0/+1
| | | | llvm-svn: 279350
* AMDGPU: Implement get_global_offset builtinJan Vesely2016-07-221-0/+1
| | | | | | | | | | | | | Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-221-2/+3
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* amdgcn: Use new workitem intrinsicsMatt Arsenault2016-02-171-0/+2
| | | | llvm-svn: 261042
* Split sources for amdgcn and r600Matt Arsenault2016-02-131-25/+0
| | | | | | | | | | | Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777
* r600: Add image writing builtins.Tom Stellard2015-09-211-0/+4
| | | | | | Patch by: Zoltan Gilian llvm-svn: 248161
* r600: Add image reading builtins.Tom Stellard2015-09-211-0/+4
| | | | | | Patch by: Zoltan Gilian llvm-svn: 248160
* Add image attribute getter builtinsTom Stellard2015-09-211-0/+6
| | | | | | | | | Added get_image_* OpenCL builtins to the headers. Added implementation to the r600 target. Patch by: Zoltan Gilian llvm-svn: 248159
* R600: Implement accurate double precision sqrt v2Tom Stellard2015-07-101-0/+1
| | | | | | | v2: - Use same implementation for R600 and gcn. llvm-svn: 241907
* Implement ldexp for R600/SITom Stellard2015-05-061-0/+1
| | | | llvm-svn: 236638
* r600: Use llvm intrinsic to read work dimension informationJan Vesely2014-10-151-0/+1
| | | | | | | | | | v2: Fix function declaration Add range metadata to r600 implementation v3: change prefix to AMDGPU Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219793
* vload/vstore: Use casts instead of scalarizing everything in CLC versionAaron Watry2014-08-201-1/+0
| | | | | | | | | | | | | | | This generates bitcode which is indistinguishable from what was hand-written for int32 types in v[load|store]_impl.ll. v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom) v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll v2: (Per Matt Arsenault) Fix alignment issues with vector load stores Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 216069
* R600: Set the noduplicate attribute on barrier() intrinsicsTom Stellard2013-10-311-1/+0
| | | | | | | | This will prevent LLVM optimization passes from creating illegal uses of the barrier() intrinsic (e.g. calling barrier() from a conditional that is not executed by all threads). llvm-svn: 193753
* Implement nextafter() builtinTom Stellard2013-10-101-0/+1
| | | | | | | | | | | | | | There are two implementations of nextafter(): 1. Using clang's __builtin_nextafter. Clang replaces this builtin with a call to nextafter which is part of libm. Therefore, this implementation will only work for targets with an implementation of libm (e.g. most CPU targets). 2. The other implementation is written in OpenCL C. This function is known internally as __clc_nextafter and can be used by targets that don't have access to libm. llvm-svn: 192383
* Add atomic_inc and atomic_add builtinsAaron Watry2013-09-051-0/+1
| | | | | Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 190058
* Added get_num_groupsAaron Watry2013-07-241-0/+1
| | | | | | | | | The get_num_groups function was missing for r600g. I did the same thing as the other workitem functions. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 187059
* Fix and re-enable R600 vload/vstore assemblyAaron Watry2013-07-161-0/+2
| | | | | | | | | | | | | | | | | | | The assembly optimizations were making unsafe assumptions about which address spaces had which identifiers. Also, fix vload/vstore with 64-bit pointers. This was broken previously on Radeon SI. This version still only has assembly versions of int/uint 2/4/8/16 for global loads and stores on R600, but it does it in a way that would be very easily extended to private/local/constant and could also be handled easily on other architectures. v2: 1) Leave v[load|store]_impl.ll in generic/lib 2) Remove vload_if.ll and vstore_if.ll interfaces 3) Fix address+offset calculations 3) Remove offset from assembly arg list llvm-svn: 186416
* Implement barrier() builtinTom Stellard2013-07-081-0/+2
| | | | | | Reviewed and Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 185837
* R600: Replace cl implementations with LLVM IR implementationTom Stellard2013-06-261-2/+4
| | | | | | This allows libclc to be built for R600 with upstream clang and LLVM. llvm-svn: 184980
* r600: Add get_global_size() implementationTom Stellard2013-06-261-0/+1
| | | | llvm-svn: 184977
* r600: Initial supportTom Stellard2013-06-261-0/+1
This includes a get_global_id() implementation and function stubs for the other workitem and synchronization functions. llvm-svn: 184975
OpenPOWER on IntegriCloud