summaryrefslogtreecommitdiffstats
path: root/libclc/r600/lib
Commit message (Collapse)AuthorAgeFilesLines
* r600: Remove empty OVERRIDES fileJan Vesely2018-11-271-0/+0
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 347666
* r600: Add datalayout to image builtin implementationJan Vesely2018-11-103-0/+6
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346597
* r600: Convert barrier to clcJan Vesely2018-11-0412-35/+10
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346078
* r600: Convert get_num_groups to clcJan Vesely2018-11-0412-49/+16
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346077
* r600: Convert get_global_size to clcJan Vesely2018-11-0412-49/+16
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346076
* r600: Convert get_local_size to clcJan Vesely2018-11-0412-49/+16
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 346075
* r600/fmin: Flush denormals before calling builtin.Jan Vesely2018-06-072-0/+31
| | | | | | | | | Same reason as amdgcn. Fixes fmin, minmag CTS on turks. Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 334228
* r600/fmax: Flush denormals before calling builtin.Jan Vesely2018-06-072-0/+30
| | | | | | | | | Same reason as amdgcn. Fixes fmax, maxmag CTS on turks. Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 334227
* r600: Update datalayout after LLVM r328656Jan Vesely2018-04-054-4/+4
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 329291
* r600: Fix datalayout after clang r324101Jan Vesely2018-02-2316-4/+109
| | | | | | | | r324101 switched around AS numbering Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325864
* r600: Add missing datalayout to .ll filesJan Vesely2017-10-204-0/+8
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 316238
* Make image builtins r600/llvm-3.9 onlyJan Vesely2017-10-1016-0/+356
| | | | | | | | | | The implementation uses r600 sepcific intrinsics LLVM-4 switched to _ro_t and _rw_t image types Portions of the code can be moved back as more targets/llvm versions add image support Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 315341
* Let get_work_dim take exactly 0 argumentsJeroen Ketema2017-10-011-1/+1
| | | | | Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314634
* r600: Cleanup barrier implementation.Jan Vesely2017-09-041-26/+5
| | | | | | | | | We don't have memory fences for r600 so just call group barrier directly Make sure that barrier is called even with 0 flags Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 312492
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-252-0/+19
| | | | llvm-svn: 279723
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-242-0/+19
| | | | llvm-svn: 279644
* amdgpu: Fix default case value for get_local_sizeMatt Arsenault2016-08-201-1/+1
| | | | llvm-svn: 279359
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-202-0/+19
| | | | llvm-svn: 279350
* AMDGPU: Implement get_global_offset builtinJan Vesely2016-07-222-0/+12
| | | | | | | | | | | | | Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-226-62/+34
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* R600: Use new barrier intrinsicMatt Arsenault2016-07-181-4/+3
| | | | llvm-svn: 275874
* amdgcn: Use new workitem intrinsicsMatt Arsenault2016-02-173-0/+62
| | | | llvm-svn: 261042
* Split sources for amdgcn and r600Matt Arsenault2016-02-1328-649/+11
| | | | | | | | | | | Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777
* r600: Add image writing builtins.Tom Stellard2015-09-215-0/+83
| | | | | | Patch by: Zoltan Gilian llvm-svn: 248161
* r600: Add image reading builtins.Tom Stellard2015-09-215-0/+110
| | | | | | Patch by: Zoltan Gilian llvm-svn: 248160
* Add image attribute getter builtinsTom Stellard2015-09-217-0/+153
| | | | | | | | | Added get_image_* OpenCL builtins to the headers. Added implementation to the r600 target. Patch by: Zoltan Gilian llvm-svn: 248159
* R600: Implement accurate double precision sqrt v2Tom Stellard2015-07-102-0/+60
| | | | | | | v2: - Use same implementation for R600 and gcn. llvm-svn: 241907
* r600: Use __clc_ldexp on asics that don't implement the intructionJan Vesely2015-05-061-1/+10
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236649
* math: Add ldexp implementationTom Stellard2015-05-062-30/+1
| | | | | | | | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Tom Stellard: - Add denormal handling. - Share vectorization code with r600 implementation. Patch By: Aaron Watry llvm-svn: 236639
* Implement ldexp for R600/SITom Stellard2015-05-063-0/+68
| | | | llvm-svn: 236638
* r600: get_work_dim: Update metadata syntax for LLVM 3.6Tom Stellard2014-12-311-1/+1
| | | | llvm-svn: 225042
* r600: Fix get_work_dim range metadataJan Vesely2014-10-221-1/+1
| | | | | | Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220388
* r600: Use llvm intrinsic to read work dimension informationJan Vesely2014-10-152-0/+9
| | | | | | | | | | v2: Fix function declaration Add range metadata to r600 implementation v3: change prefix to AMDGPU Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219793
* R600: Map Address spaces for atomic_cmpxchgAaron Watry2014-09-161-0/+19
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217925
* R600: Map address spaces for atomic_xchgAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217924
* R600: Map address spaces for atomic_minAaron Watry2014-09-161-0/+10
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217923
* R600: Map address spaces for atomic_xorAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217922
* R600: Map addr spaces and use atomic_maxAaron Watry2014-09-161-5/+16
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217921
* R600: Map address spaces for atomic_orAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217920
* R600: Map atomic_and address spacesAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217919
* vload/vstore: Use casts instead of scalarizing everything in CLC versionAaron Watry2014-08-203-189/+0
| | | | | | | | | | | | | | | This generates bitcode which is indistinguishable from what was hand-written for int32 types in v[load|store]_impl.ll. v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom) v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll v2: (Per Matt Arsenault) Fix alignment issues with vector load stores Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 216069
* Move clcmacro.h to avoid cluttering user namespace v2Jeroen Ketema2014-06-241-0/+1
| | | | | | | | | v2: - use quotes instead of <> - add include to r600/lib/math/nextafter.c changed Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 211576
* R600: Set the noduplicate attribute on barrier() intrinsicsTom Stellard2013-10-313-19/+30
| | | | | | | | This will prevent LLVM optimization passes from creating illegal uses of the barrier() intrinsic (e.g. calling barrier() from a conditional that is not executed by all threads). llvm-svn: 193753
* Implement nextafter() builtinTom Stellard2013-10-102-0/+4
| | | | | | | | | | | | | | There are two implementations of nextafter(): 1. Using clang's __builtin_nextafter. Clang replaces this builtin with a call to nextafter which is part of libm. Therefore, this implementation will only work for targets with an implementation of libm (e.g. most CPU targets). 2. The other implementation is written in OpenCL C. This function is known internally as __clc_nextafter and can be used by targets that don't have access to libm. llvm-svn: 192383
* Add atomic_sub and atomic_dec builtin functionsAaron Watry2013-09-061-0/+1
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190201
* Add atomic_inc and atomic_add builtinsAaron Watry2013-09-052-0/+21
| | | | | Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 190058
* Enable assembly vload3 int/uint constant/global for R600Aaron Watry2013-08-121-16/+2
| | | | | | | | It's supported by the R600 LLVM back-end now, at least for evergreen. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188180
* Add vload* for addrspace(2) and use as constant load for R600Aaron Watry2013-08-121-2/+8
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188179
* Added get_num_groupsAaron Watry2013-07-242-0/+19
| | | | | | | | | The get_num_groups function was missing for r600g. I did the same thing as the other workitem functions. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 187059
* Fix and re-enable R600 vload/vstore assemblyAaron Watry2013-07-163-0/+198
| | | | | | | | | | | | | | | | | | | The assembly optimizations were making unsafe assumptions about which address spaces had which identifiers. Also, fix vload/vstore with 64-bit pointers. This was broken previously on Radeon SI. This version still only has assembly versions of int/uint 2/4/8/16 for global loads and stores on R600, but it does it in a way that would be very easily extended to private/local/constant and could also be handled easily on other architectures. v2: 1) Leave v[load|store]_impl.ll in generic/lib 2) Remove vload_if.ll and vstore_if.ll interfaces 3) Fix address+offset calculations 3) Remove offset from assembly arg list llvm-svn: 186416
OpenPOWER on IntegriCloud