summaryrefslogtreecommitdiffstats
path: root/libclc/amdgpu
Commit message (Collapse)AuthorAgeFilesLines
* Remove redundant OVERRRIDES fileJan Vesely2018-11-041-2/+0
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346086
* Add initial support for half precision builtinsJan Vesely2018-05-172-0/+12
| | | | | | | | | | | | | | v2: fix fmax implementation use consistent checks for __CLC_FP_SIZE add missing TODOs fix whitespace in definitions.h v3: undef ZERO in modf.inc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 332677
* amdgpu/half_recip: Switch implementation to native_recipJan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325061
* amdgpu/half_log2: Switch implementation to native_log2Jan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325060
* amdgpu/half_log10: Switch implementation to native_log10Jan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325059
* amdgpu/half_log: Switch implementation to native_logJan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325058
* amdgpu/half_exp2: Switch implementation to native_exp2Jan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325057
* amdgpu/half_exp10: Switch implementation to native_exp10Jan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325056
* amdgpu/half_exp: Switch implementation to native_expJan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325055
* amdgpu/half_sqrt: Switch implementation to native_sqrtJan Vesely2018-02-132-0/+7
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325054
* amdgpu/half_rsqrt: Switch implementation to native_rsqrtJan Vesely2018-02-133-0/+18
| | | | | | Reviewer: Tom Stellard <tstellar@redhat.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325053
* amdgpu: Add workaround for unimplemented llvm.exp intrinsicJan Vesely2017-11-103-0/+9
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317935
* math: Implement native_log10Jan Vesely2017-10-253-0/+9
| | | | | | | | | | | | Use llvm instrinsic by default Provide amdgpu workaround v2: drop old amd copyrights Reviewer: Aaron Watry Reviewed-by: Vedran Miletić <vedran@miletic.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316588
* amdgpu/math: Don't use llvm instrinsic for native_logJan Vesely2017-10-253-0/+9
| | | | | | | | | | | | | AMDGPU targets don't have insturction for it, so it'll be expanded to C * log2 anyway. v2: use native_log2 instead of the more precise sw implementation v3: move to amdgpu v4: drop old AMD copyright Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316587
* Make image builtins r600/llvm-3.9 onlyJan Vesely2017-10-1015-346/+0
| | | | | | | | | | The implementation uses r600 sepcific intrinsics LLVM-4 switched to _ro_t and _rw_t image types Portions of the code can be moved back as more targets/llvm versions add image support Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 315341
* Do not include clc_nextafter header globallyJan Vesely2017-10-081-0/+1
| | | | | | | | Drop unused clc/math/clc_nextafter.h header Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 315190
* Restore support for llvm-3.9Jan Vesely2017-09-291-0/+2
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 314543
* Rework atomic ops to use clang builtins rather than llvm asmJan Vesely2017-09-252-66/+0
| | | | | | | reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314112
* Implement vload_half{,n} and vload(half)Jan Vesely2017-09-083-0/+25
| | | | | | | | | | v2: add vload(half) as well make helpers amdgpu specific (NVPTX uses different private AS numbering) use clang builtin on clang >= 6 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tstellar@redhat.com> llvm-svn: 312839
* vstore: Cleanup and add vstore(half)Jan Vesely2017-09-083-0/+37
| | | | | | | | | | Add missing undefs Make helpers amdgpu specific (NVPTX uses different numbering for private AS) Use clang builtins on clang >= 6 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tstellar@redhat.com> llvm-svn: 312838
* r600: Cleanup barrier implementation.Jan Vesely2017-09-042-11/+0
| | | | | | | | | We don't have memory fences for r600 so just call group barrier directly Make sure that barrier is called even with 0 flags Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 312492
* Replace nextafter implementationMatt Arsenault2016-09-081-0/+5
| | | | | | This one passes conformance. llvm-svn: 280961
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-252-19/+0
| | | | llvm-svn: 279723
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-242-19/+0
| | | | llvm-svn: 279644
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-202-19/+0
| | | | llvm-svn: 279350
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-222-12/+3
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* amdgpu: Use right builtn for rsqMatt Arsenault2016-07-191-1/+6
| | | | | | | The r600 path has never actually worked sinced double is not implemented there. llvm-svn: 276009
* Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexpMatt Arsenault2016-07-182-48/+0
| | | | | | | It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813
* amdgcn: Use new workitem intrinsicsMatt Arsenault2016-02-173-38/+0
| | | | llvm-svn: 261042
* Split sources for amdgcn and r600Matt Arsenault2016-02-1327-0/+642
Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777
OpenPOWER on IntegriCloud