summaryrefslogtreecommitdiffstats
path: root/libclc/generic/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* half_log2: Implement using log2Jan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322895
* half_log10: Implement using log10Jan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322894
* half_log: Implement using logJan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322893
* half_exp10: Implement using exp10Jan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322892
* half_exp2: Implement using exp2Jan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322891
* half_exp: Implement using expJan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322890
* half_cos: Implement using cosJan Vesely2018-01-182-0/+7
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322889
* half_sqrt: Cleanup implementationJan Vesely2018-01-182-49/+2
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322888
* half_rsqrt: Cleanup implementationJan Vesely2018-01-183-49/+11
| | | | | | | | | Passes CTS on carrizo v2: Use full precision implementation Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322887
* rootn: Port from amd_builtinsJan Vesely2018-01-174-0/+390
| | | | | | | | | | | | Passes piglit on turks and carrizo fp64 passes ctx on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322763
* powr: Port from amd_builtinsJan Vesely2018-01-173-0/+400
| | | | | | | | | | | | Passes piglit on turks and carrizo fp64 passes cts on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322762
* pown: Port from amd_builtinsJan Vesely2018-01-174-7/+387
| | | | | | | | | | | | Passes piglit on turks and carrizo fp64 passes CTS on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322761
* pow: Port from amd_builtinsJan Vesely2018-01-176-0/+1086
| | | | | | | | | | | | Passes piglit on turks and carrizo fp64 passes CTS on carrizo v2: fix formatting check fp32 denormal support at runtime Reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 322760
* math: Implement minmagJan Vesely2017-11-153-0/+9
| | | | | | Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318265
* math: Implement maxmagJan Vesely2017-11-153-0/+9
| | | | | | Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318264
* native_powr: Switch implementation to native_exp2 and native_log2Jan Vesely2017-11-143-0/+11
| | | | | | | | | | v2: don't use assume check only for x<0, the other conditions are handled transparently v3: don't check inputs at all, nan propagation works as expected Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318204
* native_divide: provide function implementation instead of macroJan Vesely2017-11-133-0/+9
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318067
* native_recip: provide function implementation instead of macroJan Vesely2017-11-133-0/+9
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318066
* native_rsqrt: Switch implementation to 1 / native_sqrtJan Vesely2017-11-133-0/+9
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318065
* native_tan: Switch implementation to use native_sin/native_cosJan Vesely2017-11-133-0/+9
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318064
* math: Use precomputed constant for log2(10.0)Jan Vesely2017-11-132-3/+3
| | | | | | | | exp10 CTS fails with or without this change Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318063
* native_exp10: Switch implementation to llvm intrinsicJan Vesely2017-11-103-0/+9
| | | | | | | | v2: Use native_log2 instead of wrong constant Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317941
* native_sqrt: Switch implementation to llvm intrinsicJan Vesely2017-11-102-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317940
* native_sin: Switch implementation to llvm intrinsicJan Vesely2017-11-102-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317939
* native_cos: Switch implementation to llvm intrinsicJan Vesely2017-11-102-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317938
* native_exp2: Switch implementation to llvm intrinsicJan Vesely2017-11-102-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317937
* native_exp: Switch implementation to llvm intrinsicJan Vesely2017-11-102-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317936
* native_log10: Switch to generic native intrinsic inc fileJan Vesely2017-11-102-8/+2
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317934
* native_log: Switch to generic native intrinsic inc fileJan Vesely2017-11-102-30/+2
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317933
* native_log2: Switch to generic native intrinsic inc fileJan Vesely2017-11-102-8/+19
| | | | | | | | | v2: Add __CLC_XCONCAT instead of function name redirection Use __CLC_XCONCAT for intrinsic functions as well Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317932
* math: Implement native_log10Jan Vesely2017-10-253-0/+14
| | | | | | | | | | | | Use llvm instrinsic by default Provide amdgpu workaround v2: drop old amd copyrights Reviewer: Aaron Watry Reviewed-by: Vedran Miletić <vedran@miletic.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316588
* shared: Implement aligned vector stores (vstorea_half)Jan Vesely2017-10-222-20/+31
| | | | | | | | | | Float version passes newly posted piglit tests on turks, float and double pass on carrizo. v2: scalar vstorea_half v3: fix typo Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316291
* shared: Implement aligned vector loads (vloada_half)Jan Vesely2017-10-222-10/+26
| | | | | | | | | | Passes newly posted piglits on turks and carrizo v2: add scalar vloada_half v3: fix typo Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316290
* Make image builtins r600/llvm-3.9 onlyJan Vesely2017-10-102-10/+0
| | | | | | | | | | The implementation uses r600 sepcific intrinsics LLVM-4 switched to _ro_t and _rw_t image types Portions of the code can be moved back as more targets/llvm versions add image support Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 315341
* integer/sub_sat: Use clang builtin instead of llvm asmJan Vesely2017-10-024-158/+26
| | | | | | | reviewer: Tom Stellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314703
* integer/add_sat: Use clang builtin instead of llvm asmJan Vesely2017-10-024-148/+28
| | | | | | | reviewer: Tom Stellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314702
* integer/clz: Use clang builtin instead of llvm asmJan Vesely2017-10-024-119/+8
| | | | | | | | | The generated llvm IR mostly identical. char/uchar case is a bit worse. reviewer: Tom Stellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314701
* Rework atomic ops to use clang builtins rather than llvm asmJan Vesely2017-09-2511-136/+117
| | | | | | | reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314112
* Implement cl_khr_int64_extended_atomics builtinsJan Vesely2017-09-206-0/+95
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 313811
* Implement cl_khr_int64_base_atomics builtinsJan Vesely2017-09-207-0/+102
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 313810
* Implement vload_half{,n} and vload(half)Jan Vesely2017-09-082-0/+72
| | | | | | | | | | v2: add vload(half) as well make helpers amdgpu specific (NVPTX uses different private AS numbering) use clang builtin on clang >= 6 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tstellar@redhat.com> llvm-svn: 312839
* vstore: Cleanup and add vstore(half)Jan Vesely2017-09-083-46/+33
| | | | | | | | | | Add missing undefs Make helpers amdgpu specific (NVPTX uses different numbering for private AS) Use clang builtins on clang >= 6 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tstellar@redhat.com> llvm-svn: 312838
* relational: Implement shuffle2 builtinAaron Watry2017-09-022-1/+161
| | | | | | | | | | | | | | This was added in CL 1.1 Tested with a Radeon HD 7850 (Pitcairn) using the CL CTS via: test_conformance/relationals/test_relationals shuffle_built_in_dual_input v2: Add half support to shuffle2 Move shuffle2 to misc/ Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 312404
* relational: Implement shuffle builtinAaron Watry2017-09-022-0/+159
| | | | | | | | | | | | | | This was added in CL 1.1 Tested with a Radeon HD 7850 (Pitcairn) using the CL CTS via: test_conformance/relationals/test_relationals shuffle_built_in v2: Add half-precision support to shuffle when available. Move to misc/ and add section 6.12.12 to clc.h Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 312403
* math: Implement sinh functionJan Vesely2017-02-252-0/+192
| | | | | | mostly copied form amd_builtins llvm-svn: 296233
* math: Add logb builtinAaron Watry2017-01-182-0/+32
| | | | | | | | | Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292335
* math: Add expm1 builtin functionAaron Watry2017-01-184-0/+283
| | | | | | | | | Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292334
* Provide vstore_half helper to workaround clc restrictionsJan Vesely2016-09-214-26/+75
| | | | | | clang won't accept half precision loads and stores without cl_khr_fp16 since r281904 llvm-svn: 282106
* math: Implement tgammaAaron Watry2016-09-152-0/+72
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566
* math: Implement lgammaAaron Watry2016-09-152-0/+45
| | | | | | | | Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565
OpenPOWER on IntegriCloud