summaryrefslogtreecommitdiffstats
path: root/libclc/generic/lib/math
Commit message (Collapse)AuthorAgeFilesLines
...
* math: Implement minmagJan Vesely2017-11-152-0/+8
| | | | | | Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318265
* math: Implement maxmagJan Vesely2017-11-152-0/+8
| | | | | | Reviewer: Aaron Watry Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318264
* native_powr: Switch implementation to native_exp2 and native_log2Jan Vesely2017-11-142-0/+10
| | | | | | | | | | v2: don't use assume check only for x<0, the other conditions are handled transparently v3: don't check inputs at all, nan propagation works as expected Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318204
* native_divide: provide function implementation instead of macroJan Vesely2017-11-132-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318067
* native_recip: provide function implementation instead of macroJan Vesely2017-11-132-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318066
* native_rsqrt: Switch implementation to 1 / native_sqrtJan Vesely2017-11-132-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318065
* native_tan: Switch implementation to use native_sin/native_cosJan Vesely2017-11-132-0/+8
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318064
* math: Use precomputed constant for log2(10.0)Jan Vesely2017-11-132-3/+3
| | | | | | | | exp10 CTS fails with or without this change Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 318063
* native_exp10: Switch implementation to llvm intrinsicJan Vesely2017-11-102-0/+8
| | | | | | | | v2: Use native_log2 instead of wrong constant Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317941
* native_sqrt: Switch implementation to llvm intrinsicJan Vesely2017-11-101-0/+7
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317940
* native_sin: Switch implementation to llvm intrinsicJan Vesely2017-11-101-0/+7
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317939
* native_cos: Switch implementation to llvm intrinsicJan Vesely2017-11-101-0/+7
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317938
* native_exp2: Switch implementation to llvm intrinsicJan Vesely2017-11-101-0/+7
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317937
* native_exp: Switch implementation to llvm intrinsicJan Vesely2017-11-101-0/+7
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317936
* native_log10: Switch to generic native intrinsic inc fileJan Vesely2017-11-102-8/+2
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317934
* native_log: Switch to generic native intrinsic inc fileJan Vesely2017-11-102-30/+2
| | | | | | Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317933
* native_log2: Switch to generic native intrinsic inc fileJan Vesely2017-11-102-8/+19
| | | | | | | | | v2: Add __CLC_XCONCAT instead of function name redirection Use __CLC_XCONCAT for intrinsic functions as well Reviewer: Jeroen Ketema Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 317932
* math: Implement native_log10Jan Vesely2017-10-252-0/+13
| | | | | | | | | | | | Use llvm instrinsic by default Provide amdgpu workaround v2: drop old amd copyrights Reviewer: Aaron Watry Reviewed-by: Vedran Miletić <vedran@miletic.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 316588
* math: Implement sinh functionJan Vesely2017-02-251-0/+191
| | | | | | mostly copied form amd_builtins llvm-svn: 296233
* math: Add logb builtinAaron Watry2017-01-181-0/+31
| | | | | | | | | Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292335
* math: Add expm1 builtin functionAaron Watry2017-01-183-0/+282
| | | | | | | | | Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292334
* math: Implement tgammaAaron Watry2016-09-151-0/+71
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566
* math: Implement lgammaAaron Watry2016-09-151-0/+44
| | | | | | | | Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565
* math: Implement lgamma_rAaron Watry2016-09-152-0/+511
| | | | | | | | | Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281564
* Replace nextafter implementationMatt Arsenault2016-09-081-28/+24
| | | | | | This one passes conformance. llvm-svn: 280961
* Implement cbrt builtinTom Stellard2016-07-223-0/+820
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276497
* Implement cosh builtinTom Stellard2016-07-223-0/+321
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276496
* math: Use single precision fmax in sp pathJan Vesely2016-05-171-1/+1
| | | | | | | | | | Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> llvm-svn: 269807
* math: Add erf ported from amd-builtinsJan Vesely2016-05-061-0/+402
| | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766
* math: Add fdim implementationAaron Watry2016-05-062-0/+81
| | | | | | | | | | | Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708
* math: Fix ilogb(double) return typeAaron Watry2016-02-241-1/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714
* math: Add ilogb ported from amd-builtinsAaron Watry2016-02-231-0/+57
| | | | | | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639
* math: Fix log2 vectorization on non-fp64 hwJan Vesely2016-02-091-0/+2
| | | | | | reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260301
* math: Add frexp ported from amd-builtinsAaron Watry2016-02-082-0/+120
| | | | | | | | | | | | | | | | | | The float implementation is almost a direct port from the amd-builtins, but instead of just having a scalar and float4 implementation, it has a scalar and arbitrary width vector implementation. The double scalar is also a direct port from AMD's builtin release. The double vector implementation copies the logic in the float vector implementation using the values from the double scalar version. Both have been tested in piglit using tests sent to that project's mailing list. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260114
* Implement modf math builtinTom Stellard2016-01-272-0/+69
| | | | | | | | V2: use the reference implementation as suggested by Matt Arsenault Patch By: Pavel Ondračka llvm-svn: 258933
* Implement tanh builtinNiels Ole Salscheider2015-09-291-0/+146
| | | | | | This is a port from the AMD builtin library. llvm-svn: 248780
* Fix double implementation of logTom Stellard2015-07-241-0/+26
| | | | | | We need to use M_LOG2E instead of M_LOG2E_F. llvm-svn: 243132
* Implement accurate log2 functionTom Stellard2015-07-244-0/+469
| | | | | | | | | Use the implementation was ported from the AMD builtin library rather than LLVM Intrinsics. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 243131
* Use llvm intrinsics for native_log and native_log2Tom Stellard2015-07-244-0/+114
| | | | llvm-svn: 243130
* Fix implementation of sqrt v2Tom Stellard2015-07-103-0/+110
| | | | | | | | | | | Passing values less than 0 to the llvm.sqrt() intrinsic results in undefined behavior, so we need to check the input and return NaN if is is less than 0. v2: - Fix build failures. llvm-svn: 241906
* Use a more accurate implementation for expTom Stellard2015-05-132-13/+85
| | | | | | | | | | | | Using exp2(x * M_LOG2E_F) does not give us accurate enough results for OpenCL. If you look at the new exp implementation you'll see that it does multiply the input by M_LOG2E_F, but it still uses the original input in part of the calculation. This exp implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237229
* Implement exp2 using OpenCL C rather than using an intrinsicTom Stellard2015-05-135-1/+255
| | | | | | | | | | Not all targets support the intrinsic, so it's better to have a generic implementation which does not use it. This exp2 implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237228
* Implement sin for double typesTom Stellard2015-05-121-7/+16
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237155
* Implement cos for double typesTom Stellard2015-05-125-7/+289
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237154
* Implement atan2pi builtinTom Stellard2015-05-121-0/+221
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237138
* Implement atan2 for doublesTom Stellard2015-05-123-2/+412
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237131
* math: limit half_sqrt to single precisionJan Vesely2015-05-091-4/+2
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236941
* Fix ldexp fp64 build errorJan Vesely2015-05-091-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 236939
* Implement half_rsqrt builtin v3Tom Stellard2015-05-082-0/+53
| | | | | | | | | | | | | This is a generic implementation which just calls rsqrt. Targets should override this if they want a faster implementation. v2: - Alphabettize SOURCES v3 (Jan Vesely): Limit to single precision types. llvm-svn: 236915
* Move ldexp soft implementation to a separate fileJan Vesely2015-05-062-100/+131
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236648
OpenPOWER on IntegriCloud