summaryrefslogtreecommitdiffstats
path: root/libclc/generic/include/clc/math
Commit message (Collapse)AuthorAgeFilesLines
...
* math: Add erf ported from amd-builtinsJan Vesely2016-05-061-0/+9
| | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766
* math: Add fdim implementationAaron Watry2016-05-062-0/+3
| | | | | | | | | | | Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708
* math: Add ilogb ported from amd-builtinsAaron Watry2016-02-232-0/+6
| | | | | | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639
* math: Add frexp ported from amd-builtinsAaron Watry2016-02-083-0/+29
| | | | | | | | | | | | | | | | | | The float implementation is almost a direct port from the amd-builtins, but instead of just having a scalar and float4 implementation, it has a scalar and arbitrary width vector implementation. The double scalar is also a direct port from AMD's builtin release. The double vector implementation copies the logic in the float vector implementation using the values from the double scalar version. Both have been tested in piglit using tests sent to that project's mailing list. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260114
* Implement modf math builtinTom Stellard2016-01-272-0/+49
| | | | | | | | V2: use the reference implementation as suggested by Matt Arsenault Patch By: Pavel Ondračka llvm-svn: 258933
* Implement tanh builtinNiels Ole Salscheider2015-09-292-0/+47
| | | | | | This is a port from the AMD builtin library. llvm-svn: 248780
* Fix double implementation of logTom Stellard2015-07-242-3/+46
| | | | | | We need to use M_LOG2E instead of M_LOG2E_F. llvm-svn: 243132
* Implement accurate log2 functionTom Stellard2015-07-243-6/+47
| | | | | | | | | Use the implementation was ported from the AMD builtin library rather than LLVM Intrinsics. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 243131
* Use llvm intrinsics for native_log and native_log2Tom Stellard2015-07-244-2/+100
| | | | llvm-svn: 243130
* Fix implementation of sqrt v2Tom Stellard2015-07-102-6/+4
| | | | | | | | | | | Passing values less than 0 to the llvm.sqrt() intrinsic results in undefined behavior, so we need to check the input and return NaN if is is less than 0. v2: - Fix build failures. llvm-svn: 241906
* Implement exp2 using OpenCL C rather than using an intrinsicTom Stellard2015-05-132-5/+46
| | | | | | | | | | Not all targets support the intrinsic, so it's better to have a generic implementation which does not use it. This exp2 implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237228
* Implement atan2pi builtinTom Stellard2015-05-122-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237138
* math: limit half_sqrt to single precisionJan Vesely2015-05-091-2/+2
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236941
* Implement half_rsqrt builtin v3Tom Stellard2015-05-082-0/+33
| | | | | | | | | | | | | This is a generic implementation which just calls rsqrt. Targets should override this if they want a faster implementation. v2: - Alphabettize SOURCES v3 (Jan Vesely): Limit to single precision types. llvm-svn: 236915
* Implement sinpi builtinJan Vesely2015-05-062-0/+4
| | | | | | | | Ported from AMD builtin library, passes piglit on Turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236647
* Implement ldexp for R600/SITom Stellard2015-05-063-0/+73
| | | | llvm-svn: 236638
* Implement fract builtinTom Stellard2015-04-232-0/+49
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 235620
* Implement atanh builtinTom Stellard2015-04-072-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234324
* Implement acosh builtinTom Stellard2015-04-072-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234323
* Implement atanpi builtinTom Stellard2015-04-022-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233928
* Implement asinpi builtinTom Stellard2015-04-022-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233927
* Implement asinh builtinTom Stellard2015-04-022-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233926
* Implement acospi builtinTom Stellard2015-04-022-0/+47
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233925
* Implement fmax using __builtin_fmaxTom Stellard2015-03-311-4/+1
| | | | | | | | This ensures correct handling of NaNi. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233713
* Implement fmin using __builtin_fminTom Stellard2015-03-311-4/+1
| | | | | | | | This ensures correct handling of NaN. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233712
* Implement half_sqrt builtin v2Tom Stellard2015-03-231-0/+31
| | | | | | | | | | This is a generic implementation which just calls sqrt. Targets should override this if they want a faster implementation. v2: - Alphabetize SOURCES llvm-svn: 232965
* math: Implement erfcAaron Watry2015-03-181-0/+9
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 232674
* Move mix from math to commonAaron Watry2015-03-032-7/+0
| | | | | | | | It has been part of the common functions since 1.0 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 231137
* libclc/math: Add cospiAaron Watry2015-02-262-0/+4
| | | | | | | | | | | | | | | | | | | Ported from the libclc/amd-builtins branch v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4 Add cospi(double) implementation instead of using llvm.cos Notes: The sincosD_piby4.h file is mostly the same as the builtin implementation released by AMD. The inline attribute declaration is changed, and M_PI is used instead of a constant double. Otherwise, the only difference is that the header explicitly enables the fp64 pragma. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> CC: Tom Stellard <tom@stellard.net> CC: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 230641
* Implement log10Jan Vesely2015-01-301-0/+9
| | | | | | | | v2: Use constant and multiplication instead of division v3: Use hex constants Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 227585
* Prune CRLF.NAKAMURA Takumi2014-10-271-1/+1
| | | | llvm-svn: 220678
* Implement log1p builtinTom Stellard2014-10-072-0/+47
| | | | llvm-svn: 219230
* Implement fmodJan Vesely2014-10-052-0/+3
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 219087
* math: Add tan implementationAaron Watry2014-09-102-0/+3
| | | | | | | | | | | | | | | | | Uses the algorithm: tan(x) = sin(x) / sqrt(1-sin^2(x)) An alternative is: tan(x) = sin(x) / cos(x) Which produces more verbose bitcode and longer assembly. Either way, the generated bitcode seems pretty nasty and a more optimized but still precise-enough solution is welcome. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217511
* math: Add asin implementationAaron Watry2014-09-102-0/+3
| | | | | | | | | | | | | | asin(x) = atan2(x, sqrt( 1-x^2 )) alternatively: asin(x) = PI/2 - acos(x) Use the atan2 implementation since it produces slightly shorter bitcode and R600 machine code. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217510
* math: Add acos implementationAaron Watry2014-09-102-0/+3
| | | | | | | | | | Passes the tests that were submitted to the piglit list Tested on R600 (Pitcairn) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217509
* Fix implementation of copysignTom Stellard2014-09-032-1/+3
| | | | | | | | | This was previously implemented with a macro and we were using __builtin_copysign(), which takes double inputs for the float version of copysign(). Reviewed-and-Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217045
* Implement sin builtin for float typesTom Stellard2014-07-232-6/+4
| | | | | | This double version still uses @llvm.sin. llvm-svn: 213762
* Implement cos builtin for float typesTom Stellard2014-07-232-6/+4
| | | | | | The double version still uses @llvm.cos. llvm-svn: 213761
* Implement atan2 builtinTom Stellard2014-07-233-0/+48
| | | | llvm-svn: 213760
* Implement atan builtinTom Stellard2014-07-232-0/+47
| | | | llvm-svn: 213759
* Add exp10Jeroen Ketema2014-06-252-0/+10
| | | | | Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211680
* Add pownJeroen Ketema2014-06-181-0/+24
| | | | | Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211211
* math: Implement mix builtinAaron Watry2014-06-162-0/+7
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211047
* Implementations for exp(float) and exp(double) v2Jeroen Ketema2014-06-132-2/+11
| | | | | | | | | | | | Use separate implementations instead of a macro to ensure the constant multiplied with is of higher precision. v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk> Reviewed-by: Aaron Warty <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 210891
* Add sincosTom Stellard2014-03-212-0/+10
| | | | | | | Patch by: Jeroen Ketema Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 204478
* Implement trunc builtin.Tom Stellard2013-12-201-0/+9
| | | | | | | | | | | | OpenCL C lang says that trunc rounds towards zero. llvm.trunc.* intrinsic rounds to integer not larger in magnitude. These definitions are equivalent. Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197769
* Implement round builtinTom Stellard2013-11-181-0/+9
| | | | llvm-svn: 195022
* Implement nextafter() builtinTom Stellard2013-10-102-0/+16
| | | | | | | | | | | | | | There are two implementations of nextafter(): 1. Using clang's __builtin_nextafter. Clang replaces this builtin with a call to nextafter which is part of libm. Therefore, this implementation will only work for targets with an implementation of libm (e.g. most CPU targets). 2. The other implementation is written in OpenCL C. This function is known internally as __clc_nextafter and can be used by targets that don't have access to libm. llvm-svn: 192383
* Implement generic rint()Tom Stellard2013-08-101-0/+6
| | | | llvm-svn: 188130
OpenPOWER on IntegriCloud