summaryrefslogtreecommitdiffstats
path: root/libclc/generic/lib/math
Commit message (Collapse)AuthorAgeFilesLines
* math: Fix ilogb(double) return typeAaron Watry2016-02-241-1/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714
* math: Add ilogb ported from amd-builtinsAaron Watry2016-02-231-0/+57
| | | | | | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639
* math: Fix log2 vectorization on non-fp64 hwJan Vesely2016-02-091-0/+2
| | | | | | reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260301
* math: Add frexp ported from amd-builtinsAaron Watry2016-02-082-0/+120
| | | | | | | | | | | | | | | | | | The float implementation is almost a direct port from the amd-builtins, but instead of just having a scalar and float4 implementation, it has a scalar and arbitrary width vector implementation. The double scalar is also a direct port from AMD's builtin release. The double vector implementation copies the logic in the float vector implementation using the values from the double scalar version. Both have been tested in piglit using tests sent to that project's mailing list. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260114
* Implement modf math builtinTom Stellard2016-01-272-0/+69
| | | | | | | | V2: use the reference implementation as suggested by Matt Arsenault Patch By: Pavel Ondračka llvm-svn: 258933
* Implement tanh builtinNiels Ole Salscheider2015-09-291-0/+146
| | | | | | This is a port from the AMD builtin library. llvm-svn: 248780
* Fix double implementation of logTom Stellard2015-07-241-0/+26
| | | | | | We need to use M_LOG2E instead of M_LOG2E_F. llvm-svn: 243132
* Implement accurate log2 functionTom Stellard2015-07-244-0/+469
| | | | | | | | | Use the implementation was ported from the AMD builtin library rather than LLVM Intrinsics. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 243131
* Use llvm intrinsics for native_log and native_log2Tom Stellard2015-07-244-0/+114
| | | | llvm-svn: 243130
* Fix implementation of sqrt v2Tom Stellard2015-07-103-0/+110
| | | | | | | | | | | Passing values less than 0 to the llvm.sqrt() intrinsic results in undefined behavior, so we need to check the input and return NaN if is is less than 0. v2: - Fix build failures. llvm-svn: 241906
* Use a more accurate implementation for expTom Stellard2015-05-132-13/+85
| | | | | | | | | | | | Using exp2(x * M_LOG2E_F) does not give us accurate enough results for OpenCL. If you look at the new exp implementation you'll see that it does multiply the input by M_LOG2E_F, but it still uses the original input in part of the calculation. This exp implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237229
* Implement exp2 using OpenCL C rather than using an intrinsicTom Stellard2015-05-135-1/+255
| | | | | | | | | | Not all targets support the intrinsic, so it's better to have a generic implementation which does not use it. This exp2 implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237228
* Implement sin for double typesTom Stellard2015-05-121-7/+16
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237155
* Implement cos for double typesTom Stellard2015-05-125-7/+289
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237154
* Implement atan2pi builtinTom Stellard2015-05-121-0/+221
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237138
* Implement atan2 for doublesTom Stellard2015-05-123-2/+412
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237131
* math: limit half_sqrt to single precisionJan Vesely2015-05-091-4/+2
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236941
* Fix ldexp fp64 build errorJan Vesely2015-05-091-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 236939
* Implement half_rsqrt builtin v3Tom Stellard2015-05-082-0/+53
| | | | | | | | | | | | | This is a generic implementation which just calls rsqrt. Targets should override this if they want a faster implementation. v2: - Alphabettize SOURCES v3 (Jan Vesely): Limit to single precision types. llvm-svn: 236915
* Move ldexp soft implementation to a separate fileJan Vesely2015-05-062-100/+131
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236648
* Implement sinpi builtinJan Vesely2015-05-061-0/+131
| | | | | | | | Ported from AMD builtin library, passes piglit on Turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236647
* math: Add ldexp implementationTom Stellard2015-05-062-0/+167
| | | | | | | | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Tom Stellard: - Add denormal handling. - Share vectorization code with r600 implementation. Patch By: Aaron Watry llvm-svn: 236639
* Fix compilation warnings without cl_khr_fp64Jan Vesely2015-04-243-6/+32
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 235762
* Implement fract builtinTom Stellard2015-04-232-0/+79
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 235620
* Implement atanh builtinTom Stellard2015-04-071-0/+113
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234324
* Implement acosh builtinTom Stellard2015-04-071-0/+127
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234323
* Implement atanpi builtinTom Stellard2015-04-021-0/+182
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233928
* Implement asinpi builtinTom Stellard2015-04-021-0/+170
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233927
* Implement asinh builtinTom Stellard2015-04-023-0/+416
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233926
* Implement acospi builtinTom Stellard2015-04-021-0/+172
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233925
* Implement fmax using __builtin_fmaxTom Stellard2015-03-312-4/+27
| | | | | | | | This ensures correct handling of NaNi. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233713
* Implement fmin using __builtin_fminTom Stellard2015-03-312-4/+27
| | | | | | | | This ensures correct handling of NaN. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233712
* Implement half_sqrt builtin v2Tom Stellard2015-03-232-0/+55
| | | | | | | | | | This is a generic implementation which just calls sqrt. Targets should override this if they want a faster implementation. v2: - Alphabetize SOURCES llvm-svn: 232965
* Add __clc_ prefix to functions in sincos_helpers.clTom Stellard2015-03-234-28/+24
| | | | | | | This will help avoid naming conflicts with functions defined in kernels linking with libclc. llvm-svn: 232960
* math: Implement erfcAaron Watry2015-03-181-0/+413
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 232674
* Move mix from math to commonAaron Watry2015-03-032-17/+0
| | | | | | | | It has been part of the common functions since 1.0 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 231137
* libclc/math: Add cospiAaron Watry2015-02-263-0/+270
| | | | | | | | | | | | | | | | | | | Ported from the libclc/amd-builtins branch v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4 Add cospi(double) implementation instead of using llvm.cos Notes: The sincosD_piby4.h file is mostly the same as the builtin implementation released by AMD. The inline attribute declaration is changed, and M_PI is used instead of a constant double. Otherwise, the only difference is that the header explicitly enables the fp64 pragma. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> CC: Tom Stellard <tom@stellard.net> CC: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 230641
* Implement log10Jan Vesely2015-01-302-0/+21
| | | | | | | | v2: Use constant and multiplication instead of division v3: Use hex constants Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 227585
* Implement log1p builtinTom Stellard2014-10-074-0/+619
| | | | llvm-svn: 219230
* Implement fmodJan Vesely2014-10-051-0/+12
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 219087
* math: Add tan implementationAaron Watry2014-09-102-0/+16
| | | | | | | | | | | | | | | | | Uses the algorithm: tan(x) = sin(x) / sqrt(1-sin^2(x)) An alternative is: tan(x) = sin(x) / cos(x) Which produces more verbose bitcode and longer assembly. Either way, the generated bitcode seems pretty nasty and a more optimized but still precise-enough solution is welcome. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217511
* math: Add asin implementationAaron Watry2014-09-102-0/+11
| | | | | | | | | | | | | | asin(x) = atan2(x, sqrt( 1-x^2 )) alternatively: asin(x) = PI/2 - acos(x) Use the atan2 implementation since it produces slightly shorter bitcode and R600 machine code. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217510
* math: Add acos implementationAaron Watry2014-09-102-0/+29
| | | | | | | | | | Passes the tests that were submitted to the piglit list Tested on R600 (Pitcairn) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217509
* Fix implementation of copysignTom Stellard2014-09-031-0/+12
| | | | | | | | | This was previously implemented with a macro and we were using __builtin_copysign(), which takes double inputs for the float version of copysign(). Reviewed-and-Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217045
* Implement sin builtin for float typesTom Stellard2014-07-231-0/+70
| | | | | | This double version still uses @llvm.sin. llvm-svn: 213762
* Implement cos builtin for float typesTom Stellard2014-07-233-0/+400
| | | | | | The double version still uses @llvm.cos. llvm-svn: 213761
* Implement atan2 builtinTom Stellard2014-07-231-0/+81
| | | | llvm-svn: 213760
* Implement atan builtinTom Stellard2014-07-232-0/+247
| | | | llvm-svn: 213759
* Add exp10Jeroen Ketema2014-06-252-0/+18
| | | | | Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211680
* Move clcmacro.h to avoid cluttering user namespace v2Jeroen Ketema2014-06-243-0/+3
| | | | | | | | | v2: - use quotes instead of <> - add include to r600/lib/math/nextafter.c changed Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 211576
OpenPOWER on IntegriCloud