summaryrefslogtreecommitdiffstats
path: root/libclc/generic/include/clc
Commit message (Collapse)AuthorAgeFilesLines
...
* Implementations for exp(float) and exp(double) v2Jeroen Ketema2014-06-132-2/+11
| | | | | | | | | | | | Use separate implementations instead of a macro to ensure the constant multiplied with is of higher precision. v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk> Reviewed-by: Aaron Warty <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 210891
* Add more log related float constantsJeroen Ketema2014-05-291-0/+3
| | | | llvm-svn: 209850
* Fix _F definitionsJeroen Ketema2014-05-291-2/+2
| | | | | | | | | The 'f' was missing and, hence, the values were considered to be doubles instead of floats. Reviewed by: Tom Stellard llvm-svn: 209849
* Add definition for M_PIJeroen Ketema2014-05-291-0/+1
| | | | | | Reviewed by: Tom Stellard llvm-svn: 209848
* Remove clc/gentype.incTom Stellard2014-04-301-51/+0
| | | | | | | | | This file duplicates clc/math/gentype.inc and is not actually being used. Patch by: Jeroen Ketema llvm-svn: 207684
* Introduce M_LOG2E_F and M_LOG2ETom Stellard2014-03-281-1/+4
| | | | | | Patch by: Jeroen Ketema llvm-svn: 205055
* Replace tabs by spacesTom Stellard2014-03-281-19/+19
| | | | | | Patch by: Jeroen Ketema llvm-svn: 205054
* Add definition for M_PI_F v3Tom Stellard2014-03-241-0/+2
| | | | | | | | | | v2: - Use a hexadecimal constant. v3: - Use a hexadecimal constant in floating-point notation. llvm-svn: 204666
* Add sincosTom Stellard2014-03-213-0/+11
| | | | | | | Patch by: Jeroen Ketema Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 204478
* Add cross for double3 and double4Tom Stellard2014-03-211-0/+5
| | | | | | | Patch by: Jeroen Ketema Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 204477
* Add floating-point macro definitions v2Tom Stellard2013-12-202-0/+27
| | | | | | | | v2: - Fix typo. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197784
* Implement trunc builtin.Tom Stellard2013-12-202-0/+10
| | | | | | | | | | | | OpenCL C lang says that trunc rounds towards zero. llvm.trunc.* intrinsic rounds to integer not larger in magnitude. These definitions are equivalent. Patch by: Jan Vesely Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 197769
* Fix a C&P error in r195021 (65a950abab3cb8435ccb2646ac4773986c995c81)Tom Stellard2013-11-281-2/+2
| | | | | | | Patch by: Kai Wasserbäch Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> llvm-svn: 195898
* Implement round builtinTom Stellard2013-11-182-0/+10
| | | | llvm-svn: 195022
* Implement builtins for cl_khr_global_int32_base_atomics extensionTom Stellard2013-11-185-1/+15
| | | | llvm-svn: 195021
* Port pocl's gen_convert.py script to libclcTom Stellard2013-10-101-2/+9
| | | | | | | This script generates implementations for the entire set of convert_* functions, llvm-svn: 192385
* Implement sign() builtinTom Stellard2013-10-102-0/+8
| | | | llvm-svn: 192384
* Implement nextafter() builtinTom Stellard2013-10-104-0/+28
| | | | | | | | | | | | | | There are two implementations of nextafter(): 1. Using clang's __builtin_nextafter. Clang replaces this builtin with a call to nextafter which is part of libm. Therefore, this implementation will only work for targets with an implementation of libm (e.g. most CPU targets). 2. The other implementation is written in OpenCL C. This function is known internally as __clc_nextafter and can be used by targets that don't have access to libm. llvm-svn: 192383
* Implement isnan() builtinTom Stellard2013-10-103-0/+28
| | | | llvm-svn: 192382
* Add missing as_{float,double} functionsTom Stellard2013-10-101-0/+15
| | | | llvm-svn: 192381
* Parenthesize arguments for mad_hiAaron Watry2013-09-091-1/+1
| | | | | | Thanks to Jordon Rose <jordan_rose@apple.com> for pointing this out. llvm-svn: 190310
* Implement mad_hi built-inAaron Watry2013-09-062-0/+2
| | | | | | | | | We already have a working mul_hi, and the spec gives us the implementation as: Returns mul_hi(a,b)+c. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190211
* Add atomic_sub and atomic_dec builtin functionsAaron Watry2013-09-063-0/+6
| | | | | Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 190201
* Remove unneeded semi-colonsAaron Watry2013-09-051-6/+6
| | | | | Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 190059
* Add atomic_inc and atomic_add builtinsAaron Watry2013-09-054-0/+18
| | | | | Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 190058
* Add mul_hi implementation [v2]Aaron Watry2013-08-193-0/+4
| | | | | | | | | | | | | | | Everything except long/ulong is handled by just casting to the next larger type, doing the math and then shifting/casting the result. For 64-bit types, we break the high/low parts of each operand apart, and do a FOIL-based multiplication. v2: Discard the stack-overflow implementation due to copyright concerns. - The implementation is still FOIL-based, but discards the previous code. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188684
* Add rhadd builtinAaron Watry2013-08-153-0/+4
| | | | | | | | | | | | rhadd = (x+y+1)>>1 Implemented as: (x>>1) + (y>>1) + ((x&1)|(y&1)) This prevents us having to do assembly addition and overflow detection Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188477
* Add hadd builtinAaron Watry2013-08-153-0/+4
| | | | | | | | | | (x + y) >> 1 gets changed to: (x>>1) + (y>>1) + (x&y&1) Saves us having to do any llvm assembly and overflow checking in the addition. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188476
* Add some missing convert_* functionsTom Stellard2013-08-101-30/+1
| | | | llvm-svn: 188131
* Implement generic rint()Tom Stellard2013-08-102-0/+7
| | | | llvm-svn: 188130
* Add missing integer min/max definitionsAaron Watry2013-07-262-0/+18
| | | | | | | | Found in CL 1.1 spec section 6.11.3 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 187200
* Implement generic upsample()Aaron Watry2013-07-192-0/+26
| | | | | | | | | | | | | | Reduces all vector upsamples down to its scalar components, so probably not the most efficient thing in the world, but it does what the spec says it needs to do. Another possible implementation would be to convert/cast everything as unsigned if necessary, upsample the input vectors, create the upsampled value, and then cast back to signed if required. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 186691
* Add integer-gentype.inc: Missing file from r185839Tom Stellard2013-07-151-0/+39
| | | | llvm-svn: 186326
* Implement mad24() and mul24() builtinsTom Stellard2013-07-085-0/+10
| | | | | Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 185839
* Add __CLC_ prefix to all macro definitions in headersTom Stellard2013-07-0850-644/+644
| | | | | | | | | | | libclc was defining and undefing GENTYPE and several other macros with common names in its header files. This was preventing applications from defining macros with identical names as command line arguments to the compiler, because the definitions in the header files were masking the macros defined as compiler arguements. Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 185838
* Add bitselect() builtinTom Stellard2013-07-082-0/+2
| | | | | Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 185836
* libclc: Initial vstore implementationTom Stellard2013-06-262-0/+37
| | | | | | | | | | Assumes that the target supports byte-addressable stores. Completely unoptimized. Patch by: Aaron Watry llvm-svn: 185007
* libclc: Initial vload implementationTom Stellard2013-06-262-0/+38
| | | | | | | | Should work for all targets and data types. Completely unoptimized. Patch by: Aaron Watry llvm-svn: 185006
* libclc: Implement clz() builtinTom Stellard2013-06-263-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Squashed commit of the following: commit a0df0a0e86c55c1bdc0b9c0f5a739e5adef4b056 Author: Aaron Watry <awatry@gmail.com> Date: Mon Apr 15 18:42:04 2013 -0500 libclc: Rename clz.ll to clz_if.ll to ensure it gets built. configure.py treats files that have the same name with the .cl and .ll extensions as overriding eachother. E.g. If you have clz.cl and clz.ll both specified to be built in the same SOURCES file, only the first file listed will actually be built. Since the contents of clz.ll were an interface that is implemented in clz_impl.ll, rename clz.ll to clz_if.ll to make sure that the interface is built. commit 931b62bed05c58f737de625bd415af09571a6a5a Author: Aaron Watry <awatry@gmail.com> Date: Sat Apr 13 12:32:54 2013 -0500 libclc: llvm assembly implementation of clz Untested... currently crashes in the same manner as add_sat. commit 6ef0b7b0b6d2e5584086b4b9a9243743b2e0538f Author: Aaron Watry <awatry@gmail.com> Date: Sat Mar 23 12:35:27 2013 -0500 libclc: Add stub clz builtin For scalar int/uint, attempt to use the clz llvm builtin.. for all others return 0 until an actual implementation is finished. Patch by: Aaron Watry llvm-svn: 185004
* libclc: Add clamp(vec, scalar, scalar) and max(vec, scalar)Tom Stellard2013-06-262-0/+8
| | | | | | | | | | | For any GENTYPE that isn't scalar, we need to implement a mixed vector/scalar version of clamp/max. This depends on the min() patches I sent to the list a few minutes ago. Patch by: Aaron Watry llvm-svn: 185003
* libclc: Implement the min(vec, scalar) version of the min builtin.Tom Stellard2013-06-263-0/+35
| | | | | | | | | | Checks if the current GENTYPE is scalar, and if not, then defines a separate implementation of the function which casts the second arg to vector before proceeding. Patch by: Aaron Watry llvm-svn: 185002
* libclc: implement initial version of min()Tom Stellard2013-06-263-0/+7
| | | | | | | | This doesn't handle the integer cases for min(vector, scalar). Patch by: Aaron Watry llvm-svn: 185001
* Simplify rotate implementation a bit..Tom Stellard2013-06-261-0/+16
| | | | | | | | Much more understandable/readable as a result, and probably more efficient. Patch by: Aaron Watry llvm-svn: 184997
* libclc: implement rotate builtinTom Stellard2013-06-264-0/+15
| | | | | | | | | | | | This implementation does a lot of bit shifting and masking. Suffice to say, this is somewhat suboptimal... but it does look to produce correct results (after the piglit tests were corrected for sign extension issues). Someone who knows LLVM better than I could re-write this more efficiently. Patch by: Aaron Watry llvm-svn: 184996
* libclc: Move max builtin to shared/Tom Stellard2013-06-266-7/+6
| | | | | | | | Max(x,y) is available for all integer/floating types. Patch by: Aaron Watry llvm-svn: 184995
* libclc: Add clamp() builtin for integer/floating pointTom Stellard2013-06-263-0/+9
| | | | | | | | | Created under a new shared/ directory for functions which are available for both integer and floating point types. Patch by: Aaron Watry llvm-svn: 184994
* libclc: Add max() builtin functionTom Stellard2013-06-265-0/+8
| | | | | | | | Adds this function for both int and floating data types. Patch by: Aaron Watry llvm-svn: 184992
* Implement ceil() builtinTom Stellard2013-06-262-0/+7
| | | | llvm-svn: 184988
* Implement fmax() and fmin() builtinsTom Stellard2013-06-265-0/+34
| | | | llvm-svn: 184987
* Remove the static keyword from the _CLC_INLINE macroTom Stellard2013-06-261-1/+1
| | | | | | static functions are not allowed in OpenCL C llvm-svn: 184986
OpenPOWER on IntegriCloud