summaryrefslogtreecommitdiffstats
path: root/libclc
Commit message (Collapse)AuthorAgeFilesLines
...
* Implement fract builtinTom Stellard2015-04-236-0/+130
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 235620
* configure: Add --enable-runtime-subnormal optionTom Stellard2015-04-207-0/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes it possible for runtime implementations to disable subnormal handling at runtime. When this flag is enabled, decisions about how to handle subnormals in the library will be controlled by an external variable called __CLC_SUBNORMAL_DISABLE. Function implementations should use these new helpers for querying subnormal support: __clc_fp16_subnormals_supported(); __clc_fp32_subnormals_supported(); __clc_fp64_subnormals_supported(); In order for the library to link correctly with this feature, users will be required to either: 1. Insert this variable into the module (if using the LLVM/Clang C++/C APIs). 2. Pass either subnormal_disable.bc or subnormal_use_default.bc to the linker. These files are distributed with liblclc and installed to $(installdir). e.g.: llvm-link -o kernel-out.bc kernel.bc builtins-nosubnormal.bc subnormal_disable.bc or llvm-link -o kernel-out.bc kernel.bc builtins-nosubnormal.bc subnormal_use_default.bc If you do not supply the --enable-runtime-subnormal then the library behaves the same as it did before this commit. In addition to these changes, the patch adds helper functions that should be used when implementing library functions that need special handling for denormals: __clc_fp16_subnormals_supported(); __clc_fp32_subnormals_supported(); __clc_fp64_subnormals_supported(); llvm-svn: 235329
* Implement atanh builtinTom Stellard2015-04-075-0/+162
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234324
* Implement acosh builtinTom Stellard2015-04-075-0/+176
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234323
* Implement atanpi builtinTom Stellard2015-04-025-0/+231
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233928
* Implement asinpi builtinTom Stellard2015-04-025-0/+219
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233927
* Implement asinh builtinTom Stellard2015-04-027-0/+466
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233926
* Implement acospi builtinTom Stellard2015-04-025-0/+221
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233925
* Implement fmax using __builtin_fmaxTom Stellard2015-03-313-8/+28
| | | | | | | | This ensures correct handling of NaNi. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233713
* Implement fmin using __builtin_fminTom Stellard2015-03-313-8/+28
| | | | | | | | This ensures correct handling of NaN. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233712
* Implement fast_distance builtinTom Stellard2015-03-236-0/+104
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 232978
* Implement fast_length builtinTom Stellard2015-03-235-0/+109
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 232977
* Implement half_sqrt builtin v2Tom Stellard2015-03-235-0/+88
| | | | | | | | | | This is a generic implementation which just calls sqrt. Targets should override this if they want a faster implementation. v2: - Alphabetize SOURCES llvm-svn: 232965
* Implement distance builtin v2Tom Stellard2015-03-235-0/+80
| | | | | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Remove unnecessary copyright. llvm-svn: 232964
* Fix implementation of length builtin v2Tom Stellard2015-03-232-6/+82
| | | | | | | | v2: - Move common code into a macro - Use the same constant for all vector types. llvm-svn: 232963
* Add __clc_ prefix to functions in sincos_helpers.clTom Stellard2015-03-234-28/+24
| | | | | | | This will help avoid naming conflicts with functions defined in kernels linking with libclc. llvm-svn: 232960
* math: Implement erfcAaron Watry2015-03-184-0/+424
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 232674
* Fix bitselect for float/double types v2Tom Stellard2015-03-055-1/+130
| | | | | | | | | | | | | | We need to reinterpret float/double types as uint/ulong in order to perform the bitwise operations. This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Use vector operations rather than splitting vectors into scalar components. Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 231373
* Move mix from math to commonAaron Watry2015-03-037-4/+4
| | | | | | | | It has been part of the common functions since 1.0 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 231137
* Implement step builtinTom Stellard2015-03-026-0/+132
| | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 230970
* Implement smoothstep builtin v2Tom Stellard2015-03-026-0/+133
| | | | | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Fix typo in smoothstep.h llvm-svn: 230969
* Implement radians builtin v2Tom Stellard2015-03-025-0/+95
| | | | | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Move to the common/ directory llvm-svn: 230968
* Implement degrees builtin v2Tom Stellard2015-03-025-0/+95
| | | | | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Move to the common/ directory llvm-svn: 230967
* libclc/math: Add cospiAaron Watry2015-02-267-0/+276
| | | | | | | | | | | | | | | | | | | Ported from the libclc/amd-builtins branch v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4 Add cospi(double) implementation instead of using llvm.cos Notes: The sincosD_piby4.h file is mostly the same as the builtin implementation released by AMD. The inline attribute declaration is changed, and M_PI is used instead of a constant double. Otherwise, the only difference is that the header explicitly enables the fp64 pragma. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> CC: Tom Stellard <tom@stellard.net> CC: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 230641
* Implement log10Jan Vesely2015-01-305-0/+32
| | | | | | | | v2: Use constant and multiplication instead of division v3: Use hex constants Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 227585
* Use amdgcn triple for SI+ GPUsTom Stellard2015-01-061-4/+7
| | | | llvm-svn: 225296
* r600: get_work_dim: Update metadata syntax for LLVM 3.6Tom Stellard2014-12-311-1/+1
| | | | llvm-svn: 225042
* Require LLVM 3.6 and bump version to 0.1.0Tom Stellard2014-12-312-54/+9
| | | | | | | | Some functions are implemented using hand-written LLVM IR, and LLVM assembly format is allowed to change between versions, so we should require a specific version of LLVM. llvm-svn: 225041
* Remove wrong semi-colonsJeroen Ketema2014-12-192-2/+2
| | | | | | Patch by Alastair Donaldson llvm-svn: 224568
* Don't include <stddef.h>Jeroen Ketema2014-11-181-2/+5
| | | | | | | | | | | | | | Including a standard or system header isn't allowed in OpenCL. The type "size_t" needs to be explicitely defined now. v2: Use __SIZE_TYPE__ instead of unsigned int. v3: Define ptrdiff_t and NULL. Patch-by: Jean-Sébastien Pédron Reviewed-by: Jeroen Ketema Reviewed-by: Jan Vesely llvm-svn: 222235
* Prune CRLF.NAKAMURA Takumi2014-10-271-1/+1
| | | | llvm-svn: 220678
* r600: Fix get_work_dim range metadataJan Vesely2014-10-221-1/+1
| | | | | | Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220388
* r600: Use llvm intrinsic to read work dimension informationJan Vesely2014-10-153-0/+10
| | | | | | | | | | v2: Fix function declaration Add range metadata to r600 implementation v3: change prefix to AMDGPU Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219793
* Implement log1p builtinTom Stellard2014-10-078-0/+669
| | | | llvm-svn: 219230
* Implement fmodJan Vesely2014-10-055-0/+17
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 219087
* Implement async_work_group_copy builtin v3Tom Stellard2014-10-036-0/+48
| | | | | | | | | | | | | This is a simple implementation which just copies data synchronously. v2: - Use size_t. v3: - Fix possible race condition by splitting the copy among multiple work items. llvm-svn: 219008
* Implement async_work_group_strided_copy builtin v2Tom Stellard2014-10-036-0/+66
| | | | | | | | | This is a simple implementation which just copies data synchronously. v2: - Use size_t. llvm-svn: 219007
* Implement wait_group_events builtin v2Tom Stellard2014-10-034-0/+8
| | | | | | | | | This is a simple default implemetation which just calls barrier(). v2: - Only call barrier() once. llvm-svn: 219006
* Remove more redundant semi-colonsJeroen Ketema2014-09-181-5/+5
| | | | llvm-svn: 218039
* atomic: undef macros that are included from atomic_decl.incAaron Watry2014-09-178-0/+15
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> llvm-svn: 217958
* Remove redundant semi-colonsJeroen Ketema2014-09-171-4/+4
| | | | llvm-svn: 217954
* R600: Map Address spaces for atomic_cmpxchgAaron Watry2014-09-161-0/+19
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217925
* R600: Map address spaces for atomic_xchgAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217924
* R600: Map address spaces for atomic_minAaron Watry2014-09-161-0/+10
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217923
* R600: Map address spaces for atomic_xorAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217922
* R600: Map addr spaces and use atomic_maxAaron Watry2014-09-161-5/+16
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217921
* R600: Map address spaces for atomic_orAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217920
* R600: Map atomic_and address spacesAaron Watry2014-09-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217919
* atomic: Add generic atom[ic]_cmpxchgAaron Watry2014-09-168-0/+56
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217918
* atomic: Implement generic atom[ic]_xchgAaron Watry2014-09-169-0/+54
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217917
OpenPOWER on IntegriCloud