summaryrefslogtreecommitdiffstats
path: root/libclc
Commit message (Collapse)AuthorAgeFilesLines
...
* .gitignore: Ignore amdgcn-mesa object directoryJan Vesely2017-02-241-0/+1
| | | | llvm-svn: 296164
* math: Add native_tan as wrapper to tanAaron Watry2017-02-232-0/+11
| | | | | | | | | | Trivially define native_tan as a redirect to tan. If there are any targets with a native implementation, we can deal with it later. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <arsenm2@gmail.com> llvm-svn: 295920
* Move BufferPtr into the block where it it being usedJeroen Ketema2017-02-121-3/+3
| | | | | | | The previous location outside the block would crash prepare-builtins when no the builtins file accidentially not passed on the command line. llvm-svn: 294916
* Add the correct prefixes to the cl_khr_fp64 pragmaJeroen Ketema2017-02-121-1/+1
| | | | llvm-svn: 294915
* math: Add native_rsqrt builtin functionMatt Arsenault2017-02-092-0/+2
| | | | | | | | Trivial define to rsqrt. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 294608
* math: Add logb builtinAaron Watry2017-01-185-0/+36
| | | | | | | | | Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292335
* math: Add expm1 builtin functionAaron Watry2017-01-186-0/+293
| | | | | | | | | Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292334
* Fix build since r286752.Tom Stellard2016-11-141-1/+2
| | | | llvm-svn: 286839
* Fix build since llvm r286566 and require at least llvm 4.0Tom Stellard2016-11-112-3/+4
| | | | llvm-svn: 286634
* Provide vstore_half helper to workaround clc restrictionsJan Vesely2016-09-214-26/+75
| | | | | | clang won't accept half precision loads and stores without cl_khr_fp16 since r281904 llvm-svn: 282106
* configure: Add amdgcn-mesa-mesa3d targetTom Stellard2016-09-161-1/+5
| | | | llvm-svn: 281793
* amdgcn-amdhsa: Add get_num_groups implementationTom Stellard2016-09-163-0/+14
| | | | llvm-svn: 281792
* amdgcn-amdhsa: Add get_global_size() implementationTom Stellard2016-09-162-0/+40
| | | | llvm-svn: 281791
* math: Implement tgammaAaron Watry2016-09-155-0/+77
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566
* math: Implement lgammaAaron Watry2016-09-155-0/+49
| | | | | | | | Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565
* math: Implement lgamma_rAaron Watry2016-09-156-0/+518
| | | | | | | | | Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281564
* Add ADDR_SPACE parameter to _CLC_V_V_VP_VECTORIZEAaron Watry2016-09-151-12/+27
| | | | | | | | | | | This macro is currently unused, but I plan to use it shortly. The previous form did casts of pointers without an address space, which doesn't work so well for CL 1.x. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281563
* Replace nextafter implementationMatt Arsenault2016-09-082-28/+29
| | | | | | This one passes conformance. llvm-svn: 280961
* Avoid ambiguity in calling atom_add functions.Jan Vesely2016-09-074-4/+4
| | | | | | | | | | clang (since r280553) allows pointer casts in function overloads, so we need to disambiguate the second argument. clang might be smarter about overloads in the future see https://reviews.llvm.org/D24113, but let's be safe in libclc anyway. llvm-svn: 280871
* configure.py: Add polaris10 and polaris11Niels Ole Salscheider2016-08-301-2/+2
| | | | llvm-svn: 280121
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-255-2/+24
| | | | llvm-svn: 279723
* Strip opencl.ocl.version metadataMatt Arsenault2016-08-251-0/+7
| | | | | | | | | | This should be uniqued when linking, but right now it creates a lot of metadata spam listing the same version. This should also probably be reporting the compiled version of the user program, which may differ from the library. Currently the library IR files report 1.0 while 1.1/1.2 are the default for user programs. llvm-svn: 279692
* amdgcn: Also correct get_local_size type for HSAMatt Arsenault2016-08-241-5/+8
| | | | llvm-svn: 279656
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-245-2/+24
| | | | llvm-svn: 279644
* amdgpu: Fix default case value for get_local_sizeMatt Arsenault2016-08-202-2/+2
| | | | llvm-svn: 279359
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-205-5/+27
| | | | llvm-svn: 279350
* amdgcn: Correct return types to be size_tMatt Arsenault2016-08-193-3/+3
| | | | llvm-svn: 279343
* Implement vstore_half{,n}Jan Vesely2016-08-173-19/+68
| | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 278962
* Make min follow the OCL 1.0 specsJan Vesely2016-07-251-2/+2
| | | | | | | | | | | | | OpenCL 1.0: "Returns y if y < x, otherwise it returns x. If x *and* y are infinite or NaN, the return values are undefined." OpenCL 1.1+: "Returns y if y < x, otherwise it returns x. If x *or* y are infinite or NaN, the return values are undefined." The 1.0 version is stricter so use that one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276704
* Implement cbrt builtinTom Stellard2016-07-227-0/+869
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276497
* Implement cosh builtinTom Stellard2016-07-227-0/+370
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276496
* geometric/floatn.inc: Add vec8 and vec16 typesTom Stellard2016-07-221-0/+16
| | | | llvm-svn: 276495
* AMDGPU: Implement get_global_offset builtinJan Vesely2016-07-229-1/+33
| | | | | | | | | | | | | Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-2214-136/+71
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* ptx: Fix builtin names after clang r274770Jan Vesely2016-07-225-13/+13
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-By: Aaron Watry <awatry@gmail.com> llvm-svn: 276423
* amdgpu: Use right builtn for rsqMatt Arsenault2016-07-191-1/+6
| | | | | | | The r600 path has never actually worked sinced double is not implemented there. llvm-svn: 276009
* R600: Use new barrier intrinsicMatt Arsenault2016-07-181-4/+3
| | | | llvm-svn: 275874
* Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexpMatt Arsenault2016-07-183-3/+3
| | | | | | | It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813
* configure: Remove device specific definesJan Vesely2016-06-171-25/+11
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273044
* nvptx: Drop feature defines.Jan Vesely2016-06-171-6/+4
| | | | | | | | This is now handled by clang Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273043
* 64 bit integers are legal in full profile without an extensionJan Vesely2016-06-172-6/+12
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273042
* math: Use single precision fmax in sp pathJan Vesely2016-05-171-1/+1
| | | | | | | | | | Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> llvm-svn: 269807
* math: Add erf ported from amd-builtinsJan Vesely2016-05-064-0/+413
| | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766
* math: Add fdim implementationAaron Watry2016-05-066-0/+86
| | | | | | | | | | | Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708
* prepare-builtins: Remove call to getGlobalContext()Tom Stellard2016-04-151-1/+1
| | | | | | | | This function has been removed from LLVM. Patch By: Laurent Carlier llvm-svn: 266430
* [AMDGPU] Implement get_local_size for amdgcn--amdhsa tripleKonstantin Zhuravlyov2016-04-075-1/+41
| | | | | | Differential Revision: http://reviews.llvm.org/D18284 llvm-svn: 265713
* Update copyright year to 2016.Paul Robinson2016-03-301-1/+1
| | | | llvm-svn: 264949
* math: Fix ilogb(double) return typeAaron Watry2016-02-241-1/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714
* math: Add ilogb ported from amd-builtinsAaron Watry2016-02-236-0/+68
| | | | | | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639
* Add .gitignore for build directoriesMatt Arsenault2016-02-171-0/+13
| | | | llvm-svn: 261043
OpenPOWER on IntegriCloud