summaryrefslogtreecommitdiffstats
path: root/libclc
Commit message (Collapse)AuthorAgeFilesLines
* Fix build since r286752.Tom Stellard2016-11-141-1/+2
| | | | llvm-svn: 286839
* Fix build since llvm r286566 and require at least llvm 4.0Tom Stellard2016-11-112-3/+4
| | | | llvm-svn: 286634
* Provide vstore_half helper to workaround clc restrictionsJan Vesely2016-09-214-26/+75
| | | | | | clang won't accept half precision loads and stores without cl_khr_fp16 since r281904 llvm-svn: 282106
* configure: Add amdgcn-mesa-mesa3d targetTom Stellard2016-09-161-1/+5
| | | | llvm-svn: 281793
* amdgcn-amdhsa: Add get_num_groups implementationTom Stellard2016-09-163-0/+14
| | | | llvm-svn: 281792
* amdgcn-amdhsa: Add get_global_size() implementationTom Stellard2016-09-162-0/+40
| | | | llvm-svn: 281791
* math: Implement tgammaAaron Watry2016-09-155-0/+77
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566
* math: Implement lgammaAaron Watry2016-09-155-0/+49
| | | | | | | | Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565
* math: Implement lgamma_rAaron Watry2016-09-156-0/+518
| | | | | | | | | Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281564
* Add ADDR_SPACE parameter to _CLC_V_V_VP_VECTORIZEAaron Watry2016-09-151-12/+27
| | | | | | | | | | | This macro is currently unused, but I plan to use it shortly. The previous form did casts of pointers without an address space, which doesn't work so well for CL 1.x. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281563
* Replace nextafter implementationMatt Arsenault2016-09-082-28/+29
| | | | | | This one passes conformance. llvm-svn: 280961
* Avoid ambiguity in calling atom_add functions.Jan Vesely2016-09-074-4/+4
| | | | | | | | | | clang (since r280553) allows pointer casts in function overloads, so we need to disambiguate the second argument. clang might be smarter about overloads in the future see https://reviews.llvm.org/D24113, but let's be safe in libclc anyway. llvm-svn: 280871
* configure.py: Add polaris10 and polaris11Niels Ole Salscheider2016-08-301-2/+2
| | | | llvm-svn: 280121
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-255-2/+24
| | | | llvm-svn: 279723
* Strip opencl.ocl.version metadataMatt Arsenault2016-08-251-0/+7
| | | | | | | | | | This should be uniqued when linking, but right now it creates a lot of metadata spam listing the same version. This should also probably be reporting the compiled version of the user program, which may differ from the library. Currently the library IR files report 1.0 while 1.1/1.2 are the default for user programs. llvm-svn: 279692
* amdgcn: Also correct get_local_size type for HSAMatt Arsenault2016-08-241-5/+8
| | | | llvm-svn: 279656
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-245-2/+24
| | | | llvm-svn: 279644
* amdgpu: Fix default case value for get_local_sizeMatt Arsenault2016-08-202-2/+2
| | | | llvm-svn: 279359
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-205-5/+27
| | | | llvm-svn: 279350
* amdgcn: Correct return types to be size_tMatt Arsenault2016-08-193-3/+3
| | | | llvm-svn: 279343
* Implement vstore_half{,n}Jan Vesely2016-08-173-19/+68
| | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 278962
* Make min follow the OCL 1.0 specsJan Vesely2016-07-251-2/+2
| | | | | | | | | | | | | OpenCL 1.0: "Returns y if y < x, otherwise it returns x. If x *and* y are infinite or NaN, the return values are undefined." OpenCL 1.1+: "Returns y if y < x, otherwise it returns x. If x *or* y are infinite or NaN, the return values are undefined." The 1.0 version is stricter so use that one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276704
* Implement cbrt builtinTom Stellard2016-07-227-0/+869
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276497
* Implement cosh builtinTom Stellard2016-07-227-0/+370
| | | | | | | This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276496
* geometric/floatn.inc: Add vec8 and vec16 typesTom Stellard2016-07-221-0/+16
| | | | llvm-svn: 276495
* AMDGPU: Implement get_global_offset builtinJan Vesely2016-07-229-1/+33
| | | | | | | | | | | | | Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-2214-136/+71
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* ptx: Fix builtin names after clang r274770Jan Vesely2016-07-225-13/+13
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-By: Aaron Watry <awatry@gmail.com> llvm-svn: 276423
* amdgpu: Use right builtn for rsqMatt Arsenault2016-07-191-1/+6
| | | | | | | The r600 path has never actually worked sinced double is not implemented there. llvm-svn: 276009
* R600: Use new barrier intrinsicMatt Arsenault2016-07-181-4/+3
| | | | llvm-svn: 275874
* Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexpMatt Arsenault2016-07-183-3/+3
| | | | | | | It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813
* configure: Remove device specific definesJan Vesely2016-06-171-25/+11
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273044
* nvptx: Drop feature defines.Jan Vesely2016-06-171-6/+4
| | | | | | | | This is now handled by clang Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273043
* 64 bit integers are legal in full profile without an extensionJan Vesely2016-06-172-6/+12
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273042
* math: Use single precision fmax in sp pathJan Vesely2016-05-171-1/+1
| | | | | | | | | | Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> llvm-svn: 269807
* math: Add erf ported from amd-builtinsJan Vesely2016-05-064-0/+413
| | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766
* math: Add fdim implementationAaron Watry2016-05-066-0/+86
| | | | | | | | | | | Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708
* prepare-builtins: Remove call to getGlobalContext()Tom Stellard2016-04-151-1/+1
| | | | | | | | This function has been removed from LLVM. Patch By: Laurent Carlier llvm-svn: 266430
* [AMDGPU] Implement get_local_size for amdgcn--amdhsa tripleKonstantin Zhuravlyov2016-04-075-1/+41
| | | | | | Differential Revision: http://reviews.llvm.org/D18284 llvm-svn: 265713
* Update copyright year to 2016.Paul Robinson2016-03-301-1/+1
| | | | llvm-svn: 264949
* math: Fix ilogb(double) return typeAaron Watry2016-02-241-1/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714
* math: Add ilogb ported from amd-builtinsAaron Watry2016-02-236-0/+68
| | | | | | | | | | | | | | | | The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639
* Add .gitignore for build directoriesMatt Arsenault2016-02-171-0/+13
| | | | llvm-svn: 261043
* amdgcn: Use new workitem intrinsicsMatt Arsenault2016-02-179-38/+124
| | | | llvm-svn: 261042
* Update page to list supported targetsMatt Arsenault2016-02-131-2/+2
| | | | llvm-svn: 260778
* Split sources for amdgcn and r600Matt Arsenault2016-02-1334-38/+75
| | | | | | | | | | | Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777
* configure: Remove llvm 3.6 definesJan Vesely2016-02-091-3/+3
| | | | | | | | we require llvm 3.7 reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260304
* configure: Remove cl_khr_fp64 for device that don't support doublesJan Vesely2016-02-091-5/+5
| | | | | | | | | Also remove definitions if provided by clang (3.7+) This halves the size of builtin.opt.{cedar,barts}.bc reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260303
* configure: Introduce per device definesJan Vesely2016-02-091-11/+24
| | | | | | | | | | | Make cl_khr_fp64 define per-device. This patch does not change the generated Makefile (for llvm 3.6, 3.7) v2: Make the device defines per LLVM version, 'all' for all versions reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260302
* math: Fix log2 vectorization on non-fp64 hwJan Vesely2016-02-091-0/+2
| | | | | | reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260301
OpenPOWER on IntegriCloud