bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	relational: Implement shuffle builtin	Aaron Watry	2017-09-02	4	-0/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was added in CL 1.1 Tested with a Radeon HD 7850 (Pitcairn) using the CL CTS via: test_conformance/relationals/test_relationals shuffle_built_in v2: Add half-precision support to shuffle when available. Move to misc/ and add section 6.12.12 to clc.h Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 312403
*	Add halfN types and enable fp16 when generating builtin declarations	Aaron Watry	2017-09-02	2	-0/+12
\| \| \| \| \| \| \| \| \|	Uses the same mechanism to enable fp16 as we use for fp64 when processing clc.h Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 312402
*	amdgcn: Implement {read_,write_,}mem_fence builtin	Jan Vesely	2017-08-16	2	-0/+6
\| \| \| \| \| \| \| \| \|	v2: add more detailed comment about waitcnt instruction Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311021
*	add __kernel_exec macros	Jan Vesely	2017-07-28	4	-11/+19
\| \| \| \| \| \| \| \|	also consolidate macros into one file, and rename to clcmacros.h Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 309358
*	generic: add missing get_work_dim include	Jan Vesely	2017-06-02	1	-0/+1
\| \| \| \| \| \| \| \|	Fixes few piglits since clang r304193 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 304556
*	math: Implement sinh function	Jan Vesely	2017-02-25	5	-0/+240
\| \| \| \| \| \|	mostly copied form amd_builtins llvm-svn: 296233
*	math: Add native_tan as wrapper to tan	Aaron Watry	2017-02-23	2	-0/+11
\| \| \| \| \| \| \| \| \| \|	Trivially define native_tan as a redirect to tan. If there are any targets with a native implementation, we can deal with it later. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <arsenm2@gmail.com> llvm-svn: 295920
*	Add the correct prefixes to the cl_khr_fp64 pragma	Jeroen Ketema	2017-02-12	1	-1/+1
\| \| \| \|	llvm-svn: 294915
*	math: Add native_rsqrt builtin function	Matt Arsenault	2017-02-09	2	-0/+2
\| \| \| \| \| \| \| \|	Trivial define to rsqrt. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 294608
*	math: Add logb builtin	Aaron Watry	2017-01-18	5	-0/+36
\| \| \| \| \| \| \| \| \|	Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292335
*	math: Add expm1 builtin function	Aaron Watry	2017-01-18	6	-0/+293
\| \| \| \| \| \| \| \| \|	Ported from the amd-builtins branch. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 292334
*	Provide vstore_half helper to workaround clc restrictions	Jan Vesely	2016-09-21	4	-26/+75
\| \| \| \| \| \|	clang won't accept half precision loads and stores without cl_khr_fp16 since r281904 llvm-svn: 282106
*	math: Implement tgamma	Aaron Watry	2016-09-15	5	-0/+77
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281566
*	math: Implement lgamma	Aaron Watry	2016-09-15	5	-0/+49
\| \| \| \| \| \| \| \|	Just use lgamma_r and ignore the value returned in the second argument Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281565
*	math: Implement lgamma_r	Aaron Watry	2016-09-15	6	-0/+518
\| \| \| \| \| \| \| \| \|	Ported from the amd-builtins branch, which is itself based on the Sun Microsystems implementation. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281564
*	Add ADDR_SPACE parameter to _CLC_V_V_VP_VECTORIZE	Aaron Watry	2016-09-15	1	-12/+27
\| \| \| \| \| \| \| \| \| \| \|	This macro is currently unused, but I plan to use it shortly. The previous form did casts of pointers without an address space, which doesn't work so well for CL 1.x. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 281563
*	Replace nextafter implementation	Matt Arsenault	2016-09-08	1	-28/+24
\| \| \| \| \| \|	This one passes conformance. llvm-svn: 280961
*	Avoid ambiguity in calling atom_add functions.	Jan Vesely	2016-09-07	4	-4/+4
\| \| \| \| \| \| \| \| \| \|	clang (since r280553) allows pointer casts in function overloads, so we need to disambiguate the second argument. clang might be smarter about overloads in the future see https://reviews.llvm.org/D24113, but let's be safe in libclc anyway. llvm-svn: 280871
*	Implement vstore_half{,n}	Jan Vesely	2016-08-17	3	-19/+68
\| \| \| \| \|	Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 278962
*	Make min follow the OCL 1.0 specs	Jan Vesely	2016-07-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	OpenCL 1.0: "Returns y if y < x, otherwise it returns x. If x and y are infinite or NaN, the return values are undefined." OpenCL 1.1+: "Returns y if y < x, otherwise it returns x. If x or y are infinite or NaN, the return values are undefined." The 1.0 version is stricter so use that one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276704
*	Implement cbrt builtin	Tom Stellard	2016-07-22	7	-0/+869
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276497
*	Implement cosh builtin	Tom Stellard	2016-07-22	7	-0/+370
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 276496
*	geometric/floatn.inc: Add vec8 and vec16 types	Tom Stellard	2016-07-22	1	-0/+16
\| \| \| \|	llvm-svn: 276495
*	AMDGPU: Implement get_global_offset builtin	Jan Vesely	2016-07-22	3	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
*	64 bit integers are legal in full profile without an extension	Jan Vesely	2016-06-17	1	-5/+12
\| \| \| \| \| \|	Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 273042
*	math: Use single precision fmax in sp path	Jan Vesely	2016-05-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Fixes fdim piglit on Turks v2: use CL fmax instead of __builtin Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom.stellard@amd.com> llvm-svn: 269807
*	math: Add erf ported from amd-builtins	Jan Vesely	2016-05-06	4	-0/+413
\| \| \| \| \| \| \| \| \| \| \| \|	The scalar float/double function bodies are a direct copy/paste, aside from the removed (optional) code in float function body that requires subnormals. reviewers: jvesely Patch by: Vedran Miletić <rivanvx@gmail.com> llvm-svn: 268766
*	math: Add fdim implementation	Aaron Watry	2016-05-06	6	-0/+86
\| \| \| \| \| \| \| \| \| \| \|	Based on the amd-builtin, but explicitly vectorized for all sizes (not just float4), and includes a vectorized double implementation. Passes piglit (float) tests on pitcairn. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 268708
*	math: Fix ilogb(double) return type	Aaron Watry	2016-02-24	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 261714
*	math: Add ilogb ported from amd-builtins	Aaron Watry	2016-02-23	6	-0/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scalar float/double function bodies are a direct copy/paste with usage of the CLC wrappers to vectorize them. This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are equal to the results of ilogb(0.0f) and ilogb(float nan) respectively. v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 261639
*	math: Fix log2 vectorization on non-fp64 hw	Jan Vesely	2016-02-09	1	-0/+2
\| \| \| \| \| \|	reviewer: tstellard Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260301
*	math: Add frexp ported from amd-builtins	Aaron Watry	2016-02-08	7	-0/+151
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The float implementation is almost a direct port from the amd-builtins, but instead of just having a scalar and float4 implementation, it has a scalar and arbitrary width vector implementation. The double scalar is also a direct port from AMD's builtin release. The double vector implementation copies the logic in the float vector implementation using the values from the double scalar version. Both have been tested in piglit using tests sent to that project's mailing list. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 260114
*	Implement modf math builtin	Tom Stellard	2016-01-27	6	-0/+120
\| \| \| \| \| \| \| \|	V2: use the reference implementation as suggested by Matt Arsenault Patch By: Pavel Ondračka llvm-svn: 258933
*	Add _CLC_V_V_VP_VECTORIZE macro	Tom Stellard	2016-01-27	1	-0/+22
\| \| \| \| \| \|	Patch by: Pavel Ondračka llvm-svn: 258932
*	integer: remove explicit casts from _MIN definitions	Aaron Watry	2015-10-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The spec says (section 6.12.3, CL version 1.2): The macro names given in the following list must use the values specified. The values shall all be constant expressions suitable for use in #if preprocessing directives. This commit addresses the second part of that statement. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> CC: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk> CC: Serge Martin <edb+libclc@sigluy.net> llvm-svn: 249445
*	Implement tanh builtin	Niels Ole Salscheider	2015-09-29	5	-0/+195
\| \| \| \| \| \|	This is a port from the AMD builtin library. llvm-svn: 248780
*	Add sampler defines.	Tom Stellard	2015-09-21	1	-0/+18
\| \| \| \| \| \|	Patch by: Zoltan Gilian llvm-svn: 248163
*	Add image attribute defines.	Tom Stellard	2015-09-21	2	-0/+32
\| \| \| \| \| \|	Patch by: Zoltan Gilian llvm-svn: 248162
*	r600: Add image writing builtins.	Tom Stellard	2015-09-21	1	-0/+7
\| \| \| \| \| \|	Patch by: Zoltan Gilian llvm-svn: 248161
*	r600: Add image reading builtins.	Tom Stellard	2015-09-21	1	-0/+13
\| \| \| \| \| \|	Patch by: Zoltan Gilian llvm-svn: 248160
*	Add image attribute getter builtins	Tom Stellard	2015-09-21	4	-0/+30
\| \| \| \| \| \| \| \| \|	Added get_image_* OpenCL builtins to the headers. Added implementation to the r600 target. Patch by: Zoltan Gilian llvm-svn: 248159
*	integer: Update integer limits to comply with spec	Aaron Watry	2015-09-15	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The values for the char/short/integer/long minimums were declared with their actual values, not the definitions from the CL spec (v1.1). As a result, (-2147483648) was actually being treated as a long by the compiler, not an int, which caused issues when trying to add/subtract that value from a vector. Update the definitions to use the values declared by the spec, and also add explicit casts for the char/short/int minimums so that the compiler actually treats them as shorts/chars. Without those casts, they actually end up stored as integers, and the compiler may end up storing the INT_MIN as a long. The compiler can sign extend the values if it needs to convert the char->short, short->int, or int->long v2: Add explicit cast for INT_MIN and fix some type-o's and wrapping in the commit message. Reported-by: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk> CC: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 247661
*	Remove files accidentally not removed in r244310	Jeroen Ketema	2015-08-13	2	-9/+0
\| \| \| \|	llvm-svn: 244987
*	Fix double implementation of log	Tom Stellard	2015-07-24	4	-3/+73
\| \| \| \| \| \|	We need to use M_LOG2E instead of M_LOG2E_F. llvm-svn: 243132
*	Implement accurate log2 function	Tom Stellard	2015-07-24	8	-6/+517
\| \| \| \| \| \| \| \| \|	Use the implementation was ported from the AMD builtin library rather than LLVM Intrinsics. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 243131
*	Use llvm intrinsics for native_log and native_log2	Tom Stellard	2015-07-24	9	-2/+216
\| \| \| \|	llvm-svn: 243130
*	Fix implementation of sqrt v2	Tom Stellard	2015-07-10	8	-6/+168
\| \| \| \| \| \| \| \| \| \| \|	Passing values less than 0 to the llvm.sqrt() intrinsic results in undefined behavior, so we need to check the input and return NaN if is is less than 0. v2: - Fix build failures. llvm-svn: 241906
*	Use a more accurate implementation for exp	Tom Stellard	2015-05-13	2	-13/+85
\| \| \| \| \| \| \| \| \| \| \| \|	Using exp2(x * M_LOG2E_F) does not give us accurate enough results for OpenCL. If you look at the new exp implementation you'll see that it does multiply the input by M_LOG2E_F, but it still uses the original input in part of the calculation. This exp implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237229
*	Implement exp2 using OpenCL C rather than using an intrinsic	Tom Stellard	2015-05-13	8	-6/+303
\| \| \| \| \| \| \| \| \| \|	Not all targets support the intrinsic, so it's better to have a generic implementation which does not use it. This exp2 implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237228
*	Implement sin for double types	Tom Stellard	2015-05-12	1	-7/+16
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237155