bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	integer: Update integer limits to comply with spec	Aaron Watry	2015-09-15	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The values for the char/short/integer/long minimums were declared with their actual values, not the definitions from the CL spec (v1.1). As a result, (-2147483648) was actually being treated as a long by the compiler, not an int, which caused issues when trying to add/subtract that value from a vector. Update the definitions to use the values declared by the spec, and also add explicit casts for the char/short/int minimums so that the compiler actually treats them as shorts/chars. Without those casts, they actually end up stored as integers, and the compiler may end up storing the INT_MIN as a long. The compiler can sign extend the values if it needs to convert the char->short, short->int, or int->long v2: Add explicit cast for INT_MIN and fix some type-o's and wrapping in the commit message. Reported-by: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk> CC: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> llvm-svn: 247661
*	Fix double implementation of log	Tom Stellard	2015-07-24	2	-3/+46
\| \| \| \| \| \|	We need to use M_LOG2E instead of M_LOG2E_F. llvm-svn: 243132
*	Implement accurate log2 function	Tom Stellard	2015-07-24	3	-6/+47
\| \| \| \| \| \| \| \| \|	Use the implementation was ported from the AMD builtin library rather than LLVM Intrinsics. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 243131
*	Use llvm intrinsics for native_log and native_log2	Tom Stellard	2015-07-24	4	-2/+100
\| \| \| \|	llvm-svn: 243130
*	Fix implementation of sqrt v2	Tom Stellard	2015-07-10	2	-6/+4
\| \| \| \| \| \| \| \| \| \| \|	Passing values less than 0 to the llvm.sqrt() intrinsic results in undefined behavior, so we need to check the input and return NaN if is is less than 0. v2: - Fix build failures. llvm-svn: 241906
*	Implement exp2 using OpenCL C rather than using an intrinsic	Tom Stellard	2015-05-13	2	-5/+46
\| \| \| \| \| \| \| \| \| \|	Not all targets support the intrinsic, so it's better to have a generic implementation which does not use it. This exp2 implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237228
*	Implement atan2pi builtin	Tom Stellard	2015-05-12	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 237138
*	math: limit half_sqrt to single precision	Jan Vesely	2015-05-09	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236941
*	geometric: Limit fast_{distance,length} functions to single precision	Jan Vesely	2015-05-09	2	-0/+4
\| \| \| \| \| \|	Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236940
*	Implement fast_normalize builtin v4	Tom Stellard	2015-05-09	4	-0/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Remove f suffix from constant in double implementations. - Consolidate implementations using the .cl/.inc approach. v3: - Use __CLC_FPSIZE instead of __CLC_FP{32,64} v4 (Jan Vesely): - Limit to single precision. llvm-svn: 236920
*	Implement half_rsqrt builtin v3	Tom Stellard	2015-05-08	3	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a generic implementation which just calls rsqrt. Targets should override this if they want a faster implementation. v2: - Alphabettize SOURCES v3 (Jan Vesely): Limit to single precision types. llvm-svn: 236915
*	Implement sinpi builtin	Jan Vesely	2015-05-06	3	-0/+5
\| \| \| \| \| \| \| \|	Ported from AMD builtin library, passes piglit on Turks. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 236647
*	Implement ldexp for R600/SI	Tom Stellard	2015-05-06	4	-0/+74
\| \| \| \|	llvm-svn: 236638
*	Implement fract builtin	Tom Stellard	2015-04-23	3	-0/+50
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 235620
*	Implement atanh builtin	Tom Stellard	2015-04-07	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234324
*	Implement acosh builtin	Tom Stellard	2015-04-07	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 234323
*	Implement atanpi builtin	Tom Stellard	2015-04-02	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233928
*	Implement asinpi builtin	Tom Stellard	2015-04-02	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233927
*	Implement asinh builtin	Tom Stellard	2015-04-02	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233926
*	Implement acospi builtin	Tom Stellard	2015-04-02	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233925
*	Implement fmax using __builtin_fmax	Tom Stellard	2015-03-31	1	-4/+1
\| \| \| \| \| \| \| \|	This ensures correct handling of NaNi. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233713
*	Implement fmin using __builtin_fmin	Tom Stellard	2015-03-31	1	-4/+1
\| \| \| \| \| \| \| \|	This ensures correct handling of NaN. This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 233712
*	Implement fast_distance builtin	Tom Stellard	2015-03-23	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 232978
*	Implement fast_length builtin	Tom Stellard	2015-03-23	3	-0/+48
\| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 232977
*	Implement half_sqrt builtin v2	Tom Stellard	2015-03-23	2	-0/+32
\| \| \| \| \| \| \| \| \| \|	This is a generic implementation which just calls sqrt. Targets should override this if they want a faster implementation. v2: - Alphabetize SOURCES llvm-svn: 232965
*	Implement distance builtin v2	Tom Stellard	2015-03-23	2	-0/+24
\| \| \| \| \| \| \| \| \| \|	This implementation was ported from the AMD builtin library and has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Remove unnecessary copyright. llvm-svn: 232964
*	math: Implement erfc	Aaron Watry	2015-03-18	2	-0/+10
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 232674
*	Fix bitselect for float/double types v2	Tom Stellard	2015-03-05	2	-1/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to reinterpret float/double types as uint/ulong in order to perform the bitwise operations. This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Use vector operations rather than splitting vectors into scalar components. Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 231373
*	Move mix from math to common	Aaron Watry	2015-03-03	4	-3/+3
\| \| \| \| \| \| \| \|	It has been part of the common functions since 1.0 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 231137
*	Implement step builtin	Tom Stellard	2015-03-02	3	-0/+54
\| \| \| \| \| \|	This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 230970
*	Implement smoothstep builtin v2	Tom Stellard	2015-03-02	3	-0/+54
\| \| \| \| \| \| \| \| \|	This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Fix typo in smoothstep.h llvm-svn: 230969
*	Implement radians builtin v2	Tom Stellard	2015-03-02	3	-0/+49
\| \| \| \| \| \| \| \| \|	This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Move to the common/ directory llvm-svn: 230968
*	Implement degrees builtin v2	Tom Stellard	2015-03-02	3	-0/+49
\| \| \| \| \| \| \| \| \|	This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Move to the common/ directory llvm-svn: 230967
*	libclc/math: Add cospi	Aaron Watry	2015-02-26	3	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ported from the libclc/amd-builtins branch v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4 Add cospi(double) implementation instead of using llvm.cos Notes: The sincosD_piby4.h file is mostly the same as the builtin implementation released by AMD. The inline attribute declaration is changed, and M_PI is used instead of a constant double. Otherwise, the only difference is that the header explicitly enables the fp64 pragma. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> CC: Tom Stellard <tom@stellard.net> CC: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 230641
*	Implement log10	Jan Vesely	2015-01-30	2	-0/+10
\| \| \| \| \| \| \| \|	v2: Use constant and multiplication instead of division v3: Use hex constants Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 227585
*	Remove wrong semi-colons	Jeroen Ketema	2014-12-19	2	-2/+2
\| \| \| \| \| \|	Patch by Alastair Donaldson llvm-svn: 224568
*	Don't include <stddef.h>	Jeroen Ketema	2014-11-18	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Including a standard or system header isn't allowed in OpenCL. The type "size_t" needs to be explicitely defined now. v2: Use __SIZE_TYPE__ instead of unsigned int. v3: Define ptrdiff_t and NULL. Patch-by: Jean-Sébastien Pédron Reviewed-by: Jeroen Ketema Reviewed-by: Jan Vesely llvm-svn: 222235
*	Prune CRLF.	NAKAMURA Takumi	2014-10-27	1	-1/+1
\| \| \| \|	llvm-svn: 220678
*	r600: Use llvm intrinsic to read work dimension information	Jan Vesely	2014-10-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	v2: Fix function declaration Add range metadata to r600 implementation v3: change prefix to AMDGPU Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219793
*	Implement log1p builtin	Tom Stellard	2014-10-07	3	-0/+48
\| \| \| \|	llvm-svn: 219230
*	Implement fmod	Jan Vesely	2014-10-05	3	-0/+4
\| \| \| \| \| \| \|	Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 219087
*	Implement async_work_group_copy builtin v3	Tom Stellard	2014-10-03	3	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a simple implementation which just copies data synchronously. v2: - Use size_t. v3: - Fix possible race condition by splitting the copy among multiple work items. llvm-svn: 219008
*	Implement async_work_group_strided_copy builtin v2	Tom Stellard	2014-10-03	3	-0/+22
\| \| \| \| \| \| \| \| \|	This is a simple implementation which just copies data synchronously. v2: - Use size_t. llvm-svn: 219007
*	Implement wait_group_events builtin v2	Tom Stellard	2014-10-03	2	-0/+2
\| \| \| \| \| \| \| \| \|	This is a simple default implemetation which just calls barrier(). v2: - Only call barrier() once. llvm-svn: 219006
*	Remove more redundant semi-colons	Jeroen Ketema	2014-09-18	1	-5/+5
\| \| \| \|	llvm-svn: 218039
*	atomic: undef macros that are included from atomic_decl.inc	Aaron Watry	2014-09-17	8	-0/+15
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> llvm-svn: 217958
*	Remove redundant semi-colons	Jeroen Ketema	2014-09-17	1	-4/+4
\| \| \| \|	llvm-svn: 217954
*	atomic: Add generic atom[ic]_cmpxchg	Aaron Watry	2014-09-16	4	-0/+22
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217918
*	atomic: Implement generic atom[ic]_xchg	Aaron Watry	2014-09-16	4	-0/+12
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217917
*	atomic: Add generic atomic_min implementation	Aaron Watry	2014-09-16	4	-0/+10
\| \| \| \| \| \|	Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217916