| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
This is a port from the AMD builtin library.
llvm-svn: 248780
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 248163
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 248162
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 248161
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 248160
|
|
|
|
|
|
|
|
|
| |
Added get_image_* OpenCL builtins to the headers.
Added implementation to the r600 target.
Patch by: Zoltan Gilian
llvm-svn: 248159
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The values for the char/short/integer/long minimums were declared with
their actual values, not the definitions from the CL spec (v1.1). As
a result, (-2147483648) was actually being treated as a long by the
compiler, not an int, which caused issues when trying to add/subtract
that value from a vector.
Update the definitions to use the values declared by the spec, and also
add explicit casts for the char/short/int minimums so that the compiler
actually treats them as shorts/chars. Without those casts, they
actually end up stored as integers, and the compiler may end up storing
the INT_MIN as a long.
The compiler can sign extend the values if it needs to convert the
char->short, short->int, or int->long
v2: Add explicit cast for INT_MIN and fix some type-o's and wrapping
in the commit message.
Reported-by: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk>
CC: Moritz Pflanzer <moritz.pflanzer14@imperial.ac.uk>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 247661
|
|
|
|
|
|
| |
We need to use M_LOG2E instead of M_LOG2E_F.
llvm-svn: 243132
|
|
|
|
|
|
|
|
|
| |
Use the implementation was ported from the AMD builtin library rather
than LLVM Intrinsics.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 243131
|
|
|
|
| |
llvm-svn: 243130
|
|
|
|
|
|
|
|
|
|
|
| |
Passing values less than 0 to the llvm.sqrt() intrinsic results in
undefined behavior, so we need to check the input and return NaN if
is is less than 0.
v2:
- Fix build failures.
llvm-svn: 241906
|
|
|
|
|
|
|
|
|
|
| |
Not all targets support the intrinsic, so it's better to have a
generic implementation which does not use it.
This exp2 implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237228
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237138
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236941
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236940
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Remove f suffix from constant in double implementations.
- Consolidate implementations using the .cl/.inc approach.
v3:
- Use __CLC_FPSIZE instead of __CLC_FP{32,64}
v4 (Jan Vesely):
- Limit to single precision.
llvm-svn: 236920
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a generic implementation which just calls rsqrt.
Targets should override this if they want a faster implementation.
v2:
- Alphabettize SOURCES
v3 (Jan Vesely):
Limit to single precision types.
llvm-svn: 236915
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236648
|
|
|
|
|
|
|
|
| |
Ported from AMD builtin library, passes piglit on Turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236647
|
|
|
|
| |
llvm-svn: 236638
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 235620
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible for runtime implementations to disable
subnormal handling at runtime.
When this flag is enabled, decisions about how to handle subnormals
in the library will be controlled by an external variable called
__CLC_SUBNORMAL_DISABLE.
Function implementations should use these new helpers for querying subnormal
support:
__clc_fp16_subnormals_supported();
__clc_fp32_subnormals_supported();
__clc_fp64_subnormals_supported();
In order for the library to link correctly with this feature,
users will be required to either:
1. Insert this variable into the module (if using the LLVM/Clang C++/C APIs).
2. Pass either subnormal_disable.bc or subnormal_use_default.bc to the
linker. These files are distributed with liblclc and installed to
$(installdir). e.g.:
llvm-link -o kernel-out.bc kernel.bc builtins-nosubnormal.bc subnormal_disable.bc
or
llvm-link -o kernel-out.bc kernel.bc builtins-nosubnormal.bc subnormal_use_default.bc
If you do not supply the --enable-runtime-subnormal then the library
behaves the same as it did before this commit.
In addition to these changes, the patch adds helper functions that
should be used when implementing library functions that need
special handling for denormals:
__clc_fp16_subnormals_supported();
__clc_fp32_subnormals_supported();
__clc_fp64_subnormals_supported();
llvm-svn: 235329
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 234324
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 234323
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233928
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233927
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233926
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233925
|
|
|
|
|
|
|
|
| |
This ensures correct handling of NaNi.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233713
|
|
|
|
|
|
|
|
| |
This ensures correct handling of NaN.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233712
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 232978
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 232977
|
|
|
|
|
|
|
|
|
|
| |
This is a generic implementation which just calls sqrt. Targets should
override this if they want a faster implementation.
v2:
- Alphabetize SOURCES
llvm-svn: 232965
|
|
|
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Remove unnecessary copyright.
llvm-svn: 232964
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 232674
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to reinterpret float/double types as uint/ulong in order to
perform the bitwise operations.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Use vector operations rather than splitting vectors into scalar
components.
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 231373
|
|
|
|
|
|
|
|
| |
It has been part of the common functions since 1.0
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 231137
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 230970
|
|
|
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Fix typo in smoothstep.h
llvm-svn: 230969
|
|
|
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Move to the common/ directory
llvm-svn: 230968
|
|
|
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Move to the common/ directory
llvm-svn: 230967
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ported from the libclc/amd-builtins branch
v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4
Add cospi(double) implementation instead of using llvm.cos
Notes:
The sincosD_piby4.h file is mostly the same as the builtin implementation
released by AMD. The inline attribute declaration is changed, and M_PI is
used instead of a constant double. Otherwise, the only difference is that
the header explicitly enables the fp64 pragma.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
CC: Tom Stellard <tom@stellard.net>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 230641
|
|
|
|
|
|
|
|
| |
v2: Use constant and multiplication instead of division
v3: Use hex constants
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 227585
|
|
|
|
|
|
| |
Patch by Alastair Donaldson
llvm-svn: 224568
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Including a standard or system header isn't allowed in OpenCL.
The type "size_t" needs to be explicitely defined now.
v2: Use __SIZE_TYPE__ instead of unsigned int.
v3: Define ptrdiff_t and NULL.
Patch-by: Jean-Sébastien Pédron
Reviewed-by: Jeroen Ketema
Reviewed-by: Jan Vesely
llvm-svn: 222235
|
|
|
|
| |
llvm-svn: 220678
|
|
|
|
|
|
|
|
|
|
| |
v2: Fix function declaration
Add range metadata to r600 implementation
v3: change prefix to AMDGPU
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219793
|
|
|
|
| |
llvm-svn: 219230
|
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 219087
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
v3:
- Fix possible race condition by splitting the copy among multiple
work items.
llvm-svn: 219008
|