| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237154
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237138
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237131
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236941
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236940
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 236939
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Remove f suffix from constant in double implementations.
- Consolidate implementations using the .cl/.inc approach.
v3:
- Use __CLC_FPSIZE instead of __CLC_FP{32,64}
v4 (Jan Vesely):
- Limit to single precision.
llvm-svn: 236920
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a generic implementation which just calls rsqrt.
Targets should override this if they want a faster implementation.
v2:
- Alphabettize SOURCES
v3 (Jan Vesely):
Limit to single precision types.
llvm-svn: 236915
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236648
|
|
|
|
|
|
|
|
| |
Ported from AMD builtin library, passes piglit on Turks.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236647
|
|
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Tom Stellard:
- Add denormal handling.
- Share vectorization code with r600 implementation.
Patch By: Aaron Watry
llvm-svn: 236639
|
|
|
|
| |
llvm-svn: 236638
|
|
|
|
|
|
|
| |
The new implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 236608
|
|
|
|
|
|
|
|
|
| |
It allows to keep temporary compatibilty with older version.
For exemple, this can be use when change are not to large.
Patch by: EdB
llvm-svn: 236113
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 235762
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 235620
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible for runtime implementations to disable
subnormal handling at runtime.
When this flag is enabled, decisions about how to handle subnormals
in the library will be controlled by an external variable called
__CLC_SUBNORMAL_DISABLE.
Function implementations should use these new helpers for querying subnormal
support:
__clc_fp16_subnormals_supported();
__clc_fp32_subnormals_supported();
__clc_fp64_subnormals_supported();
In order for the library to link correctly with this feature,
users will be required to either:
1. Insert this variable into the module (if using the LLVM/Clang C++/C APIs).
2. Pass either subnormal_disable.bc or subnormal_use_default.bc to the
linker. These files are distributed with liblclc and installed to
$(installdir). e.g.:
llvm-link -o kernel-out.bc kernel.bc builtins-nosubnormal.bc subnormal_disable.bc
or
llvm-link -o kernel-out.bc kernel.bc builtins-nosubnormal.bc subnormal_use_default.bc
If you do not supply the --enable-runtime-subnormal then the library
behaves the same as it did before this commit.
In addition to these changes, the patch adds helper functions that
should be used when implementing library functions that need
special handling for denormals:
__clc_fp16_subnormals_supported();
__clc_fp32_subnormals_supported();
__clc_fp64_subnormals_supported();
llvm-svn: 235329
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 234324
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 234323
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233928
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233927
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233926
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233925
|
|
|
|
|
|
|
|
| |
This ensures correct handling of NaNi.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233713
|
|
|
|
|
|
|
|
| |
This ensures correct handling of NaN.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 233712
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 232978
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 232977
|
|
|
|
|
|
|
|
|
|
| |
This is a generic implementation which just calls sqrt. Targets should
override this if they want a faster implementation.
v2:
- Alphabetize SOURCES
llvm-svn: 232965
|
|
|
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Remove unnecessary copyright.
llvm-svn: 232964
|
|
|
|
|
|
|
|
| |
v2:
- Move common code into a macro
- Use the same constant for all vector types.
llvm-svn: 232963
|
|
|
|
|
|
|
| |
This will help avoid naming conflicts with functions defined in
kernels linking with libclc.
llvm-svn: 232960
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 232674
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to reinterpret float/double types as uint/ulong in order to
perform the bitwise operations.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Use vector operations rather than splitting vectors into scalar
components.
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 231373
|
|
|
|
|
|
|
|
| |
It has been part of the common functions since 1.0
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 231137
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 230970
|
|
|
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Fix typo in smoothstep.h
llvm-svn: 230969
|
|
|
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Move to the common/ directory
llvm-svn: 230968
|
|
|
|
|
|
|
|
|
| |
This has been tested with piglit, OpenCV, and the ocl conformance tests.
v2:
- Move to the common/ directory
llvm-svn: 230967
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ported from the libclc/amd-builtins branch
v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4
Add cospi(double) implementation instead of using llvm.cos
Notes:
The sincosD_piby4.h file is mostly the same as the builtin implementation
released by AMD. The inline attribute declaration is changed, and M_PI is
used instead of a constant double. Otherwise, the only difference is that
the header explicitly enables the fp64 pragma.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk>
CC: Tom Stellard <tom@stellard.net>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 230641
|
|
|
|
|
|
|
|
| |
v2: Use constant and multiplication instead of division
v3: Use hex constants
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 227585
|
|
|
|
|
|
| |
Patch by Alastair Donaldson
llvm-svn: 224568
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Including a standard or system header isn't allowed in OpenCL.
The type "size_t" needs to be explicitely defined now.
v2: Use __SIZE_TYPE__ instead of unsigned int.
v3: Define ptrdiff_t and NULL.
Patch-by: Jean-Sébastien Pédron
Reviewed-by: Jeroen Ketema
Reviewed-by: Jan Vesely
llvm-svn: 222235
|
|
|
|
| |
llvm-svn: 220678
|
|
|
|
|
|
|
|
|
|
| |
v2: Fix function declaration
Add range metadata to r600 implementation
v3: change prefix to AMDGPU
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219793
|
|
|
|
| |
llvm-svn: 219230
|
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 219087
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
v3:
- Fix possible race condition by splitting the copy among multiple
work items.
llvm-svn: 219008
|
|
|
|
|
|
|
|
|
| |
This is a simple implementation which just copies data synchronously.
v2:
- Use size_t.
llvm-svn: 219007
|
|
|
|
|
|
|
|
|
| |
This is a simple default implemetation which just calls barrier().
v2:
- Only call barrier() once.
llvm-svn: 219006
|
|
|
|
| |
llvm-svn: 218039
|