| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318265
|
|
|
|
|
|
| |
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318264
|
|
|
|
|
|
|
|
|
|
| |
v2: don't use assume
check only for x<0, the other conditions are handled transparently
v3: don't check inputs at all, nan propagation works as expected
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318204
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318067
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318066
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318065
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318064
|
|
|
|
|
|
|
|
| |
exp10 CTS fails with or without this change
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 318063
|
|
|
|
|
|
|
|
| |
v2: Use native_log2 instead of wrong constant
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317941
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317940
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317939
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317938
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317937
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317936
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317934
|
|
|
|
|
|
| |
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317933
|
|
|
|
|
|
|
|
|
| |
v2: Add __CLC_XCONCAT instead of function name redirection
Use __CLC_XCONCAT for intrinsic functions as well
Reviewer: Jeroen Ketema
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 317932
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use llvm instrinsic by default
Provide amdgpu workaround
v2: drop old amd copyrights
Reviewer: Aaron Watry
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 316588
|
|
|
|
|
|
| |
mostly copied form amd_builtins
llvm-svn: 296233
|
|
|
|
|
|
|
|
|
| |
Ported from the amd-builtins branch.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 292335
|
|
|
|
|
|
|
|
|
| |
Ported from the amd-builtins branch.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 292334
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 281566
|
|
|
|
|
|
|
|
| |
Just use lgamma_r and ignore the value returned in the second argument
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 281565
|
|
|
|
|
|
|
|
|
| |
Ported from the amd-builtins branch, which is itself based on the
Sun Microsystems implementation.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 281564
|
|
|
|
|
|
| |
This one passes conformance.
llvm-svn: 280961
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 276497
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 276496
|
|
|
|
|
|
|
|
|
|
| |
Fixes fdim piglit on Turks
v2: use CL fmax instead of __builtin
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom.stellard@amd.com>
llvm-svn: 269807
|
|
|
|
|
|
|
|
|
|
|
|
| |
The scalar float/double function bodies are a direct copy/paste,
aside from the removed (optional) code in float function body that
requires subnormals.
reviewers: jvesely
Patch by: Vedran Miletić <rivanvx@gmail.com>
llvm-svn: 268766
|
|
|
|
|
|
|
|
|
|
|
| |
Based on the amd-builtin, but explicitly vectorized for all sizes (not just
float4), and includes a vectorized double implementation.
Passes piglit (float) tests on pitcairn.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 268708
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 261714
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The scalar float/double function bodies are a direct copy/paste
with usage of the CLC wrappers to vectorize them.
This commit also adds in the FP_ILOGB0 and FP_ILOGBNAN macros which are
equal to the results of ilogb(0.0f) and ilogb(float nan) respectively.
v2: Add FP_ILOGB0 and FP_ILOGBNAN definitions
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
v1 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 261639
|
|
|
|
|
|
| |
reviewer: tstellard
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 260301
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The float implementation is almost a direct port from the amd-builtins,
but instead of just having a scalar and float4 implementation, it has
a scalar and arbitrary width vector implementation.
The double scalar is also a direct port from AMD's builtin release.
The double vector implementation copies the logic in the float vector
implementation using the values from the double scalar version.
Both have been tested in piglit using tests sent to that project's
mailing list.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 260114
|
|
|
|
|
|
|
|
| |
V2: use the reference implementation as suggested by Matt Arsenault
Patch By: Pavel Ondračka
llvm-svn: 258933
|
|
|
|
|
|
| |
This is a port from the AMD builtin library.
llvm-svn: 248780
|
|
|
|
|
|
| |
We need to use M_LOG2E instead of M_LOG2E_F.
llvm-svn: 243132
|
|
|
|
|
|
|
|
|
| |
Use the implementation was ported from the AMD builtin library rather
than LLVM Intrinsics.
This has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 243131
|
|
|
|
| |
llvm-svn: 243130
|
|
|
|
|
|
|
|
|
|
|
| |
Passing values less than 0 to the llvm.sqrt() intrinsic results in
undefined behavior, so we need to check the input and return NaN if
is is less than 0.
v2:
- Fix build failures.
llvm-svn: 241906
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using exp2(x * M_LOG2E_F) does not give us accurate enough results for
OpenCL. If you look at the new exp implementation you'll see that
it does multiply the input by M_LOG2E_F, but it still uses the original
input in part of the calculation.
This exp implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237229
|
|
|
|
|
|
|
|
|
|
| |
Not all targets support the intrinsic, so it's better to have a
generic implementation which does not use it.
This exp2 implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237228
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237155
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237154
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237138
|
|
|
|
|
|
|
| |
This implementation was ported from the AMD builtin library
and has been tested with piglit, OpenCV, and the ocl conformance tests.
llvm-svn: 237131
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236941
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 236939
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a generic implementation which just calls rsqrt.
Targets should override this if they want a faster implementation.
v2:
- Alphabettize SOURCES
v3 (Jan Vesely):
Limit to single precision types.
llvm-svn: 236915
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236648
|