| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 347666
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 346597
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 346078
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 346077
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 346076
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewer: Aaron Watry
llvm-svn: 346075
|
|
|
|
|
|
|
|
|
| |
Same reason as amdgcn.
Fixes fmin, minmag CTS on turks.
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 334228
|
|
|
|
|
|
|
|
|
| |
Same reason as amdgcn.
Fixes fmax, maxmag CTS on turks.
Reviewer: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 334227
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 329291
|
|
|
|
|
|
|
|
| |
r324101 switched around AS numbering
Acked-by: Aaron Watry <awatry@gmail.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 325864
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 316238
|
|
|
|
|
|
|
|
|
|
| |
The implementation uses r600 sepcific intrinsics
LLVM-4 switched to _ro_t and _rw_t image types
Portions of the code can be moved back as more targets/llvm versions add image support
Reviewer: Aaron Watry
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 315341
|
|
|
|
|
| |
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 314634
|
|
|
|
|
|
|
|
|
| |
We don't have memory fences for r600 so just call group barrier directly
Make sure that barrier is called even with 0 flags
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 312492
|
|
|
|
| |
llvm-svn: 279723
|
|
|
|
| |
llvm-svn: 279644
|
|
|
|
| |
llvm-svn: 279359
|
|
|
|
| |
llvm-svn: 279350
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also fix get_global_id to consider offset
No idea how to add this for ptx, so they are stuck with the old get_global_id
implementation.
v2: split to a separate patch
v3: Switch R600 to use implictarg.ptr
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 276443
|
|
|
|
|
|
|
|
|
|
|
| |
v2: split into 2 patches
use clang builtins for other intrinsics as well
v3: Fix warnings
Switch r600 to use implictarg.ptr
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 276442
|
|
|
|
| |
llvm-svn: 275874
|
|
|
|
| |
llvm-svn: 261042
|
|
|
|
|
|
|
|
|
|
|
| |
Most files remain in a common amdgpu directory.
Also switches barriers to to use convergent,
and use llvm.amdgcn.s.barrier.
This now requires 3.9/trunk to build amdgcn.
llvm-svn: 260777
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 248161
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 248160
|
|
|
|
|
|
|
|
|
| |
Added get_image_* OpenCL builtins to the headers.
Added implementation to the r600 target.
Patch by: Zoltan Gilian
llvm-svn: 248159
|
|
|
|
|
|
|
| |
v2:
- Use same implementation for R600 and gcn.
llvm-svn: 241907
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 236649
|
|
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Tom Stellard:
- Add denormal handling.
- Share vectorization code with r600 implementation.
Patch By: Aaron Watry
llvm-svn: 236639
|
|
|
|
| |
llvm-svn: 236638
|
|
|
|
| |
llvm-svn: 225042
|
|
|
|
|
|
| |
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 220388
|
|
|
|
|
|
|
|
|
|
| |
v2: Fix function declaration
Add range metadata to r600 implementation
v3: change prefix to AMDGPU
Reviewed-by: Tom Stellard <tom@stellard.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 219793
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217925
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217924
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217923
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217922
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217921
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217920
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217919
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This generates bitcode which is indistinguishable from what was
hand-written for int32 types in v[load|store]_impl.ll.
v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom)
v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll
v2: (Per Matt Arsenault) Fix alignment issues with vector load stores
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: Matt Arsenault <Matthew.Arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 216069
|
|
|
|
|
|
|
|
|
| |
v2: - use quotes instead of <>
- add include to r600/lib/math/nextafter.c changed
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 211576
|
|
|
|
|
|
|
|
| |
This will prevent LLVM optimization passes from creating illegal uses
of the barrier() intrinsic (e.g. calling barrier() from a conditional
that is not executed by all threads).
llvm-svn: 193753
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two implementations of nextafter():
1. Using clang's __builtin_nextafter. Clang replaces this builtin with
a call to nextafter which is part of libm. Therefore, this
implementation will only work for targets with an implementation of
libm (e.g. most CPU targets).
2. The other implementation is written in OpenCL C. This function is
known internally as __clc_nextafter and can be used by targets that
don't have access to libm.
llvm-svn: 192383
|
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190201
|
|
|
|
|
| |
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 190058
|
|
|
|
|
|
|
|
| |
It's supported by the R600 LLVM back-end now, at least for evergreen.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188180
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188179
|
|
|
|
|
|
|
|
|
| |
The get_num_groups function was missing for r600g. I did the same
thing as the other workitem functions.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 187059
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The assembly optimizations were making unsafe assumptions about which address
spaces had which identifiers.
Also, fix vload/vstore with 64-bit pointers. This was broken previously on
Radeon SI.
This version still only has assembly versions of int/uint 2/4/8/16 for global
loads and stores on R600, but it does it in a way that would be very easily
extended to private/local/constant and could also be handled easily on other
architectures.
v2: 1) Leave v[load|store]_impl.ll in generic/lib
2) Remove vload_if.ll and vstore_if.ll interfaces
3) Fix address+offset calculations
3) Remove offset from assembly arg list
llvm-svn: 186416
|