| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.
Also allow using clamp with f16, and use knowledge
of dx10_clamp.
llvm-svn: 295788
|
| |
|
|
| |
llvm-svn: 295554
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D26010
llvm-svn: 294692
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D28760#fb670e28
llvm-svn: 294449
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D29633
llvm-svn: 294441
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D29318
llvm-svn: 294440
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D29224
llvm-svn: 293318
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].
Patch By: Dave Airlie
Reviewers: nhaehnle, arsenm, tstellarAMD
Reviewed By: arsenm
Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D25428
llvm-svn: 293000
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27732
llvm-svn: 291245
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
functime metadata V2.0
Summary:
Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata.
Between them user can put YAML string that would be directly put to the generated note. E.g.:
'''
.hsa_code_object_metadata
{
amd.MDVersion: [ 2, 0 ]
}
.end_hsa_code_object_metadata
'''
Based on D25046
Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye
Differential Revision: https://reviews.llvm.org/D27619
llvm-svn: 290097
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25046
llvm-svn: 289674
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This frees 2 scalar registers.
Reviewers: tstellarAMD
Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27150
llvm-svn: 289261
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There is no point in setting SGPRS=104, because VI allocates SGPRs
in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs
for general purposes.
Reviewers: tstellarAMD
Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27149
llvm-svn: 289260
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D27416
llvm-svn: 288852
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata.
However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html).
This patch lets AMDGPU backend emits runtime metadata as a note element in .note section.
Differential Revision: https://reviews.llvm.org/D25781
llvm-svn: 286502
|
| |
|
|
|
|
| |
This is possible when using inline asm.
llvm-svn: 285447
|
| |
|
|
|
|
|
|
| |
respectively.
Differential Revision: http://reviews.llvm.org/D25789
llvm-svn: 284655
|
| |
|
|
|
|
|
|
| |
This avoids "static initialization order fiasco"
Differential Revision: https://reviews.llvm.org/D25412
llvm-svn: 283702
|
| |
|
|
|
|
| |
Fix bad merge
llvm-svn: 283470
|
| |
|
|
| |
llvm-svn: 283469
|
| |
|
|
|
|
| |
Make the necessary refactorings to make use of PseudoInstExpansion
llvm-svn: 283467
|
| |
|
|
|
|
|
| |
AMDGPU needs to expand unconditional branches in a new
block with an indirect branch.
llvm-svn: 283464
|
| |
|
|
|
|
|
|
| |
These ones need to have the size on the pseudo instruction set for
getInstSizeInBytes to work correctly. These also have a statically
known size.
llvm-svn: 283437
|
| |
|
|
| |
llvm-svn: 283004
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We need to call AsmPrinter::getNameWithPrefix() in order to handle
anonymous GlobalValues (e.g. @0, @1).
Reviewers: arsenm, b-sumner
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D24865
llvm-svn: 282420
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D24835
llvm-svn: 282223
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
mesa3d will use the same kernel calling convention as amdhsa, but it will
handle everything else like the default 'unknown' OS type.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22783
llvm-svn: 281779
|
| |
|
|
| |
llvm-svn: 280841
|
| |
|
|
|
|
|
|
| |
OpenCL kernels have hidden kernel arguments for global offset and printf buffer. For consistency, these hidden argument should be included in the runtime metadata. Also updated kernel argument kind metadata.
Differential Revision: https://reviews.llvm.org/D23424
llvm-svn: 280829
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch
Patch by Tom Stellard and Konstantin Zhuravlyov
Differential Revision: https://reviews.llvm.org/D21562
llvm-svn: 280747
|
| |
|
|
|
|
|
|
| |
Add runtime metdata for pointee alignment of pointer type kernel argument. The key is KeyArgPointeeAlign and the value is a 32 bit unsigned integer.
Differential Revision: https://reviews.llvm.org/D24145
llvm-svn: 280399
|
| |
|
|
|
|
| |
Follow up to r278902. I had missed "fall through", with a space.
llvm-svn: 278970
|
| |
|
|
|
|
|
|
|
|
| |
Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22964
llvm-svn: 277759
|
| |
|
|
|
|
|
| |
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.
llvm-svn: 277017
|
| |
|
|
| |
llvm-svn: 276804
|
| |
|
|
|
|
|
|
|
| |
ABIArgOffset is a problem because properly fsetting the
KernArgSize requires that the reserved area before the
real kernel arguments be correctly aligned, which requires
fixing clover.
llvm-svn: 276766
|
| |
|
|
|
|
|
| |
Remove dead code from r600 intrinsic removal.
Remove unset members, rename StackSize to be less ambiguous.
llvm-svn: 276436
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D22526
llvm-svn: 276119
|
| |
|
|
|
|
| |
Attempting to fix lit test failure on ppc.
llvm-svn: 275676
|
| |
|
|
|
|
| |
This reverts commit r275566.
llvm-svn: 275599
|
| |
|
|
|
|
|
|
|
|
| |
Added emitting metadata to elf for runtime.
Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream.
Differential Revision: https://reviews.llvm.org/D21849
llvm-svn: 275566
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
v2: don't count SGPRs spilled to scratch twice
I think this is sufficient. It doesn't count private memory usage, which
happens often and uses scratch but isn't technically a spill. The private
memory usage can be computed by:
[scratch_per_thread - vgpr_spills - a random multiple of SGPR spills].
The fact SGPR spills add very high numbers to the scratch size make that
computation a guessing game, but I don't have a solution to that.
Reviewers: tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D22197
llvm-svn: 275288
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21484
llvm-svn: 275268
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.
Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
- offset 0: work group ID x
- offset 4: work group ID y
- offset 8: work group ID z
- offset 16: work item ID x
- offset 20: work item ID y
- offset 24: work item ID z
Set
- amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
- amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
- amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled
Differential Revision: http://reviews.llvm.org/D20335
llvm-svn: 273769
|
| |
|
|
|
|
|
|
|
| |
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.
llvm-svn: 273652
|
| |
|
|
|
|
|
| |
Backends may want to report errors on resources other than
stack size.
llvm-svn: 273177
|
| |
|
|
| |
llvm-svn: 273172
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20081
llvm-svn: 270594
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Version 2 is now the default. If you want to emit version 1, use
the amdgcn--amdhsa-amdcov1 triple.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19283
llvm-svn: 268647
|
| |
|
|
| |
llvm-svn: 267922
|