summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] add labels to +DumpCode outputTim Renouf2017-12-081-2/+25
| | | | | | | | | | | | | | Summary: +DumpCode is a hack to embed disassembly in the ELF file. This commit fixes it to include labels, to make it slightly more useful. Reviewers: arsenm, kzhuravl Subscribers: nhaehnle, timcorringham, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D40169 llvm-svn: 320146
* AMDGPU: Add num spilled s/vgprs to metadataKonstantin Zhuravlyov2017-11-281-0/+2
| | | | | | | | This was requested by tools. Differential Revision: https://reviews.llvm.org/D40321 llvm-svn: 319192
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-171-1/+1
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* AMDGPU: Error on stack size overflowMatt Arsenault2017-11-141-3/+9
| | | | llvm-svn: 318189
* AMDGPU: Fix set but not used warnings related to AMDGPUASKonstantin Zhuravlyov2017-11-011-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D39499 llvm-svn: 317114
* AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistencyKonstantin Zhuravlyov2017-10-181-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D38957 llvm-svn: 316097
* AMDGPU: Start generating metadata for MaxFlatWorkGroupSizeKonstantin Zhuravlyov2017-10-171-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D38958 llvm-svn: 316024
* AMDGPU: Don't use TargetStreamer if it has not been initializedKonstantin Zhuravlyov2017-10-141-9/+15
| | | | | | | | | | Fixes cfe/trunk/test/Misc/backend-resource-limit-diagnostics.cl test after r315808 We may hit few other similar issues, but I want to discuss good solution offline. llvm-svn: 315830
* AMDGPU: Bring HSA metadata on par with the specificationKonstantin Zhuravlyov2017-10-141-1/+50
| | | | | | Differential Revision: https://reviews.llvm.org/D38753 llvm-svn: 315821
* AMDGPU: Do not emit deprecated notes for code object v3Konstantin Zhuravlyov2017-10-141-11/+19
| | | | | | Differential Revision: https://reviews.llvm.org/D38749 llvm-svn: 315810
* AMDGPU: Add support for isa version noteKonstantin Zhuravlyov2017-10-141-6/+16
| | | | | | | | | | - Emit NT_AMD_AMDGPU_ISA - Add assembler parsing for isa version directive - If isa version directive does not match command line arguments, then return error Differential Revision: https://reviews.llvm.org/D38748 llvm-svn: 315808
* AMDGPU/NFC: Minor clean ups in HSA metadataKonstantin Zhuravlyov2017-10-111-3/+6
| | | | | | | | | - Use HSA metadata streamer directly from AMDGPUAsmPrinter - Make naming consistent with PAL metadata Differential Revision: https://reviews.llvm.org/D38746 llvm-svn: 315526
* AMDGPU/NFC: Minor clean ups in PAL metadataKonstantin Zhuravlyov2017-10-111-39/+42
| | | | | | | | | - Move PAL metadata definitions to AMDGPUMetadata - Make naming consistent with HSA metadata Differential Revision: https://reviews.llvm.org/D38745 llvm-svn: 315523
* AMDGPU/NFC: Rename code object metadata as HSA metadataKonstantin Zhuravlyov2017-10-111-4/+3
| | | | | | | | | - Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change) - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer - Introduce HSAMD namespace - Other minor name changes in function and test names llvm-svn: 315522
* [AMDGPU] implemented pal metadataTim Renouf2017-10-031-3/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For the amdpal OS type: We write an AMDGPU_PAL_METADATA record in the .note section in the ELF (or as an assembler directive). It contains key=value pairs of 32 bit ints. It is a merge of metadata from codegen of the shaders, and metadata provided by the frontend as _amdgpu_pal_metadata IR metadata. Where both sources have a key=value with the same key, the two values are ORed together. This .note record is part of the amdpal ABI and will be documented in docs/AMDGPUUsage.rst in a future commit. Eventually the amdpal OS type will stop generating the .AMDGPU.config section once the frontend has safely moved over to using the .note records above instead of .AMDGPU.config. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37753 llvm-svn: 314829
* [AMDGPU] calling conventions for AMDPAL OS typeTim Renouf2017-09-291-1/+3
| | | | | | | | | | | | | | | Summary: This commit adds comments on how the AMDPAL OS type overloads the existing AMDGPU_ calling conventions used by Mesa, and adds a couple of new ones. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37752 llvm-svn: 314502
* [AMDGPU] AMDPAL scratch buffer supportTim Renouf2017-09-291-9/+14
| | | | | | | | | | | | | | | | | | | | | | | Summary: Added support for scratch (including spilling) for OS type amdpal: generates code to set up the scratch descriptor if it is needed. With amdpal, the scratch resource descriptor is loaded from offset 0 of the global information table. The low 32 bits of the address of the global information table is passed in s0. Added amdgpu-git-ptr-high function attribute to hard-wire the high 32 bits of the address of the global information table. If the function attribute is not specified, or is 0xffffffff, then the backend generates code to use the high 32 bits of pc. The documentation for the AMDPAL ABI will be added in a later commit. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye Differential Revision: https://reviews.llvm.org/D37483 llvm-svn: 314501
* AMDGPU: Fix not accounting for tail call resource usageMatt Arsenault2017-09-051-1/+2
| | | | | | | | If the only call in a function is a tail call, the function isn't considered to have a call since it's a type of return. llvm-svn: 312561
* AMDGPU: Start adding tail call supportMatt Arsenault2017-08-111-2/+4
| | | | | | Handle the sibling call cases. llvm-svn: 310753
* AMDGPU: Fix assert on n inline asm constraintMatt Arsenault2017-08-091-6/+15
| | | | llvm-svn: 310515
* AMDGPU: Restore using MRI to find highest used regsMatt Arsenault2017-08-021-5/+23
| | | | | | | | | | If there are no calls, this is a faster path than searching the entire program for calls. This was supposed to be left in r309781. Fixes unused variable warning. llvm-svn: 309832
* AMDGPU: Analyze callee resource usage in AsmPrinterMatt Arsenault2017-08-021-11/+145
| | | | llvm-svn: 309781
* AMDGPU: Remove duplicate print outs from .AMDGPU.csdataKonstantin Zhuravlyov2017-07-161-9/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D35428 llvm-svn: 308145
* Move Object format code to lib/BinaryFormat.Zachary Turner2017-06-071-1/+1
| | | | | | | | | | | | This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* AMDGPU: Remove error on call in AsmPrinterMatt Arsenault2017-06-011-29/+26
| | | | | | | Partial revert of r301938 which is making it harder to split patches up. llvm-svn: 304418
* AMDGPU/AMDHSA: Set COMPUTE_PGM_RSRC2:LDS_SIZE to 0Konstantin Zhuravlyov2017-05-051-1/+2
| | | | | | | | This field is populated by the CP Differential Revision: https://reviews.llvm.org/D32619 llvm-svn: 302277
* AMDGPU: Refactor AsmPrinterMatt Arsenault2017-05-021-127/+213
| | | | | | | Avoid analyzing functions multiple times. This allows asserting that each function is only analyzed once. llvm-svn: 301938
* AMDGPU: Add AMDGPU_HS calling conventionMarek Olsak2017-05-021-0/+1
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32644 llvm-svn: 301930
* AMDGPU: Don't emit amd_kernel_code_t for callable functionsMatt Arsenault2017-04-191-1/+14
| | | | | | | | | | | | This is inserted directly in the text section. The relocation for the function ends up resolving to the beginning of the amd_kernel_code_t header rather than the actual function entry point. Also skip some of the comments for initialization that only makes sense for kernels. llvm-svn: 300736
* AMDGPU: Don't align callable functions to 256Matt Arsenault2017-04-191-1/+3
| | | | llvm-svn: 300720
* AMDGPU: Make MFI fields privateMatt Arsenault2017-04-181-3/+3
| | | | llvm-svn: 300596
* AMDGPU: Use MachineRegisterInfo to find max used registerMatt Arsenault2017-04-171-126/+75
| | | | | | | | | | Avoid looping through program to determine register counts. This avoids needing to look at regmask operands. Also fixes some counting errors with flat_scr when there are no stack objects. llvm-svn: 300482
* AMDGPU: Refactor argument loweringMatt Arsenault2017-04-111-1/+1
| | | | | | | Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998
* AMDGPU: Rename isKernelMatt Arsenault2017-03-301-1/+1
| | | | | | | | What we really want to do is distinguish functions that may be called by other functions, and graphics shaders are not called kernels. llvm-svn: 299140
* [AMDGPU] Get address space mapping by target triple environmentYaxun Liu2017-03-271-2/+5
| | | | | | | | | | | | | | | | | | As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
* [AMDGPU] Do not emit isa info as code object metadataKonstantin Zhuravlyov2017-03-221-3/+2
| | | | | | | | - It was decided to expose this information through other means (rocr) Differential Revision: https://reviews.llvm.org/D30970 llvm-svn: 298560
* [AMDGPU] Emit kernel code properties as code object metadataKonstantin Zhuravlyov2017-03-221-38/+36
| | | | | | | | - These are not required for low level runtime Differential Revision: https://reviews.llvm.org/D29949 llvm-svn: 298556
* [AMDGPU] Restructure code object metadata creationKonstantin Zhuravlyov2017-03-221-18/+27
| | | | | | | | | | | | | | | | | - Rename runtime metadata -> code object metadata - Make metadata not flow - Switch enums to use ScalarEnumerationTraits - Cleanup and move AMDGPUCodeObjectMetadata.h to AMDGPU/MCTargetDesc - Introduce in-memory representation for attributes - Code object metadata streamer - Create metadata for isa and printf during EmitStartOfAsmFile - Create metadata for kernel during EmitFunctionBodyStart - Finalize and emit metadata to .note during EmitEndOfAsmFile - Other minor improvements/bug fixes Differential Revision: https://reviews.llvm.org/D29948 llvm-svn: 298552
* AMDGPU: Redefine clamp node as clamp 0.0-1.0Matt Arsenault2017-02-211-1/+1
| | | | | | | | | | | Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
* AMDGPU: Merge initial gfx9 supportMatt Arsenault2017-02-181-0/+4
| | | | llvm-svn: 295554
* AMDGPU : Add trap handler support.Wei Ding2017-02-101-0/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692
* [AMDGPU] Add target information that is required by tools to metadataKonstantin Zhuravlyov2017-02-081-16/+18
| | | | | | Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449
* [AMDGPU] Distinguish between S/VGPR allocation and encoding granularitiesKonstantin Zhuravlyov2017-02-081-4/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D29633 llvm-svn: 294441
* [AMDGPU] Move register related queries to subtarget classKonstantin Zhuravlyov2017-02-081-25/+27
| | | | | | Differential Revision: https://reviews.llvm.org/D29318 llvm-svn: 294440
* [AMDGPU] Grab MCSubtargetInfo from TargetMachine instead of constructing itKonstantin Zhuravlyov2017-01-271-6/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D29224 llvm-svn: 293318
* AMDGPU add support for spilling to a user sgpr pointed buffersTom Stellard2017-01-251-3/+3
| | | | | | | | | | | | | | | | | Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000
* [AMDGPU] Do not emit .AMDGPU.config section for amdhsaKonstantin Zhuravlyov2017-01-061-4/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D27732 llvm-svn: 291245
* AMDGPU: [AMDGPU] Assembler: add .hsa_code_object_metadata directive for ↵Sam Kolton2016-12-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | functime metadata V2.0 Summary: Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata. Between them user can put YAML string that would be directly put to the generated note. E.g.: ''' .hsa_code_object_metadata { amd.MDVersion: [ 2, 0 ] } .end_hsa_code_object_metadata ''' Based on D25046 Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye Differential Revision: https://reviews.llvm.org/D27619 llvm-svn: 290097
* AMDGPU: Emit runtime metadata version 2 as YAMLYaxun Liu2016-12-141-2/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D25046 llvm-svn: 289674
OpenPOWER on IntegriCloud