summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/NVPTX/NVPTX.h
Commit message (Collapse)AuthorAgeFilesLines
* [LegacyPassManager] Deprecate the BasicBlockPass/Manager.Alina Sbirlea2019-09-301-1/+1
| | | | | | | | | | | | | | | | | Summary: The BasicBlockManager is potentially broken and should not be used. Replace all uses of the BasicBlockPass with a FunctionBlockPass+loop on blocks. Reviewers: chandlerc Subscribers: jholewinski, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68234 llvm-svn: 373254
* Include what you use in NVPTX.hDmitri Gribenko2019-06-031-7/+2
| | | | | | | Other files were not relying on these transitive includes, so I'm submitting this change separately. llvm-svn: 362403
* Include what you use in NVPTX.hDmitri Gribenko2019-06-031-1/+0
| | | | | | | I also fixed all other files that were including NVPTX.h and were relying on transitive includes. llvm-svn: 362402
* [NVPTX] Create a TargetInfo header. NFCRichard Trieu2019-05-141-3/+0
| | | | | | | | Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360729
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [NVPTX] Allow libcalls that are defined in the current module.Justin Lebar2018-12-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069
* [NVPTX] Handle __nvvm_reflect("__CUDA_ARCH").Artem Belevich2018-08-031-1/+1
| | | | | | | | | | | | | | Summary: libdevice in recent CUDA versions relies on __nvvm_reflect() to select GPU-specific bitcode. This patch addresses the requirement. Reviewers: jlebar Subscribers: jholewinski, sanjoy, hiraditya, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D50207 llvm-svn: 338908
* NVPTX: Move InferAddressSpaces to generic codeMatt Arsenault2017-01-311-1/+0
| | | | llvm-svn: 293579
* [NVPTX] Let there be One True Way to set NVVMReflect params.Justin Lebar2017-01-151-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously there were three ways to inform the NVVMReflect pass whether you wanted to flush denormals to zero: * An LLVM command-line option * Parameters to the NVVMReflect constructor * Metadata on the module itself. This change removes the first two, leaving only the third. The motivation for this change, aside from simplifying things, is that we want LLVM to be aware of whether it's operating in FTZ mode, so other passes can use this information. Ideally we'd have a target-generic piece of metadata on the module. This change moves us in that direction. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28700 llvm-svn: 292068
* [NVPTX] Added support for half-precision floating point.Artem Belevich2017-01-131-1/+2
| | | | | | | | | | | | | | | | Only scalar half-precision operations are supported at the moment. - Adds general support for 'half' type in NVPTX. - fp16 math operations are supported on sm_53+ GPUs only (can be disabled with --nvptx-no-f16-math). - Type conversions to/from fp16 are supported on all GPU variants. - On GPU variants that do not have full fp16 support (or if it's disabled), fp16 operations are promoted to fp32 and results are converted back to fp16 for storage. Differential Revision: https://reviews.llvm.org/D28540 llvm-svn: 291956
* [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass.Justin Lebar2016-10-311-1/+0
| | | | | | | | | | | | | | | Summary: This has been replaced by the NVPTXInferAddressSpaces pass. We've had the new one as the default with the old one accessible via a flag for some months now, and we've had no problems. Reviewers: tra Subscribers: llvm-commits, jholewinski, jingyue, mgorny Differential Revision: https://reviews.llvm.org/D26165 llvm-svn: 285642
* Move the global variables representing each Target behind accessor functionMehdi Amini2016-10-091-2/+2
| | | | | | | | This avoids "static initialization order fiasco" Differential Revision: https://reviews.llvm.org/D25412 llvm-svn: 283702
* [NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC.Artem Belevich2016-07-201-1/+1
| | | | | | | | After r276153 the pass applies to both kernels and regular functions. Differential Revision: https://reviews.llvm.org/D22583 llvm-svn: 276189
* [NVPTX] Added NVVMIntrRange passArtem Belevich2016-05-261-0/+1
| | | | | | | | | | | | NVVMIntrRange adds !range metadata to calls of NVVM intrinsics that return values within known limited range. This allows LLVM to generate optimal code for indexing arrays based on tid/ctaid which is a frequently used pattern in CUDA code. Differential Revision: http://reviews.llvm.org/D20644 llvm-svn: 270872
* [NVPTX] Make NVVMReflect a function pass.Justin Lebar2016-03-301-2/+2
| | | | | | | | | | | | | | | Summary: Currently it's a module pass. Make it a function pass so that we can move it to PassManagerBuilder's EP_EarlyAsPossible extension point, which only accepts function passes. Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18615 llvm-svn: 264919
* [NVPTX] Avoid temporary std::string and make single-use function local to ↵Benjamin Kramer2016-03-301-2/+0
| | | | | | | | the cpp file. No functionality change intended. llvm-svn: 264861
* [NVPTX] Adds a new address space inference pass.Jingyue Wu2016-03-201-0/+1
| | | | | | | | | | | | | | | | | | | Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916
* [NVPTX] Remove dead code.Benjamin Kramer2015-10-151-18/+0
| | | | | | I left helpers that look useful for debugging alone. NFC. llvm-svn: 250410
* Add NVPTXPeephole pass to reduce unnecessary address castJingyue Wu2015-06-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch first change the register that holds local address for stack frame to %SPL. Then the new NVPTXPeephole pass will try to scan the following pattern %vreg0<def> = LEA_ADDRi64 <fi#0>, 4 %vreg1<def> = cvta_to_local %vreg0 and transform it into %vreg1<def> = LEA_ADDRi64 %VRFrameLocal, 4 Patched by Xuetian Weng Test Plan: test/CodeGen/NVPTX/local-stack-frame.ll Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: eliben, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10549 llvm-svn: 240587
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-5/+5
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-5/+5
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* Add NVPTXLowerAlloca pass to convert alloca'ed memory to local addressJingyue Wu2015-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This is done by first adding two additional instructions to convert the alloca returned address to local and convert it back to generic. Then replace all uses of alloca instruction with the converted generic address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine the generic addresscast and the corresponding Load, Store, Bitcast, GEP Instruction together. Patched by Xuetian Weng (xweng@google.com). Test Plan: test/CodeGen/NVPTX/lower-alloca.ll Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: meheff, broune, eliben, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10483 llvm-svn: 239964
* [NVPTX] roll forward r239082Jingyue Wu2015-06-041-1/+1
| | | | | | | | | NVPTXISelDAGToDAG translates "addrspacecast to param" to NVPTX::nvvm_ptr_gen_to_param Added an llc test in bug21465. llvm-svn: 239100
* Revert r239082Jingyue Wu2015-06-041-1/+1
| | | | | | llc crashed for NVPTX backend llvm-svn: 239094
* [NVPTX] kernel pointer arguments point to the global address spaceJingyue Wu2015-06-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a pointer in the global address space. This change, along with NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.* and st.global.* for accessing kernel pointer arguments. Minor changes: 1. refactor: extract function convertToPointerInAddrSpace 2. fix a bug in the test case in bug21465.ll Test Plan: lower-kernel-ptr-arg.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: wengxt, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10154 llvm-svn: 239082
* [PM] Remove a bunch of stale TTI creation method declarations. I nukedChandler Carruth2015-02-011-1/+0
| | | | | | | their definitions, but forgot to clean up all the declarations which are in different files. llvm-svn: 227698
* Test commit.Jacques Pienaar2014-12-031-2/+2
| | | | llvm-svn: 223310
* [NVPTX] Add an NVPTX-specific TargetTransformInfoJingyue Wu2014-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | Summary: It currently only implements hasBranchDivergence, and will be extended in later diffs. Split from D6188. Test Plan: make check-all Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6195 llvm-svn: 221619
* [NVPTX] Add NVPTXLowerStructArgs passJustin Holewinski2014-11-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x *byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x * %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. llvm-svn: 221377
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* [NVPTX] Add preliminary intrinsics and codegen support for textures/surfacesJustin Holewinski2014-04-091-0/+2
| | | | | | This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later). llvm-svn: 205907
* Optimize away unnecessary address casts.Eli Bendersky2014-04-031-0/+1
| | | | | | | | | Removes unnecessary casts from non-generic address spaces to the generic address space for certain code patterns. Patch by Jingyue Wu. llvm-svn: 205571
* Fix for PR19099 - NVPTX produces invalid symbol names.Eli Bendersky2014-03-311-0/+1
| | | | | | | | This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. llvm-svn: 205212
* [NVPTX] Remove unused prototypesJustin Holewinski2013-07-221-3/+0
| | | | llvm-svn: 186844
* [NVPTX] Clean up comparison/select/convert patterns and factor out PTX ↵Justin Holewinski2013-06-281-0/+47
| | | | | | | | instructions from their patterns Test case is no breakage llvm-svn: 185175
* [NVPTX] Add support for selecting CUDA vs OCL mode based on tripleJustin Holewinski2013-06-211-2/+1
| | | | | | IR for CUDA should use "nvptx[64]-nvidia-cuda", and IR for NV OpenCL should use "nvptx[64]-nvidia-nvcl" llvm-svn: 184579
* [NVPTX] Re-enable support for virtual registers in the final outputJustin Holewinski2013-05-311-0/+2
| | | | | | | | | | | | Now that 3.3 is branched, we are re-enabling virtual registers to help iron out bugs before the next release. Some of the post-RA passes do not play well with virtual registers, so we disable them for now. The needed functionality of the PrologEpilogInserter pass is copied to a new backend-specific NVPTXPrologEpilog pass. The test for this commit is not breaking the existing tests. llvm-svn: 182998
* [NVPTX] Add programmatic interface to NVVMReflect passJustin Holewinski2013-05-201-0/+3
| | | | llvm-svn: 182297
* [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputsJustin Holewinski2013-05-201-0/+1
| | | | | | | | | | | | | | | This converter currently only handles global variables in address space 0. For these variables, they are promoted to address space 1 (global memory), and all uses are updated to point to the result of a cvta.global instruction on the new variable. The motivation for this is address space 0 global variables are illegal since we cannot declare variables in the generic address space. Instead, we place the variables in address space 1 and explicitly convert the pointer to address space 0. This is primarily intended to help new users who expect to be able to place global variables in the default address space. llvm-svn: 182254
* [NVPTX] Run clang-format on all NVPTX sources.Justin Holewinski2013-03-301-11/+16
| | | | | | | Hopefully this resolves any outstanding style issues and gives us an automated way of ensuring we conform to the style guidelines. llvm-svn: 178415
* [NVPTX] Disable vector registersJustin Holewinski2013-02-121-1/+0
| | | | | | | | | | | Vectors were being manually scalarized by the backend. Instead, let the target-independent code do all of the work. The manual scalarization was from a time before good target-independent support for scalarization in LLVM. However, this forces us to specially-handle vector loads and stores, which we can turn into PTX instructions that produce/consume multiple operands. llvm-svn: 174968
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-2/+2
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Sort includes for all of the .h files under the 'lib' tree. These wereChandler Carruth2012-12-041-2/+2
| | | | | | | | | | missed in the first pass because the script didn't yet handle include guards. Note that the script is now able to handle all of these headers without manual edits. =] llvm-svn: 169224
* Fix header file include order in NVPTX backend NV_CONTRIBYuan Lin2012-06-051-2/+2
| | | | llvm-svn: 158013
* Fix a Clang warning in the new NVPTX backend:Chandler Carruth2012-05-041-1/+1
| | | | | | | | | | | | | | In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53: ../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default] default: assert(0 && "Unknown condition code"); ^ 1 warning generated. The prevailing pattern in LLVM is to not use a default label, and instead to use llvm_unreachable to denote that the switch in fact covers all return paths from the function. llvm-svn: 156209
* This patch adds a new NVPTX back-end to LLVM which supports code generation ↵Justin Holewinski2012-05-041-0/+137
for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196
OpenPOWER on IntegriCloud