summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/NVPTX/CMakeLists.txt
Commit message (Collapse)AuthorAgeFilesLines
* [NVPTX] Move InstPrinter files to MCTargetDesc. NFCRichard Trieu2019-05-111-1/+0
| | | | | | | | | For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360500
* [NVPTX] Allow libcalls that are defined in the current module.Justin Lebar2018-12-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch adds a possibility to make library calls on NVPTX. An important thing about library functions - they must be defined within the current module. This basically should guarantee that we produce a valid PTX assembly (without calls to not defined functions). The one who wants to use the libcalls is probably will have to link against compiler-rt or any other implementation. Currently, it's completely impossible to make library calls because of error LLVM ERROR: Cannot select: i32 = ExternalSymbol '...'. But we can lower ExternalSymbol to TargetExternalSymbol and verify if the function definition is available. Also, there was an issue with a DAG during legalisation. When we expand instruction into libcall, the inner call-chain isn't being "integrated" into outer chain. Since the last "data-flow" (call retval load) node is located in call-chain earlier than CALLSEQ_END node, the latter becomes a leaf and therefore a dead node (and is being removed quite fast). Proposed here solution relies on another data-flow pseudo nodes (ProxyReg) which purpose is only to keep CALLSEQ_END at legalisation and instruction selection phases - we remove the pseudo instructions before register scheduling phase. Patch by Denys Zariaiev! Differential Revision: https://reviews.llvm.org/D34708 llvm-svn: 350069
* Consistently sort add_subdirectory calls in lib/Target/*/CMakeLists.txtNico Weber2018-04-231-1/+1
| | | | llvm-svn: 330584
* Sort targetgen calls in lib/Target/*/CMakeLists.Nico Weber2018-04-041-3/+3
| | | | | | | | | | | Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181
* NVPTX: Move InferAddressSpaces to generic codeMatt Arsenault2017-01-311-1/+0
| | | | llvm-svn: 293579
* [NVPTX] Remove NVPTXFavorNonGenericAddrSpaces pass.Justin Lebar2016-10-311-1/+0
| | | | | | | | | | | | | | | Summary: This has been replaced by the NVPTXInferAddressSpaces pass. We've had the new one as the default with the old one accessible via a flag for some months now, and we've had no problems. Reviewers: tra Subscribers: llvm-commits, jholewinski, jingyue, mgorny Differential Revision: https://reviews.llvm.org/D26165 llvm-svn: 285642
* [NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC.Artem Belevich2016-07-201-1/+1
| | | | | | | | After r276153 the pass applies to both kernels and regular functions. Differential Revision: https://reviews.llvm.org/D22583 llvm-svn: 276189
* [NVPTX] Added NVVMIntrRange passArtem Belevich2016-05-261-0/+1
| | | | | | | | | | | | NVVMIntrRange adds !range metadata to calls of NVVM intrinsics that return values within known limited range. This allows LLVM to generate optimal code for indexing arrays based on tid/ctaid which is a frequently used pattern in CUDA code. Differential Revision: http://reviews.llvm.org/D20644 llvm-svn: 270872
* [NVPTX] Adds a new address space inference pass.Jingyue Wu2016-03-201-0/+1
| | | | | | | | | | | | | | | | | | | Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916
* Add NVPTXPeephole pass to reduce unnecessary address castJingyue Wu2015-06-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch first change the register that holds local address for stack frame to %SPL. Then the new NVPTXPeephole pass will try to scan the following pattern %vreg0<def> = LEA_ADDRi64 <fi#0>, 4 %vreg1<def> = cvta_to_local %vreg0 and transform it into %vreg1<def> = LEA_ADDRi64 %VRFrameLocal, 4 Patched by Xuetian Weng Test Plan: test/CodeGen/NVPTX/local-stack-frame.ll Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: eliben, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10549 llvm-svn: 240587
* Add NVPTXLowerAlloca pass to convert alloca'ed memory to local addressJingyue Wu2015-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This is done by first adding two additional instructions to convert the alloca returned address to local and convert it back to generic. Then replace all uses of alloca instruction with the converted generic address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine the generic addresscast and the corresponding Load, Store, Bitcast, GEP Instruction together. Patched by Xuetian Weng (xweng@google.com). Test Plan: test/CodeGen/NVPTX/lower-alloca.ll Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: meheff, broune, eliben, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10483 llvm-svn: 239964
* [NVPTX] roll forward r239082Jingyue Wu2015-06-041-1/+1
| | | | | | | | | NVPTXISelDAGToDAG translates "addrspacecast to param" to NVPTX::nvvm_ptr_gen_to_param Added an llc test in bug21465. llvm-svn: 239100
* Revert r239082Jingyue Wu2015-06-041-1/+1
| | | | | | llc crashed for NVPTX backend llvm-svn: 239094
* [NVPTX] kernel pointer arguments point to the global address spaceJingyue Wu2015-06-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Summary: With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a pointer in the global address space. This change, along with NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.* and st.global.* for accessing kernel pointer arguments. Minor changes: 1. refactor: extract function convertToPointerInAddrSpace 2. fix a bug in the test case in bug21465.ll Test Plan: lower-kernel-ptr-arg.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: wengxt, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10154 llvm-svn: 239082
* NVPTX: Remove dead code.Benjamin Kramer2015-03-021-1/+0
| | | | | | | Fun fact: This file was never referenced since the initial checkin of the NVPTX backend. llvm-svn: 230957
* [NVPTX] Add an NVPTX-specific TargetTransformInfoJingyue Wu2014-11-101-12/+13
| | | | | | | | | | | | | | | | | | | | Summary: It currently only implements hasBranchDivergence, and will be extended in later diffs. Split from D6188. Test Plan: make check-all Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6195 llvm-svn: 221619
* [NVPTX] Add NVPTXLowerStructArgs passJustin Holewinski2014-11-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x *byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x * %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. llvm-svn: 221377
* [NVPTX] Add preliminary intrinsics and codegen support for textures/surfacesJustin Holewinski2014-04-091-0/+2
| | | | | | This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later). llvm-svn: 205907
* Optimize away unnecessary address casts.Eli Bendersky2014-04-031-0/+1
| | | | | | | | | Removes unnecessary casts from non-generic address spaces to the generic address space for certain code patterns. Patch by Jingyue Wu. llvm-svn: 205571
* Fix for PR19099 - NVPTX produces invalid symbol names.Eli Bendersky2014-03-311-0/+1
| | | | | | | | This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. llvm-svn: 205212
* Removes the NVPTXSplitBBatBar pass.Eli Bendersky2014-03-241-1/+0
| | | | | | | This pass is a historic remnant and actually causes less efficient code to be generated in some cases. llvm-svn: 204620
* [CMake] Let add_public_tablegen_target() provide intrinsics_gen, too.NAKAMURA Takumi2013-11-281-2/+0
| | | | | | | | | | I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929
* [CMake] Let add_public_tablegen_target responsible to provide dependency to ↵NAKAMURA Takumi2013-11-281-1/+1
| | | | | | | | | CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927
* [NVPTX] Start conversion to MC infrastructureJustin Holewinski2013-08-061-0/+1
| | | | | | | | | This change converts the NVPTX target to use the MC infrastructure instead of directly emitting MachineInstr instances. This brings the target more up-to-date with LLVM TOT, and should fix PR15175 and PR15958 (libNVPTXInstPrinter is empty) as a side-effect. llvm-svn: 187798
* Target/*/CMakeLists.txt: Add the dependency to CommonTableGen explicitly for ↵NAKAMURA Takumi2013-08-061-1/+1
| | | | | | | | | each corresponding CodeGen. Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel. It races to emit *.inc files simultaneously. llvm-svn: 187780
* [NVPTX] Re-enable support for virtual registers in the final outputJustin Holewinski2013-05-311-0/+1
| | | | | | | | | | | | Now that 3.3 is branched, we are re-enabling virtual registers to help iron out bugs before the next release. Some of the post-RA passes do not play well with virtual registers, so we disable them for now. The needed functionality of the PrologEpilogInserter pass is copied to a new backend-specific NVPTXPrologEpilog pass. The test for this commit is not breaking the existing tests. llvm-svn: 182998
* [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputsJustin Holewinski2013-05-201-0/+1
| | | | | | | | | | | | | | | This converter currently only handles global variables in address space 0. For these variables, they are promoted to address space 1 (global memory), and all uses are updated to point to the result of a cvta.global instruction on the new variable. The motivation for this is address space 0 global variables are illegal since we cannot declare variables in the generic address space. Instead, we place the variables in address space 1 and explicitly convert the pointer to address space 0. This is primarily intended to help new users who expect to be able to place global variables in the default address space. llvm-svn: 182254
* [NVPTX] Add NVVMReflect pass to allow compile-time selection ofJustin Holewinski2013-03-301-0/+1
| | | | | | | | | | | | | | | | specific code paths. This allows us to write code like: if (__nvvm_reflect("FOO")) // Do something else // Do something else and compile into a library, then give "FOO" a value at kernel compile-time so the check becomes a no-op. llvm-svn: 178416
* [NVPTX] Disable vector registersJustin Holewinski2013-02-121-1/+0
| | | | | | | | | | | Vectors were being manually scalarized by the backend. Instead, let the target-independent code do all of the work. The manual scalarization was from a time before good target-independent support for scalarization in LLVM. However, this forces us to specially-handle vector loads and stores, which we can turn into PTX instructions that produce/consume multiple operands. llvm-svn: 174968
* llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.NAKAMURA Takumi2012-06-241-0/+1
| | | | llvm-svn: 159112
* This patch adds a new NVPTX back-end to LLVM which supports code generation ↵Justin Holewinski2012-05-041-0/+33
for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196
OpenPOWER on IntegriCloud