summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/NVPTX
Commit message (Collapse)AuthorAgeFilesLines
...
* [NVPTX] Fix sign/zero-extending ldg/ldu instruction selectionJustin Holewinski2016-05-023-48/+77
| | | | | | | | | | | | | | | | | Summary: We don't have sign-/zero-extending ldg/ldu instructions defined, so we need to emulate them with explicit CVTs. We were originally handling the i8 case, but not any other cases. Fixes PR26185 Reviewers: jingyue, jlebar Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D19615 llvm-svn: 268272
* [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in ↵Craig Topper2016-04-281-6/+0
| | | | | | TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853
* [NVPTX] Run NVVMReflect at the beginning of IR passes.Justin Lebar2016-04-272-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the NVVMReflect pass is run at the beginning of our backend passes. But really, it should be run as early as possible, as it's simply resolving an "if" statement in code. So copy it into TargetMachine::addEarlyAsPossiblePasses. We still run it at the beginning of the backend passes, since it's needed for correctness when lowering to nvptx. (Specifically, NVVMReflect changes each call to the __nvvm_reflect function or llvm.nvvm.reflect intrinsic into an integer constant, based on the pass's configuration. Clearly we miss many optimization opportunities if we perform this transformation at the beginning of codegen.) Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18616 llvm-svn: 267765
* Add optimization bisect opt-in calls for NVPTX passesAndrew Kaylor2016-04-265-1/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D19518 llvm-svn: 267635
* [NVPTX] Fix some usages of CodeGenOpt::None.Jingyue Wu2016-04-261-5/+9
| | | | | | | | | | NVPTXLowerKernelArgs is required for correctness, so it should not be guarded by CodeGenOpt::None. NVPTXPeephole is optimization only, so it should be skipped when CodeGenOpt::None. llvm-svn: 267619
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-261-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* [NVPTX] Set ctlz_zero_undef to Expand so LegalizeDAG will convert it to ↵Craig Topper2016-04-232-10/+3
| | | | | | ctlz. Remove the now unneccessary isel patterns. NFC llvm-svn: 267265
* Disable the PatchableFunction pass for NVPTX & WasmSanjoy Das2016-04-191-0/+1
| | | | | | | PatchableFunction requires AllVRegsAllocated that these targets don't provide. llvm-svn: 266720
* [NFC] Header cleanupMehdi Amini2016-04-186-10/+0
| | | | | | | | | | | | | | Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595
* [NVPTX] Set NVPTXTTI::getInliningThresholdMultiplier to 5.Justin Lebar2016-04-151-0/+4
| | | | | | | | | | | | | | | Summary: Calls on NVPTX are unusually expensive (for one thing, lots of state needs to be saved to memory, which is slow), so make the inlininer much more aggressive. Reviewers: chandlerc Subscribers: jholewinski, llvm-commits, tra Differential Revision: http://reviews.llvm.org/D18561 llvm-svn: 266406
* IR: Introduce ConstantAggregate, NFCDuncan P. N. Exon Smith2016-04-052-4/+2
| | | | | | | | | | | | Add a common parent class for ConstantArray, ConstantVector, and ConstantStruct called ConstantAggregate. These are the aggregate subclasses of Constant that take operands. This is mainly a cleanup, adding common `isa` target and removing duplicated code. However, it also simplifies caching which constants point transitively at `GlobalValue` (a possible future direction). llvm-svn: 265466
* [NVPTX] Handle ldg created from sign-/zero-extended loadJustin Holewinski2016-04-052-4/+91
| | | | | | | | | | Reviewers: jingyue Subscribers: jholewinski Differential Revision: http://reviews.llvm.org/D18053 llvm-svn: 265389
* [NVPTX] Add a truncate DAG node to some calls.Justin Lebar2016-04-011-2/+10
| | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, we were running afoul of the assertion EVT(CLI.Ins[i].VT) == InVals[i].getValueType() && "LowerCall emitted a value with the wrong type!" in SelectionDAGBuilder.cpp when running the NVPTX/i8-param.ll test. This is because our backend (for some reason) treats small return values as i32, but it wasn't ever truncating the i32 back down to the expected width in the DAG. Unclear to me whether this fixes any actual bugs -- in this test, at least, the generated code is unchanged. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski Differential Revision: http://reviews.llvm.org/D17872 llvm-svn: 265091
* [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect.Justin Lebar2016-04-011-7/+17
| | | | | | | | | | | | | | | | | | | Summary: Previously the NVVMReflect pass would read its configuration from command-line flags or a static configuration given to the pass at instantiation time. This doesn't quite work for clang's use-case. It needs to pass a value for __CUDA_FTZ down on a per-module basis. We use a module flag for this, so the NVVMReflect pass needs to be updated to read said flag. Reviewers: tra, rnk Subscribers: cfe-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18672 llvm-svn: 265090
* [NVPTX] Annotate some instructions as hasSideEffects = 0.Justin Lebar2016-04-012-146/+171
| | | | | | | | | | | | | | | | | | | | Summary: Tablegen tries to infer this from the selection DAG patterns defined for the instructions, but it can't always. An instructive example is CLZr64. CLZr32 is correctly inferred to have no side-effects, but the selection DAG pattern for CLZr64 is slightly more complicated, and in particular the ctlz DAG node is not at the root of the pattern. Thus tablegen can't infer that CLZr64 has no side-effects. Reviewers: jholewinski Subscribers: jholewinski, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17472 llvm-svn: 265089
* Change eliminateCallFramePseudoInstr() to return an iteratorHans Wennborg2016-03-312-3/+3
| | | | | | | | | | | | | | | | | | | | | This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction. It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator. Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment. Differential Revision: http://reviews.llvm.org/D18627 llvm-svn: 265036
* [NVPTX] Make NVVMReflect a function pass.Justin Lebar2016-03-302-102/+69
| | | | | | | | | | | | | | | Summary: Currently it's a module pass. Make it a function pass so that we can move it to PassManagerBuilder's EP_EarlyAsPossible extension point, which only accepts function passes. Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18615 llvm-svn: 264919
* [NVPTX] Avoid temporary std::string and make single-use function local to ↵Benjamin Kramer2016-03-302-5/+4
| | | | | | | | the cpp file. No functionality change intended. llvm-svn: 264861
* Introduce MachineFunctionProperties and the AllVRegsAllocated propertyDerek Schuff2016-03-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | MachineFunctionProperties represents a set of properties that a MachineFunction can have at particular points in time. Existing examples of this idea are MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which will eventually be switched to use this mechanism. This change introduces the AllVRegsAllocated property; i.e. the property that all virtual registers have been allocated and there are no VReg operands left. With this mechanism, passes can declare that they require a particular property to be set, or that they set or clear properties by implementing e.g. MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class verifies that the requirements are met, and handles the setting and clearing based on the delcarations. Passes can also directly query and update the current properties of the MF if they want to have conditional behavior. This change annotates the target-independent post-regalloc passes; future changes will also annotate target-specific ones. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D18421 llvm-svn: 264593
* IR: Reserve an MDKind for !llvm.loop; NFCDuncan P. N. Exon Smith2016-03-251-1/+2
| | | | | | | | | | This reserves an MDKind for !llvm.loop, which allows callers to avoid a string-based lookup. I'm not sure why it was missing. There should be no functionality change here, just a small compile-time speedup. llvm-svn: 264371
* [NVPTX] Adds a new address space inference pass.Jingyue Wu2016-03-205-9/+609
| | | | | | | | | | | | | | | | | | | Summary: The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable to convert the address space of a pointer induction variable. This patch adds a new pass called NVPTXInferAddressSpaces that overcomes that limitation using a fixed-point data-flow analysis (see the file header comments for details). The new pass is experimental and not enabled by default. Users can turn it on by setting the -nvptx-use-infer-addrspace flag of llc. Reviewers: jholewinski, tra, jlebar Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17965 llvm-svn: 263916
* [PM] Port GVN to the new pass manager, wire it up, and teach a couple ofChandler Carruth2016-03-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 llvm-svn: 263208
* [NVPTX] Annotate param loads/stores as mayLoad/mayStore.Justin Lebar2016-03-012-56/+68
| | | | | | | | | | | | | | | | | | Summary: Tablegen was unable to determine that param loads/stores were actually reading or writing from memory. I think this isn't a problem in practice for param stores, because those occur in a block right before we make our call. But param loads don't have to at the very beginning of a function, so should be annotated as mayLoad so we don't incorrectly optimize them. Reviewers: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17471 llvm-svn: 262381
* [NVPTX] Remove workaround for tablegen crash in NVPTXInstrInfo.td.Justin Lebar2016-03-011-28/+7
| | | | | | | | | | | | Summary: Looks like this was caused by a typo. Reviewers: jholewinski Subscribers: jholewinski, llvm-commits, tra Differential Revision: http://reviews.llvm.org/D17357 llvm-svn: 262380
* [NVPTX] Use different, convergent MIs for convergent calls.Justin Lebar2016-03-013-49/+56
| | | | | | | | | | | | | | | | | | | | | | | Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373
* [NVPTX] Nix hack used to emit '{' and '}' for NVPTX calls.Justin Lebar2016-03-011-9/+3
| | | | | | | | | | | | Summary: Tablegen understands backslash as an escape char; that's sufficient. Reviewers: jholewinski Subscribers: llvm-commits, tra, jholewinski Differential Revision: http://reviews.llvm.org/D17432 llvm-svn: 262372
* [NVPTX] Reformat NVPTXInstrInfo.td, and add additional comments.Justin Lebar2016-03-011-1418/+1400
| | | | | | | | | | | | | | | Summary: Also simplify some of the embedded C++ logic. No functional changes. Reviewers: jholewinski Subscribers: llvm-commits, tra, jholewinski Differential Revision: http://reviews.llvm.org/D17354 llvm-svn: 262371
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-231-3/+3
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* Unbreak non-X86 targets from fallout caused by r261462David Majnemer2016-02-211-1/+2
| | | | llvm-svn: 261463
* [NVPTX] Annotate convergent intrinsics as convergent.Justin Lebar2016-02-171-0/+2
| | | | | | | | | | | | | | | Summary: Previously the machine instructions for bar.sync &co. were not marked as convergent. This resulted in some MI passes (such as TailDuplication, fixed in an upcoming patch) doing unsafe things to these instructions. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski, hfinkel Differential Revision: http://reviews.llvm.org/D17318 llvm-svn: 261115
* [NVPTX] Annotate call machine instructions as calls.Justin Lebar2016-02-171-0/+2
| | | | | | | | | | | | | | | | | Summary: Otherwise we'll try to do unsafe optimizations on these MIs, such as sinking loads below calls. (I suspect that this is not the only bug in the NVPTX instruction tablegen files; I need to comb through them.) Reviewers: jholewinski, tra Subscribers: jingyue, jhen, llvm-commits Differential Revision: http://reviews.llvm.org/D17315 llvm-svn: 261113
* [NVPTX] emit .file directives for files referenced by subprograms.Artem Belevich2016-02-111-0/+1
| | | | | | | | .. so .loc directives referring to those files work correctly. Differential Revision: http://reviews.llvm.org/D17086 llvm-svn: 260557
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-091-7/+4
| | | | llvm-svn: 260316
* [NVPTX] Disable performance optimizations when OptLevel==NoneJingyue Wu2016-02-041-21/+36
| | | | | | | | | | Reviewers: jholewinski, tra, eliben Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D16874 llvm-svn: 259749
* [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor.Justin Lebar2016-01-301-0/+17
| | | | | | | | | | | | Summary: Previously we'd just silently skip these. Reviewers: tra, jholewinski Subscribers: llvm-commits, jhen, echristo, Differential Revision: http://reviews.llvm.org/D16739 llvm-svn: 259279
* One more batch of self-containing headers.Benjamin Kramer2016-01-271-0/+2
| | | | llvm-svn: 258974
* Don't put classes in headers into anonymous namespaces.Benjamin Kramer2016-01-271-3/+2
| | | | | | You want ODR violations? That's how you get ODR violations. llvm-svn: 258973
* Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to ↵Benjamin Kramer2016-01-272-4/+4
| | | | | | | | CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939
* Remove autoconf supportChris Bieneman2016-01-264-69/+0
| | | | | | | | | | | | | | | | Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
* [CUDA] Die gracefully when trying to output an LLVM alias.Justin Lebar2016-01-231-0/+5
| | | | | | | | | | | | | | Summary: Previously, we would just output "foo = bar" in the assembly, and then ptxas would choke. Now we die before emitting any invalid code. Reviewers: echristo Subscribers: jholewinski, llvm-commits, jhen, tra Differential Revision: http://reviews.llvm.org/D16490 llvm-svn: 258638
* [CUDA] Make empty parameter lists in nvptx function decls easier to read.Justin Lebar2016-01-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Summary: Before: .func (.param .b32 func_retval0) _ZL21__nvvm_reflect_anchorv( ) { After: .func (.param .b32 func_retval0) _ZL21__nvvm_reflect_anchorv() { Reviewers: bkramer Subscribers: llvm-commits, tra, jhen, echristo, jholewinski Differential Revision: http://reviews.llvm.org/D16512 llvm-svn: 258637
* Put space after pointer type in test. NFC.Manuel Jacob2016-01-231-1/+1
| | | | llvm-svn: 258615
* [NVPTX] expand mul_lohi to mul_lo and mul_hiJingyue Wu2016-01-221-0/+4
| | | | | | | | | | | | Summary: Fixes PR26186. Reviewers: grosser, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D16479 llvm-svn: 258536
* GlobalValue: use getValueType() instead of getType()->getPointerElementType().Manuel Jacob2016-01-162-10/+8
| | | | | | | | | | | | Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999
* [NVPTX] Do not emit .hidden or .protected directives as they are not allowed ↵Artem Belevich2016-01-151-0/+3
| | | | | | by PTX. llvm-svn: 257961
* Added support for macro emission in dwarf (supporting DWARF version 4).Amjad Aboud2016-01-072-0/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D15495 llvm-svn: 257060
* Remove extra forward declarations and scrub includes for all in tree ↵Craig Topper2015-12-251-2/+0
| | | | | | InstPrinters. NFC llvm-svn: 256427
* Start replacing vector_extract/vector_insert with extractelt/inserteltMatt Arsenault2015-12-111-30/+28
| | | | | | | | | | | | | | | | | | | | These are redundant pairs of nodes defined for INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT. insertelement/extractelement are slightly closer to the corresponding C++ node name, and has stricter type checking so prefer it. Update targets to only use these nodes where it is trivial to do so. AArch64, ARM, and Mips all have various type errors on simple replacement, so they will need work to fix. Example from AArch64: def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8), (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>; Which is trying to do sext_inreg i8, i8. llvm-svn: 255359
* Stop producing .data.rel sections.Rafael Espindola2015-11-181-2/+1
| | | | | | | | | | | | | | | | | If a section is rw, it is irrelevant if the dynamic linker will write to it or not. It looks like llvm implemented this because gcc was doing it. It looks like gcc implemented this in the hope that it would put all the relocated items close together and speed up the dynamic linker. There are two problem with this: * It doesn't work. Both bfd and gold will map .data.rel to .data and concatenate the input sections in the order they are seen. * If we want a feature like that, it can be implemented directly in the linker since it knowns where the dynamic relocations are. llvm-svn: 253436
* Drop code after unreachable. No functionality change.Benjamin Kramer2015-10-261-2/+0
| | | | llvm-svn: 251278
OpenPOWER on IntegriCloud