summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [CodeGen] Add print and verify pass after each MachineFunctionPass by defaultMatthias Braun2014-12-111-6/+3
| | | | | | | | | | | | | | | | | | | Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059
* This reverts commit r224043 and r224042.Rafael Espindola2014-12-111-3/+6
| | | | | | check-llvm was failing. llvm-svn: 224045
* [CodeGen] Add print and verify pass after each MachineFunctionPass by defaultMatthias Braun2014-12-111-6/+3
| | | | | | | | | | | | | | | | Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042
* Add out of line virtual destructors to all LLVMTargetMachine subclassesReid Kleckner2014-11-201-0/+2
| | | | | | | | | | | | | | | | | These recently all grew a unique_ptr<TargetLoweringObjectFile> member in r221878. When anyone calls a virtual method of a class, clang-cl requires all virtual methods to be semantically valid. This includes the implicit virtual destructor, which triggers instantiation of the unique_ptr destructor, which fails because the type being deleted is incomplete. This is just part of the ongoing saga of PR20337, which is affecting Blink as well. Because the MSVC ABI doesn't have key functions, we end up referencing the vtable and implicit destructor on any virtual call through a class. We don't actually end up emitting the dtor, so it'd be good if we could avoid this unneeded type completion work. llvm-svn: 222480
* This patch changes the ownership of TLOF from TargetLoweringBase to ↵Aditya Nandakumar2014-11-131-0/+2
| | | | | | TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878
* [NVPTX] Add an NVPTX-specific TargetTransformInfoJingyue Wu2014-11-101-0/+8
| | | | | | | | | | | | | | | | | | | | Summary: It currently only implements hasBranchDivergence, and will be extended in later diffs. Split from D6188. Test Plan: make check-all Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6195 llvm-svn: 221619
* [NVPTX] Add NVPTXLowerStructArgs passJustin Holewinski2014-11-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x *byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x * %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. llvm-svn: 221377
* [NVPTX] Directly control the Machine SSA passes that are invoked for NVPTX.Justin Holewinski2014-06-271-0/+41
| | | | | | | NVPTX is a bit special in the optimizations it requires, so this gives us better control over the backend optimization pipeline. llvm-svn: 211927
* Move NVPTX subtarget dependent variables from the target machineEric Christopher2014-06-271-14/+1
| | | | | | to the subtarget. llvm-svn: 211860
* Remove unnecessary caching of the TargetMachine on NVPTXFrameLowering.Eric Christopher2014-06-271-1/+1
| | | | | | Adjust the constructor accordingly. llvm-svn: 211846
* Remove caching of the target machine in NVPTXInstrInfo andEric Christopher2014-06-271-1/+1
| | | | | | update constructor accordingly. llvm-svn: 211840
* Remove comment that duplicated information in the constructorEric Christopher2014-06-271-6/+6
| | | | | | that it's after. llvm-svn: 211839
* Have TargetSelectionDAGInfo take a DataLayout initializer rather thanEric Christopher2014-06-061-1/+1
| | | | | | a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366
* Add an optimization that does CSE in a group of similar GEPs.Eli Bendersky2014-05-011-4/+17
| | | | | | | | | | | | | | This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783
* [C++11] Add 'override' keywords and remove 'virtual'. Additionally add ↵Craig Topper2014-04-291-8/+8
| | | | | | 'final' and leave 'virtual' on some methods that are marked virtual without overriding anything and have no obvious overrides themselves. NVPTX edition llvm-svn: 207505
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-1/+1
| | | | llvm-svn: 207197
* [C++11] Replace OwningPtr with std::unique_ptr in places where it doesn't ↵Benjamin Kramer2014-04-211-1/+0
| | | | | | | | break the API. No functionality change. llvm-svn: 206740
* [NVPTX] Add preliminary intrinsics and codegen support for textures/surfacesJustin Holewinski2014-04-091-0/+8
| | | | | | This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later). llvm-svn: 205907
* Optimize away unnecessary address casts.Eli Bendersky2014-04-031-0/+9
| | | | | | | | | Removes unnecessary casts from non-generic address spaces to the generic address space for certain code patterns. Patch by Jingyue Wu. llvm-svn: 205571
* Fix for PR19099 - NVPTX produces invalid symbol names.Eli Bendersky2014-03-311-0/+3
| | | | | | | | This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. llvm-svn: 205212
* Removes the NVPTXSplitBBatBar pass.Eli Bendersky2014-03-241-2/+0
| | | | | | | This pass is a historic remnant and actually causes less efficient code to be generated in some cases. llvm-svn: 204620
* Switch all uses of LLVM_OVERRIDE to just use 'override' directly.Craig Topper2014-03-021-1/+1
| | | | llvm-svn: 202621
* [cleanup] Move the Dominators.h and Verifier.h headers into the IRChandler Carruth2014-01-131-1/+1
| | | | | | | | | | | | | | | | | | directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
* [PM] Rename the IR printing pass header to a more generic and correctChandler Carruth2014-01-121-1/+1
| | | | | | | | name to match the source file which I got earlier. Update the include sites. Also modernize the comments in the header to use the more recommended doxygen style. llvm-svn: 199041
* Move the LLVM IR asm writer header files into the IR directory, as theyChandler Carruth2014-01-071-1/+1
| | | | | | | | | | | | | | | | | are part of the core IR library in order to support dumping and other basic functionality. Rename the 'Assembly' include directory to 'AsmParser' to match the library name and the only functionality left their -- printing has been in the core IR library for quite some time. Update all of the #includes to match. All of this started because I wanted to have the layering in good shape before I started adding support for printing LLVM IR using the new pass infrastructure, and commandline support for the new pass infrastructure. llvm-svn: 198688
* The preferred alignment defaults to the abi alignment. Omit if it is the same.Rafael Espindola2013-12-161-2/+2
| | | | llvm-svn: 197400
* Don't duplicate the DataLayout defaults for integer, floats and vectors.Rafael Espindola2013-12-161-3/+1
| | | | llvm-svn: 197398
* On DataLayout, omit the default of p:64:64:64.Rafael Espindola2013-12-161-3/+1
| | | | llvm-svn: 197397
* Refactor NVPTX's computeDataLayout.Rafael Espindola2013-12-141-4/+9
| | | | | | No functionality change. llvm-svn: 197312
* Turn NVPTXSubtarget::getDataLayout into a static function.Rafael Espindola2013-12-141-1/+11
| | | | | | No functionality change. llvm-svn: 197311
* [NVPTX] Blacklist TailDuplicate passJustin Holewinski2013-11-111-0/+1
| | | | | | | | This causes issues with virtual registers. We will likely need to fix TailDuplicate in the future, or introduce a new version that plays nicely with vregs. llvm-svn: 194373
* Assert on duplicate registration. Don't depend on function pointer equality.Rafael Espindola2013-10-161-3/+0
| | | | | | | | | | | | | | | | | | | | Before this patch we would assert when building llvm as multiple shared libraries (cmake's BUILD_SHARED_LIBS). The problem was the line if (T.AsmStreamerCtorFn == Target::createDefaultAsmStreamer) which returns false because of -fvisibility-inlines-hidden. It is easy to fix just this one case, but I decided to try to also make the registration more strict. It looks like the old logic for ignoring followup registration was just a temporary hack that outlived its usefulness. This patch converts the ifs to asserts, fixes the few cases that were registering twice and makes sure all the asserts compare with null. Thanks for Joerg for reporting the problem and reviewing the patch. llvm-svn: 192803
* [NVPTX] Switch from StrongPHIElimination to PHIElimination in ↵Justin Holewinski2013-10-111-2/+22
| | | | | | | | NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc Fixes PR17529 llvm-svn: 192445
* NVPTX: Don't even create a regalloc if we're not going to use it.Benjamin Kramer2013-05-311-2/+7
| | | | | | Fixes a leak found by valgrind. llvm-svn: 183031
* [NVPTX] Re-enable support for virtual registers in the final outputJustin Holewinski2013-05-311-0/+27
| | | | | | | | | | | | Now that 3.3 is branched, we are re-enabling virtual registers to help iron out bugs before the next release. Some of the post-RA passes do not play well with virtual registers, so we disable them for now. The needed functionality of the PrologEpilogInserter pass is copied to a new backend-specific NVPTXPrologEpilog pass. The test for this commit is not breaking the existing tests. llvm-svn: 182998
* Move passes from namespace llvm into anonymous namespaces. Sort includes ↵Benjamin Kramer2013-05-231-2/+2
| | | | | | while there. llvm-svn: 182594
* [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputsJustin Holewinski2013-05-201-0/+8
| | | | | | | | | | | | | | | This converter currently only handles global variables in address space 0. For these variables, they are promoted to address space 1 (global memory), and all uses are updated to point to the result of a cvta.global instruction on the new variable. The motivation for this is address space 0 global variables are illegal since we cannot declare variables in the generic address space. Instead, we place the variables in address space 1 and explicitly convert the pointer to address space 0. This is primarily intended to help new users who expect to be able to place global variables in the default address space. llvm-svn: 182254
* Remove the MachineMove class.Rafael Espindola2013-05-131-1/+3
| | | | | | | | | | | | It was just a less powerful and more confusing version of MCCFIInstruction. A side effect is that, since MCCFIInstruction uses dwarf register numbers, calls to getDwarfRegNum are pushed out, which should allow further simplifications. I left the MachineModuleInfo::addFrameMove interface unchanged since this patch was already fairly big. llvm-svn: 181680
* [NVPTX] Add NVVMReflect pass to allow compile-time selection ofJustin Holewinski2013-03-301-0/+7
| | | | | | | | | | | | | | | | specific code paths. This allows us to write code like: if (__nvvm_reflect("FOO")) // Do something else // Do something else and compile into a library, then give "FOO" a value at kernel compile-time so the check becomes a no-op. llvm-svn: 178416
* [NVPTX] Run clang-format on all NVPTX sources.Justin Holewinski2013-03-301-38/+21
| | | | | | | Hopefully this resolves any outstanding style issues and gives us an automated way of ensuring we conform to the style guidelines. llvm-svn: 178415
* [NVPTX] Disable vector registersJustin Holewinski2013-02-121-1/+0
| | | | | | | | | | | Vectors were being manually scalarized by the backend. Instead, let the target-independent code do all of the work. The manual scalarization was from a time before good target-independent support for scalarization in LLVM. However, this forces us to specially-handle vector loads and stores, which we can turn into PTX instructions that produce/consume multiple operands. llvm-svn: 174968
* Switch TargetTransformInfo from an immutable analysis pass that requiresChandler Carruth2013-01-071-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis *pass*, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it *always* being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681
* Move all of the header files which are involved in modelling the LLVM IRChandler Carruth2013-01-021-1/+1
| | | | | | | | | | | | | | | | | | | | | into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. llvm-svn: 171366
* Remove duplicate includes.Roman Divacky2012-12-211-1/+0
| | | | llvm-svn: 170902
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-10/+10
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Implement a basic VectorTargetTransformInfo interface to be used by the loop ↵Nadav Rotem2012-10-241-1/+1
| | | | | | and bb vectorizers for modeling the cost of instructions. llvm-svn: 166593
* Reapply the TargerTransformInfo changes, minus the changes to LSR and ↵Nadav Rotem2012-10-181-1/+2
| | | | | | Lowerinvoke. llvm-svn: 166248
* Temporarily revert the TargetTransform changes.Bob Wilson2012-10-181-2/+1
| | | | | | | | | | | The TargetTransform changes are breaking LTO bootstraps of clang. I am working with Nadav to figure out the problem, but I am reverting it for now to get our buildbots working. This reverts svn commits: 165665 165669 165670 165786 165787 165997 and I have also reverted clang svn 165741 llvm-svn: 166168
* Add a new interface to allow IR-level passes to access codegen-specific ↵Nadav Rotem2012-10-101-1/+2
| | | | | | information. llvm-svn: 165665
* Move TargetData to DataLayout.Micah Villmow2012-10-081-2/+2
| | | | llvm-svn: 165402
OpenPOWER on IntegriCloud