summaryrefslogtreecommitdiffstats
path: root/llvm/include/llvm/InitializePasses.h
Commit message (Collapse)AuthorAgeFilesLines
* Add a MachinePostDominator passTom Stellard2012-09-171-0/+1
| | | | | | This is used in the AMDIL and R600 backends. llvm-svn: 164029
* Introduce a new SROA implementation.Chandler Carruth2012-09-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is essentially a ground up re-think of the SROA pass in LLVM. It was initially inspired by a few problems with the existing pass: - It is subject to the bane of my existence in optimizations: arbitrary thresholds. - It is overly conservative about which constructs can be split and promoted. - The vector value replacement aspect is separated from the splitting logic, missing many opportunities where splitting and vector value formation can work together. - The splitting is entirely based around the underlying type of the alloca, despite this type often having little to do with the reality of how that memory is used. This is especially prevelant with unions and base classes where we tail-pack derived members. - When splitting fails (often due to the thresholds), the vector value replacement (again because it is separate) can kick in for preposterous cases where we simply should have split the value. This results in forming i1024 and i2048 integer "bit vectors" that tremendously slow down subsequnet IR optimizations (due to large APInts) and impede the backend's lowering. The new design takes an approach that fundamentally is not susceptible to many of these problems. It is the result of a discusison between myself and Duncan Sands over IRC about how to premptively avoid these types of problems and how to do SROA in a more principled way. Since then, it has evolved and grown, but this remains an important aspect: it fixes real world problems with the SROA process today. First, the transform of SROA actually has little to do with replacement. It has more to do with splitting. The goal is to take an aggregate alloca and form a composition of scalar allocas which can replace it and will be most suitable to the eventual replacement by scalar SSA values. The actual replacement is performed by mem2reg (and in the future SSAUpdater). The splitting is divided into four phases. The first phase is an analysis of the uses of the alloca. This phase recursively walks uses, building up a dense datastructure representing the ranges of the alloca's memory actually used and checking for uses which inhibit any aspects of the transform such as the escape of a pointer. Once we have a mapping of the ranges of the alloca used by individual operations, we compute a partitioning of the used ranges. Some uses are inherently splittable (such as memcpy and memset), while scalar uses are not splittable. The goal is to build a partitioning that has the minimum number of splits while placing each unsplittable use in its own partition. Overlapping unsplittable uses belong to the same partition. This is the target split of the aggregate alloca, and it maximizes the number of scalar accesses which become accesses to their own alloca and candidates for promotion. Third, we re-walk the uses of the alloca and assign each specific memory access to all the partitions touched so that we have dense use-lists for each partition. Finally, we build a new, smaller alloca for each partition and rewrite each use of that partition to use the new alloca. During this phase the pass will also work very hard to transform uses of an alloca into a form suitable for promotion, including forming vector operations, speculating loads throguh PHI nodes and selects, etc. After splitting is complete, each newly refined alloca that is a candidate for promotion to a scalar SSA value is run through mem2reg. There are lots of reasonably detailed comments in the source code about the design and algorithms, and I'm going to be trying to improve them in subsequent commits to ensure this is well documented, as the new pass is in many ways more complex than the old one. Some of this is still a WIP, but the current state is reasonbly stable. It has passed bootstrap, the nightly test suite, and Duncan has run it successfully through the ACATS and DragonEgg test suites. That said, it remains behind a default-off flag until the last few pieces are in place, and full testing can be done. Specific areas I'm looking at next: - Improved comments and some code cleanup from reviews. - SSAUpdater and enabling this pass inside the CGSCC pass manager. - Some datastructure tuning and compile-time measurements. - More aggressive FCA splitting and vector formation. Many thanks to Duncan Sands for the thorough final review, as well as Benjamin Kramer for lots of review during the process of writing this pass, and Daniel Berlin for reviewing the data structures and algorithms and general theory of the pass. Also, several other people on IRC, over lunch tables, etc for lots of feedback and advice. llvm-svn: 163883
* Add a pass that renames everything with metasyntatic names. This works well ↵Alex Rosenberg2012-09-111-0/+1
| | | | | | after using bugpoint to reduce the confusion presented by the original names, which no longer mean what they used to. llvm-svn: 163592
* Add a new optimization pass: Stack Coloring, that merges disjoint static ↵Nadav Rotem2012-09-061-0/+1
| | | | | | | | allocations (allocas). Allocas are known to be disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics). llvm-svn: 163299
* Profile: set branch weight metadata with data generated from profiling.Manman Ren2012-08-281-0/+1
| | | | | | | | | This patch implements ProfileDataLoader which loads profile data generated by -insert-edge-profiling and updates branch weight metadata accordingly. Patch by Alastair Murray. llvm-svn: 162799
* Start scaffolding for a MachineTraceMetrics analysis pass.Jakob Stoklund Olesen2012-07-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This is still a work in progress. Out-of-order CPUs usually execute instructions from multiple basic blocks simultaneously, so it is necessary to look at longer traces when estimating the performance effects of code transformations. The MachineTraceMetrics analysis will pick a typical trace through a given basic block and provide performance metrics for the trace. Metrics will include: - Instruction count through the trace. - Issue count per functional unit. - Critical path length, and per-instruction 'slack'. These metrics can be used to determine the performance limiting factor when executing the trace, and how it will be affected by a code transformation. Initially, this will be used by the early if-conversion pass. llvm-svn: 160796
* Add an experimental early if-conversion pass, off by default.Jakob Stoklund Olesen2012-07-041-0/+1
| | | | | | | | | | | | | This pass performs if-conversion on SSA form machine code by speculatively executing both sides of the branch and using a cmov instruction to select the result. This can help lower the number of branch mispredictions on architectures like x86 that don't have predicable instructions. The current implementation is very aggressive, and causes regressions on mosts tests. It needs good heuristics that have yet to be implemented. llvm-svn: 159694
* Remove the RenderMachineFunction HTML output pass.Jakob Stoklund Olesen2012-06-201-1/+0
| | | | | | | I don't think anyone has been using this functionality for a while, and it is getting in the way of refactoring now. llvm-svn: 158876
* Sketch a LiveRegMatrix analysis pass.Jakob Stoklund Olesen2012-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The LiveRegMatrix represents the live range of assigned virtual registers in a Live interval union per register unit. This is not fundamentally different from the interference tracking in RegAllocBase that both RABasic and RAGreedy use. The important differences are: - LiveRegMatrix tracks interference per register unit instead of per physical register. This makes interference checks cheaper and assignments slightly more expensive. For example, the ARM D7 reigster has 24 aliases, so we would check 24 physregs before assigning to one. With unit-based interference, we check 2 units before assigning to 2 units. - LiveRegMatrix caches regmask interference checks. That is currently duplicated functionality in RABasic and RAGreedy. - LiveRegMatrix is a pass which makes it possible to insert target-dependent passes between register allocation and rewriting. Such passes could tweak the register assignments with interference checking support from LiveRegMatrix. Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix. llvm-svn: 158255
* Reintroduce VirtRegRewriter.Jakob Stoklund Olesen2012-06-081-0/+1
| | | | | | | | | | | | | | | | | | OK, not really. We don't want to reintroduce the old rewriter hacks. This patch extracts virtual register rewriting as a separate pass that runs after the register allocator. This is possible now that CodeGen/Passes.cpp can configure the full optimizing register allocator pipeline. The rewriter pass uses register assignments in VirtRegMap to rewrite virtual registers to physical registers, and it inserts kill flags based on live intervals. These finalization steps are the same for the optimizing register allocators: RABasic, RAGreedy, and PBQP. llvm-svn: 158244
* Add an insertPass API to TargetPassConfig. <rdar://problem/11498613>Bob Wilson2012-05-301-0/+1
| | | | | | | | | | Besides adding the new insertPass function, this patch uses it to enhance the existing -print-machineinstrs so that the MachineInstrs after a specific pass can be printed. Patch by Bin Zeng! llvm-svn: 157655
* add a new pass to instrument loads and stores for run-time bounds checkingNuno Lopes2012-05-221-0/+1
| | | | | | | | move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h (a draft of this) patch reviewed by Andrew, thanks. llvm-svn: 157261
* ThreadSanitizer, a race detector. First LLVM commit.Kostya Serebryany2012-02-131-0/+1
| | | | | | | Clang patch (flags) will follow shortly. The run-time library will also follow, but not immediately. llvm-svn: 150423
* Codegen pass definition cleanup. No functionality.Andrew Trick2012-02-081-0/+7
| | | | | | | | | | | | | Moving toward a uniform style of pass definition to allow easier target configuration. Globally declare Pass ID. Globally declare pass initializer. Use INITIALIZE_PASS consistently. Add a call to the initializer from CodeGen.cpp. Remove redundant "createPass" functions and "getPassName" methods. While cleaning up declarations, cleaned up comments (sorry for large diff). llvm-svn: 150100
* Move pass configuration out of pass constructors: BranchFolderPassAndrew Trick2012-02-081-0/+1
| | | | llvm-svn: 150095
* Make TargetPassConfig an ImmutablePass so CodeGenPasses can query optionsAndrew Trick2012-02-041-0/+1
| | | | llvm-svn: 149752
* Add a basic-block autovectorization pass.Hal Finkel2012-02-011-1/+5
| | | | | | | This is the initial checkin of the basic-block autovectorization pass along with some supporting vectorization infrastructure. Special thanks to everyone who helped review this code over the last several months (especially Tobias Grosser). llvm-svn: 149468
* More bundle related API additions.Evan Cheng2012-01-191-0/+1
| | | | llvm-svn: 148465
* Add a new ObjC ARC optimization pass to eliminate unneededDan Gohman2012-01-171-0/+1
| | | | | | autorelease push+pop pairs. llvm-svn: 148330
* Renamed MachineScheduler to ScheduleTopDownLive.Andrew Trick2012-01-171-1/+1
| | | | | | Responding to code review. llvm-svn: 148290
* Added the MachineSchedulerPass skeleton.Andrew Trick2012-01-131-0/+1
| | | | llvm-svn: 148105
* Added a late machine instruction copy propagation pass. This catchesEvan Cheng2012-01-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | opportunities that only present themselves after late optimizations such as tail duplication .e.g. ## BB#1: movl %eax, %ecx movl %ecx, %eax ret The register allocator also leaves some of them around (due to false dep between copies from phi-elimination, etc.) This required some changes in codegen passes. Post-ra scheduler and the pseudo-instruction expansion passes have been moved after branch folding and tail merging. They were before branch folding before because it did not always update block livein's. That's fixed now. The pass change makes independently since we want to properly schedule instructions after branch folding / tail duplication. rdar://10428165 rdar://10640363 llvm-svn: 147716
* - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a functionEvan Cheng2011-12-141-0/+1
| | | | | | | | | | to finalize MI bundles (i.e. add BUNDLE instruction and computing register def and use lists of the BUNDLE instruction) and a pass to unpack bundles. - Teach more of MachineBasic and MachineInstr methods to be bundle aware. - Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to prevent IT blocks from being broken apart. llvm-svn: 146542
* Kill off the LoopSplitter. It's not being used or maintained.Lang Hames2011-12-061-1/+0
| | | | llvm-svn: 145897
* AddressSanitizer, first commit (compiler module only)Kostya Serebryany2011-11-161-0/+1
| | | | llvm-svn: 144758
* Prune more RALinScan. RALinScan was also here!NAKAMURA Takumi2011-11-131-1/+0
| | | | llvm-svn: 144487
* Begin collecting some of the statistics for block placement discussed onChandler Carruth2011-11-021-0/+1
| | | | | | | | | | | | | the mailing list. Suggestions for other statistics to collect would be awesome. =] Currently these are implemented as a separate pass guarded by a separate flag. I'm not thrilled by that, but I wanted to be able to collect the statistics for the old code placement as well as the new in order to have a point of comparison. I'm planning on folding them into the single pass if / when there is only one pass of interest. llvm-svn: 143537
* Implement a block placement pass based on the branch probability andChandler Carruth2011-10-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | block frequency analyses. This differs substantially from the existing block-placement pass in LLVM: 1) It operates on the Machine-IR in the CodeGen layer. This exposes much more (and more precise) information and opportunities. Also, the results are more stable due to fewer transforms ocurring after the pass runs. 2) It uses the generalized probability and frequency analyses. These can model static heuristics, code annotation derived heuristics as well as eventual profile loading. By basing the optimization on the analysis interface it can work from any (or a combination) of these inputs. 3) It uses a more aggressive algorithm, both building chains from tho bottom up to maximize benefit, and using an SCC-based walk to layout chains of blocks in a profitable ordering without O(N^2) iterations which the old pass involves. The pass is currently gated behind a flag, and not enabled by default because it still needs to grow some important features. Most notably, it needs to support loop aligning and careful layout of loop structures much as done by hand currently in CodePlacementOpt. Once it supports these, and has sufficient testing and quality tuning, it should replace both of these passes. Thanks to Nick Lewycky and Richard Smith for help authoring & debugging this, and to Jakob, Andy, Eric, Jim, and probably a few others I'm forgetting for reviewing and answering all my questions. Writing a backend pass is *sooo* much better now than it used to be. =D llvm-svn: 142641
* svn mv Target/ARM/ARMGlobalMerge.cpp Transforms/Scalar/GlobalMerge.cppDevang Patel2011-10-171-0/+1
| | | | | | There is no reason to have simple IR level pass in lib/Target. llvm-svn: 142200
* Remove the old tail duplication pass. It is not used and is unable to updateRafael Espindola2011-08-301-1/+0
| | | | | | | ssa, so it has to be run really early in the pipeline. Any replacement should probably use the SSAUpdater. llvm-svn: 138841
* Remove the LowerSetJmp pass. It wasn't used effectively by any of the targets.Bill Wendling2011-08-031-1/+0
| | | | | | This is some of my original LLVM code. *wipes tear* llvm-svn: 136821
* Rename BlockFrequency to BlockFrequencyInfo and MachineBlockFrequency toJakub Staszak2011-07-251-2/+2
| | | | | | MachineBlockFrequencyInfo. llvm-svn: 135937
* Add MachineBlockFrequency analysis.Jakub Staszak2011-07-161-0/+1
| | | | llvm-svn: 135352
* Land the long talked about "type system rewrite" patch. ThisChris Lattner2011-07-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | patch brings numerous advantages to LLVM. One way to look at it is through diffstat: 109 files changed, 3005 insertions(+), 5906 deletions(-) Removing almost 3K lines of code is a good thing. Other advantages include: 1. Value::getType() is a simple load that can be CSE'd, not a mutating union-find operation. 2. Types a uniqued and never move once created, defining away PATypeHolder. 3. Structs can be "named" now, and their name is part of the identity that uniques them. This means that the compiler doesn't merge them structurally which makes the IR much less confusing. 4. Now that there is no way to get a cycle in a type graph without a named struct type, "upreferences" go away. 5. Type refinement is completely gone, which should make LTO much MUCH faster in some common cases with C++ code. 6. Types are now generally immutable, so we can use "Type *" instead "const Type *" everywhere. Downsides of this patch are that it removes some functions from the C API, so people using those will have to upgrade to (not yet added) new API. "LLVM 3.0" is the right time to do this. There are still some cleanups pending after this, this patch is large enough as-is. llvm-svn: 134829
* Introduce "expect" intrinsic instructions.Jakub Staszak2011-07-061-0/+1
| | | | llvm-svn: 134516
* Remove the experimental (and unused) pre-ra splitting pass. Greedy regalloc ↵Evan Cheng2011-06-271-1/+0
| | | | | | can split live ranges. llvm-svn: 133962
* There is only one register coalescer. Merge it into the base class andRafael Espindola2011-06-261-2/+1
| | | | | | remove the analysis group. llvm-svn: 133899
* Introduce BlockFrequency analysis for BasicBlocks.Jakub Staszak2011-06-231-0/+1
| | | | llvm-svn: 133766
* Introduce MachineBranchProbabilityInfo class, which has similar API toJakub Staszak2011-06-161-0/+1
| | | | | | | | BranchProbabilityInfo (expect setEdgeWeight which is not available here). Branch Weights are kept in MachineBasicBlocks. To turn off this analysis set -use-mbpi=false. llvm-svn: 133184
* The ARC language-specific optimizer. Credit to Dan Gohman.John McCall2011-06-151-0/+4
| | | | llvm-svn: 133108
* New BranchProbabilityInfo analysis. Patch by Jakub Staszak!Andrew Trick2011-06-041-0/+1
| | | | | | | | | | | BranchProbabilityInfo provides an interface for IR passes to query the likelihood that control follows a CFG edge. This patch provides an initial implementation of static branch predication that will populate BranchProbabilityInfo for branches with no external profile information using very simple heuristics. It currently isn't hooked up to any external profile data, so static prediction does all the work. llvm-svn: 132613
* Rename LineProfiling to GCOVProfiling to more accurately represent what itNick Lewycky2011-04-161-1/+1
| | | | | | | does. Also mostly implement it. Still a work-in-progress, but generates legal output on crafted test cases. llvm-svn: 129630
* Add support for line profiling. Very work-in-progress.Nick Lewycky2011-04-121-0/+1
| | | | | | | | | | Use debug info in the IR to find the directory/file:line:col. Each time that location changes, bump a counter. Unlike the existing profiling system, we don't try to look at argv[], and thusly don't require main() to be present in the IR. This matches GCC's technique where you specify the profiling flag when producing each .o file. The runtime library is minimal, currently just calling printf at program shutdown time. The API is designed to make it possible to emit GCOV data later on. llvm-svn: 129340
* remove the StructRetPromotion pass. It is unused, not maintained andChris Lattner2011-04-111-1/+0
| | | | | | | has some bugs. If this is interesting functionality, it should be reimplemented in the argpromotion pass. llvm-svn: 129314
* remove postdom frontiers, because it is dead. Forward dom frontiers areChris Lattner2011-04-051-1/+0
| | | | | | still used by RegionInfo :( llvm-svn: 128943
* Delete the GEPSplitter experiment.Dan Gohman2011-02-281-1/+0
| | | | llvm-svn: 126671
* Delete the SimplifyHalfPowrLibCalls pass, which was unused, andDan Gohman2011-02-281-1/+0
| | | | | | only existed as the result of a misunderstanding. llvm-svn: 126669
* Delete the LiveValues pass. I won't get get back to the project itDan Gohman2011-02-281-1/+0
| | | | | | was started for in the foreseeable future. llvm-svn: 126668
* introduce a new TargetLibraryInfo pass, which transformations can use toChris Lattner2011-02-181-0/+1
| | | | | | | | query about available library functions. For now this just has memset_pattern16, which exists on darwin, but it can be extended for a bunch of other things in the future. llvm-svn: 125965
* Implementation of path profiling.Andrew Trick2011-01-291-4/+9
| | | | | | | | | | Modified patch by Adam Preuss. This builds on the existing framework for block tracing, edge profiling and optimal edge profiling. See -help-hidden for new flags. For documentation, see the technical report "Implementation of Path Profiling..." in llvm.org/pubs. llvm-svn: 124515
OpenPOWER on IntegriCloud