summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/MachineCSE.cpp
Commit message (Collapse)AuthorAgeFilesLines
* rangify; NFCISanjay Patel2016-01-061-24/+14
| | | | llvm-svn: 256891
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* MachineCSE: Add a target query for the LookAheadLimit heurisiticTom Stellard2015-05-091-2/+3
| | | | | | | | | This is used to determine whether or not to CSE physical register defs. Differential Revision: http://reviews.llvm.org/D9472 llvm-svn: 236923
* Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.Benjamin Kramer2015-03-231-0/+1
| | | | llvm-svn: 232998
* MachineCSE: Clear dead-def flag on CSE.Matthias Braun2015-02-041-2/+9
| | | | | | | | | | | In case CSE reuses a previoulsy unused register the dead-def flag has to be cleared on the def operand, as exposed by the arm64-cse.ll test. This fixes PR22439 and the corresponding rdar://19694987 Differential Revision: http://reviews.llvm.org/D7395 llvm-svn: 228178
* [MachineCSE] Clear kill-flag on registers imp-def'd by the CSE'd instruction.Ahmed Bougacha2014-12-021-0/+31
| | | | | | | | | | | | | | | | | | | Go through implicit defs of CSMI and MI, and clear the kill flags on their uses in all the instructions between CSMI and MI. We might have made some of the kill flags redundant, consider: subs ... %NZCV<imp-def> <- CSMI csinc ... %NZCV<imp-use,kill> <- this kill flag isn't valid anymore subs ... %NZCV<imp-def> <- MI, to be eliminated csinc ... %NZCV<imp-use,kill> Since we eliminated MI, and reused a register imp-def'd by CSMI (here %NZCV), that register, if it was killed before MI, should have that kill flag removed, because it's lifetime was extended. Also, add an exhaustive testcase for the motivating example. Reviewed by: Juergen Ributzka <juergen@apple.com> llvm-svn: 223133
* In Machine CSE pass, the source register of a COPY machine instruction canJiangning Liu2014-08-111-11/+19
| | | | | | | | be propagated to all its users, and this propagation could increase the probability of finding common subexpressions. If the COPY has only one user, the COPY itself can be removed. llvm-svn: 215344
* Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher2014-08-051-2/+2
| | | | | | | | | | | shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-2/+3
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Add TargetInstrInfo interface isAsCheapAsAMove.Jiangning Liu2014-07-291-1/+1
| | | | llvm-svn: 214158
* [Modules] Remove potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-221-1/+2
| | | | | | | | | | | | define below all header includes in the lib/CodeGen/... tree. While the current modules implementation doesn't check for this kind of ODR violation yet, it is likely to grow support for it in the future. It also removes one layer of macro pollution across all the included headers. Other sub-trees will follow. llvm-svn: 206837
* Disable each MachineFunctionPass for 'optnone' functions, unless thatPaul Robinson2014-03-311-0/+3
| | | | | | | pass normally runs at optimization level None, or is part of the register allocation pipeline. llvm-svn: 205228
* Switch a number of loops in lib/CodeGen over to range-based for-loops, now thatOwen Anderson2014-03-171-17/+9
| | | | | | the MachineRegisterInfo iterators are compatible with it. llvm-svn: 204075
* Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson2014-03-131-17/+17
| | | | | | | | | | operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
* [C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper2014-03-071-3/+3
| | | | | | class. llvm-svn: 203220
* Replace PROLOG_LABEL with a new CFI_INSTRUCTION.Rafael Espindola2014-03-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | The old system was fairly convoluted: * A temporary label was created. * A single PROLOG_LABEL was created with it. * A few MCCFIInstructions were created with the same label. The semantics were that the cfi instructions were mapped to the PROLOG_LABEL via the temporary label. The output position was that of the PROLOG_LABEL. The temporary label itself was used only for doing the mapping. The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to one by holding an index into the CFI instructions of this function. I did consider removing MMI.getFrameInstructions completelly and having CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non trivial constructors and destructors and are somewhat big, so the this setup is probably better. The net result is that we don't create temporary labels that are never used. llvm-svn: 203204
* [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.Benjamin Kramer2014-03-021-2/+2
| | | | | | Remove the old functions. llvm-svn: 202636
* Disabled subregister copy coalescing during MachineCSE.Andrew Trick2013-12-171-5/+15
| | | | | | | | | This effectively backs out r197465 but leaves some of the general fixes in place. Not all targets are ready to handle this feature. To enable it, some infrastructure work is needed to better handle register class constraints. llvm-svn: 197514
* Allow MachineCSE to coalesce trivial subregister copies the same way that it ↵Andrew Trick2013-12-171-3/+8
| | | | | | | | | | | | | | | | | | | | coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> Test case: cse-add-with-overflow.ll. This exposed an existing bug in PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case: PowerPC/crash.ll. llvm-svn: 197465
* Revert "Allow MachineCSE to coalesce trivial subregister copies the same way ↵Rafael Espindola2013-12-161-8/+3
| | | | | | | | | | that it coalesces normal copies." This reverts commit r197414. It broke the ppc64 bootstrap. I will post a testcase in a sec. llvm-svn: 197424
* Allow MachineCSE to coalesce trivial subregister copies the same wayAndrew Trick2013-12-161-3/+8
| | | | | | | | | | | | | | | | | that it coalesces normal copies. Without this, MachineCSE is powerless to handle redundant operations with truncated source operands. This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled: %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1 %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2 %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def> llvm-svn: 197414
* whitespaceAndrew Trick2013-12-161-1/+1
| | | | llvm-svn: 197413
* Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector ↵Craig Topper2013-07-141-4/+4
| | | | | | size. llvm-svn: 186274
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-5/+5
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* CSE: allow PerformTrivialCoalescing to check copies across basic blockManman Ren2012-11-271-2/+0
| | | | | | | | | | | | | | | | boundaries. Given the following case: BB0 %vreg1<def> = SUBrr %vreg0, %vreg7 %vreg2<def> = COPY %vreg7 BB1 %vreg10<def> = SUBrr %vreg0, %vreg2 We should be able to CSE between SUBrr in BB0 and SUBrr in BB1. rdar://12462006 llvm-svn: 168717
* Don't use iterator after being erased.Jakub Staszak2012-11-261-1/+1
| | | | llvm-svn: 168622
* Do not consider a machine instruction that uses and defines the sameUlrich Weigand2012-11-131-16/+44
| | | | | | | | | | physical register as candidate for common subexpression elimination in MachineCSE. This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc caused by MachineCSE invalidly merging two separate DYNALLOC insns. llvm-svn: 167855
* Remove unused BitVectors from getAllocatableSet().Jakob Stoklund Olesen2012-10-161-3/+0
| | | | llvm-svn: 165999
* Switch most getReservedRegs() clients to the MRI equivalent.Jakob Stoklund Olesen2012-10-151-4/+1
| | | | | | | Using the cached bit vector in MRI avoids comstantly allocating and recomputing the reserved register bit vector. llvm-svn: 165983
* MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps ↵Benjamin Kramer2012-08-111-4/+3
| | | | | | already. llvm-svn: 161729
* PR13578: Teach MachineCSE that instructions that use a constant register can ↵Benjamin Kramer2012-08-111-2/+5
| | | | | | | | be CSE'd safely. This is common e.g. when doing rip-relative addressing on x86_64. llvm-svn: 161728
* X86: enable CSE between CMP and SUBManman Ren2012-08-081-2/+18
| | | | | | | | | | | | | | We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462
* MachineCSE: Update the heuristics for isProfitableToCSE.Manman Ren2012-08-071-0/+23
| | | | | | | | | If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396
* Remove tabs.Bill Wendling2012-07-191-1/+1
| | | | llvm-svn: 160475
* Remove ParentMap. You can just ask the domnode for its parent. No functionalityNick Lewycky2012-07-051-11/+8
| | | | | | | | | | change. Move the "Not profitable, avoid CSE!" debug message next to where we fail the check for profitability and use a different message for avoiding CSE due to being in different register classes. llvm-svn: 159729
* Switch some getAliasSet clients to MCRegAliasIterator.Jakob Stoklund Olesen2012-06-011-3/+2
| | | | | | | MCRegAliasIterator can optionally visit the register itself, allowing for simpler code. llvm-svn: 157837
* Use uint16_t to store register overlaps to reduce static data.Craig Topper2012-03-041-1/+1
| | | | llvm-svn: 152001
* Handle regmasks in MachineCSE.Jakob Stoklund Olesen2012-02-281-0/+6
| | | | | | | | Don't attempt to extend physreg live ranges across calls. <rdar://problem/10942095> llvm-svn: 151610
* Re-enable 150652 and 150654 - Make FPSCR non-reserved, and make MachineCSE ↵Lang Hames2012-02-171-3/+9
| | | | | | bail on reserved registers. This *should* be safe as of r150786. llvm-svn: 150769
* Oop - r150653 + r150654 broke one of my test cases. Backing out for now...Lang Hames2012-02-161-9/+3
| | | | llvm-svn: 150655
* MachineCSE shouldn't extend the live ranges of reserved or allocatable ↵Lang Hames2012-02-161-3/+9
| | | | | | registers. llvm-svn: 150653
* Codegen pass definition cleanup. No functionality.Andrew Trick2012-02-081-2/+1
| | | | | | | | | | | | | Moving toward a uniform style of pass definition to allow easier target configuration. Globally declare Pass ID. Globally declare pass initializer. Use INITIALIZE_PASS consistently. Add a call to the initializer from CodeGen.cpp. Remove redundant "createPass" functions and "getPassName" methods. While cleaning up declarations, cleaned up comments (sorry for large diff). llvm-svn: 150100
* whitespaceAndrew Trick2012-02-081-2/+2
| | | | llvm-svn: 150094
* Persuade GCC that there is nothing worth warning about here (there isn't).Duncan Sands2012-02-051-0/+1
| | | | llvm-svn: 149834
* Avoid CSE of instructions which define physical registers across MBBs unlessEvan Cheng2012-01-111-4/+12
| | | | | | the physical registers are not allocatable. llvm-svn: 147902
* Allow machine-cse to look across MBB boundary when cse'ing instructions thatEvan Cheng2012-01-101-15/+54
| | | | | | | | | | define physical registers. It's currently very restrictive, only catching cases where the CE is in an immediate (and only) predecessor. But it catches a surprising large number of cases. rdar://10660865 llvm-svn: 147827
* Add bundle aware API for querying instruction properties and switch the codeEvan Cheng2011-12-071-5/+4
| | | | | | | | | | | | | | generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if *all* of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026
* We need to verify that the machine instruction we're using as a replacement forBill Wendling2011-10-121-0/+11
| | | | | | | | | | our current machine instruction defines a register with the same register class as what's being replaced. This showed up in the SPEC 403.gcc benchmark, where it would ICE because a tail call was expecting one register class but was given another. (The machine instruction verifier catches this situation.) <rdar://problem/10270968> llvm-svn: 141830
* - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo andEvan Cheng2011-06-281-3/+3
| | | | | | | | sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021
* Re-revert r130877; it's apparently causing a regression on 197.parser,Eli Friedman2011-05-061-50/+27
| | | | | | possibly related to cbnz formation. llvm-svn: 130977
OpenPOWER on IntegriCloud