summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/RegisterCoalescer.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* TargetRegisterInfo: Introduce PrintLaneMask.Matthias Braun2015-09-251-13/+14
| | | | | | | This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624
* TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where ↵Matthias Braun2015-09-251-44/+45
| | | | | | apropriate; NFC llvm-svn: 248623
* LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFCMatthias Braun2015-09-221-19/+6
| | | | llvm-svn: 248241
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* Fix some comment typos.Benjamin Kramer2015-08-081-3/+3
| | | | llvm-svn: 244402
* [regalloc] Make RegMask clobbers prevent merging vreg's into PhysRegs when ↵Daniel Sanders2015-07-311-0/+8
| | | | | | | | | | | | | | | | | | | | | hoisting def's upwards. Summary: This prevents vreg260 and D7 from being merged in: %vreg260<def> = LDC1 ... JAL <ga:@sin>, <regmask ... list not containing D7 ...> %D7<def> = COPY %vreg260; ... Doing so is not valid because the JAL clobbers the D7. This fixes the almabench regression in the LLVM 3.7.0 release branch. Reviewers: MatzeB Subscribers: MatzeB, qcolombet, hans, llvm-commits Differential Revision: http://reviews.llvm.org/D11649 llvm-svn: 243745
* RegisterCoalescer: Cleanup empty subranges after shrinkToUses()Matthias Braun2015-06-301-0/+1
| | | | | | | | A call to removeEmptySubranges() is necessary after every operation that potentially removes all segments from a subregister range; this case in the register coalescer was missing. llvm-svn: 241027
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* TargetRegisterInfo: Make the concept of imprecise lane masks explicitMatthias Braun2015-06-161-1/+2
| | | | | | | | | | | | | | | | | | | | LaneMasks as given by getSubRegIndexLaneMask() have a limited number of of bits, so for targets with more than 31 disjunct subregister there may be cases where: getSubReg(Reg,A) does not overlap getSubReg(Reg,B) but we still have (getSubRegIndexLaneMask(A) & getSubRegIndexLaneMask(B)) != 0. I had hoped to keep this an implementation detail of the tablegen but as my next commit shows we can avoid unnecessary imp-defs operands if we know that the lane masks in use are precise. This is in preparation to http://reviews.llvm.org/D10470. llvm-svn: 239837
* CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperandsMatthias Braun2015-05-291-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | MIOperands/ConstMIOperands are classes iterating over the MachineOperand of a MachineInstr, however MachineInstr::mop_iterator does the same thing. I assume these two iterators exist to have a uniform interface to iterate over the operands of a machine instruction bundle and a single machine instruction. However in practice I find it more confusing to have 2 different iterator classes, so this patch transforms (nearly all) the code to use mop_iterators. The only exception being MIOperands::anlayzePhysReg() and MIOperands::analyzeVirtReg() still needing an equivalent, I leave that as an exercise for the next patch. Differential Revision: http://reviews.llvm.org/D9932 This version is slightly modified from the proposed revision in that it introduces MachineInstr::getOperandNo to avoid the extra counting variable in the few loops that previously used MIOperands::getOperandNo. llvm-svn: 238539
* MachineInstr: Remove unused parameter.Matthias Braun2015-05-191-1/+1
| | | | llvm-svn: 237726
* RegisterCoalescer: Improve a comment.Matthias Braun2015-05-191-6/+5
| | | | | | | Explain the relation of the example to the variables in the code, explain what bad behaviour the code avoids in this case. llvm-svn: 237706
* [RegisterCoalescer] Make sure each live-range has only one component, asQuentin Colombet2015-05-061-4/+30
| | | | | | | | | | | | | demanded by the machine verifier. After shrinking a live-range to its uses, it is possible to create several smaller live-ranges. When this happens, shrinkToUses returns true and we need to split the different components into their own live-ranges. The problem does not reproduce on any in-tree target but Jonas Paulsson <jonas.paulsson@ericsson.com>, who reported the problem, checked that this patch fixes the issue. llvm-svn: 236658
* RegisterCoalescer: hide terminal rule option by defaultMatthias Braun2015-04-281-1/+1
| | | | llvm-svn: 236062
* RegisterCoalescer: implicit phsreg uses are fine when rematerializingMatthias Braun2015-04-241-2/+2
| | | | | | | The target hooks should have already checked them. This change is necessary to enable the remateriailzation on R600. llvm-svn: 235673
* RegisterCoalescer: Avoid unnecessary register class widening for some ↵Matthias Braun2015-04-231-3/+25
| | | | | | | | | | | rematerializations I couldn't provide a testcase as none of the public targets has wide register classes with alot of subregisters and at the same time an instruction which "ReMaterializable" and "AsCheapAsAMove" (could probably be added for R600). llvm-svn: 235668
* [RegisterCoalescer] Fix a potential misuse of direct operand index in theQuentin Colombet2015-03-301-4/+6
| | | | | | | terminal rule. Spot by code inspection. llvm-svn: 233606
* [RegisterCoalescer] Refine the terminal rule to still consider the terminalQuentin Colombet2015-03-271-8/+32
| | | | | | | | | | | | | | | nodes. When a node is terminal it is pushed at the end of the list of the copies to coalesce instead of being completely ignored. In effect, this reduces its priority over non-terminal nodes. Because of that, we do not miss the rematerialization opportunities, nor the copies that can be merged with more complex, than the terminal rule, interference checks. Related to PR22768. llvm-svn: 233395
* [RegisterCoalescer] Add a rule to consider more profitable copies first whenQuentin Colombet2015-03-261-2/+72
| | | | | | | | | | | those are in the same basic block. The previous approach was the topological order of the basic block. By default this rule is disabled. Related to PR22768. llvm-svn: 233241
* RegisterCoalescer: Fix implicit def handling in register coalescerMatthias Braun2015-03-251-1/+23
| | | | | | | | | | | | | | | | If liveranges induced by an IMPLICIT_DEF get completely covered by a proper liverange the IMPLICIT_DEF instructions and its corresponding definitions have to be removed from the live ranges. This has to happen in the subregister live ranges as well (I didn't see this case earlier because in most programs only some subregisters are covered and the IMPLCIT_DEF won't get removed). No testcase, I spent hours trying to create one for one of the public targets, but ultimately failed because I couldn't manage to properly control the placement of COPY and IMPLICIT_DEF instructions from an .ll file. llvm-svn: 233217
* Do not track subregister liveness when it brings no benefitsMatthias Braun2015-03-191-2/+2
| | | | | | | | | | | Some subregisters are only to indicate different access sizes, while not providing any way to actually divide the register up into multiple disjunct parts. Avoid tracking subregister liveness in these cases as it is not beneficial. Differential Revision: http://reviews.llvm.org/D8429 llvm-svn: 232695
* Remove useMachineScheduler and replace it with subtarget optionsEric Christopher2015-03-111-1/+1
| | | | | | | | | | | | | that control, individually, all of the disparate things it was controlling. At the same time move a FIXME in the Hexagon port to a new subtarget function that will enable a user of the machine scheduler to avoid using the source scheduler for pre-RA-scheduling. The FIXME would have this removed, but involves either testcase changes or adding -pre-RA-sched=source to a few testcases. llvm-svn: 231980
* RegisterCoalescer: Gracefully continue if subrange merging fails.Matthias Braun2015-03-041-18/+48
| | | | | | | | | | | There is a known bug where the register coalescer fails to merge subranges when multiple ranges end up in the "overflow" bit 32 of the lanemasks. A proper fix for this is complicated so for now this is a workaround which lets the register coalescer drop the subregister liveness information (we just loose some precision by that) and continue. llvm-svn: 231186
* Rename UpdateRegAllocHint to match style guidelines.Eric Christopher2015-02-241-1/+1
| | | | llvm-svn: 230357
* RegisterCoalescer: Don't rematerialize subregister definitions.Matthias Braun2015-02-161-0/+18
| | | | | | | | | | | We cannot simply rematerialize instructions which only defining a subregister, as the final value also depends on the previous instructions. This fixes test/CodeGen/R600/subreg-coalescer-bug.ll with subreg liveness enabled. llvm-svn: 229444
* RegisterCoalescer: Do not look for regclass of IMPLICIT_DEF.Matthias Braun2015-02-161-6/+7
| | | | | | | | | | | | IMPLICIT_DEF is a generic instruction and has no (fixed) output register class defined. The rematerialization code of the register coalescer should not scan the instruction description for a register class. This fixes a problem showing up in test/CodeGen/R600/subreg-coalescer-crash.ll with subregister liveness enabled. llvm-svn: 229443
* RegisterCoalescer: Improve previous fix for wrong def after.Matthias Braun2015-02-161-3/+2
| | | | | | | | | | | The previous fix in r225503 was needlessly complicated. The problem goes away as well if the arguments to MergeValueNumberInto are supplied in the correct order. This was previously missed because the existing code already had the wrong order but an additional later Merge was hiding the bug for the main liverange VNI. llvm-svn: 229424
* Update a few calls to getSubtarget<> to either be getSubtargetImplEric Christopher2015-01-271-5/+4
| | | | | | | when we didn't need the cast to the base class or the cached version off of the subtarget. llvm-svn: 227176
* MachineRegisterInfo can access TII off of the MachineFunction'sEric Christopher2015-01-271-1/+1
| | | | | | | subtarget and so doesn't need the TargetMachine or to access via getSubtargetImpl. Update all callers. llvm-svn: 227160
* LiveIntervalAnalysis: Factor out code to update liveness on vreg def removalMatthias Braun2015-01-211-32/+18
| | | | | | | | | | | This cleans up code and is more in line with the general philosophy of modifying LiveIntervals through LiveIntervalAnalysis instead of changing them directly. This also fixes a case where SplitEditor::removeBackCopies() would miss the subregister ranges. llvm-svn: 226690
* LiveIntervalAnalysis: Factor out code to update liveness on physreg def removalMatthias Braun2015-01-211-9/+5
| | | | | | | | This cleans up code and is more in line with the general philosophy of modifying LiveIntervals through LiveIntervalAnalysis instead of changing them directly. llvm-svn: 226687
* RegisterCoalescer: Cleanup and improved comment for a subtle detail.Matthias Braun2015-01-171-6/+8
| | | | llvm-svn: 226353
* RegisterCoalescer: Cleanup by factoring out a common expressionMatthias Braun2015-01-171-5/+6
| | | | llvm-svn: 226352
* RegisterCoalescer: Cleanup comment styleMatthias Braun2015-01-171-190/+174
| | | | | | | | | | | - Consistenly put comments above the function declaration, not the definition. To achieve this some duplicate comments got merged and some comment parts describing implementation details got moved into their functions. - Consistently use doxygen comments above functions. - Do not use doxygen comments inside functions. llvm-svn: 226351
* RegisterCoalescer: Drive-by typo + whitespace fixMatthias Braun2015-01-171-2/+2
| | | | llvm-svn: 226350
* Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to ↵Hal Finkel2015-01-151-5/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reserved registers"" Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying *from* a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226200
* Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"Hal Finkel2015-01-151-15/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reverting this while I investigate some bad behavior this is causing. As a possibly-related issue, adding -verify-machineinstrs to one of the test cases now fails because of this change: llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs *** Bad machine code: No instruction at def index *** - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS Valno #3 is defined at 624r *** Bad machine code: Live segment doesn't end at a valid instruction *** - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS [624r,624d:3) LLVM ERROR: Found 2 machine code errors. where 624r corresponds exactly to the interval combining change: 624B %RSP<def> = COPY %vreg16; GR64:%vreg16 Considering merging %vreg16 with %RSP RHS = %vreg16 [608r,624r:0) 0@608r updated: 608B %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1] Success: %vreg16 -> %RSP Result = %RSP llvm-svn: 226086
* [RegisterCoalescer] Remove copies to reserved registersHal Finkel2015-01-151-5/+15
| | | | | | | | | | | | | | | | | | | | | | This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying *from* a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. llvm-svn: 226071
* [cleanup] Re-sort all the #include lines in LLVM usingChandler Carruth2015-01-141-1/+1
| | | | | | | | | | | utils/sort_includes.py. I clearly haven't done this in a while, so more changed than usual. This even uncovered a missing include from the InstrProf library that I've added. No functionality changed here, just mechanical cleanup of the include order. llvm-svn: 225974
* RegisterCoalescer: Turn some impossible conditions into assertsMatthias Braun2015-01-121-17/+11
| | | | | | | | This is a fixed version of reverted r225500. It fixes the too early if() continue; of the last patch and adds a comment to the unorthodox loop. llvm-svn: 225652
* Revert r225500, it leads to infinite loops.Joerg Sonnenberger2015-01-101-9/+15
| | | | llvm-svn: 225590
* RegisterCoalescer: Fix removeCopyByCommutingDef with subreg livenessMatthias Braun2015-01-091-1/+3
| | | | | | | | | The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. llvm-svn: 225503
* RegisterCoalescer: Some cleanup in removeCopyByCommutingDef(), NFCMatthias Braun2015-01-091-15/+19
| | | | llvm-svn: 225502
* RegisterCoalescer: No need to set kill flags, they are recompute later anywayMatthias Braun2015-01-091-2/+0
| | | | llvm-svn: 225501
* RegisterCoalescer: Turn some impossible conditions into assertsMatthias Braun2015-01-091-15/+9
| | | | llvm-svn: 225500
* RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for ↵Matthias Braun2015-01-081-1/+7
| | | | | | | | | | | subranges. The register coalescer used to remove implicit_defs when they are covered by the main range anyway. With subreg liveness tracking we can't do that anymore in places where the IMPLICIT_DEF is required as begin of a subregister liverange. llvm-svn: 225416
* RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases.Matthias Braun2015-01-071-90/+81
| | | | | | | | | | | | | I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! llvm-svn: 225415
* Silence GCC's -Wparentheses warningDavid Majnemer2014-12-251-1/+1
| | | | | | No functionality change intended. llvm-svn: 224833
* RegisterCoalescer: With subrange liveness there may be no RedefVNI for ↵Matthias Braun2014-12-241-3/+6
| | | | | | unused lanes. llvm-svn: 224805
OpenPOWER on IntegriCloud