summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/RegisterCoalescer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [RegisterCoalescer] Fix for SubRange join unreachableDavid Stuttard2017-07-061-0/+28
| | | | | | | | | | | | | | | | Summary: During remat, some subranges might end up having invalid segments which caused problems for later coalescing. Added in a check to remove segments that are invalidated as part of the remat. See http://llvm.org/PR33524 Subscribers: MatzeB, qcolombet Differential Revision: https://reviews.llvm.org/D34391 llvm-svn: 307247
* [RegisterCoalescer] Account for instructions deleted by ↵Sameer AbuAsal2017-06-301-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | removePartialredunduncy and in WorkList Summary: removePartialRedundency optimization introduces a state in the RegisterCoalescer where an instruction pointed to in the WorkList is deleted from the MBB and then removed from the ErasedList. This patch updates the ErasedList to be used globally by not erasing erased Instructions from it to solve the problem. The patch also accounts for the case where an Instruction was previously deleted and the same memory was reused by BuildMI to create a new instruction. Reviewers: kparzysz, qcolombet Reviewed By: qcolombet Subscribers: MatzeB, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D34902 llvm-svn: 306915
* LiveIntervalAnalysis: Fix missing case in pruneSubRegValues()Matthias Braun2017-05-191-1/+7
| | | | | | | | | | pruneSubRegValues() needs to remove subregister ranges starting at instructions that later get removed by eraseInstrs(). It missed to check one case in which eraseInstrs() would remove an instruction. Fixes http://llvm.org/PR32688 llvm-svn: 303396
* Remove spurious cast of nullptr. NFC.Serge Guelton2017-05-111-1/+1
| | | | | | Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780
* RegisterCoalescer: Simplify subrange splitting code; NFCMatthias Braun2017-03-031-67/+16
| | | | | | - Use slightly better variable names / compute in a more direct way. llvm-svn: 296905
* RegisterCoalescer: Fix joinReservedPhysReg()Matthias Braun2017-02-071-0/+5
| | | | | | | | | | | joinReservedPhysReg() can only deal with a liverange in a single basic block when copying from a vreg into a physreg. See also rdar://30306405 Differential Revision: https://reviews.llvm.org/D29436 llvm-svn: 294268
* [RegisterCoalescer] Do not call getInstructionIndex with DBG_VALUEBrendon Cahoon2017-02-041-1/+1
| | | | | | | | | | | An assert occurs when calling SlotIndexes::getInstructionIndex with a DBG_VALUE instruction because the function expects an instruction with a slot index. However, there is no slot index for a DBG_VALUE instruction. Differential Revision: https://reviews.llvm.org/D29048 llvm-svn: 294070
* RegisterCoalescer: Cleanup joinReservedPhysReg(); NFCMatthias Braun2017-02-021-9/+24
| | | | | | | | - Factor out a common subexpression - Add some helpful comments - Fix printing of a register in a debug message llvm-svn: 293856
* [RegisterCoalescing] Recommit the patch "Remove partial redundent copy".Quentin Colombet2017-01-281-0/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In r292621, the recommit fixes a bug related with live interval update after the partial redundent copy is moved. This recommit solves an additional bug related to the lack of update of subranges. The original patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 Re-apply r292621 Revert "Revert rL292621. Caused some internal build bot failures in apple." This reverts commit r292984. Original patch: Wei Mi <wmi@google.com> Subrange fix: Mostly Matthias Braun <matze@braunis.de> llvm-svn: 293353
* Revert rL292621. Caused some internal build bot failures in apple.Wei Mi2017-01-241-171/+0
| | | | llvm-svn: 292984
* [RegisterCoalescing] Recommit the patch "Remove partial redundent copy".Wei Mi2017-01-201-0/+171
| | | | | | | | | | | | | | | The recommit fixes a bug related with live interval update after the partial redundent copy is moved. The original patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292621
* Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer ↵Wei Mi2017-01-181-171/+0
| | | | | | build bot. llvm-svn: 292327
* [RegisterCoalescing] Remove partial redundent copy.Wei Mi2017-01-171-0/+171
| | | | | | | | | | | | The patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292292
* Check for register clobbers when merging a vreg live range with aJames Y Knight2017-01-131-8/+8
| | | | | | | | | | | reserved physreg in RegisterCoalescer. Previously, we only checked for clobbers when merging into a READ of the physreg, but not when merging from a WRITE to the physreg. Differential Revision: https://reviews.llvm.org/D28527 llvm-svn: 291942
* Implement LaneBitmask::any(), use it to replace !none(), NFCIKrzysztof Parzyszek2016-12-161-12/+12
| | | | llvm-svn: 289974
* Extract LaneBitmask into a separate typeKrzysztof Parzyszek2016-12-151-37/+38
| | | | | | | | | | | | Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820
* RegisterCoalscer: Only coalesce complete reserved registers.Matthias Braun2016-12-011-1/+7
| | | | | | | | | | | | The coalescer eliminates copies from reserved registers of the form: %vregX = COPY %rY in the case where %rY is a reserved register. However this turns out to be invalid if only some of the subregisters are reserved (see also https://reviews.llvm.org/D26648). Differential Revision: https://reviews.llvm.org/D26687 llvm-svn: 288428
* RegisterCoalescer: Ignore interferences for constant physregsMatthias Braun2016-11-101-21/+25
| | | | | | | | | | | When copying to/from a constant register interferences can be ignored. Also update the documentation for isConstantPhysReg() to make it more obvious that this transformation is valid. Differential Revision: https://reviews.llvm.org/D26106 llvm-svn: 286503
* Missed a semicolon in r279835Krzysztof Parzyszek2016-08-261-1/+1
| | | | llvm-svn: 279836
* Add some more detailed debugging information in RegisterCoalescerKrzysztof Parzyszek2016-08-261-5/+19
| | | | llvm-svn: 279835
* Create subranges for new intervals resulting from live interval splittingKrzysztof Parzyszek2016-08-241-7/+109
| | | | | | | | | | | | | | | | | | | The register allocator can split a live interval of a register into a set of smaller intervals. After the allocation of registers is complete, the rewriter will modify the IR to replace virtual registers with the corres- ponding physical registers. At this stage, if a register corresponding to a subregister of a virtual register is used, the rewriter will check if that subregister is undefined, and if so, it will add the <undef> flag to the machine operand. The function verifying liveness of the subregis- ter would assume that it is undefined, unless any of the subranges of the live interval proves otherwise. The problem is that the live intervals created during splitting do not have any subranges, even if the original parent interval did. This could result in the <undef> flag placed on a register that is actually defined. Differential Revision: http://reviews.llvm.org/D21189 llvm-svn: 279625
* Reset "undef" flag when coalescing subregister into whole registerKrzysztof Parzyszek2016-08-191-0/+1
| | | | llvm-svn: 279344
* Replace a few more "fall through" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-2/+2
| | | | | | Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970
* CodeGen: Use MachineInstr& in RegisterCoalescer, NFCDuncan P. N. Exon Smith2016-07-011-38/+37
| | | | | | | Remove a few more implicit iterator to pointer conversions by preferring MachineInstr&. llvm-svn: 274363
* CodeGen: Use MachineInstr& in TargetInstrInfo, NFCDuncan P. N. Exon Smith2016-06-301-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `*` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr*` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
* Fix typoKrzysztof Parzyszek2016-06-281-1/+1
| | | | llvm-svn: 274051
* CodeGen: Don't iterate over operands after we've erased an MIJustin Bogner2016-03-251-13/+20
| | | | | | | | | | | This fixes a use-after-free introduced 3 years ago, in r182872 ;) The code more or less worked because the memory that CopyMI was pointing to happened to still be valid, but lots of tests would crash if you ran under ASAN with the recycling allocator changes from llvm.org/PR26808 llvm-svn: 264455
* RegisterCoalescer: Remap subregister lanemasks before exchanging operandsMatthias Braun2016-03-051-1/+6
| | | | | | | | | | Rematerializing and merging into a bigger register class at the same time, requires the subregister range lanemasks getting remapped to the new register class. This fixes http://llvm.org/PR26805 llvm-svn: 262768
* RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flagsMatthias Braun2016-03-051-20/+50
| | | | | | | | copy coalescing with enabled subregister liveness can reveal undef uses, previously this was only checked for the SrcReg in updateRegDefsUses() but we need to check DstReg as well. llvm-svn: 262767
* CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFCDuncan P. N. Exon Smith2016-02-271-26/+26
| | | | | | | | | | | | | | Take MachineInstr by reference instead of by pointer in SlotIndexes and the SlotIndex wrappers in LiveIntervals. The MachineInstrs here are never null, so this cleans up the API a bit. It also incidentally removes a few implicit conversions from MachineInstrBundleIterator to MachineInstr* (see PR26753). At a couple of call sites it was convenient to convert to a range-based for loop over MachineBasicBlock::instr_begin/instr_end, so I added MachineBasicBlock::instrs. llvm-svn: 262115
* RegCoalescer: Making sure re-materialization defines all subrangesMarcello Maggioni2016-02-031-0/+30
| | | | | | | | | | | | | The register coalescer can rematerialize constants that define more of a register than the copy it is going to replace was going to do. This is valid in the case the register was undef before the copy happened. This patch makes sure that all the subranges defined by the new rematerialization instructions have at least a dead def. Review: http://reviews.llvm.org/D16693 llvm-svn: 259614
* [RegisterCoalescer] Better DebugLoc for reMaterializeTrivialDefDavid Majnemer2016-02-021-0/+2
| | | | | | | | | | When rematerializing a computation by replacing the copy, use the copy's location. The location of the copy is more representative of the original program. This partially fixes PR10003. llvm-svn: 259469
* [NFC] Fix whitespace.Chad Rosier2016-01-111-1/+1
| | | | llvm-svn: 257365
* Assume lane masks are always preciseMatthias Braun2015-11-171-40/+17
| | | | | | | | | | | | | | | Allowing imprecise lane masks in case of more than 32 sub register lanes lead to some tricky corner cases, and I need another bugfix for another one. Instead I rather declare lane masks as precise and let tablegen abort if we do not have enough bits. This does not affect any in-tree target, even AMDGPU only needs 16 lanes at the moment. If the 32 lanes turn out to be a problem in the future, then we can easily change the LaneBitmask typedef to uint64_t. Differential Revision: http://reviews.llvm.org/D14557 llvm-svn: 253279
* TableGen: Emit LaneMask for register classes without subregisters as ~0uMatthias Braun2015-11-101-10/+13
| | | | | | | This makes it slightly easier to handle classes with and without subregister uniformly. llvm-svn: 252671
* CodeGen: Avoid more ilist iterator implicit conversions, NFCDuncan P. N. Exon Smith2015-10-091-1/+1
| | | | llvm-svn: 249903
* Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor2015-09-281-9/+14
| | | | | | | | | | mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735
* TargetRegisterInfo: Introduce PrintLaneMask.Matthias Braun2015-09-251-13/+14
| | | | | | | This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624
* TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where ↵Matthias Braun2015-09-251-44/+45
| | | | | | apropriate; NFC llvm-svn: 248623
* LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFCMatthias Braun2015-09-221-19/+6
| | | | llvm-svn: 248241
* [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatibleChandler Carruth2015-09-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with the new pass manager, and no longer relying on analysis groups. This builds essentially a ground-up new AA infrastructure stack for LLVM. The core ideas are the same that are used throughout the new pass manager: type erased polymorphism and direct composition. The design is as follows: - FunctionAAResults is a type-erasing alias analysis results aggregation interface to walk a single query across a range of results from different alias analyses. Currently this is function-specific as we always assume that aliasing queries are *within* a function. - AAResultBase is a CRTP utility providing stub implementations of various parts of the alias analysis result concept, notably in several cases in terms of other more general parts of the interface. This can be used to implement only a narrow part of the interface rather than the entire interface. This isn't really ideal, this logic should be hoisted into FunctionAAResults as currently it will cause a significant amount of redundant work, but it faithfully models the behavior of the prior infrastructure. - All the alias analysis passes are ported to be wrapper passes for the legacy PM and new-style analysis passes for the new PM with a shared result object. In some cases (most notably CFL), this is an extremely naive approach that we should revisit when we can specialize for the new pass manager. - BasicAA has been restructured to reflect that it is much more fundamentally a function analysis because it uses dominator trees and loop info that need to be constructed for each function. All of the references to getting alias analysis results have been updated to use the new aggregation interface. All the preservation and other pass management code has been updated accordingly. The way the FunctionAAResultsWrapperPass works is to detect the available alias analyses when run, and add them to the results object. This means that we should be able to continue to respect when various passes are added to the pipeline, for example adding CFL or adding TBAA passes should just cause their results to be available and to get folded into this. The exception to this rule is BasicAA which really needs to be a function pass due to using dominator trees and loop info. As a consequence, the FunctionAAResultsWrapperPass directly depends on BasicAA and always includes it in the aggregation. This has significant implications for preserving analyses. Generally, most passes shouldn't bother preserving FunctionAAResultsWrapperPass because rebuilding the results just updates the set of known AA passes. The exception to this rule are LoopPass instances which need to preserve all the function analyses that the loop pass manager will end up needing. This means preserving both BasicAAWrapperPass and the aggregating FunctionAAResultsWrapperPass. Now, when preserving an alias analysis, you do so by directly preserving that analysis. This is only necessary for non-immutable-pass-provided alias analyses though, and there are only three of interest: BasicAA, GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is preserved when needed because it (like DominatorTree and LoopInfo) is marked as a CFG-only pass. I've expanded GlobalsAA into the preserved set everywhere we previously were preserving all of AliasAnalysis, and I've added SCEVAA in the intersection of that with where we preserve SCEV itself. One significant challenge to all of this is that the CGSCC passes were actually using the alias analysis implementations by taking advantage of a pretty amazing set of loop holes in the old pass manager's analysis management code which allowed analysis groups to slide through in many cases. Moving away from analysis groups makes this problem much more obvious. To fix it, I've leveraged the flexibility the design of the new PM components provides to just directly construct the relevant alias analyses for the relevant functions in the IPO passes that need them. This is a bit hacky, but should go away with the new pass manager, and is already in many ways cleaner than the prior state. Another significant challenge is that various facilities of the old alias analysis infrastructure just don't fit any more. The most significant of these is the alias analysis 'counter' pass. That pass relied on the ability to snoop on AA queries at different points in the analysis group chain. Instead, I'm planning to build printing functionality directly into the aggregation layer. I've not included that in this patch merely to keep it smaller. Note that all of this needs a nearly complete rewrite of the AA documentation. I'm planning to do that, but I'd like to make sure the new design settles, and to flesh out a bit more of what it looks like in the new pass manager first. Differential Revision: http://reviews.llvm.org/D12080 llvm-svn: 247167
* Fix some comment typos.Benjamin Kramer2015-08-081-3/+3
| | | | llvm-svn: 244402
* [regalloc] Make RegMask clobbers prevent merging vreg's into PhysRegs when ↵Daniel Sanders2015-07-311-0/+8
| | | | | | | | | | | | | | | | | | | | | hoisting def's upwards. Summary: This prevents vreg260 and D7 from being merged in: %vreg260<def> = LDC1 ... JAL <ga:@sin>, <regmask ... list not containing D7 ...> %D7<def> = COPY %vreg260; ... Doing so is not valid because the JAL clobbers the D7. This fixes the almabench regression in the LLVM 3.7.0 release branch. Reviewers: MatzeB Subscribers: MatzeB, qcolombet, hans, llvm-commits Differential Revision: http://reviews.llvm.org/D11649 llvm-svn: 243745
* RegisterCoalescer: Cleanup empty subranges after shrinkToUses()Matthias Braun2015-06-301-0/+1
| | | | | | | | A call to removeEmptySubranges() is necessary after every operation that potentially removes all segments from a subregister range; this case in the register coalescer was missing. llvm-svn: 241027
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
* TargetRegisterInfo: Make the concept of imprecise lane masks explicitMatthias Braun2015-06-161-1/+2
| | | | | | | | | | | | | | | | | | | | LaneMasks as given by getSubRegIndexLaneMask() have a limited number of of bits, so for targets with more than 31 disjunct subregister there may be cases where: getSubReg(Reg,A) does not overlap getSubReg(Reg,B) but we still have (getSubRegIndexLaneMask(A) & getSubRegIndexLaneMask(B)) != 0. I had hoped to keep this an implementation detail of the tablegen but as my next commit shows we can avoid unnecessary imp-defs operands if we know that the lane masks in use are precise. This is in preparation to http://reviews.llvm.org/D10470. llvm-svn: 239837
* CodeGen: Use mop_iterator instead of MIOperands/ConstMIOperandsMatthias Braun2015-05-291-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | MIOperands/ConstMIOperands are classes iterating over the MachineOperand of a MachineInstr, however MachineInstr::mop_iterator does the same thing. I assume these two iterators exist to have a uniform interface to iterate over the operands of a machine instruction bundle and a single machine instruction. However in practice I find it more confusing to have 2 different iterator classes, so this patch transforms (nearly all) the code to use mop_iterators. The only exception being MIOperands::anlayzePhysReg() and MIOperands::analyzeVirtReg() still needing an equivalent, I leave that as an exercise for the next patch. Differential Revision: http://reviews.llvm.org/D9932 This version is slightly modified from the proposed revision in that it introduces MachineInstr::getOperandNo to avoid the extra counting variable in the few loops that previously used MIOperands::getOperandNo. llvm-svn: 238539
* MachineInstr: Remove unused parameter.Matthias Braun2015-05-191-1/+1
| | | | llvm-svn: 237726
* RegisterCoalescer: Improve a comment.Matthias Braun2015-05-191-6/+5
| | | | | | | Explain the relation of the example to the variables in the code, explain what bad behaviour the code avoids in this case. llvm-svn: 237706
OpenPOWER on IntegriCloud