|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | entries as there are basic blocks in the function.  LiveVariables::getVarInfo
creates a VarInfo struct for every register in the function, leading to
quadratic space use.  This patch changes the BitVector to a SparseBitVector,
which doesn't help the worst-case memory use but does reduce the actual use in
very long functions with short-lived variables.
llvm-svn: 72426 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | not utilizing registers at all. The fundamental problem is linearscan's backtracking can end up freeing more than one allocated registers. However,  reloads and restores might be folded into uses / defs and freed registers might not be used at all.
VirtRegMap keeps track of allocations so it knows what's not used. As a horrible hack, the stack coloring can color spill slots with *free* registers. That is, it replace reload and spills with copies from and to the free register. It unfold instructions that load and store the spill slot and replace them with register using variants.
Not yet enabled. This is part 1. More coming.
llvm-svn: 70787 | 
| | 
| 
| 
| 
| 
| | two-address update.
llvm-svn: 70245 | 
| | 
| 
| 
| 
| 
| | use is deleted by two-address pass.
llvm-svn: 70213 | 
| | 
| 
| 
| 
| 
| | This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.
llvm-svn: 69743 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | allocator spill an interval with multiple uses in the same basic block, it creates a different virtual register for each of the reloads. e.g.
	%reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
        %reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
        %reg1486<def> = MOV32rr %reg1506
        %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
        %reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
=>
        %reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
        %reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
        %reg1486<def> = MOV32rr %reg1506
        %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
        %reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block.
Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused.
This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006.
llvm-svn: 69585 | 
| | 
| 
| 
| 
| 
| | a live interval. This is needed for some upcoming subreg changes.
llvm-svn: 68956 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | register destinations that are tied to source operands.  The
TargetInstrDescr::findTiedToSrcOperand method silently fails for inline
assembly.  The existing MachineInstr::isRegReDefinedByTwoAddr was very
close to doing what is needed, so this revision makes a few changes to
that method and also renames it to isRegTiedToUseOperand (for consistency
with the very similar isRegTiedToDefOperand and because it handles both
two-address instructions and inline assembly with tied registers).
llvm-svn: 68714 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce
SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG
instructions), and teach the DAGCombiner to take advantage of this on
targets which support it. This eliminates many redundant
zero-extension operations on x86-64.
This adds a new TargetLowering hook, isZExtFree. It's similar to
isTruncateFree, except it only applies to actual definitions, and not
no-op truncates which may not zero the high bits.
Also, this adds a new optimization to SimplifyDemandedBits: transform
operations like x+y into (zext (add (trunc x), (trunc y))) on targets
where all the casts are no-ops. In contexts where the high part of the
add is explicitly masked off, this allows the mask operation to be
eliminated. Fix the DAGCombiner to avoid undoing these transformations
to eliminate casts on targets where the casts are no-ops.
Also, this adds a new two-address lowering heuristic. Since
two-address lowering runs before coalescing, it helps to be able to
look through copies when deciding whether commuting and/or
three-address conversion are profitable.
Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle
the case that a clobber range extended both before and beyond an
existing live range. In that case, multiple live ranges need to be
added. This was exposed by the new subreg coalescing code.
Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the
spiller behavior it was looking for no longer occurrs with the new
instruction selection.
llvm-svn: 68576 | 
| | 
| 
| 
| | llvm-svn: 67764 | 
| | 
| 
| 
| | llvm-svn: 67544 | 
| | 
| 
| 
| 
| 
| | machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies.
llvm-svn: 67512 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | - Make type declarations match the struct/class keyword of the definition.
 - Move AddSignalHandler into the namespace where it belongs.
 - Correctly call functions from template base.
 - Some other small changes.
With this patch, LLVM and Clang should build properly and with far less noise under VS2008.
llvm-svn: 67347 | 
| | 
| 
| 
| | llvm-svn: 67335 | 
| | 
| 
| 
| 
| 
| | start. Sorry, no small test case possible.
llvm-svn: 66129 | 
| | 
| 
| 
| 
| 
| | interval after its sub-register is coalesced with a virtual register.
llvm-svn: 64082 | 
| | 
| 
| 
| | llvm-svn: 63267 | 
| | 
| 
| 
| 
| 
| | sub-register indices as well.
llvm-svn: 62600 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | any of the physical register's sub-register live intervals overlaps with the virtual register. This is overly conservative. It prevents a extract_subreg from being coalesced away:
v1024 = EDI  // not killed
      =
      = EDI
One possible solution is for the coalescer to examine the sub-register live intervals in the same manner as the physical register. Another possibility is to examine defs and uses (when needed) of sub-registers. Both solutions are too expensive. For now, look for "short virtual intervals" and scan instructions to look for conflict instead.
This is a small win on x86-64. e.g. It shaves 403.gcc by ~80 instructions.
llvm-svn: 61847 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | an input operand, it effectively extends the live range of the physical register. Currently we do not have a good way to represent this.
172     %ECX<def> = MOV32rr %reg1039<kill>
180     INLINEASM <es:subl $5,$1
        sbbl $3,$0>, 10, %EAX<def>, 14, %ECX<earlyclobber,def>, 9, %EAX<kill>,
36, <fi#0>, 1, %reg0, 0, 9, %ECX<kill>, 36, <fi#1>, 1, %reg0, 0
188     %EAX<def> = MOV32rr %EAX<kill>
196     %ECX<def> = MOV32rr %ECX<kill>
204     %ECX<def> = MOV32rr %ECX<kill>
212     %EAX<def> = MOV32rr %EAX<kill>
220     %EAX<def> = MOV32rr %EAX
228     %reg1039<def> = MOV32rr %ECX<kill>
The early clobber operand ties ECX input to the ECX def.
The live interval of ECX is represented as this:
%reg20,inf = [46,47:1)[174,230:0)  0@174-(230) 1@46-(47)
The right way to represent this is something like
%reg20,inf = [46,47:2)[174,182:1)[181:230:0)  0@174-(182) 1@181-230 @2@46-(47)
Of course that won't work since that means overlapping live ranges defined by two val#.
The workaround for now is to add a bit to val# which says the val# is redefined by a early clobber def somewhere. This prevents the move at 228 from being optimized away by SimpleRegisterCoalescing::AdjustCopiesBackFrom.
llvm-svn: 61259 | 
| | 
| 
| 
| | llvm-svn: 60683 | 
| | 
| 
| 
| 
| 
| | constpool into a use, the rewrite happens at time of spill (not in VirtRegMap). Later on, if the GlobalBaseReg is spilled, the spiller can see the use uses GlobalBaseReg and do the right thing.
llvm-svn: 60596 | 
| | 
| 
| 
| | llvm-svn: 60592 | 
| | 
| 
| 
| | llvm-svn: 60586 | 
| | 
| 
| 
| | llvm-svn: 60487 | 
| | 
| 
| 
| 
| 
| 
| | and the LiveInterval.h top-level comment and accordingly. This fixes blocks
having spurious live-in registers in boundary cases.
llvm-svn: 60092 | 
| | 
| 
| 
| | llvm-svn: 59841 | 
| | 
| 
| 
| 
| 
| | BitVector, instead of manually testing each bit.
llvm-svn: 59246 | 
| | 
| 
| 
| 
| 
| 
| | coalescing as a separate pass rather than inside of
LiveIntervalAnalysis.
llvm-svn: 59146 | 
| | 
| 
| 
| 
| 
| 
| | - Create and update spill slot live intervals.
- Lots of bug fixes.
llvm-svn: 58367 | 
| | 
| 
| 
| 
| 
| 
| | can give it the same stack slot as the spilled interval if it is folded.
This prevents the fold/unfold code from pointing to the wrong register.
llvm-svn: 58255 | 
| | 
| 
| 
| 
| 
| | re-materializable val# (for now).
llvm-svn: 58068 | 
| | 
| 
| 
| | llvm-svn: 57766 | 
| | 
| 
| 
| | llvm-svn: 57765 | 
| | 
| 
| 
| | llvm-svn: 57259 | 
| | 
| 
| 
| 
| 
| | isReg, etc., from isRegister, etc.
llvm-svn: 57006 | 
| | 
| 
| 
| 
| 
| | amount of time to track down.
llvm-svn: 56889 | 
| | 
| 
| 
| | llvm-svn: 56848 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | "If a re-materializable instruction has a register
operand, the spiller will change the register operand's
spill weight to HUGE_VAL to avoid it being spilled.
However, if the operand is already in the queue ready
to be spilled, avoid re-materializing it".
llvm-svn: 56837 | 
| | 
| 
| 
| 
| 
| | change the register operand's spill weight to HUGE_VAL to avoid it being spilled. However, if the operand is already in the queue ready to be spilled, avoid re-materializing it.
llvm-svn: 56835 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | RA problem by expanding the live interval of an
earlyclobber def back one slot.  Remove
overlap-earlyclobber throughout.  Remove 
earlyclobber bits and their handling from
live internals.
llvm-svn: 56539 | 
| | 
| 
| 
| 
| 
| 
| 
| | live-in indices
correct in the presence of things like EH labels.
llvm-svn: 56410 | 
| | 
| 
| 
| 
| 
| 
| | and redo as linked list walk.  Logic moved into RA.
Per review feedback.
llvm-svn: 56326 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | with an earlyclobber operand elsewhere.  Propagate
this bit and the earlyclobber bit through SDISel.
Change linear-scan RA not to allocate regs in a way 
that conflicts with an earlyclobber.  See also comments.
llvm-svn: 56290 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | basic block, not at the first
instruction.  Also, their valno's should have an unknown def.  This has no effect currently, but was
causing issues when StrongPHIElimination was enabled.
llvm-svn: 56231 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | isImmediate(), isRegister(), and friends, to avoid confusion
about having two different names with the same meaning. I'm
not attached to the longer names, and would be ok with
changing to the shorter names if others prefer it.
llvm-svn: 56189 | 
| | 
| 
| 
| 
| 
| | remat and splitting.
llvm-svn: 55012 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | appropriate.
With this patch, all of MultiSource/Applications and all of SPEC2000/2006 pass with
the SimpleSpiller and this fast-path enabled.
llvm-svn: 55000 | 
| | 
| 
| 
| | llvm-svn: 54958 | 
| | 
| 
| 
| 
| 
| 
| | 1) Assign stack slots to new temporaries.
  2) Don't insert an interval into the return vector more than once.
llvm-svn: 54956 |