| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
we have invokes, so there is no functionality change here.
llvm-svn: 122990
|
|
|
|
|
|
| |
Nadav Rotem.
llvm-svn: 122983
|
|
|
|
|
|
| |
typed atomics. This will lower exclusively to libcalls at the moment.
llvm-svn: 122979
|
|
|
|
|
|
| |
This fixes PR 8913 crash.
llvm-svn: 122971
|
|
|
|
|
|
|
| |
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.
llvm-svn: 122952
|
|
|
|
| |
llvm-svn: 122949
|
|
|
|
|
|
| |
Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there.
llvm-svn: 122940
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass precomputes CFG block frequency information that can be used by the
register allocator to find optimal spill code placement.
Given an interference pattern, placeSpills() will compute which basic blocks
should have the current variable enter or exit in a register, and which blocks
prefer the stack.
The algorithm is ready to consume block frequencies from profiling data, but for
now it gets by with the static estimates used for spill weights.
This is a work in progress and still not hooked up to RegAllocGreedy.
llvm-svn: 122938
|
|
|
|
|
|
|
|
| |
up freebsd bootloader. However, this doesn't make much sense for Darwin, whose
-Os is meant to optimize for size only if it doesn't hurt performance.
rdar://8821501
llvm-svn: 122936
|
|
|
|
|
|
|
| |
the original type of the switch statement key.
rdar://8781238
llvm-svn: 122935
|
|
|
|
|
|
|
|
|
| |
r1025 = s/zext r1024, 4
r1026 = extract_subreg r1025, 4
to:
r1026 = copy r1024
llvm-svn: 122925
|
|
|
|
|
|
| |
calculated.
llvm-svn: 122912
|
|
|
|
| |
llvm-svn: 122909
|
|
|
|
| |
llvm-svn: 122849
|
|
|
|
|
|
|
|
|
|
| |
The analysis will be needed by both the greedy register allocator and the
X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't
change.
This pass is very fast, usually showing up as 0.0% wall time.
llvm-svn: 122832
|
|
|
|
|
|
| |
makes getLeader() nonrecursive.
llvm-svn: 122811
|
|
|
|
|
|
| |
spent in StrongPHIElimination on 403.gcc.
llvm-svn: 122803
|
|
|
|
|
|
| |
static constructors.
llvm-svn: 122795
|
|
|
|
|
|
| |
a 28% speedup of MachineCSE time on 403.gcc.
llvm-svn: 122735
|
|
|
|
|
|
| |
so that Dominators.h is *just* domtree. Also prune #includes a bit.
llvm-svn: 122714
|
|
|
|
|
|
|
|
|
|
| |
This allows us to compile:
void test(char *s, int a) {
__builtin_memset(s, a, 15);
}
into 1 mul + 3 stores instead of 3 muls + 3 stores.
llvm-svn: 122710
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
series of shifts and ors.
We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.
Example code:
void test(char *s, int a) {
__builtin_memset(s, a, 4);
}
before:
_test: ## @test
movzbl 8(%esp), %eax
movl %eax, %ecx
shll $8, %ecx
orl %eax, %ecx
movl %ecx, %eax
shll $16, %eax
orl %ecx, %eax
movl 4(%esp), %ecx
movl %eax, 4(%ecx)
movl %eax, (%ecx)
ret
after:
_test: ## @test
movzbl 8(%esp), %eax
imull $16843009, %eax, %eax ## imm = 0x1010101
movl 4(%esp), %ecx
movl %eax, 4(%ecx)
movl %eax, (%ecx)
ret
llvm-svn: 122707
|
|
|
|
|
|
|
| |
with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on
403.gcc.
llvm-svn: 122635
|
|
|
|
| |
llvm-svn: 122628
|
|
|
|
|
|
|
| |
process those instructions that define phi sources. This is a 47% speedup of
StrongPHIElimination compile time on 403.gcc.
llvm-svn: 122627
|
|
|
|
| |
llvm-svn: 122625
|
|
|
|
| |
llvm-svn: 122617
|
|
|
|
|
|
| |
in the most obvious way.
llvm-svn: 122610
|
|
|
|
|
|
| |
it relies on assumptions that may not be true in the future.
llvm-svn: 122608
|
|
|
|
|
|
|
|
| |
we are only interested in the defs when discovering interferences.
This is a 28% speedup running StrongPHIElimination on 403.gcc.
llvm-svn: 122596
|
|
|
|
|
|
|
| |
in this function, but the compiler was warning that it might be when
doing a release build.
llvm-svn: 122595
|
|
|
|
| |
llvm-svn: 122586
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when running without the verifier, and I have not yet checked them to see if
the new results are still correct. There are more verifier failures, but they
all seem to be additional occurrences of verifier failures that occur with the
existing PHIElimination pass. There are a few obvious issues with the code:
1) It doesn't properly update the register equivalence classes during copy
insertion, and instead recomputes them before merging live intervals and
renaming registers. I wanted to keep this first patch simple for debugging
purposes, but it shouldn't be very hard to do this.
2) It doesn't mix the renaming and live interval merging with the copy insertion
process, which leads to a lot of virtual register churn. Virtual registers and
live intervals are created, only to later be merged into others. The code should
be smarter and only create a new virtual register if there is no existing
register in the same congruence class.
3) In one place the code uses a DenseMap per basic block, which is unnecessary
heap allocation. There should be an inline storage version of DenseMap.
I did a quick compile-time test of running llc on 403.gcc with and without
StrongPHIElimination. It is slightly slower with StrongPHIElimination, because
the small decrease in the coalescer runtime can't beat the increase in phi
elimination runtime. Perhaps fixing the above performance issues will narrow
the gap.
I also haven't yet run any tests of the quality of the generated code.
llvm-svn: 122582
|
|
|
|
|
|
|
|
|
| |
valno verification. The "Different value live out of predecessor" check is
incorrect in the case of phi-def valnos, so just skip that check for phi-def
valnos and instead check that all of the valnos for predecessors have phi-kill.
Fixes PR8863.
llvm-svn: 122581
|
|
|
|
| |
llvm-svn: 122545
|
|
|
|
|
|
| |
scheduling node may have a NULL DAG node, yuck.
llvm-svn: 122544
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.
Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.
Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.
Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.
ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.
ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.
llvm-svn: 122541
|
|
|
|
| |
llvm-svn: 122539
|
|
|
|
| |
llvm-svn: 122537
|
|
|
|
| |
llvm-svn: 122509
|
|
|
|
| |
llvm-svn: 122507
|
|
|
|
|
|
| |
and instruction issue.
llvm-svn: 122491
|
|
|
|
|
|
| |
multiple nodes per cycle.
llvm-svn: 122474
|
|
|
|
| |
llvm-svn: 122473
|
|
|
|
|
|
|
|
| |
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.
llvm-svn: 122472
|
|
|
|
|
|
|
| |
new gcc warning that complains on self-assignments and
self-initializations.
llvm-svn: 122458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
illegal. The latter usually compiles into smaller code.
example code:
unsigned foo(unsigned x, unsigned y) {
if (x != 0) y--;
return y;
}
before:
_foo: ## @foo
cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01]
sbbl %eax, %eax ## encoding: [0x19,0xc0]
notl %eax ## encoding: [0xf7,0xd0]
addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08]
ret ## encoding: [0xc3]
after:
_foo: ## @foo
cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01]
movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08]
adcl $-1, %eax ## encoding: [0x83,0xd0,0xff]
ret ## encoding: [0xc3]
llvm-svn: 122455
|
|
|
|
|
|
| |
pick the victim with the lowest total spill weight.
llvm-svn: 122445
|
|
|
|
| |
llvm-svn: 122444
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loads properly. We miscompiled the testcase into:
_test: ## @test
movl $128, (%rdi)
movzbl 1(%rdi), %eax
ret
Now we get a proper:
_test: ## @test
movl $128, (%rdi)
movsbl (%rdi), %eax
movzbl %ah, %eax
ret
This fixes PR8757.
llvm-svn: 122392
|