| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The analysis will be needed by both the greedy register allocator and the
X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't
change.
This pass is very fast, usually showing up as 0.0% wall time.
llvm-svn: 122832
|
|
|
|
|
|
| |
warning is overzealous but gcc is what it is.)
llvm-svn: 122829
|
|
|
|
| |
llvm-svn: 122794
|
|
|
|
| |
llvm-svn: 122789
|
|
|
|
|
|
|
|
|
|
|
| |
prologue and epilogue if the adjustment is 8. Similarly, use pushl / popl if
the adjustment is 4 in 32-bit mode.
In the epilogue, takes care to pop to a caller-saved register that's not live
at the exit (either return or tailcall instruction).
rdar://8771137
llvm-svn: 122783
|
|
|
|
| |
llvm-svn: 122778
|
|
|
|
|
|
|
|
|
|
| |
This allows us to compile:
void test(char *s, int a) {
__builtin_memset(s, a, 15);
}
into 1 mul + 3 stores instead of 3 muls + 3 stores.
llvm-svn: 122710
|
|
|
|
| |
llvm-svn: 122706
|
|
|
|
| |
llvm-svn: 122700
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
header for now for memset/memcpy opportunities. It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for
loops" into 2 basic block loops that loop-idiom was ignoring.
With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:
for (j=0; j<MAX_history; ++j) {
history_new[i][j+1] = history[2*i][j];
}
Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine. Woo.
llvm-svn: 122685
|
|
|
|
| |
llvm-svn: 122676
|
|
|
|
| |
llvm-svn: 122675
|
|
|
|
| |
llvm-svn: 122667
|
|
|
|
|
|
|
|
|
| |
earlyclobber stuff. This should fix PRs 2313 and 8157.
Unfortunately, no testcase, since it'd be dependent on register
assignments.
llvm-svn: 122663
|
|
|
|
|
|
| |
is the wrong hammer for this nail, and is probably right.
llvm-svn: 122661
|
|
|
|
|
|
|
|
|
|
|
| |
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.
llvm-svn: 122654
|
|
|
|
| |
llvm-svn: 122653
|
|
|
|
| |
llvm-svn: 122652
|
|
|
|
|
|
|
| |
is necessary for executing the custom command that runs the
assember. Fixes PR8877.
llvm-svn: 122649
|
|
|
|
|
|
| |
Fixes PR8861.
llvm-svn: 122641
|
|
|
|
| |
llvm-svn: 122638
|
|
|
|
| |
llvm-svn: 122631
|
|
|
|
| |
llvm-svn: 122626
|
|
|
|
|
|
| |
files in Target/ARM and Target/X86.
llvm-svn: 122623
|
|
|
|
|
|
|
|
| |
subxcc defs/uses;
and fixed CustomInserter.
llvm-svn: 122607
|
|
|
|
| |
llvm-svn: 122603
|
|
|
|
|
|
| |
supports.
llvm-svn: 122577
|
|
|
|
| |
llvm-svn: 122560
|
|
|
|
| |
llvm-svn: 122559
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.
Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.
Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.
Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.
ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.
ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.
llvm-svn: 122541
|
|
|
|
| |
llvm-svn: 122539
|
|
|
|
| |
llvm-svn: 122530
|
|
|
|
| |
llvm-svn: 122528
|
|
|
|
| |
llvm-svn: 122524
|
|
|
|
| |
llvm-svn: 122523
|
|
|
|
|
|
|
|
| |
If the basic block containing the BCCi64 (or BCCZi64) instruction ends with
an unconditional branch, that branch needs to be deleted before appending
the expansion of the BCCi64 to the end of the block.
llvm-svn: 122521
|
|
|
|
| |
llvm-svn: 122513
|
|
|
|
| |
llvm-svn: 122509
|
|
|
|
|
|
| |
doesn't return a pointer to the end of the string.
llvm-svn: 122496
|
|
|
|
| |
llvm-svn: 122495
|
|
|
|
|
|
|
| |
new gcc warning that complains on self-assignments and
self-initializations.
llvm-svn: 122458
|
|
|
|
| |
llvm-svn: 122456
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
int test(unsigned long a, unsigned long b) { return -(a < b); }
compiles to
_test: ## @test
cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7]
sbbl %eax, %eax ## encoding: [0x19,0xc0]
ret ## encoding: [0xc3]
instead of
_test: ## @test
xorl %ecx, %ecx ## encoding: [0x31,0xc9]
cmpq %rsi, %rdi ## encoding: [0x48,0x39,0xf7]
movl $-1, %eax ## encoding: [0xb8,0xff,0xff,0xff,0xff]
cmovael %ecx, %eax ## encoding: [0x0f,0x43,0xc1]
ret ## encoding: [0xc3]
llvm-svn: 122451
|
|
|
|
| |
llvm-svn: 122398
|
|
|
|
| |
llvm-svn: 122385
|
|
|
|
| |
llvm-svn: 122384
|
|
|
|
| |
llvm-svn: 122381
|
|
|
|
| |
llvm-svn: 122379
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(add Y, (sete X, 0)) -> cmp X, 1; adc 0, Y
(add Y, (setne X, 0)) -> cmp X, 1; sbb -1, Y
(sub (sete X, 0), Y) -> cmp X, 1; sbb 0, Y
(sub (setne X, 0), Y) -> cmp X, 1; adc -1, Y
for
unsigned foo(unsigned a, unsigned b) {
if (a == 0) b++;
return b;
}
we now get:
foo:
cmpl $1, %edi
movl %esi, %eax
adcl $0, %eax
ret
instead of:
foo:
testl %edi, %edi
sete %al
movzbl %al, %eax
addl %esi, %eax
ret
llvm-svn: 122364
|
|
|
|
|
|
|
|
|
|
|
| |
Type legalization splits up i64 values into pairs of i32 values, which leads
to poor quality code when inserting or extracting i64 vector elements.
If the vector element is loaded or stored, it can be treated as an f64 value
and loaded or stored directly from a VPR register. Use the pre-legalization
DAG combiner to cast those vector elements to f64 types so that the type
legalizer won't mess them up. Radar 8755338.
llvm-svn: 122319
|