| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 142822
|
| |
|
|
| |
llvm-svn: 142821
|
| |
|
|
|
|
| |
more of this code gets refactored, a lot of these manual decoding hooks should get smaller and/or go away entirely.
llvm-svn: 142817
|
| |
|
|
|
|
|
| |
physreg dependencies, and upcoming codegen changes will require proper
physreg dependence handling.
llvm-svn: 142816
|
| |
|
|
| |
llvm-svn: 142815
|
| |
|
|
|
|
| |
use of Sched::ILP instead, as Sched::Latency is going away.
llvm-svn: 142813
|
| |
|
|
|
|
| |
as the Latency scheduler is going away.
llvm-svn: 142811
|
| |
|
|
|
|
| |
is going away.
llvm-svn: 142810
|
| |
|
|
|
|
| |
PR11220
llvm-svn: 142801
|
| |
|
|
| |
llvm-svn: 142800
|
| |
|
|
|
|
| |
used it. Fixes an unused variable warning from GCC on release builds.
llvm-svn: 142799
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
introduce no-return or unreachable heuristics.
The return heuristics from the Ball and Larus paper don't work well in
practice as they pessimize early return paths. The only good hitrate
return heuristics are those for:
- NULL return
- Constant return
- negative integer return
Only the last of these three can possibly require significant code for
the returning block, and even the last is fairly rare and usually also
a constant. As a consequence, even for the cold return paths, there is
little code on that return path, and so little code density to be gained
by sinking it. The places where sinking these blocks is valuable (inner
loops) will already be weighted appropriately as the edge is a loop-exit
branch.
All of this aside, early returns are nearly as common as all three of
these return categories, and should actually be predicted as taken!
Rather than muddy the waters of the static predictions, just remain
silent on returns and let the CFG itself dictate any layout or other
issues.
However, the return heuristic was flagging one very important case:
unreachable. Unfortunately it still gave a 1/4 chance of the
branch-to-unreachable occuring. It also didn't do a rigorous job of
finding those blocks which post-dominate an unreachable block.
This patch builds a more powerful analysis that should flag all branches
to blocks known to then reach unreachable. It also has better worst-case
runtime complexity by not looping through successors for each block. The
previous code would perform an N^2 walk in the event of a single entry
block branching to N successors with a switch where each successor falls
through to the next and they finally fall through to a return.
Test case added for noreturn heuristics. Also doxygen comments improved
along the way.
llvm-svn: 142793
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enhance SCEV's brute force loop analysis to handle multiple PHI nodes in the
loop header when computing the trip count.
With this, we now constant evaluate:
struct ListNode { const struct ListNode *next; int i; };
static const struct ListNode node1 = {0, 1};
static const struct ListNode node2 = {&node1, 2};
static const struct ListNode node3 = {&node2, 3};
int test() {
int sum = 0;
for (const struct ListNode *n = &node3; n != 0; n = n->next)
sum += n->i;
return sum;
}
llvm-svn: 142790
|
| |
|
|
|
|
|
| |
state. Furthermore, they might not have two operands. This fixes the underlying
issue behind the crashes introduced in r142781.
llvm-svn: 142788
|
| |
|
|
|
|
|
|
|
|
| |
instructions.
This doesn't introduce any optimizations we weren't doing before (except
potentially due to pass ordering issues), now passes will eliminate them sooner
as part of their own cleanups.
llvm-svn: 142787
|
| |
|
|
|
|
|
| |
Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed.
coming out of indvars.
llvm-svn: 142786
|
| |
|
|
|
|
| |
Rackover!
llvm-svn: 142785
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
a single class. Previously it was split between two classes, one
internal and one external. The concern seemed to center around exposing
the weights used, but those can remain confined to the implementation
file.
Having a single class to maintain the state and analyses in use will
also simplify several of the enhancements I want to make to our static
heuristics.
llvm-svn: 142783
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
loop header when computing the trip count.
With this, we now constant evaluate:
struct ListNode { const struct ListNode *next; int i; };
static const struct ListNode node1 = {0, 1};
static const struct ListNode node2 = {&node1, 2};
static const struct ListNode node3 = {&node2, 3};
int test() {
int sum = 0;
for (const struct ListNode *n = &node3; n != 0; n = n->next)
sum += n->i;
return sum;
}
llvm-svn: 142781
|
| |
|
|
|
|
|
| |
extraneous whitespace. Trying to clean-up this pass as much as I can
before I start making functional changes.
llvm-svn: 142780
|
| |
|
|
| |
llvm-svn: 142779
|
| |
|
|
|
|
|
|
|
|
| |
to bring it under direct test instead of merely indirectly testing it in
the BlockFrequencyInfo pass.
The next step is to start adding tests for the various heuristics
employed, and to start fixing those heuristics once they're under test.
llvm-svn: 142778
|
| |
|
|
|
|
|
| |
to get important constant branch probabilities and use them for finding
the best branch out of a set of possibilities.
llvm-svn: 142762
|
| |
|
|
| |
llvm-svn: 142761
|
| |
|
|
|
|
| |
50% is much more readable than 5.000000e-01.
llvm-svn: 142752
|
| |
|
|
|
|
| |
edge is hot.
llvm-svn: 142751
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
discussions with Andy. Fundamentally, the previous algorithm is both
counter productive on several fronts and prioritizing things which
aren't necessarily the most important: static branch prediction.
The new algorithm uses the existing loop CFG structure information to
walk through the CFG itself to layout blocks. It coalesces adjacent
blocks within the loop where the CFG allows based on the most likely
path taken. Finally, it topologically orders the block chains that have
been formed. This allows it to choose a (mostly) topologically valid
ordering which still priorizes fallthrough within the structural
constraints.
As a final twist in the algorithm, it does violate the CFG when it
discovers a "hot" edge, that is an edge that is more than 4x hotter than
the competing edges in the CFG. These are forcibly merged into
a fallthrough chain.
Future transformations that need te be added are rotation of loop exit
conditions to be fallthrough, and better isolation of cold block chains.
I'm also planning on adding statistics to model how well the algorithm
does at laying out blocks based on the probabilities it receives.
The old tests mostly still pass, and I have some new tests to add, but
the nested loops are still behaving very strangely. This almost seems
like working-as-intended as it rotated the exit branch to be
fallthrough, but I'm not convinced this is actually the best layout. It
is well supported by the probabilities for loops we currently get, but
those are pretty broken for nested loops, so this may change later.
llvm-svn: 142743
|
| |
|
|
| |
llvm-svn: 142741
|
| |
|
|
|
|
|
| |
element types, even though the element extraction code does. It is surprising
that this bug has been here for so long. Fixes <rdar://problem/10318778>.
llvm-svn: 142740
|
| |
|
|
| |
llvm-svn: 142738
|
| |
|
|
|
|
| |
multiply instructions.
llvm-svn: 142737
|
| |
|
|
|
|
| |
elimination on them too.
llvm-svn: 142735
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
able to constant fold load instructions where the argument is a constant.
Second, we should be able to watch multiple PHI nodes through the loop; this
patch only supports PHIs in loop headers, more can be done here.
With this patch, we now constant evaluate:
static const int arr[] = {1, 2, 3, 4, 5};
int test() {
int sum = 0;
for (int i = 0; i < 5; ++i) sum += arr[i];
return sum;
}
llvm-svn: 142731
|
| |
|
|
|
|
| |
correctness along the way.
llvm-svn: 142726
|
| |
|
|
|
|
|
| |
SHL inserts zeros from the right, thus even when the original
sign_extend_inreg value was of 1-bit, we need to sra.
llvm-svn: 142724
|
| |
|
|
|
|
|
| |
that the set of callee-saved registers is correct for the specific platform.
<rdar://problem/10313708> & ctor_dtor_count & ctor_dtor_count-2
llvm-svn: 142706
|
| |
|
|
| |
llvm-svn: 142704
|
| |
|
|
| |
llvm-svn: 142691
|
| |
|
|
|
|
|
|
| |
The assumption in the back-end is that PHIs are not allowed at the start of the
landing pad block for SjLj exceptions.
<rdar://problem/10313708>
llvm-svn: 142689
|
| |
|
|
| |
llvm-svn: 142687
|
| |
|
|
| |
llvm-svn: 142684
|
| |
|
|
| |
llvm-svn: 142683
|
| |
|
|
| |
llvm-svn: 142682
|
| |
|
|
|
|
| |
This is from the same paper from Ball and Larus as the rest of the currently implemented heuristics.
llvm-svn: 142677
|
| |
|
|
| |
llvm-svn: 142675
|
| |
|
|
| |
llvm-svn: 142673
|
| |
|
|
|
|
| |
expensive helper.
llvm-svn: 142672
|
| |
|
|
|
|
| |
the input and output vectors have different sizes. Patch by Xiaoyi Guo.
llvm-svn: 142671
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Next step in the ongoing saga of NEON load/store assmebly parsing. Handle
VLD1 instructions that take a two-register register list.
Adjust the instruction definitions to only have the single encoded register
as an operand. The super-register from the pseudo is kept as an implicit def,
so passes which come after pseudo-expansion still know that the instruction
defines the other subregs.
llvm-svn: 142670
|
| |
|
|
|
|
| |
ask for them. This is a divergence from gas' behavior, but it is correct per the documentation and allows us to forge ahead with roundtrip testing.
llvm-svn: 142669
|