| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 179975
|
| |
|
|
|
|
|
|
| |
with multiple users.
We did not terminate the switch case and we executed the search routine twice.
llvm-svn: 179974
|
| |
|
|
|
|
| |
NumPeeps and make sure that Changed is set to true.
llvm-svn: 179968
|
| |
|
|
| |
llvm-svn: 179967
|
| |
|
|
| |
llvm-svn: 179966
|
| |
|
|
| |
llvm-svn: 179965
|
| |
|
|
|
|
| |
large multi-level nested if statement.
llvm-svn: 179964
|
| |
|
|
|
|
|
|
|
|
| |
progress.
This will make it clearer when we are actually resetting a sequence's progress
vs just changing state. This is an important distinction because the former case
clears any pointers that we are tracking while the later does not.
llvm-svn: 179963
|
| |
|
|
| |
llvm-svn: 179960
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up.
llvm-svn: 179957
|
| |
|
|
| |
llvm-svn: 179936
|
| |
|
|
|
|
| |
Avoids a couple of copies and allows more flexibility in the clients.
llvm-svn: 179935
|
| |
|
|
|
|
| |
the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr.
llvm-svn: 179933
|
| |
|
|
|
|
| |
lists.
llvm-svn: 179932
|
| |
|
|
| |
llvm-svn: 179931
|
| |
|
|
| |
llvm-svn: 179930
|
| |
|
|
| |
llvm-svn: 179929
|
| |
|
|
| |
llvm-svn: 179928
|
| |
|
|
| |
llvm-svn: 179927
|
| |
|
|
|
|
|
|
|
| |
The logic that actually compares the types considers pointers and integers the
same if they are of the same size. This created a strange mismatch between hash
and reality and made the test case for this fail on some platforms (yay,
test cases).
llvm-svn: 179905
|
| |
|
|
|
|
|
|
|
| |
Also make some static function class functions to avoid having to mention the
class namespace for enums all the time.
No functionality change intended.
llvm-svn: 179886
|
| |
|
|
| |
llvm-svn: 179826
|
| |
|
|
|
|
|
| |
If the return type is a pointer and the call returns an integer, then do the
inttoptr convertions. And vice versa.
llvm-svn: 179817
|
| |
|
|
| |
llvm-svn: 179789
|
| |
|
|
|
|
| |
limitation that extract is promoted over a cast only if the cast has only one use.
llvm-svn: 179786
|
| |
|
|
|
|
| |
it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location.
llvm-svn: 179783
|
| |
|
|
| |
llvm-svn: 179775
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y)
sequence in LLVM. If we see such a sequence we can treat it just as any other
commutative binary instruction and reduce it.
This appears to help bzip2 by about 1.5% on an imac12,2.
radar://12960601
llvm-svn: 179773
|
| |
|
|
|
|
| |
Fixes PR15748.
llvm-svn: 179757
|
| |
|
|
|
|
| |
It is causing stage2 builds to fail, let's get them running again.
llvm-svn: 179750
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplify:
(select (icmp eq (and X, C1), 0), Y, (or Y, C2))
Into:
(or (shl (and X, C1), C3), y)
Where:
C3 = Log(C2) - Log(C1)
If:
C1 and C2 are both powers of two
llvm-svn: 179748
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
outside said for loop in the presense of differing provenance caused by escaping blocks.
This occurs due to an alloca representing a separate ownership from the
original pointer. Thus consider the following pseudo-IR:
objc_retain(%a)
for (...) {
objc_retain(%a)
%block <- %a
F(%block)
objc_release(%block)
}
objc_release(%a)
From the perspective of the optimizer, the %block is a separate
provenance from the original %a. Thus the optimizer pairs up the inner
retain for %a and the outer release from %a, resulting in segfaults.
This is fixed by noting that the signature of a mismatch of
retain/releases inside the for loop is a Use/CanRelease top down with an
None bottom up (since bottom up the Retain-CanRelease-Use-Release
sequence is completed by the inner objc_retain, but top down due to the
differing provenance from the objc_release said sequence is not
completed). In said case in CheckForCFGHazards, we now clear the state
of %a implying that no pairing will occur.
Additionally a test case is included.
rdar://12969722
llvm-svn: 179747
|
| |
|
|
| |
llvm-svn: 179746
|
| |
|
|
|
|
| |
ssa identifier.
llvm-svn: 179729
|
| |
|
|
| |
llvm-svn: 179721
|
| |
|
|
|
|
| |
EnableCheckForCFGHazards, EnableARCOptimizations.
llvm-svn: 179718
|
| |
|
|
| |
llvm-svn: 179717
|
| |
|
|
|
|
| |
Differential Revision: http://llvm-reviews.chandlerc.com/D620
llvm-svn: 179661
|
| |
|
|
|
|
|
|
|
|
|
| |
If a switch instruction has a case for every possible value of its type,
with the same successor, SimplifyCFG would replace it with an icmp ult,
but the computation of the bound overflows in that case, which inverts
the test.
Patch by Jed Davis!
llvm-svn: 179587
|
| |
|
|
|
|
|
|
| |
Two return types are not equivalent if one is a pointer and the other is an
integral. This is because we cannot bitcast a pointer to an integral value.
PR15185
llvm-svn: 179569
|
| |
|
|
|
|
| |
vector-gather sequence out of loops.
llvm-svn: 179562
|
| |
|
|
| |
llvm-svn: 179542
|
| |
|
|
|
|
| |
-fslp-vectorize run the slp-vectorizer.
llvm-svn: 179508
|
| |
|
|
| |
llvm-svn: 179505
|
| |
|
|
|
|
| |
instructions.
llvm-svn: 179504
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1
The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.
Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
%shr = lshr i32 %X, 4
%and = and i32 %shr, 15
%add = add i32 %and, -14
%tobool = icmp ne i32 %add, 0
into:
%and = and i32 %X, 240
%tobool = icmp ne i32 %and, 224
llvm-svn: 179493
|
| |
|
|
| |
llvm-svn: 179483
|
| |
|
|
| |
llvm-svn: 179479
|
| |
|
|
|
|
| |
and add the cost of extracting values from the roots of the tree.
llvm-svn: 179475
|
| |
|
|
| |
llvm-svn: 179470
|