| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
the inlined function has multiple returns.
rdar://problem/12415623
llvm-svn: 180793
|
|
|
|
|
|
|
| |
Differences in bitwidth between X and Y could exist even if C1 and C2 have
the same Log2 representation.
llvm-svn: 180779
|
|
|
|
|
|
|
|
|
| |
This fixes the optimization introduced in r179748 and reverted in r179750.
While the optimization was sound, it did not properly respect differences in
bit-width.
llvm-svn: 180777
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
llvm-svn: 180731
|
|
|
|
|
|
| |
warning.
llvm-svn: 180700
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ObjCARCContract instead of ObjCARCOpts.
Turning retains into retainRV calls disrupts the data flow analysis in
ObjCARCOpts. Thus we move it as late as we can by moving it into
ObjCARCContract.
We leave in the conversion from retainRV -> retain in ObjCARCOpt since
it enables the dataflow analysis.
rdar://10813093
llvm-svn: 180698
|
|
|
|
|
|
| |
optimization.
llvm-svn: 180697
|
|
|
|
| |
llvm-svn: 180696
|
|
|
|
| |
llvm-svn: 180694
|
|
|
|
|
|
| |
ConnectTDBUTraversals so that I can prevent Changed = true from being set. This prevents an infinite loop.
llvm-svn: 180693
|
|
|
|
|
|
|
|
|
|
| |
When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.
rdar://13739160
llvm-svn: 180676
|
|
|
|
|
|
| |
rdar://problem/13056109
llvm-svn: 180618
|
|
|
|
|
|
|
|
|
|
|
| |
Since we can't guarantee that the original dbg.declare instrinsic
is removed by LowerDbgDeclare(), we need to make sure that we are
not inserting the same dbg.value intrinsic over and over.
This removes tons of redundant DIEs when compiling optimized code.
rdar://problem/13056109
llvm-svn: 180615
|
|
|
|
|
|
| |
based on the numbers of reads and writes.
llvm-svn: 180593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
that were once autoreleaseRV instructions."
This reverts commit r180222.
I think this might tie in with a different problem which will require a
different approach potentially. I am reverting this in the case I need to go
down that second path.
My apologies for the noise. = /.
llvm-svn: 180590
|
|
|
|
|
|
| |
readonly pointers.
llvm-svn: 180570
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
were once autoreleaseRV instructions.
Due to the semantics of ARC, we must be extremely conservative with autorelease
calls inserted by the frontend since ARC gaurantees that said object will be in
the autorelease pool after that point, an optimization invariant that the
optimizer must respect.
On the other hand, we are allowed significantly more flexibility with
autoreleaseRV instructions.
Often times though this flexibility is disrupted by early transformations which
transform objc_autoreleaseRV => objc_autorelease if said instruction is no
longer being used as part of an RV pair (generally due to inlining). Since we
can not tell the difference in between an autorelease put into place by the
frontend and one created through said ``strength reduction'' we can not perform
these optimizations.
The addition of this set gets around said issues by allowing us to differentiate
in between said two cases.
rdar://problem/13697741.
llvm-svn: 180222
|
|
|
|
| |
llvm-svn: 180221
|
|
|
|
|
|
|
|
| |
This makes it easier to read the code.
No functionality change.
llvm-svn: 180197
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch disables memory-instruction vectorization for types that need padding
bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on
x86_64. Because the load/store vectorization is performed by the bit casting to
a packed vector, which has incompatible memory layout due to the lack of padding
bytes, the present vectorizer produces inconsistent result for memory
instructions of those types.
This patch checks an equality of the AllocSize of a scalar type and allocated
size for each vector element, to ensure that there is no padding bytes and the
array can be read/written using vector operations.
Patch by Daisuke Takahashi!
Fixes PR15758.
llvm-svn: 180196
|
|
|
|
| |
llvm-svn: 180195
|
|
|
|
|
|
|
|
|
|
| |
debug location. This solves a problem where range of an inlined
subroutine is emitted wrongly.
Patch by Manman Ren.
Fixes rdar://problem/12415623
llvm-svn: 180140
|
|
|
|
|
|
|
|
| |
that the order in which the elements are scalarized is the same as the original order.
This fixes a miscompilation in FreeBSD's regex library.
llvm-svn: 180121
|
|
|
|
|
|
| |
Made the uniform write test's checks a bit stricter.
llvm-svn: 180119
|
|
|
|
|
|
|
|
|
| |
even if erroneously annotated with the parallel loop metadata.
Fixes Bug 15794:
"Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata"
llvm-svn: 180081
|
|
|
|
|
|
|
| |
or the C++ files themselves. This enables people to use
just a C compiler to interoperate with LLVM.
llvm-svn: 180063
|
|
|
|
|
|
| |
extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users.
llvm-svn: 180045
|
|
|
|
|
|
|
| |
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.
llvm-svn: 180019
|
|
|
|
|
|
|
| |
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.
llvm-svn: 179982
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is the temptation to make this tranform dependent on target information as
it is not going to be beneficial on all (sub)targets. Therefore, we should
probably do this in MI Early-Ifconversion.
This reverts commit r179957. Original commit message:
"SimplifyCFG: If convert single conditional stores
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up."
llvm-svn: 179980
|
|
|
|
| |
llvm-svn: 179975
|
|
|
|
|
|
|
|
| |
with multiple users.
We did not terminate the switch case and we executed the search routine twice.
llvm-svn: 179974
|
|
|
|
|
|
| |
NumPeeps and make sure that Changed is set to true.
llvm-svn: 179968
|
|
|
|
| |
llvm-svn: 179967
|
|
|
|
| |
llvm-svn: 179966
|
|
|
|
| |
llvm-svn: 179965
|
|
|
|
|
|
| |
large multi-level nested if statement.
llvm-svn: 179964
|
|
|
|
|
|
|
|
|
|
| |
progress.
This will make it clearer when we are actually resetting a sequence's progress
vs just changing state. This is an important distinction because the former case
clears any pointers that we are tracking while the later does not.
llvm-svn: 179963
|
|
|
|
| |
llvm-svn: 179960
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:
a[i] =
may-alias with a[i] load
if (cond)
a[i] = Y
into an unconditional store.
a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp
We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.
hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.
I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up.
llvm-svn: 179957
|
|
|
|
| |
llvm-svn: 179936
|
|
|
|
|
|
| |
Avoids a couple of copies and allows more flexibility in the clients.
llvm-svn: 179935
|
|
|
|
|
|
| |
the more expensive patterns. After this change will only check basic arithmetic trees that start at cmpinstr.
llvm-svn: 179933
|
|
|
|
|
|
| |
lists.
llvm-svn: 179932
|
|
|
|
| |
llvm-svn: 179931
|
|
|
|
| |
llvm-svn: 179930
|
|
|
|
| |
llvm-svn: 179929
|
|
|
|
| |
llvm-svn: 179928
|
|
|
|
| |
llvm-svn: 179927
|
|
|
|
|
|
|
|
|
| |
The logic that actually compares the types considers pointers and integers the
same if they are of the same size. This created a strange mismatch between hash
and reality and made the test case for this fail on some platforms (yay,
test cases).
llvm-svn: 179905
|