| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
BBs might contain non-LCSSA'd values after the LCSSA pass is run if they
are unreachable from the entry block.
Normally, the users of the instruction would be PHIs but the unreachable
BBs have normal users; rewrite their uses to be undef values.
An alternative fix could involve fixing this at LCSSA but that would
require this invariant to hold after subsequent transforms. If a BB
created an unreachable block, they would be in violation of this.
This fixes PR19798.
Differential Revision: http://reviews.llvm.org/D5146
llvm-svn: 216911
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SROA may decide that it needs to insert a bitcast and would set it's
insertion point before a PHI. This will create an invalid module
right quick.
Instead, choose the first insertion point in the basic block that holds
our PHI.
This fixes PR20822.
Differential Revision: http://reviews.llvm.org/D5141
llvm-svn: 216891
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
chain became completely broken here as *all* intrinsic users ended up
being skipped, and the ones that seemed to be singled out were actually
the exact wrong set.
This is a great example of why long else-if chains can be easily
confusing. Switch the entire code to use early exits and early continues
to have simpler (and more importantly, correct) logic here, as well as
fixing the reversed logic for detecting and continuing on lifetime
intrinsics.
I've also significantly cleaned up the test case and added another test
case demonstrating an example where the optimization is not (trivially)
safe to perform.
llvm-svn: 216871
|
| |
|
|
|
|
| |
Aden, review by Hal Finkel.
llvm-svn: 216865
|
| |
|
|
|
|
| |
just letting them be implicitly created.
llvm-svn: 216525
|
| |
|
|
| |
llvm-svn: 216351
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes PR20425.
During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.
Test Plan: Added three more tests in phi-and-select.ll
Reviewers: nlewycky, eliben, meheff, chandlerc
Reviewed By: chandlerc
Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits
Differential Revision: http://reviews.llvm.org/D4659
llvm-svn: 216299
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In this case, we are creating an x86_fp80 slice for a union from C where
the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes,
and that's just fine. We can't, however, use regular loads and stores to
access the slice, because the store size is only 10 bytes / 80 bits.
Instead, use memcpy and memset.
Fixes PR18726.
Reviewed By: chandlerc
Differential Revision: http://reviews.llvm.org/D5012
llvm-svn: 216248
|
| |
|
|
| |
llvm-svn: 216176
|
| |
|
|
|
|
|
|
|
|
|
| |
This does not require -ffast-math, and it gives CSE/GVN more options to
eliminate duplicate expressions in, e.g.:
return ((x + 0.1234 * y) * (x - 0.1234 * y));
Differential Revision: http://reviews.llvm.org/D4904
llvm-svn: 216169
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"sub nsw" and "mul nsw" instructions.
Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:
int N = 100;
float **A;
void foo(int x0, int x1)
{
float * A_cur = &A[0][0];
float * A_next = &A[1][0];
for(int x = x0; x < x1; ++x).
{
// Currently only [x+N] case is widened. Others 2 cases lead to sext.
// This patch fixes it, so all 3 cases do not need sext.
const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
A_next[x] = div;
}
}
...
> clang++ test.cpp -march=core-avx2 -Ofast -fno-unroll-loops -fno-tree-vectorize -S -o -
Differential Revision: http://reviews.llvm.org/D4695
llvm-svn: 216160
|
| |
|
|
|
|
| |
needing to mention the size.
llvm-svn: 216158
|
| |
|
|
|
|
|
|
| |
avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.
llvm-svn: 215870
|
| |
|
|
|
|
| |
needing to mention the size.
llvm-svn: 215868
|
| |
|
|
|
|
|
|
|
| |
Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use
it.
Patch by Björn Steinbrink!
llvm-svn: 215723
|
| |
|
|
|
|
|
| |
Vector instructions are (still) not supported for either integer or floating
point. Hopefully, that work will be landed shortly.
llvm-svn: 215647
|
| |
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215573
|
| |
|
|
| |
llvm-svn: 215169
|
| |
|
|
|
|
|
|
| |
this case, the code path dealing with vector promotion was missing the explicit
checks for lifetime intrinsics that were present on the corresponding integer
promotion path.
llvm-svn: 215148
|
| |
|
|
|
|
| |
It broke msan.
llvm-svn: 214989
|
| |
|
|
|
|
| |
Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com)
llvm-svn: 214934
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize the following IR:
%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4
Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store. If the store is to an out of bounds address, it is undefined and thus also removable.
Reviewed By: nicholas
Differential Revision: http://reviews.llvm.org/D3942
llvm-svn: 214897
|
| |
|
|
|
|
|
| |
Some configure scripts declare this with the wrong prototype, which can lead
to an assertion failure.
llvm-svn: 214593
|
| |
|
|
|
|
|
|
|
| |
Instead of moving out the data in a ErrorOr<std::unique_ptr<Foo>>, get
a reference to it.
Thanks to David Blaikie for the suggestion.
llvm-svn: 214516
|
| |
|
|
|
|
| |
function as well. No functional changes intended.
llvm-svn: 214325
|
| |
|
|
|
|
|
|
|
|
| |
hint) the loop unroller replaces the llvm.loop.unroll.count metadata with
llvm.loop.unroll.disable metadata to prevent any subsequent unrolling
passes from unrolling more than the hint indicates. This patch fixes
an issue where loop unrolling could be disabled for other loops as well which
share the same llvm.loop metadata.
llvm-svn: 213900
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds scoped noalias metadata. The primary motivations for this
feature are:
1. To preserve noalias function attribute information when inlining
2. To provide the ability to model block-scope C99 restrict pointers
Neither of these two abilities are added here, only the necessary
infrastructure. In fact, there should be no change to existing functionality,
only the addition of new features. The logic that converts noalias function
parameters into this metadata during inlining will come in a follow-up commit.
What is added here is the ability to generally specify noalias memory-access
sets. Regarding the metadata, alias-analysis scopes are defined similar to TBAA
nodes:
!scope0 = metadata !{ metadata !"scope of foo()" }
!scope1 = metadata !{ metadata !"scope 1", metadata !scope0 }
!scope2 = metadata !{ metadata !"scope 2", metadata !scope0 }
!scope3 = metadata !{ metadata !"scope 2.1", metadata !scope2 }
!scope4 = metadata !{ metadata !"scope 2.2", metadata !scope2 }
Loads and stores can be tagged with an alias-analysis scope, and also, with a
noalias tag for a specific scope:
... = load %ptr1, !alias.scope !{ !scope1 }
... = load %ptr2, !alias.scope !{ !scope1, !scope2 }, !noalias !{ !scope1 }
When evaluating an aliasing query, if one of the instructions is associated
with an alias.scope id that is identical to the noalias scope associated with
the other instruction, or is a descendant (in the scope hierarchy) of the
noalias scope associated with the other instruction, then the two memory
accesses are assumed not to alias.
Note that is the first element of the scope metadata is a string, then it can
be combined accross functions and translation units. The string can be replaced
by a self-reference to create globally unqiue scope identifiers.
[Note: This overview is slightly stylized, since the metadata nodes really need
to just be numbers (!0 instead of !scope0), and the scope lists are also global
unnamed metadata.]
Existing noalias metadata in a callee is "cloned" for use by the inlined code.
This is necessary because the aliasing scopes are unique to each call site
(because of possible control dependencies on the aliasing properties). For
example, consider a function: foo(noalias a, noalias b) { *a = *b; } that gets
inlined into bar() { ... if (...) foo(a1, b1); ... if (...) foo(a2, b2); } --
now just because we know that a1 does not alias with b1 at the first call site,
and a2 does not alias with b2 at the second call site, we cannot let inlining
these functons have the metadata imply that a1 does not alias with b2.
llvm-svn: 213864
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).
This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.
No functionality change intended.
llvm-svn: 213859
|
| |
|
|
|
|
| |
#pragma clang loop unroll(full).
llvm-svn: 213789
|
| |
|
|
|
|
| |
new form using the string "full".
llvm-svn: 213772
|
| |
|
|
|
|
| |
callUsesLocalStack, sometimes with IsNocapture true and sometimes with IsNocapture false. We accidentally skipped work we needed to do in the IsNocapture=false case if we were called with IsNocapture=true the first time. Fixes PR20405!
llvm-svn: 213726
|
| |
|
|
|
|
|
|
|
| |
iterator ranges."
This reverts commit r213474 (and r213475), which causes a miscompile on
a stage2 LTO build. I'll reply on the list in a moment.
llvm-svn: 213562
|
| |
|
|
|
|
|
|
|
| |
64-bit mode
Prevents hoisting of loads above stores and sinking of stores below loads
in MergedLoadStoreMotion.cpp (rdar://15991737)
llvm-svn: 213497
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ranges.
Summary: This patch introduces two new iterator ranges and updates existing code to use it. No functional change intended.
Test Plan: All tests (make check-all) still pass.
Reviewers: dblaikie
Reviewed By: dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D4481
llvm-svn: 213474
|
| |
|
|
| |
llvm-svn: 213456
|
| |
|
|
| |
llvm-svn: 213448
|
| |
|
|
| |
llvm-svn: 213414
|
| |
|
|
| |
llvm-svn: 213412
|
| |
|
|
|
|
|
|
|
|
|
| |
Merges equivalent loads on both sides of a hammock/diamond
and hoists into into the header.
Merges equivalent stores on both sides of a hammock/diamond
and sinks it to the footer.
Can enable if conversion and tolerate better load misses
and store operand latencies.
llvm-svn: 213396
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Converting outermost zext(a) to sext(a) causes worse code when the
computation of zext(a) could be reused. For example, after converting
... = array[zext(a)]
... = array[zext(a) + 1]
to
... = array[sext(a)]
... = array[zext(a) + 1],
the program computes sext(a), which is actually unnecessary. I added one
test in split-gep-and-gvn.ll to illustrate this scenario.
Also, with r211281 and r211084, we annotate more "nuw" tags to
computation involving CUDA intrinsics such as threadIdx.x. These
annotations help with splitting GEP a lot, rendering the benefit we get
from this reverted optimization only marginal.
Test Plan: make check-all
Reviewers: eliben, meheff
Reviewed By: meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D4542
llvm-svn: 213209
|
| |
|
|
|
|
| |
relevant. Fixes PR18304. Patch by David Wiberg!
llvm-svn: 212970
|
| |
|
|
|
|
|
|
|
|
| |
not properly handle the case where the predecessor block was the entry block to
the function. The only in-tree client of this is JumpThreading, which worked
around the issue in its own code. This patch moves the solution into the helper
so that JumpThreading (and other clients) do not have to replicate the same fix
everywhere.
llvm-svn: 212875
|
| |
|
|
|
|
|
|
|
|
| |
This is the one remaining place I see where passing
isSafeToSpeculativelyExecute a DataLayout pointer might matter (at least for
loads) -- I think I got the others in r212720. Most of the other remaining
callers of isSafeToSpeculativelyExecute only use it for call sites (or
otherwise exclude loads).
llvm-svn: 212730
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
isSafeToSpeculativelyExecute can optionally take a DataLayout pointer. In the
past, this was mainly used to make better decisions regarding divisions known
not to trap, and so was not all that important for users concerned with "cheap"
instructions. However, now it also helps look through bitcasts for
dereferencable loads, and will also be important if/when we add a
dereferencable pointer attribute.
This is some initial work to feed a DataLayout pointer through to callers of
isSafeToSpeculativelyExecute, generally where one was already available.
llvm-svn: 212720
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.
In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).
llvm-svn: 212686
|
| |
|
|
| |
llvm-svn: 212405
|
| |
|
|
|
|
|
|
|
|
|
| |
If both instructions to be replaced are marked invariant the resulting
instruction is invariant.
rdar://13358910
Fix by Erik Eckstein!
llvm-svn: 211801
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[LLVM part]
These patches rename the loop unrolling and loop vectorizer metadata
such that they have a common 'llvm.loop.' prefix. Metadata name
changes:
llvm.vectorizer.* => llvm.loop.vectorizer.*
llvm.loopunroll.* => llvm.loop.unroll.*
This was a suggestion from an earlier review
(http://reviews.llvm.org/D4090) which added the loop unrolling
metadata.
Patch by Mark Heffernan.
llvm-svn: 211710
|
| |
|
|
| |
llvm-svn: 211678
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when sinking.
Fixes exponential compilation complexity in PR19835, caused by
LICM::sink not handling the following pattern well:
f = op g
e = op f, g
d = op e
c = op d, e
b = op c
a = op b, c
When an instruction with N uses is sunk, each of its operands gets N
new uses (all of them - phi nodes). In the example above, if a had 1
use, c would have 2, e would have 4, and g would have 8.
llvm-svn: 211673
|