|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| | llvm-svn: 202621 | 
| | 
| 
| 
| 
| 
| | directly, and remove the macro.
llvm-svn: 202612 | 
| | 
| 
| 
| 
| 
| 
| 
| | of boilerplate.
No intended functionality change.
llvm-svn: 202588 | 
| | 
| 
| 
| | llvm-svn: 202555 | 
| | 
| 
| 
| | llvm-svn: 202391 | 
| | 
| 
| 
| | llvm-svn: 202390 | 
| | 
| 
| 
| 
| 
| | when two unrelated pointers are compared or subtracted). This implementation has both false positives and false negatives and is not tuned for performance. A bug report for a proper implementation will follow.
llvm-svn: 202389 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | We should apply fastcc whenever profitable.  We can expand this list,
but there are lots of conventions with performance implications that we
don't want to change.
Differential Revision: http://llvm-reviews.chandlerc.com/D2705
llvm-svn: 202293 | 
| | 
| 
| 
| 
| 
| 
| 
| | truncated use.
Patch by Michael Zolotukhin!
llvm-svn: 202273 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | address spaces.
This isn't really a correctness issue (the values are truncated) but its
much cleaner.
Patch by Matt Arsenault!
llvm-svn: 202252 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | the default.
Based on the patch by Matt Arsenault, D1764!
I switched one place to use the more direct pointer type to compute the
desired address space, and I reworked the memcpy rewriting section to
reflect significant refactorings that this patch helped inspire.
Thanks to several of the folks who helped review and improve the patch
as well.
llvm-svn: 202247 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | to work independently for the slice side and the other side.
This allows us to only compute the minimum of the two when we actually
rewrite to a memcpy that needs to take the minimum, and preserve higher
alignment for one side or the other when rewriting to loads and stores.
This fix was inspired by seeing the result of some refactoring that
makes addrspace handling better.
llvm-svn: 202242 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | D1764, which in turn set off the other refactorings to make
'getSliceAlign()' a sensible thing.
There are two possible inputs to the required alignment of a memory
transfer intrinsic: the alignment constraints of the source and the
destination. If we are *only* introducing a (potentially new) offset
onto one side of the transfer, we don't need to consider the alignment
constraints of the other side. Use this to simplify the logic feeding
into alignment computation for unsplit transfers.
Also, hoist the clamp of the magical zero alignment for these intrinsics
to the more customary one alignment early. This lets several other
conditions melt away.
No functionality changed. There is a further improvement this exposes
which *will* change functionality, but that's arriving in a separate
patch.
llvm-svn: 202232 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | rewriting logic: don't pass custom offsets for the adjusted pointer to
the new alloca.
We always passed NewBeginOffset here. Sometimes we spelled it
BeginOffset, but only when they were in fact equal. Whats worse, the API
is set up so that you can't reasonably call it with anything else -- it
assumes that you're passing it an offset relative to the *original*
alloca that happens to fall within the new one. That's the whole point
of NewBeginOffset, it's the clamped beginning offset.
No functionality changed.
llvm-svn: 202231 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | alignment of the slice being rewritten, not any arbitrary offset.
Every caller is really just trying to compute the alignment for the
whole slice, never for some arbitrary alignment. They are also just
passing a type when they have one to see if we can skip an explicit
alignment in the IR by using the type's alignment. This makes for a much
simpler interface.
Another refactoring inspired by the addrspace patch for SROA, although
only loosely related.
llvm-svn: 202230 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | consistency with memcpy rewriting, and fix a latent bug in the alignment
management for memset.
The alignment issue is that getAdjustedAllocaPtr is computing the
*relative* offset into the new alloca, but the alignment isn't being set
to the relative offset, it was using the the absolute offset which is
into the old alloca.
I don't think its possible to write a test case that actually reaches
this code where the resulting alignment would be observably different,
but the intent was clearly to use the relative offset within the new
alloca.
llvm-svn: 202229 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | rather than passing them as arguments.
While I generally prefer actual arguments, in this case the readability
loss is substantial. By using members we avoid repeatedly calculating
the offsets, and once we're using members it is useful to ensure that
those names *always* refer to the original-alloca-relative new offset
for a rewritten slice.
No functionality changed. Follow-up refactoring, all toward getting the
address space patch merged.
llvm-svn: 202228 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | slice being rewritten.
We had the same code scattered across most of the visits. Instead,
compute the new offsets and the slice size once when we start to visit
a particular slice, and use the member variables from then on. This
reduces quite a bit of code duplication.
No functionality changed. Refactoring inspired to make it easier to
apply the address space patch to SROA.
llvm-svn: 202227 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | checking in SROA.
The primary change is to just rely on uge for checking that the offset
is within the allocation size. This removes the explicit checks against
isNegative which were terribly error prone (including the reversed logic
that led to PR18615) and prevented us from supporting stack allocations
larger than half the address space.... Ok, so maybe the latter isn't
*common* but it's a silly restriction to have.
Also, we used to try to support a PHI node which loaded from before the
start of the allocation if any of the loaded bytes were within the
allocation. This doesn't make any sense, we have never really supported
loading or storing *before* the allocation starts. The simplified logic
just doesn't care.
We continue to allow loading past the end of the allocation in part to
support cases where there is a PHI and some loads are larger than others
and the larger ones reach past the end of the allocation. We could solve
this a different and more conservative way, but I'm still somewhat
paranoid about this.
llvm-svn: 202224 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | their inputs come from std::stable_sort and they are not total orders.
I'm not a huge fan of this, but the really bad std::stable_sort is right
at the beginning of Reassociate. After we commit to stable-sort based
consistent respect of source order, the downstream sorts shouldn't undo
that unless they have a total order or they are used in an
order-insensitive way. Neither appears to be true for these cases.
I don't have particularly good test cases, but this jumped out by
inspection when looking for output instability in this pass due to
changes in the ordering of std::sort.
llvm-svn: 202196 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | implemented this way a long time ago and due to the overwhelming bugs
that surfaced, moved to a much more relaxed variant. Richard Smith would
like to understand the magnitude of this problem and it seems fairly
harmless to keep some flag-controlled logic to get the extremely strict
behavior here. I'll remove it if it doesn't prove useful.
llvm-svn: 202193 | 
| | 
| 
| 
| 
| 
| 
| | Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.
llvm-svn: 202168 | 
| | 
| 
| 
| | llvm-svn: 202157 | 
| | 
| 
| 
| | llvm-svn: 202155 | 
| | 
| 
| 
| 
| 
| 
| 
| | just "load". This helps avoid pointless de-duping with order-sensitive
numbers as we already have unique names from the original load. It also
makes the resulting IR quite a bit easier to read.
llvm-svn: 202140 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | the pointer adjustment code. This is the primary code path that creates
totally new instructions in SROA and being able to lump them based on
the pointer value's name for which they were created causes
*significantly* fewer name collisions and general noise in the debug
output. This is particularly significant because it is making it much
harder to track down instability in the output of SROA, as name
de-duplication is a totally harmless form of instability that gets in
the way of seeing real problems.
The new fancy naming scheme tries to dig out the root "pre-SROA" name
for pointer values and associate that all the way through the pointer
formation instructions. Digging out the root is important to prevent the
multiple iterative rounds of SROA from just layering too much cruft on
top of cruft here. We already track the layers of SROAs iteration in the
alloca name prefix. We don't need to duplicate it here.
Should have no functionality change, and shouldn't have any really
measurable impact on NDEBUG builds, as most of the complex logic is
debug-only.
llvm-svn: 202139 | 
| | 
| 
| 
| 
| 
| 
| | PHI-pointer builder, just copy the builder and clobber the obvious
fields.
llvm-svn: 202136 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | using OldPtr more heavily. Lots of this code was written before the
rewriter had an OldPtr member setup ahead of time. There are already
asserts in place that should ensure this doesn't change any
functionality.
llvm-svn: 202135 | 
| | 
| 
| 
| | llvm-svn: 202134 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | the break statement, not just think it to yourself....
No idea how this worked at all, much less survived most bots, my
bootstrap, and some bot bootstraps!
The Polly one didn't survive, and this was filed as PR18959. I don't
have a reduced test case and honestly I'm not seeing the need. What we
probably need here are better asserts / debug-build behavior in
SmallPtrSet so that this madness doesn't make it so far.
llvm-svn: 202129 | 
| | 
| 
| 
| | llvm-svn: 202119 | 
| | 
| 
| 
| | llvm-svn: 202107 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | sorting it. This helps uncover latent reliance on the original ordering
which aren't guaranteed to be preserved by std::sort (but often are),
and which are based on the use-def chain orderings which also aren't
(technically) guaranteed.
Only available in C++11 debug builds, and behind a flag to prevent noise
at the moment, but this is generally useful so figured I'd put it in the
tree rather than keeping it out-of-tree.
llvm-svn: 202106 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | the destination operand or source operand of a memmove.
It so happens that it was impossible for SROA to try to rewrite
self-memmove where the operands are *identical*, because either such
a think is volatile (and we don't rewrite) or it is non-volatile, and we
don't even register it as a use of the alloca.
However, making the 'IsDest' test *rely* on this subtle fact is... Very
confusing for the reader. We should use the direct and readily available
test of the Use* which gives us concrete information about which operand
is being rewritten.
No functionality changed, I hope! ;]
llvm-svn: 202103 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | ordering.
The fundamental problem that we're hitting here is that the use-def
chain ordering is *itself* not a stable thing to be relying on in the
rewriting for SROA. Further, we use a non-stable sort over the slices to
arrange them based on the section of the alloca they're operating on.
With a debugging STL implementation (or different implementations in
stage2 and stage3) this can cause stage2 != stage3.
The specific aspect of this problem fixed in this commit deals with the
rewriting and load-speculation around PHIs and Selects. This, like many
other aspects of the use-rewriting in SROA, is really part of the
"strong SSA-formation" that is doen by SROA where it works very hard to
canonicalize loads and stores in *just* the right way to satisfy the
needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the
same alloca, we test that loads downstream of the select are
speculatable around it twice. If only one of the operands to the select
needs to be rewritten, then if we get lucky we rewrite that one first
and the select is immediately speculatable. This can cause the order of
operand visitation, and thus the order of slices to be rewritten, to
change an alloca from promotable to non-promotable and vice versa.
The fix is to defer all of the speculation until *after* the rewrite
phase is done. Once we've rewritten everything, we can accurately test
for whether speculation will work (once, instead of twice!) and the
order ceases to matter.
This also happens to simplify the other subtlety of speculation -- we
need to *not* speculate anything unless the result of speculating will
make the alloca fully promotable by mem2reg. I had a previous attempt at
simplifying this, but it was still pretty horrible.
There is actually already a *really* nice test case for this in
basictest.ll, but on multiple STL implementations and inputs, we just
got "lucky". Fortunately, the test case is very small and we can
essentially build it in exactly the opposite way to get reasonable
coverage in both directions even from normal STL implementations.
llvm-svn: 202092 | 
| | 
| 
| 
| 
| 
| | No functionality change. Just reduces the noise of an upcoming patch.
llvm-svn: 202087 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Vectorize sequential stores of a broadcasted value.
5% on eon.
radar://16124699
llvm-svn: 202067 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)
llvm-svn: 202052 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | During the LTO phase LICM will move loop invariant global variables out of loops
(informed by GlobalModRef). This makes more loops countable presenting
opportunity for the loop vectorizer.
Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset
(5%) on x86-64.
radar://15970632
llvm-svn: 202051 | 
| | 
| 
| 
| 
| 
| | This will make it easier to switch the default to being binary files.
llvm-svn: 202042 | 
| | 
| 
| 
| 
| 
| | internal flags that allowed to override it. The tests pass, but still this change might break asan on some platform not covered by tests. If you see this, please submit a fix with a test.
llvm-svn: 202033 | 
| | 
| 
| 
| | llvm-svn: 201930 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | CodeGenPrepare uses extensively TargetLowering which is part of libLLVMCodeGen.
This is a layer violation which would introduce eventually a dependence on
CodeGen in ScalarOpts.
Move CodeGenPrepare into libLLVMCodeGen to avoid that.
Follow-up of <rdar://problem/15519855>
llvm-svn: 201912 | 
| | 
| 
| 
| | llvm-svn: 201870 | 
| | 
| 
| 
| | llvm-svn: 201833 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | I am really sorry for the noise, but the current state where some parts of the
code use TD (from the old name: TargetData) and other parts use DL makes it
hard to write a patch that changes where those variables come from and how
they are passed along.
llvm-svn: 201827 | 
| | 
| 
| 
| 
| 
| | to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink!
llvm-svn: 201822 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | r201608 made llvm corretly handle private globals with MachO. r201622 fixed
a bug in it and r201624 and r201625 were changes for using private linkage,
assuming that llvm would do the right thing.
They all got reverted because r201608 introduced a crash in LTO. This patch
includes a fix for that. The issue was that TargetLoweringObjectFile now has
to be initialized before we can mangle names of private globals. This is
trivially true during the normal codegen pipeline (the asm printer does it),
but LTO has to do it manually.
llvm-svn: 201700 | 
| | 
| 
| 
| 
| 
| 
| | Since r201608 got reverted, it is not safe to use private linkage in these cases
until it is committed back.
llvm-svn: 201688 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | On x86, shifting a vector by a scalar is significantly cheaper than shifting a
vector by another fully general vector. Unfortunately, because SelectionDAG
operates on just one basic block at a time, the shufflevector instruction that
reveals whether the right-hand side of a shift *is* really a scalar is often
not visible to CodeGen when it's needed.
This adds another handler to CodeGenPrepare, to sink any useful shufflevector
instructions down to the basic block where they're used, predicated on a target
hook (since on other architectures, doing so will often just introduce extra
real work).
rdar://problem/16063505
llvm-svn: 201655 |