| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.
This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.
llvm-svn: 167011
|
| |
|
|
|
|
|
|
|
| |
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.
Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.
llvm-svn: 166958
|
| |
|
|
|
|
|
|
|
| |
wrapper returns a vector of integers when passed a vector of pointers) by having
getIntPtrType itself return a vector of integers in this case. Outside of this
wrapper, I didn't find anywhere in the codebase that was relying on the old
behaviour for vectors of pointers, so give this a whirl through the buildbots.
llvm-svn: 166939
|
| |
|
|
|
|
| |
I don't think this is possible with the current implementation but that may change eventually.
llvm-svn: 166877
|
| |
|
|
|
|
|
|
|
|
|
| |
This turns loops like
for (unsigned i = 0; i != n; ++i)
p[i] = p[i+1];
into memmove, which has a highly optimized implementation in most libcs.
This was really easy with the new DependenceAnalysis :)
llvm-svn: 166875
|
| |
|
|
|
|
|
|
|
|
|
| |
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481.
Compile time performance seems to be slightly worse, but this is mostly due
to an extra LCSSA run scheduled by the PassManager and should be fixed there.
llvm-svn: 166874
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
smaller integer loads and stores.
The high-level motivation is that the frontend sometimes generates
a single whole-alloca integer load or store during ABI lowering of
splittable allocas. We need to be able to break this apart in order to
see the underlying elements and properly promote them to SSA values. The
hope is that this fixes some performance regressions on x86-32 with the
new SROA pass.
Unfortunately, this causes quite a bit of churn in the test cases, and
bloats some IR that comes out. When we see an alloca that consists soley
of bits and bytes being extracted and re-inserted, we now do some
splitting first, before building widened integer "bucket of bits"
representations. These are always well folded by instcombine however, so
this shouldn't actually result in missed opportunities.
If this splitting of all-integer allocas does cause problems (perhaps
due to smaller SSA values going into the RA), we could potentially go to
some extreme measures to only do this integer splitting trick when there
are non-integer component accesses of an alloca, but discovering this is
quite expensive: it adds yet another complete walk of the recursive use
tree of the alloca.
Either way, I will be watching build bots and LNT bots to see what
fallout there is here. If anyone gets x86-32 numbers before & after this
change, I would be very interested.
llvm-svn: 166662
|
| |
|
|
|
|
|
| |
GVN will now generate ptrtoint instructions for vectors of pointers.
Fixes PR14166.
llvm-svn: 166624
|
| |
|
|
| |
llvm-svn: 166607
|
| |
|
|
|
|
| |
command. Bleh, sorry about this!
llvm-svn: 166596
|
| |
|
|
| |
llvm-svn: 166591
|
| |
|
|
|
|
|
|
|
| |
address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.
llvm-svn: 166578
|
| |
|
|
|
|
|
| |
every TU where it's implicitly instantiated, even if there's an implicit
instantiation for the same types available in another TU.
llvm-svn: 166470
|
| |
|
|
|
|
| |
bots.
llvm-svn: 166424
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
very small but very important bugfix:
bool shouldExplore(Use *U) {
Value *V = U->get();
if (isa<CallInst>(V) || isa<InvokeInst>(V))
[...]
should have read:
bool shouldExplore(Use *U) {
Value *V = U->getUser();
if (isa<CallInst>(V) || isa<InvokeInst>(V))
Fixes PR14143!
llvm-svn: 166407
|
| |
|
|
|
|
|
|
| |
deciding whether"
It broke selfhosting stage2 in several builders.
llvm-svn: 166406
|
| |
|
|
|
|
| |
calls can be marked tail.
llvm-svn: 166405
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
LoopDependenceAnalysis."
It passes all tests, produces better results than the old code but uses the
wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it
still in tree?", you might ask. The answer is obviously: "To confuse developers."
Just swapping in the new dependency pass sends the pass manager into an infinte
loop, I'll try to figure out why tomorrow.
llvm-svn: 166399
|
| |
|
|
|
|
|
|
|
|
| |
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481. I'm not entirely
sure that all cases are handled that the old checks handled but LDA will
certainly become smarter in the future.
llvm-svn: 166390
|
| |
|
|
| |
llvm-svn: 166375
|
| |
|
|
| |
llvm-svn: 166340
|
| |
|
|
|
|
|
|
| |
input is zero.
Fixes PR13028.
llvm-svn: 166313
|
| |
|
|
|
|
|
| |
This can invalidate the iterators leading to use after frees and crashes.
Fixes PR12536.
llvm-svn: 166291
|
| |
|
|
|
|
| |
interface.
llvm-svn: 166264
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch migrates the strcpy optimizations from the simplify-libcalls pass
into the instcombine library call simplifier. Note also that StrCpyChkOpt
has been updated with a few simplifications that were being done in the
simplify-libcalls version of StrCpyOpt, but not in the migrated implementation
of StrCpyOpt. There is no reason to overload StrCpyOpt with fortified and
regular simplifications in the new model since there is already a dedicated
simplifier for __strcpy_chk.
llvm-svn: 166198
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
operate purely on values. Sink the alloca loading and storing logic into
the rewrite routines that are specific to alloca-integer-rewrite
driving. This is just a refactoring here, but the subsequent step will
be to reuse the insertion and extraction logic when rewriting integer
loads and stores that have been split and decomposed into narrower loads
and stores.
No functionality changed other than different names for instructions.
llvm-svn: 166176
|
| |
|
|
| |
llvm-svn: 166175
|
| |
|
|
|
|
|
|
|
|
|
| |
The TargetTransform changes are breaking LTO bootstraps of clang. I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.
This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741
llvm-svn: 166168
|
| |
|
|
|
|
|
|
|
|
|
| |
a pointer. A very bad idea. Let's not do that. Fixes PR14105.
Note that this wasn't *that* glaring of an oversight. Originally, these
routines were only called on offsets within an alloca, which are
intrinsically positive. But over the evolution of the pass, they ended
up being called for arbitrary offsets, and things went downhill...
llvm-svn: 166095
|
| |
|
|
|
|
|
|
|
|
|
| |
revision makes no sense. We cannot use the address space of the *post
indexed* type to conclude anything about a *pre indexed* pointer type's
size. More importantly, this index can never be over a pointer. We are
indexing over arrays and vectors here.
Of course, I have no test case here. Neither did the original patch. =/
llvm-svn: 166091
|
| |
|
|
| |
llvm-svn: 166053
|
| |
|
|
| |
llvm-svn: 166050
|
| |
|
|
| |
llvm-svn: 166045
|
| |
|
|
|
|
| |
simplify the code a bit. No functionality change.
llvm-svn: 166009
|
| |
|
|
|
|
| |
own class named AttrBuilder. No functionality change.
llvm-svn: 165960
|
| |
|
|
|
|
| |
different pointer sizes on a per address space basis.
llvm-svn: 165941
|
| |
|
|
|
|
|
|
| |
includes extracting ints for copying elsewhere and inserting ints when
copying into the alloca. This should fix the CanSROA assertion coming
out of Clang's regression test suite.
llvm-svn: 165931
|
| |
|
|
|
|
|
| |
and generally clean up the memset handling. It had rotted a bit as the
other rewriting logic got polished more.
llvm-svn: 165930
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cases where we have partial integer loads and stores to an otherwise
promotable alloca to widen[1] those loads and stores to cover the entire
alloca and bitcast them into the appropriate type such that promotion
can proceed.
These partial loads and stores stem from an annoying confluence of ARM's
calling convention and ABI lowering and the FCA pre-splitting which
takes place in SROA. Clang lowers a { double, double } in-register
function argument as a [4 x i32] function argument to ensure it is
placed into integer 32-bit registers (a really unnerving implicit
contract between Clang and the ARM backend I would add). This results in
a FCA load of [4 x i32]* from the { double, double } alloca, and SROA
decomposes this into a sequence of i32 loads and stores. Inlining
proceeds, code gets folded, but at the end of the day, we still have i32
stores to the low and high halves of a double alloca. Widening these to
be i64 operations, and bitcasting them to double prior to loading or
storing allows promotion to proceed for these allocas.
I looked quite a bit changing the IR which Clang produces for this case
to be more friendly, but small changes seem unlikely to help. I think
the best representation we could use currently would be to pass 4 i32
arguments thereby avoiding any FCAs, but that would still require this
fix. It seems like it might eventually be nice to somehow encode the ABI
register selection choices outside of the parameter type system so that
the parameter can be a { double, double }, but the CC register
annotations indicate that this should be passed via 4 integer registers.
This patch does not address the second problem in PR14059, which is the
reverse: when a struct alloca is loaded as a *larger* single integer.
This patch also does not address some of the code quality issues with
the FCA-splitting. Those don't actually impede any optimizations really,
but they're on my list to clean up.
[1]: Pedantic footnote: for those concerned about memory model issues
here, this is safe. For the alloca to be promotable, it cannot escape or
have any use of its address that could allow these loads or stores to be
racing. Thus, widening is always safe.
llvm-svn: 165928
|
| |
|
|
|
|
|
| |
into static helper functions. They're really quite generic and are going
to be needed elsewhere shortly.
llvm-svn: 165927
|
| |
|
|
|
|
| |
This gets rid of some magic numbers.
llvm-svn: 165924
|
| |
|
|
|
|
|
|
|
|
| |
Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.
llvm-svn: 165917
|
| |
|
|
|
|
|
| |
This patch migrates the strcmp and strncmp optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.
llvm-svn: 165915
|
| |
|
|
|
|
|
|
|
| |
Erasing from the beginning or middle of the vector is expensive, remove_if can
do it in linear time even though it's a bit ugly without lambdas.
No functionality change.
llvm-svn: 165903
|
| |
|
|
|
|
| |
it with the equivalent from the builder class.
llvm-svn: 165895
|
| |
|
|
|
|
| |
the equivalent from the builder class.
llvm-svn: 165893
|
| |
|
|
|
|
|
| |
This patch migrates the strchr and strrchr optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.
llvm-svn: 165875
|
| |
|
|
|
|
|
| |
This patch migrates the strcat and strncat optimizations from the
simplify-libcalls pass into the instcombine library call simplifier.
llvm-svn: 165874
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
type coercion code, especially when targetting ARM. Things like [1
x i32] instead of i32 are very common there.
The goal of this logic is to ensure that when we are picking an alloca
type, we look through such wrapper aggregates and across any zero-length
aggregate elements to find the simplest type possible to form a type
partition.
This logic should (generally speaking) rarely fire. It only ends up
kicking in when an alloca is accessed using two different types (for
instance, i32 and float), and the underlying alloca type has wrapper
aggregates around it. I noticed a significant amount of this occurring
looking at stepanov_abstraction generated code for arm, and suspect it
happens elsewhere as well.
Note that this doesn't yet address truly heinous IR productions such as
PR14059 is concerning. Those result in mismatched *sizes* of types in
addition to mismatched access and alloca types.
llvm-svn: 165870
|
| |
|
|
|
|
|
|
|
| |
help the dragonegg builders, and no test case at this point, but this
was one dimly plausible case I spotted by inspection. Hopefully will get
a testcase from those bots soon-ish, and will tidy this up with proper
testing.
llvm-svn: 165869
|