| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
CodeGenPrepare and SimplifyLibCalls
llvm-svn: 233596
|
| |
|
|
|
|
| |
Update lib/Analysis and lib/Transforms to use the new `DebugLoc` API.
llvm-svn: 233587
|
| |
|
|
|
|
| |
have been removed.
llvm-svn: 233419
|
| |
|
|
|
|
|
|
|
|
|
|
| |
of geps of bitcasts
This just didn't need to be here at all, but the assertion I tried to
add wasn't appropriate either - the circumstance isn't impossible, it's
just not important to deal with it here - the gep-rooted version of this
instcombine will handle this case, we don't need to duplicate it for the
case where the gep happens to be used in a bitcast.
llvm-svn: 233404
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We make many redundant calls to isInterestingAlloca in the AddressSanitzier
pass. This is especially inefficient for allocas that have many uses. Let's
cache the results to speed up compilation.
The compile time improvements depend on the input. I did not see much
difference on benchmarks; however, I have a test case where compile time
goes from minutes to under a second.
llvm-svn: 233397
|
| |
|
|
| |
llvm-svn: 233392
|
| |
|
|
|
|
|
|
| |
This re-adds float2int to the tree, after fixing PR23038. It turns
out the argument to APSInt() is true-if-unsigned, rather than
true-if-signed :(. Added testcase and explanatory comment.
llvm-svn: 233370
|
| |
|
|
| |
llvm-svn: 233363
|
| |
|
|
|
|
| |
The assertion here was more expensive then it needed to be. We're only inserting allocas in the entry block, so we only need to consider ones in the entry block.
llvm-svn: 233362
|
| |
|
|
| |
llvm-svn: 233361
|
| |
|
|
|
|
| |
Minor naming, one potentially unsafe cast
llvm-svn: 233359
|
| |
|
|
|
|
| |
All the removed assertions are either implied locally by the assert at the top of the function or properties of the verifier.
llvm-svn: 233358
|
| |
|
|
|
|
|
|
| |
This patch exposes LoopVectorizer's isInductionVariable function as common
a functionality.
http://reviews.llvm.org/D8608
llvm-svn: 233352
|
| |
|
|
|
|
| |
tree, due to PR23038.
llvm-svn: 233350
|
| |
|
|
|
|
|
| |
Anding and comparing with zero can be done in a single instruction on
most archs so this is a bit cheaper.
llvm-svn: 233291
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch enhances SLSR to handle another candidate form &B[i * S]. If
we found two candidates
S1: X = &B[i * S]
S2: Y = &B[i' * S]
and S1 dominates S2, we can replace S2 with
Y = &X[(i' - i) * S]
Test Plan:
slsr-gep.ll
X86/no-slsr.ll: verify that we do not run SLSR on GEPs that already fit into
an addressing mode
Reviewers: eliben, atrick, meheff, hfinkel
Reviewed By: hfinkel
Subscribers: sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D7459
llvm-svn: 233286
|
| |
|
|
|
|
|
| |
Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int
is not run on optnone functions.
llvm-svn: 233183
|
| |
|
|
|
|
|
|
| |
where possible.
Now with a fix for PR23008 and extra regression test.
llvm-svn: 233175
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)
SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)
llvm-svn: 233131
|
| |
|
|
|
|
|
|
|
|
|
|
| |
...because this is what happens when an instruction
set puts its underwear on after its pants.
This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110
llvm-svn: 233127
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The changes to InstCombine do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)
llvm-svn: 233126
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch tries to merge duplicate landing pads when they branch to a common shared target.
Given IR that looks like this:
lpad1:
%exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
cleanup
br label %shared_resume
lpad2:
%exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
cleanup
br label %shared_resume
shared_resume:
call void @fn()
ret void
}
We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.
Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.
Differential Revision: http://reviews.llvm.org/D8297
llvm-svn: 233125
|
| |
|
|
|
|
|
|
| |
Assertion fires in compiler-rt. Guess it does fire..
This reverts commit r233116.
llvm-svn: 233121
|
| |
|
|
|
|
|
|
| |
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.
llvm-svn: 233116
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the IR optimizer follow-on patch for D8563: the x86 backend patch
that converts this kind of shuffle back into a vperm2.
This is also a continuation of the transform that started in D8486.
In that patch, Andrea suggested that we could convert vperm2 intrinsics that
use zero masks into a single shuffle.
This is an implementation of that suggestion.
Differential Revision: http://reviews.llvm.org/D8567
llvm-svn: 233110
|
| |
|
|
|
|
|
|
|
|
|
| |
where possible."
This caused PR23008, compiles failing with: "Use still stuck around after Def is
destroyed: %.sroa.speculated"
Also reverting follow-up r233064.
llvm-svn: 233105
|
| |
|
|
|
|
|
|
|
| |
IRCE requires the induction variables it handles to not sign-overflow.
The current scheme of checking if sext({X,+,S}) == {sext(X),+,sext(S)}
fails when SCEV simplifies sext(X) too. After this change we //also//
check no-signed-wrap by looking at the flags set on the SCEVAddRecExpr.
llvm-svn: 233102
|
| |
|
|
|
|
|
| |
IRCE should not try to eliminate range checks that check an induction
variable against a loop-varying length.
llvm-svn: 233101
|
| |
|
|
| |
llvm-svn: 233064
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used.
This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers.
This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point.
To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around.
llvm-svn: 233062
|
| |
|
|
| |
llvm-svn: 232998
|
| |
|
|
| |
llvm-svn: 232995
|
| |
|
|
| |
llvm-svn: 232993
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 232976
|
| |
|
|
|
|
| |
NFC.
llvm-svn: 232944
|
| |
|
|
|
|
| |
bitfield transform.
llvm-svn: 232903
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.
int foo(char C) { return strchr("123!", C) != nullptr; } now becomes
cmpl $64, %edi ## range check
sbbb %al, %al
movabsq $0xE000200000001, %rcx
btq %rdi, %rcx ## bit test
sbbb %cl, %cl
andb %al, %cl ## and the two conditions
andb $1, %cl
movzbl %cl, %eax ## returning an int
ret
(imho the backend should expand this into a series of branches, but
that's a different story)
The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.
llvm-svn: 232902
|
| |
|
|
|
|
| |
This is just memchr(x, y, 0) -> nullptr and constant folding.
llvm-svn: 232896
|
| |
|
|
| |
llvm-svn: 232873
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vperm2* intrinsics are just shuffles.
In a few special cases, they're not even shuffles.
Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:
1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
really angry (and so I have regrets about some patches from last week).
2. Doing mask conversion logic in header files is hard to write and
subsequently read.
There are a couple of TODOs in this patch to complete this optimization.
Differential Revision: http://reviews.llvm.org/D8486
llvm-svn: 232852
|
| |
|
|
| |
llvm-svn: 232851
|
| |
|
|
|
|
|
| |
After a WIP patch to make `DIDescriptor` accessors more strict, this
started asserting.
llvm-svn: 232832
|
| |
|
|
|
|
|
|
|
| |
Don't use `DebugLoc` accessors if we're pointing at null, which will be
a problem after a WIP patch to make the `DIDescriptor` accessors more
strict. Caught by Frontend/profile-sample-use-loc-tracking.c (in
clang).
llvm-svn: 232792
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove `DebugInfoVerifierLegacyPass` and the `-verify-di` pass.
Instead, call into the `DebugInfoVerifier` from inside
`VerifierLegacyPass::finalizeModule()`. This better matches the logic
in `verifyModule()` (used by the new PassManager), avoids requiring two
separate passes to verify the IR, and makes the API for "add a pass to
verify the IR" simple.
Note: the `-verify-debug-info` flag still works (for now, at least;
eventually it might make sense to just remove it).
llvm-svn: 232772
|
| |
|
|
|
|
|
|
|
|
| |
Each use of the byte array uses a different alias. This makes the
backend less likely to reuse previously computed byte array addresses,
improving the security of the CFI mechanism based on this pass.
Differential Revision: http://reviews.llvm.org/D8455
llvm-svn: 232770
|
| |
|
|
|
|
|
|
|
|
| |
This change also introduces a link-time optimization level of 1. This
optimization level runs only the globaldce pass as well as cleanup passes for
passes that run at -O0, specifically simplifycfg which cleans up lowerbitsets.
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150316/266951.html
llvm-svn: 232769
|
| |
|
|
|
|
|
|
|
| |
`StripDebug` was only used by tools/opt/opt.cpp in
`AddStandardLinkPasses()`, but opt.cpp adds the same pass based on its
command-line flag before it calls `AddStandardLinkPasses()`. Stripping
debug info twice isn't very useful.
llvm-svn: 232765
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we encounter a global with a comdat, rather than iterating over
every global in the module to find globals in the same comdat, store the
members in a multimap. This effectively lowers the complexity to O(N log N),
improving performance significantly for large modules such as might be
encountered during LTO.
It looks like we used to do something like this until r219191.
No functional change.
Differential Revision: http://reviews.llvm.org/D8431
llvm-svn: 232743
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can only occur (I think) through the back-edge of the loop.
However, folding a GEP into itself means that the value of the previous
iteration needs to be stored in the meantime, thus requiring an
additional register variable to be live, but not actually achieving
anything (the gep still needs to be executed once per loop iteration).
The attached test case is derived from:
typedef unsigned uint32;
typedef unsigned char uint8;
inline uint8 *f(uint32 value, uint8 *target) {
while (value >= 0x80) {
value >>= 7;
++target;
}
++target;
return target;
}
uint8 *g(uint32 b, uint8 *target) {
target = f(b, f(42, target));
return target;
}
What happens is that the GEP stored in incptr2 is folded into itself
through the loop's back-edge and the phi-node stored in loopptr,
effectively incrementing the ptr by "2" in each iteration instead of "1".
In this case, it is actually increasing the number of GEPs required as
the GEP before the loop can't be folded away anymore. For comparison:
With this patch:
define i8* @test4(i32 %value, i8* %buffer) {
entry:
%cmp = icmp ugt i32 %value, 127
br i1 %cmp, label %loop.header, label %exit
loop.header: ; preds = %entry
br label %loop.body
loop.body: ; preds = %loop.body, %loop.header
%buffer.pn = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
%newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
%loopptr = getelementptr inbounds i8, i8* %buffer.pn, i64 1
%shr = lshr i32 %newval, 7
%cmp2 = icmp ugt i32 %newval, 16383
br i1 %cmp2, label %loop.body, label %loop.exit
loop.exit: ; preds = %loop.body
br label %exit
exit: ; preds = %loop.exit, %entry
%0 = phi i8* [ %loopptr, %loop.exit ], [ %buffer, %entry ]
%incptr3 = getelementptr inbounds i8, i8* %0, i64 2
ret i8* %incptr3
}
Without this patch:
define i8* @test4(i32 %value, i8* %buffer) {
entry:
%incptr = getelementptr inbounds i8, i8* %buffer, i64 1
%cmp = icmp ugt i32 %value, 127
br i1 %cmp, label %loop.header, label %exit
loop.header: ; preds = %entry
br label %loop.body
loop.body: ; preds = %loop.body, %loop.header
%0 = phi i8* [ %buffer, %loop.header ], [ %loopptr, %loop.body ]
%loopptr = phi i8* [ %incptr, %loop.header ], [ %incptr2, %loop.body ]
%newval = phi i32 [ %value, %loop.header ], [ %shr, %loop.body ]
%shr = lshr i32 %newval, 7
%incptr2 = getelementptr inbounds i8, i8* %0, i64 2
%cmp2 = icmp ugt i32 %newval, 16383
br i1 %cmp2, label %loop.body, label %loop.exit
loop.exit: ; preds = %loop.body
br label %exit
exit: ; preds = %loop.exit, %entry
%ptr2 = phi i8* [ %incptr2, %loop.exit ], [ %incptr, %entry ]
%incptr3 = getelementptr inbounds i8, i8* %ptr2, i64 1
ret i8* %incptr3
}
Review: http://reviews.llvm.org/D8245
llvm-svn: 232718
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change splits `makeICmpRegion` into `makeAllowedICmpRegion` and
`makeSatisfyingICmpRegion` with slightly different contracts. The first
one is useful for determining what values some expression //may// take,
given that a certain `icmp` evaluates to true. The second one is useful
for determining what values are guaranteed to //satisfy// a given
`icmp`.
Reviewers: nlewycky
Reviewed By: nlewycky
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8345
llvm-svn: 232575
|