| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
They are always defined in the main executable.
llvm-svn: 181994
|
| |
|
|
|
|
|
|
|
| |
We only want to check this once, not for every conditional block in the loop.
No functionality change (except that we don't perform a check redudantly
anymore).
llvm-svn: 181942
|
| |
|
|
|
|
| |
consistent about their usage of periods.
llvm-svn: 181901
|
| |
|
|
|
|
| |
No functionality change.
llvm-svn: 181862
|
| |
|
|
|
|
|
|
|
|
|
|
| |
InstCombine can be uncooperative to vectorization and sink loads into
conditional blocks. This prevents vectorization.
Undo this optimization if there are unconditional memory accesses to the same
addresses in the loop.
radar://13815763
llvm-svn: 181860
|
| |
|
|
| |
llvm-svn: 181848
|
| |
|
|
|
|
|
|
|
|
|
| |
CXAAtExitFn was set outside a loop and before optimizations where functions
can be deleted. This patch will set CXAAtExitFn inside the loop and after
optimizations.
Seg fault when running LTO because of accesses to a deleted function.
rdar://problem/13838828
llvm-svn: 181838
|
| |
|
|
| |
llvm-svn: 181760
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We used to give up if we saw two integer inductions. After this patch, we base
further induction variables on the chosen one like we do in the reverse
induction and pointer induction case.
Fixes PR15720.
radar://13851975
llvm-svn: 181746
|
| |
|
|
|
|
| |
pointer is known positive.
llvm-svn: 181745
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if and only if we are both KnownSafeBU/KnownSafeTD rather than just either or.
In the presense of a block being initialized, the frontend will emit the
objc_retain on the original pointer and the release on the pointer loaded from
the alloca. The optimizer will through the provenance analysis realize that the
two are related (albiet different), but since we only require KnownSafe in one
direction, will match the inner retain on the original pointer with the guard
release on the original pointer. This is fixed by ensuring that in the presense
of allocas we only unconditionally remove pointers if both our retain and our
release are KnownSafe (i.e. we are KnownSafe in both directions) since we must
deal with the possibility that the frontend will emit what (to the optimizer)
appears to be unbalanced retain/releases.
An example of the miscompile is:
%A = alloca
retain(%x)
retain(%x) <--- Inner Retain
store %x, %A
%y = load %A
... DO STUFF ...
release(%y)
call void @use(%x)
release(%x) <--- Guarding Release
getting optimized to:
%A = alloca
retain(%x)
store %x, %A
%y = load %A
... DO STUFF ...
release(%y)
call void @use(%x)
rdar://13750319
llvm-svn: 181743
|
| |
|
|
|
|
| |
Suppresses an unused-variable warning in -Asserts builds.
llvm-svn: 181733
|
| |
|
|
|
|
| |
get{TopDown,BottomUp}PtrState will create a new PtrState object if it does not find a PtrState for Arg.
llvm-svn: 181726
|
| |
|
|
|
|
|
|
|
|
|
| |
OptimizeIndividualCalls.
This makes the statistics gathering completely independent of the actual
optimization occuring, preventing any sort of bleeding over from occuring.
Additionally, it simplifies a switch statement in the non-statistic gathering case.
llvm-svn: 181719
|
| |
|
|
|
|
| |
read in asserts.
llvm-svn: 181689
|
| |
|
|
| |
llvm-svn: 181684
|
| |
|
|
|
|
|
|
| |
multiple users.
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract.
llvm-svn: 181674
|
| |
|
|
|
|
|
|
| |
round of vectorization.
Testcase in the next commit.
llvm-svn: 181673
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two transforms in visitUrem that conflict with each other.
*) One, if a divisor is a power of two, subtracts one from the divisor
and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
instructions.
Flipping the order allows the subtraction to go beneath the sign
extension.
llvm-svn: 181668
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the widest induction type encountered for the cannonical induction variable.
We used to turn the following loop into an empty loop because we used i8 as
induction variable type and truncated 1024 to 0 as trip count.
int a[1024];
void fail() {
int reverse_induction = 1023;
unsigned char forward_induction = 0;
while ((reverse_induction) >= 0) {
forward_induction++;
a[reverse_induction] = forward_induction;
--reverse_induction;
}
}
radar://13862901
llvm-svn: 181667
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 181666
|
| |
|
|
|
|
| |
No functionality change intended.
llvm-svn: 181665
|
| |
|
|
|
|
|
| |
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.
llvm-svn: 181661
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
For example:
bar() {
int a = A[i];
int b = A[i+1];
B[i] = a;
B[i+1] = b;
foo(a); <--- a is used outside the vectorized expression.
}
llvm-svn: 181648
|
| |
|
|
| |
llvm-svn: 181647
|
| |
|
|
|
|
|
|
|
|
| |
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.
PR15959.
llvm-svn: 181604
|
| |
|
|
|
|
| |
PR15952.
llvm-svn: 181586
|
| |
|
|
| |
llvm-svn: 181551
|
| |
|
|
|
|
|
|
|
|
| |
iteration.
This on step toward non-iterative GVN. My local hack suggests that getting rid
of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++).
I cannot explain why not 2x or more at this moment.
llvm-svn: 181532
|
| |
|
|
|
|
|
| |
When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).
llvm-svn: 181524
|
| |
|
|
|
|
|
| |
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.
llvm-svn: 181518
|
| |
|
|
|
|
|
|
|
| |
A computable loop exit count does not imply the presence of an induction
variable. Scalar evolution can return a value for an infinite loop.
Fixes PR15926.
llvm-svn: 181495
|
| |
|
|
|
|
|
|
|
|
|
| |
- requires existing debug information to be present
- fixes up file name and line number information in metadata
- emits a "<orig_filename>-debug.ll" succinct IR file (without !dbg metadata
or debug intrinsics) that can be read by a debugger
- initialize pass in opt tool to enable the "-debug-ir" flag
- lit tests to follow
llvm-svn: 181467
|
| |
|
|
|
|
| |
by switching to a ValueMap. Patch by Andrea DiBiagio!
llvm-svn: 181397
|
| |
|
|
|
|
|
| |
The two nested loops were confusing and also conservative in identifying
reduction variables. This patch replaces them by a worklist based approach.
llvm-svn: 181369
|
| |
|
|
|
|
|
|
|
|
| |
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.
Should fix PR15882.
llvm-svn: 181286
|
| |
|
|
| |
llvm-svn: 181249
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test case by Michele Scandale!
Fixes PR10293: Load not hoisted out of loop with multiple exits.
There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.
This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.
I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.
(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:
- We need a strong guarantee that we won't endlessly rotate, so the
analysis would need to be precise in order to avoid the
SimplifiedLoopLatch precondition.
- Analysis like this are usually based on SCEV, which we don't want to
rely on.
(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.
llvm-svn: 181230
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
llvm-svn: 181216
|
| |
|
|
| |
llvm-svn: 181178
|
| |
|
|
|
|
| |
Thanks Nick Lewycky for pointing this out.
llvm-svn: 181177
|
| |
|
|
|
|
|
| |
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.
llvm-svn: 181175
|
| |
|
|
| |
llvm-svn: 181157
|
| |
|
|
|
|
|
|
|
|
| |
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.
radar://13723044
llvm-svn: 181144
|
| |
|
|
|
|
|
|
|
| |
No need for setting the operands. The pointers are going to be bound by the
matcher.
radar://13723044
llvm-svn: 181142
|
| |
|
|
|
|
|
|
|
|
| |
We can just use the initial element that feeds the reduction.
max(max(x, y), z) == max(max(x,y), max(x,z))
radar://13723044
llvm-svn: 181141
|
| |
|
|
|
|
|
|
| |
constructor enables
Patch by Robert Wilhelm.
llvm-svn: 181138
|
| |
|
|
| |
llvm-svn: 181082
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
functions. No function change.
This function consists of following steps:
1. Collect dependent memory accesses.
2. Analyze availability.
3. Perform fully redundancy elimination, or
4. Perform PRE, depending on the availability
Step 2, 3 and 4 are now moved to three helper routines.
llvm-svn: 181047
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
values.
By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements.
We can now vectorize this loop:
int foo(int *A, int *B, int n) {
for (int i=0; i < n; i++) {
int x = 9;
if (A[i] > B[i]) {
if (A[i] > 19) {
x = 3;
} else if (B[i] < 4 ) {
x = 4;
} else {
x = 5;
}
}
A[i] = x;
}
}
llvm-svn: 181037
|