|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | 1. Size heuristics changed. Now we calculate number of unswitching
branches only once per loop.
2. Some checks was moved from UnswitchIfProfitable to
processCurrentLoop, since it is not changed during processCurrentLoop
iteration. It allows decide to skip some loops at an early stage.
Extended statistics:
- Added total number of instructions analyzed.
llvm-svn: 147935 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.
Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.
Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.
On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.
llvm-svn: 147826 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | After collecting chains, check if any should be materialized. If so,
hide the chained IV users from the LSR solver. LSR will only solve for
the head of the chain. GenerateIVChains will then materialize the
chained IV users by computing the IV relative to its previous value in
the chain.
In theory, chained IV users could be exposed to LSR's solver. This
would be considerably complicated to implement and I'm not aware of a
case where we need it. In practice it's more important to
intelligently prune the search space of nontrivial loops before
running the solver, otherwise the solver is often forced to prune the
most optimal solutions. Hiding the chained users does this well, so
that LSR is more likely to find the best IV for the chain as a whole.
llvm-svn: 147801 | 
| | 
| 
| 
| 
| 
| 
| 
| | This collects a set of IV uses within the loop whose values can be
computed relative to each other in a sequence. Following checkins will
make use of this information.
llvm-svn: 147797 | 
| | 
| 
| 
| | llvm-svn: 147785 | 
| | 
| 
| 
| 
| 
| | This will be more important as we extend the LSR pass in ways that don't rely on the formula solver. In particular, we need it for constructing IV chains.
llvm-svn: 147724 | 
| | 
| 
| 
| 
| 
| 
| 
| | LoopSimplify may not run on some outer loops, e.g. because of indirect
branches. SCEVExpander simply cannot handle outer loops with no preheaders.
Fixes rdar://10655343 SCEVExpander segfault.
llvm-svn: 147718 | 
| | 
| 
| 
| | llvm-svn: 147711 | 
| | 
| 
| 
| | llvm-svn: 147707 | 
| | 
| 
| 
| | llvm-svn: 147291 | 
| | 
| 
| 
| | llvm-svn: 147284 | 
| | 
| 
| 
| | llvm-svn: 147226 | 
| | 
| 
| 
| | llvm-svn: 147176 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | performance regressions (both execution-time and compile-time) on our
nightly testers.
Original commit message:
Fix for bug #11429: Wrong behaviour for switches. Small improvement for code
size heuristics.
llvm-svn: 147131 | 
| | 
| 
| 
| 
| 
| | an invalid iterator aren't reproducible.  rdar://10614085.
llvm-svn: 147098 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | into Analysis as a standalone function, since there's no need for
it to be in VMCore. Also, update it to use isKnownNonZero and
other goodies available in Analysis, making it more precise,
enabling more aggressive optimization.
llvm-svn: 146610 | 
| | 
| 
| 
| 
| 
| | size heuristics.
llvm-svn: 146578 | 
| | 
| 
| 
| 
| 
| | point to ARC-managed pointers sometimes. This fixes rdar://10551239.
llvm-svn: 146577 | 
| | 
| 
| 
| | llvm-svn: 146459 | 
| | 
| 
| 
| 
| 
| 
| 
| | This should always be done as a matter of principal. I don't have a
case that exposes the problem. I just noticed this recently while
scanning the code and realized I meant to fix it long ago.
llvm-svn: 146438 | 
| | 
| 
| 
| | llvm-svn: 146411 | 
| | 
| 
| 
| | llvm-svn: 146409 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | detected in the forward-CFG DFS. This prevents the reverse-CFG from
visiting blocks inside loops after blocks that dominate them in the
case where loops have multiple exits.
No testcase, because this fixes a bug which in practice only shows
up in a full optimizer run, due to the use-list order.
This fixes rdar://10422791 and others.
llvm-svn: 146408 | 
| | 
| 
| 
| | llvm-svn: 146389 | 
| | 
| 
| 
| | llvm-svn: 146385 | 
| | 
| 
| 
| | llvm-svn: 146384 | 
| | 
| 
| 
| | llvm-svn: 146383 | 
| | 
| 
| 
| | llvm-svn: 146380 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | indicates whether the intrinsic has a defined result for a first
argument equal to zero. This will eventually allow these intrinsics to
accurately model the semantics of GCC's __builtin_ctz and __builtin_clz
and the X86 instructions (prior to AVX) which implement them.
This patch merely sets the stage by extending the signature of these
intrinsics and establishing auto-upgrade logic so that the old spelling
still works both in IR and in bitcode. The upgrade logic preserves the
existing (inefficient) semantics. This patch should not change any
behavior. CodeGen isn't updated because it can use the existing
semantics regardless of the flag's value.
Note that this will be followed by API updates to Clang and DragonEgg.
Reviewed by Nick Lewycky!
llvm-svn: 146357 | 
| | 
| 
| 
| 
| 
| 
| 
| | Since we're not rewriting IVs in other loops, there's not much reason
to consider their stride when generating formulae.
This should reduce the number of useless formulas considered by LSR.
llvm-svn: 146302 | 
| | 
| 
| 
| | llvm-svn: 146277 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Patch by Brendon Cahoon!
This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.
llvm-svn: 146245 | 
| | 
| 
| 
| 
| 
| | trivially infinite.
llvm-svn: 146197 | 
| | 
| 
| 
| | llvm-svn: 145934 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | It's always good to prune early, but formulae that are unsatisfactory
in their own right need to be removed before running any other pruning
heuristics. We easily avoid generating such formulae, but we need them
as an intermediate basis for forming other good formulae.
llvm-svn: 145906 | 
| | 
| 
| 
| | llvm-svn: 145866 | 
| | 
| 
| 
| 
| 
| 
| 
| | where this would be bad as the backend shouldn't have a problem inlining small
memcpys.
rdar://10510150
llvm-svn: 145865 | 
| | 
| 
| 
| | llvm-svn: 145801 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | causing the optimisation to occur
Turns out long long + unsigned long long is unsigned.  Doh!
Fixes http://llvm.org/bugs/show_bug.cgi?id=11455
llvm-svn: 145731 | 
| | 
| 
| 
| 
| 
| | Add FIXMEs to places that are non-trivial to fix.
llvm-svn: 145661 | 
| | 
| 
| 
| 
| 
| 
| | where it appeared beneficial to pass.
More of rdar://10500969
llvm-svn: 145630 | 
| | 
| 
| 
| 
| 
| 
| | InstructionSimplify.cpp.  Other fixups as needed.
Part of rdar://10500969
llvm-svn: 145559 | 
| | 
| 
| 
| 
| 
| 
| 
| | explicitly specified alignment.
<rdar://problem/10497732>.
llvm-svn: 145523 | 
| | 
| 
| 
| 
| 
| | not be changed inside the uses enumeration loop.
llvm-svn: 145432 | 
| | 
| 
| 
| | llvm-svn: 145420 | 
| | 
| 
| 
| 
| 
| 
| | This reverts r139450, fixes r139453, and adds much needed comments and a
unit test.
llvm-svn: 145367 | 
| | 
| 
| 
| 
| 
| | SCEV should now be used for trip count analysis, not LoopInfo.
llvm-svn: 145262 | 
| | 
| 
| 
| | llvm-svn: 145154 | 
| | 
| 
| 
| 
| 
| 
| 
| | Suggested in code review by Eli.
That code in InstCombine looks kinda suspicious.
llvm-svn: 145013 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Add a custom name for fwrite and fputs on x86-32 OSX.  Make SimplifyLibCalls honor the custom
names for fwrite and fputs.
Fixes <rdar://problem/9815881>.
llvm-svn: 144876 |