|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| | * wrap code blocks in \code ... \endcode;
* refer to parameter names in paragraphs correctly (\arg is not what most
  people want -- it starts a new paragraph).
llvm-svn: 163790 | 
| | 
| 
| 
| | llvm-svn: 163739 | 
| | 
| 
| 
| 
| 
| 
| 
| | "#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"
No functional change. Update r163344.
llvm-svn: 163679 | 
| | 
| 
| 
| | llvm-svn: 163485 | 
| | 
| 
| 
| | llvm-svn: 163480 | 
| | 
| 
| 
| 
| 
| | No functional change.
llvm-svn: 163344 | 
| | 
| 
| 
| 
| 
| | No functional change.
llvm-svn: 163279 | 
| | 
| 
| 
| 
| 
| 
| 
| | pointers-to-strong-pointers may be in play. These can lead to retains and
releases happening in unstructured ways, foiling the optimizer. This fixes
rdar://12150909.
llvm-svn: 163180 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.
Patch by Tyler Nowicki!
llvm-svn: 163150 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Scan the body of the loop and find instructions that may trap.
Use this information when deciding if it is safe to hoist or sink instructions.
Notice that we can optimize the search of instructions that may throw in the case of nested loops.
rdar://11518836
llvm-svn: 163132 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.
llvm-svn: 163093 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | We update until we hit a fixpoint. This is probably slow but also
slightly simplifies the code. It should also fix the occasional
invalid domtrees observed when building with expensive checking.
I couldn't find a case where this had a measurable slowdown, but
if someone finds a pathological case where it does we may have
to find a cleverer way of updating dominators here.
Thanks to Duncan for the test case.
llvm-svn: 163091 | 
| | 
| 
| 
| | llvm-svn: 163058 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The old PHI updating code in loop-rotate was replaced with SSAUpdater a while
ago, it has no problems with comples PHIs. What had to be fixed is detecting
whether a loop was already rotated and updating dominators when multiple exits
were present.
This change increases overall code size a bit, mostly due to additional loop
unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info.
Fixes PR7447.
Thanks to Andy for the input on the domtree updating code.
llvm-svn: 162912 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.
Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.
Fixes PR13694 and probably others.
llvm-svn: 162841 | 
| | 
| 
| 
| 
| 
| | intended functionality change. Thanks to Ahmed Charles for spotting it.
llvm-svn: 162686 | 
| | 
| 
| 
| 
| 
| 
| 
| | No intended behavior change.  This was introduced in r162023.  With the fixed
algorithm a Release build of ARMInstPrinter.cpp goes from 16s to 10s on a
2011 MBP.
llvm-svn: 162559 | 
| | 
| 
| 
| | llvm-svn: 162383 | 
| | 
| 
| 
| 
| 
| 
| | optimizations are guarded by the -enable-double-float-shrink LLVM option.
Last bit of PR13574.  Patch by Weiming Zhao <weimingz@codeaurora.org>.
llvm-svn: 162368 | 
| | 
| 
| 
| 
| 
| | functional change intended.  Patch by Weiming Zhao <weimingz@codeaurora.org>.
llvm-svn: 162363 | 
| | 
| 
| 
| 
| 
| | WeakVH::operator*).
llvm-svn: 162309 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This optimization is really just replacing allocas wholesale with
globals, there is no scalarization.
The underlying motivation for this patch is to simplify the SROA pass
and focus it on splitting and promoting allocas.
llvm-svn: 162271 | 
| | 
| 
| 
| | llvm-svn: 162256 | 
| | 
| 
| 
| 
| 
| | to shrink from double to float.
llvm-svn: 162173 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | where some fact lake a=b dominates a use in a phi, but doesn't dominate the
basic block itself.
This feature could also be implemented by splitting critical edges, but at least
with the current algorithm reasoning about the dominance directly is faster.
The time for running "opt -O2" in the testcase in pr10584 is 1.003 times slower
and on gcc as a single file it is 1.0007 times faster.
llvm-svn: 162023 | 
| | 
| 
| 
| | llvm-svn: 161990 | 
| | 
| 
| 
| 
| 
| | store to the same offset is treated as completing overwriting.
llvm-svn: 161857 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.
rdar://11973998
llvm-svn: 161852 | 
| | 
| 
| 
| | llvm-svn: 161668 | 
| | 
| 
| 
| 
| 
| | being applied to all accesses to an alloca, not just the ones which read from the GEP.  Thanks to Evan for reducing the test.  rdar://11861001
llvm-svn: 161654 | 
| | 
| 
| 
| 
| 
| 
| | sure we account for that correctly in DeadStoreElimination.  Fixes a regression
from r158919.  PR13547.
llvm-svn: 161468 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | multiple scalar promotions on a single loop. This also has the effect of
preserving the order of stores sunk out of loops, which is aesthetically
pleasing, and it happens to fix the testcase in PR13542, though it doesn't
fix the underlying problem.
llvm-svn: 161459 | 
| | 
| 
| 
| 
| 
| 
| 
| | into predecessor blocks to enable tail call optimization.
rdar://11958338
llvm-svn: 160894 | 
| | 
| 
| 
| 
| 
| | Thanks Eli for noticing.
llvm-svn: 160787 | 
| | 
| 
| 
| 
| 
| | is a temporary measure until my fix for PR13021 is ready.
llvm-svn: 160778 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | creating a call to a library function.
Update all clients to pass the TLI information around.
Previous draft reviewed by Eli.
llvm-svn: 160733 | 
| | 
| 
| 
| | llvm-svn: 160668 | 
| | 
| 
| 
| 
| 
| | rdar://11931823.
llvm-svn: 160637 | 
| | 
| 
| 
| | llvm-svn: 160629 | 
| | 
| 
| 
| | llvm-svn: 160621 | 
| | 
| 
| 
| 
| 
| | moved earlier. This fixes some layering issues.
llvm-svn: 160611 | 
| | 
| 
| 
| 
| 
| 
| 
| | belongs. I dunno why in the world I dropped it in the Scalar folder in the first place.
No functionality change.
llvm-svn: 160587 | 
| | 
| 
| 
| 
| 
| 
| 
| | GetBestDestForJumpOnUndef() assumes there is at least 1 successor, which isn't
true if the block ends in an indirect branch with no successors. Fix this by
bailing out earlier in this case.
llvm-svn: 160546 | 
| | 
| 
| 
| 
| 
| | Minor oversight noticed by inspection. Sorry no unit test.
llvm-svn: 160422 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Fixes PR13371: indvars pass incorrectly substitutes 'undef' values.
I do not like this fix. It's needed until/unless the meaning of undef
changes. It attempts to be complete according to the IR spec, but I
don't have much confidence in the implementation given the difficulty
testing undefined behavior. Worse, this invalidates some of my
hard-fought work on indvars and LSR to optimize pointer induction
variables. It results benchmark regressions, which I'll track
internally. On x86_64 no LTO I see:
-3% huffbench
-3% 400.perlbench
-8% fhourstones
My only suggestion for recovering is to change the meaning of
undef. If we could trust an arbitrary instruction to produce a some
real value that can be manipulated (e.g. incremented) according to
non-undef rules, then this case could be easily handled with SCEV.
llvm-svn: 160421 | 
| | 
| 
| 
| 
| 
| | Speculatively fix crashes by code inspection. Can't reproduce them yet.
llvm-svn: 160344 | 
| | 
| 
| 
| 
| 
| | Some units tests crashed on a different platform.
llvm-svn: 160341 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This places limits on CollectSubexprs to constrains the number of
reassociation possibilities. It limits the recursion depth and skips
over chains of nested recurrences outside the current loop.
Fixes PR13361. Although underlying SCEV behavior is still potentially bad.
llvm-svn: 160340 | 
| | 
| 
| 
| | llvm-svn: 160325 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | All SCEV expressions used by LSR formulae must be safe to
expand. i.e. they may not contain UDiv unless we can prove nonzero
denominator.
Fixes PR11356: LSR hoists UDiv.
llvm-svn: 160205 |