|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| | An attempt to recommit r346584 after failure on OSX build bot.
Fixed cache key computation in ThinLTOCodeGenerator and added
test case
llvm-svn: 347033 | 
| | 
| 
| 
| 
| 
| 
| | It broke the Windows self-host:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457
llvm-svn: 346823 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.)
Patch by Chang Lin (clin1)
Differential Revision: https://reviews.llvm.org/D53876
llvm-svn: 346810 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch updates DuplicateInstructionsInSplitBetween to update a DTU
instead of applying updates to the DT directly.
Given that there only are 2 users, also updated them in this patch to
avoid churn.
I slightly moved the code in CallSiteSplitting around to reduce the
places where we have to pass in DTU. If necessary, I could split those
changes in a separate patch.
This fixes missing DT updates when dealing with musttail calls in
CallSiteSplitting, by using DTU->deleteBB.
Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki
Reviewed By: NutshellySima
llvm-svn: 346769 | 
| | 
| 
| 
| 
| 
| | This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2.
llvm-svn: 346768 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This patch allows internalising globals if all accesses to them
(from live functions) are from non-volatile load instructions
Differential revision: https://reviews.llvm.org/D49362
llvm-svn: 346584 | 
| | 
| 
| 
| 
| 
| 
| 
| | In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.
Differential Revision: https://reviews.llvm.org/D53390
llvm-svn: 346481 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This eliminates the outlining penalty for llvm.trap/unreachable, because
callers no longer have to emit cleanup/ret instructions after calling an
outlined `noreturn` function.
rdar://45523626
llvm-svn: 346421 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The lowering for a call to eh_typeid_for changes when it's moved from
one function to another.
There are several proposals for fixing this issue in llvm.org/PR39545.
Until some solution is in place, do not allow CodeExtractor to extract
calls to eh_typeid_for, as that results in serious miscompilations.
llvm-svn: 346256 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When CodeExtractor moves instructions to a new function, debug
intrinsics referring to those instructions within the parent function
become invalid.
This results in the same verifier failure which motivated r344545, about
function-local metadata being used in the wrong function.
llvm-svn: 346255 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Clang's -Wimplicit-fallthrough implementation warns on this. I built
clang with GCC 7.3 in +asserts and -asserts mode, and GCC doesn't warn
on this in either configuration. I think it is unnecessary. I separated
it from the large mechanical patch (https://reviews.llvm.org/D53950) in
case I am wrong and it has to be reverted.
llvm-svn: 345876 | 
| | 
| 
| 
| 
| 
| 
| 
| | This is modeled after C++17 std::empty().
Differential Revision: https://reviews.llvm.org/D53909
llvm-svn: 345679 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | As K has to dominate I, IIUC I's range metadata must be a subset of
K's. After Eli's recent clarification to the LangRef, loading a value
outside of the range is undefined behavior.
Therefore if I's range contains elements outside of K's range and we would load
one such value, K would cause undefined behavior.
In cases like hoisting/sinking, we still want the most generic range
over all code paths to/from the hoist/sink point. As suggested in the
patches related to D47339, I will refactor the handling of those
scenarios and try to decouple it from this function as follow up, once
we switched to a similar handling of metadata in most of
combineMetadata.
I updated some tests checking mostly the merging of metadata to keep the
metadata of to dominating load. The most interesting one is probably test8 in
test/Transforms/JumpThreading/thread-loads.ll. It contained a comment
about the alias metadata preventing us to eliminate the branch, but it
seem like the actual problem currently is that we merge the ranges of
both loads and cannot eliminate the icmp afterwards. With this patch, we
manage to eliminate the icmp, as the range of the first load excludes 8.
Reviewers: efriedma, nlopes, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D51629
llvm-svn: 345456 | 
| | 
| 
| 
| 
| 
| 
| 
| | When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines.
Differential Revision: https://reviews.llvm.org/D53287
llvm-svn: 345250 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Teach LoopRotate to preserve MemorySSA.
Enable tests for correctness, dependency disabled by default.
Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D51718
llvm-svn: 345216 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The current splitting algorithm works in three stages:
  1) Identify cold blocks, then
  2) Use forward/backward propagation to mark hot blocks, then
  3) Grow a SESE region of blocks *outside* of the set of hot blocks and
  start outlining.
While testing this pass on Apple internal frameworks I noticed that some
kinds of control flow (e.g. loops) are never outlined, even though they
unconditionally lead to / follow cold blocks. I noticed two other issues
related to how cold regions are identified:
  - An inconsistency can arise in the internal state of the hotness
  propagation stage, as a block may end up in both the ColdBlocks set
  and the HotBlocks set. Further inconsistencies can arise as these sets
  do not match what's in ProfileSummaryInfo.
  - It isn't necessary to limit outlining to single-exit regions.
This patch teaches the splitting algorithm to identify maximal cold
regions and outline them. A maximal cold region is defined as the set of
blocks post-dominated by a cold sink block, or dominated by that sink
block. This approach can successfully outline loops in the cold path. As
a side benefit, it maintains less internal state than the current
approach.
Due to a limitation in CodeExtractor, blocks within the maximal cold
region which aren't dominated by a single entry point (a so-called "max
ancestor") are filtered out.
Results:
  - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to
  47KB pre-patch, or a ~3x improvement. Did not see a performance impact
  across two runs.
  - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB
  of TEXT were outlined. Ditto re: performance impact.
  - Outlining results improve marginally in the internal frameworks I
  tested.
Follow-ups:
  - Outline more than once per function, outline large single basic
  blocks, & try to remove unconditional branches in outlined functions.
Differential Revision: https://reviews.llvm.org/D53627
llvm-svn: 345209 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
The current default of appending "_"+entry block label to the new
extracted cold function breaks demangling. Change the deliminator from
"_" to "." to enable demangling. Because the header block label will
be empty for release compile code, use "extracted" after the "." when
the label is empty.
Additionally, add a mechanism for the client to pass in an alternate
suffix applied after the ".", and have the hot cold split pass use
"cold."+Count, where the Count is currently 1 but can be used to
uniquely number multiple cold functions split out from the same function
with D53588.
Reviewers: sebpop, hiraditya
Subscribers: llvm-commits, erik.pilkington
Differential Revision: https://reviews.llvm.org/D53534
llvm-svn: 345178 | 
| | 
| 
| 
| 
| 
| | Undo stray change introduced by r344725.
llvm-svn: 344814 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
In several places in the code we use the following pattern:
  if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) {
    [...]
    Value *Res = emitUnaryFloatFnCall(X, TLI.getName(LibFunc_tan), B, Attrs);
    [...]
  }
In short, we check if there is a lib-function for a certain type, and then
we _always_ fetch the name of the "double" version of the lib function and
construct a call to the appropriate function, that we just checked exists,
using that "double" name as a basis.
This is of course a problem in cases where the target doesn't support the
"double" version, but e.g. only the "float" version.
In that case TLI.getName(LibFunc_tan) returns "", and
emitUnaryFloatFnCall happily appends an "f" to "", and we erroneously end
up with a call to a function called "f".
To solve this, the above pattern is changed to
  if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) {
    [...]
    Value *Res = emitUnaryFloatFnCall(X, &TLI, LibFunc_tan, LibFunc_tanf,
                                      LibFunc_tanl, B, Attrs);
    [...]
  }
I.e instead of first fetching the name of the "double" version and then
letting emitUnaryFloatFnCall() add the final "f" or "l", we let
emitUnaryFloatFnCall() fetch the right name from TLI.
Reviewers: eli.friedman, efriedma
Reviewed By: efriedma
Subscribers: efriedma, bjope, llvm-commits
Differential Revision: https://reviews.llvm.org/D53370
llvm-svn: 344725 | 
| | 
| 
| 
| 
| 
| | a variable's type.
llvm-svn: 344717 | 
| | 
| 
| 
| | llvm-svn: 344716 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D53338
llvm-svn: 344645 | 
| | 
| 
| 
| | llvm-svn: 344592 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Extend LCSSA so that debug values outside loops are rewritten to
use the PHI nodes that the pass creates.
This fixes PR39019. In that case, we ran LCSSA on a loop that
was later on vectorized, which left us with something like this:
  for.cond.cleanup:
    %add.lcssa = phi i32 [ %add, %for.body ], [ %34, %middle.block ]
    call void @llvm.dbg.value(metadata i32 %add,
    ret i32 %add.lcssa
  for.body:
    %add =
    [...]
    br i1 %exitcond, label %for.cond.cleanup, label %for.body
which later resulted in the debug.value becoming undef when
removing the scalar loop (and the location would have probably
been wrong for the vectorized case otherwise).
As we now may need to query the AvailableVals cache more than
once for a basic block, FindAvailableVals() in SSAUpdaterImpl is
changed so that it updates the cache for blocks that we do not
create a PHI node for, regardless of the block's number of
predecessors. The debug value in the attached IR reproducer
would not be properly rewritten without this.
Debug values residing in blocks where we have not inserted any
PHI nodes are currently left as-is by this patch. I'm not sure
what should be done with those uses.
Reviewers: mattd, aprantl, vsk, probinson
Reviewed By: mattd, aprantl
Subscribers: jmorse, gbedwell, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D53130
llvm-svn: 344589 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Variable updates within the outlined function are invisible to
debuggers. This could be improved by defining a DISubprogram for the
new function. For the moment, simply erase the debug intrinsics instead.
This fixes verifier failures about function-local metadata being used in
the wrong function, seen while testing the hot/cold splitting pass.
rdar://45142482
Differential Revision: https://reviews.llvm.org/D53267
llvm-svn: 344545 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | by `getTerminator()` calls instead be declared as `Instruction`.
This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).
llvm-svn: 344502 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This requires updating a number of .cpp files to adapt to the new API.
I've just systematically updated all uses of `TerminatorInst` within
these files te `Instruction` so thta I won't have to touch them again in
the future.
llvm-svn: 344498 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | LLVM APIs. There weren't very many.
We still have the instruction visitor, and APIs with TerminatorInst as
a return type or an output parameter.
llvm-svn: 344494 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: Fixes PR39177
Reviewers: spatel, jbuening
Reviewed By: jbuening
Subscribers: jbuening, llvm-commits
Differential Revision: https://reviews.llvm.org/D53129
llvm-svn: 344454 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | references to it.
InstCombine keeps a worklist and assumes that optimizations don't
eraseFromParent() the instruction, which SimplifyLibCalls violates. This change
adds a new callback to SimplifyLibCalls to let clients specify their own hander
for erasing actions.
Differential Revision: https://reviews.llvm.org/D52729
llvm-svn: 344251 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | There is a transform that may replace `lshr (x+1), 1` with `lshr x, 1` in case
if it can prove that the result will be the same. However the initial instruction
might have an `exact` flag set, and it now should be dropped unless we prove
that it may hold. Incorrectly set `exact` attribute may then produce poison.
Differential Revision: https://reviews.llvm.org/D53061
Reviewed By: sanjoy
llvm-svn: 344223 | 
| | 
| 
| 
| 
| 
| | Differential Revision: https://reviews.llvm.org/D52792
llvm-svn: 344153 | 
| | 
| 
| 
| 
| 
| 
| 
| | This reverts commit r344120.
It was causing buildbot failures.
llvm-svn: 344135 | 
| | 
| 
| 
| 
| 
| 
| 
| | When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. 
Differential Revision: https://reviews.llvm.org/D52887
llvm-svn: 344120 | 
| | 
| 
| 
| | llvm-svn: 344113 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Remove null check.
Summary:
At some point in the past the recursion in DominatesMergePoint used to pass null for AggressiveInsts as part of the recursion. It no longer does this. So there is no way for AggressiveInsts to be null.
This passes it by reference and removes the null check to make this explicit.
Reviewers: efriedma, reames
Reviewed By: efriedma
Subscribers: xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D52575
llvm-svn: 343828 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | outer while loop to revisit.
Summary:
The llvm::SimplifyCFG function creates a SimplifyCFGOpt object and calls run on it. There were numerous places reached from this run function that called back out llvm::SimplifyCFG which would create another SimplifyCFGOpt object. This is an inefficient use of stack space at minimum. We are also not passing along the LoopHeaders pointer passed into the outer llvm::SimplifyCFG call. So if its not null we lose it on the first recursion and get nullptr from there on.
This patch adds an outer loop around the main BasicBlock simplifying code and adds a flag to the SimplifyCFGOpt class that can be set by to request another iteration. I don't think we can iterate based just on the change flag alone since some of the simplifications delete a basic block entirely leaving nothing to iterate on.
Reviewers: bogner, eli.friedman, reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52760
llvm-svn: 343816 | 
| | 
| 
| 
| 
| 
| | getNumUses is linear in the number of uses. Since we're looking for a specific use count, we can use hasNUses which will stop as soon as it determines there are more than N uses instead of walking all of them.
llvm-svn: 343550 | 
| | 
| 
| 
| 
| 
| | There is no variable in this function named CondBB, but there is one named ThenBB and I believe the comments are all refering to it.
llvm-svn: 343548 | 
| | 
| 
| 
| 
| 
| 
| | There are a few leftovers in rL343163 which span two lines. This commit
changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...)
llvm-svn: 343426 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb
Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D52573
llvm-svn: 343163 | 
| | 
| 
| 
| 
| 
| 
| 
| | that follows the peeled iterations.
Differential Revision: https://reviews.llvm.org/D52176
llvm-svn: 343054 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In this patch, I'm adding an extra check to the Latch's terminator in llvm::UnrollRuntimeLoopRemainder,
similar to how it is already done in the llvm::UnrollLoop.
The compiler would crash if this function is called with a malformed loop.
Patch by Rodrigo Caetano Rocha!
Differential Revision: https://reviews.llvm.org/D51486
llvm-svn: 342958 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
The strcmp->memcmp transform can make the resulting memcmp read
uninitialized data, which MSan doesn't like.
Resolves https://github.com/google/sanitizers/issues/993.
Reviewers: eugenis, xbolva00
Reviewed By: eugenis
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D52272
llvm-svn: 342582 | 
| | 
| 
| 
| 
| 
| 
| | This is still unsafe for long double, we will transform things into tanl
even if tanl is for another type. But that's for someone else to fix.
llvm-svn: 342542 | 
| | 
| 
| 
| 
| 
| 
| 
| | When SimplifyCFG changes the PHI node into a select instruction, the debug information becomes ambiguous. It causes the debugger to display wrong variable value. 
Differential Revision: https://reviews.llvm.org/D51976
llvm-svn: 342527 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Previously the alignment on the newly created switch table data was not set,
meaning that DataLayout::getPreferredAlignment was free to overalign it to 16
bytes. This causes unnecessary code bloat.
Differential Revision: https://reviews.llvm.org/D51800
llvm-svn: 342039 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory.
The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html).
Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h|cpp to Analysis/IVDescriptors.h|cpp.
Reviewers: dmgreen, llvm-commits, hfinkel
Reviewed By: dmgreen
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D51153
llvm-svn: 342016 | 
| | 
| 
| 
| 
| 
| | Loop's getBlocks returns an ArrayRef.
llvm-svn: 341821 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Move InductionDescriptor::transform() routine from LoopUtils to its only uses in LoopVectorize.cpp.
Specifically, the function is renamed as InnerLoopVectorizer::emitTransformedIndex().
This is a child to D51153.
Reviewers: dmgreen, llvm-commits
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D51837
llvm-svn: 341776 |