|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Extending findExistingExpansion can use existing value in ExprValueMap.
This patch gives 0.3~0.5% performance improvements on 
benchmarks(test-suite, spec2000, spec2006, commercial benchmark)
   
Reviewers: mzolotukhin, sanjoy, zzheng
Differential Revision: http://reviews.llvm.org/D15559
llvm-svn: 260938 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Export the CloneDebugInfoMetadata utility, which clones all debug info
associated with a function into the first module. Also use this function
in CloneModule on each function we clone (the CloneFunction entrypoint
already does this).
Without this, cloning a module will lead to DI quality regressions,
especially since r252219 reversed the Function <-> DISubprogram edge
(before we could get lucky and have this edge preserved if the
DISubprogram itself was, e.g. due to location metadata).
This was verified to fix missing debug information in julia and
a unittest to verify the new behavior is included.
Patch by Yichao Yu! Thanks!
Reviewers: loladiro, pcc
Differential Revision: http://reviews.llvm.org/D17165
llvm-svn: 260791 | 
| | 
| 
| 
| | llvm-svn: 260731 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | convergent functions.
Summary:
Performing this optimization duplicates the call to the convergent
function and adds new control-flow dependencies, which is a no-no.
Reviewers: jingyue
Subscribers: broune, hfinkel, tra, resistor, joker.eph, arsenm, llvm-commits, mzolotukhin
Differential Revision: http://reviews.llvm.org/D17128
llvm-svn: 260730 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | MSan adds a constructor to each translation unit that calls
__msan_init, and does nothing else. The idea is to run __msan_init
before any instrumented code. This results in multiple constructors
and multiple .init_array entries in the final binary, one per
translation unit. This is absolutely unnecessary; one would be
enough.
This change moves the constructors to a comdat group in order to drop
the extra ones.
llvm-svn: 260632 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
As discussed on IRC, move the ThinLTOGlobalProcessing code out of
the linker, and into TransformUtils. The name of the class is changed
to FunctionImportGlobalProcessing.
Reviewers: joker.eph, rafael
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D17081
llvm-svn: 260395 | 
| | 
| 
| 
| | llvm-svn: 260389 | 
| | 
| 
| 
| | llvm-svn: 260387 | 
| | 
| 
| 
| | llvm-svn: 260151 | 
| | 
| 
| 
| | llvm-svn: 260130 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | sanitizer issue. The PredicatedScalarEvolution's copy constructor
wasn't copying the Generation value, and was leaving it un-initialized.
Original commit message:
[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
Summary:
This change adds no wrap SCEV predicates with:
  - support for runtime checking
  - support for expression rewriting:
      (sext ({x,+,y}) -> {sext(x),+,sext(y)}
      (zext ({x,+,y}) -> {zext(x),+,sext(y)}
Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.
We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.
Reviewers: mzolotukhin, sanjoy, anemet
Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel
Differential Revision: http://reviews.llvm.org/D15412
llvm-svn: 260112 | 
| | 
| 
| 
| 
| 
| | sanitizer bots.
llvm-svn: 260087 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | We shouldn't assert when there are no memchecks, since we
can have SCEV checks. There is already an assert covering
the case where there are no SCEV checks or memchecks.
This also changes the LAA pointer wrapping versioning test
to use the loop versioning pass (this was how I managed to
trigger the assert in the loop versioning pass).
llvm-svn: 260086 | 
| | 
| 
| 
| 
| 
| | unittests
llvm-svn: 260015 | 
| | 
| 
| 
| | llvm-svn: 260014 | 
| | 
| 
| 
| | llvm-svn: 260013 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In r255133 (reapplied r253126) we started to avoid redundant
recomputation of LCSSA after loop-unrolling. This patch moves one step
further in this direction - now we can avoid it for much wider range of
loops, as we start to look at IR and try to figure out if the
transformation actually breaks LCSSA phis or makes it necessary to
insert new ones.
Differential Revision: http://reviews.llvm.org/D16838
llvm-svn: 259869 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | D16251)
Summary:
This is a simpler fix to the problem than the dominator approach in
http://reviews.llvm.org/D16251. It adds only values into the gather() while loop
that have been seen before.
The actual endless loop is in the constant compare gather() routine in
Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the
queue:
%.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
Here is what happens at the IR level:
for.cond.i:                                       ; preds = %if.end6.i,
%if.end.i54
%ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ]
%ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<<
%cmp2.i = icmp ult i32 %ix.0.i, %11
br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit
if.end6.i:                                        ; preds = %for.body.i
%cmp10.i = icmp ugt i32 %conv.i, %add9.i
%.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<<
When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i.
The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i
(Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i).
And
now there is use of .ret.0.off0.i before a definition which triggers the
“endless” loop in gather():
while(!DFT.empty()) {
    V = DFT.pop_back_val();   // V is .ret.0.off0.i
    if (Instruction *I = dyn_cast<Instruction>(V)) {
      // If it is a || (or && depending on isEQ), process the operands.
      if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) {
        DFT.push_back(I->getOperand(1));  // This is now .ret.0.off0.i also
        DFT.push_back(I->getOperand(0));
        continue; // “endless loop” for .ret.0.off0.i
      }
Reviewers: reames, ahatanak
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16839
llvm-svn: 259730 | 
| | 
| 
| 
| | llvm-svn: 259623 | 
| | 
| 
| 
| | llvm-svn: 259621 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.
I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.
(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)
The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.
Reviewers: hfinkel, ashutosh.nema, sbaranga
Subscribers: zzheng, mssimpso, llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D16612
llvm-svn: 259610 | 
| | 
| 
| 
| | llvm-svn: 259602 | 
| | 
| 
| 
| | llvm-svn: 259599 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Please see include/llvm/Transforms/Utils/MemorySSA.h for a description
of MemorySSA, and what it does.
Differential Revision: http://reviews.llvm.org/D7864
llvm-svn: 259595 | 
| | 
| 
| 
| 
| 
| | Differential revision: http://reviews.llvm.org/D16793
llvm-svn: 259539 | 
| | 
| 
| 
| 
| 
| 
| | These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.
llvm-svn: 259283 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | its aliasee.
Summary: When splitting module with preserving locals, we currently do not handle case of global alias being separated with its aliasee.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16585
llvm-svn: 259075 | 
| | 
| 
| 
| | llvm-svn: 259010 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is a fix for:
https://llvm.org/bugs/show_bug.cgi?id=26308
With the switch to using the TTI cost model in:
http://reviews.llvm.org/rL228826
...it became possible to hit a zero-cost cycle of instructions (gep -> phi -> gep...), 
so we need a cap for the recursion in DominatesMergePoint().
A recursion depth parameter was already added for a different reason in:
http://reviews.llvm.org/rL255660
...so we can just set a limit for it.
I pulled "10" out of the air and made it an independent parameter that we can play with.
It might be higher than it needs to be given the currently low default value of 
PHINodeFoldingThreshold (2). That's the starting cost value that we enter the recursion
with, and most instructions have cost set to TCC_Basic (1), so I don't think we're going
to speculate more than 2 instructions with the current parameters.
As noted in the review and the TODO comment, we can do better than just limiting recursion
depth.
Differential Revision: http://reviews.llvm.org/D16637
llvm-svn: 258971 | 
| | 
| 
| 
| | llvm-svn: 258937 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | instruction (PR24818)""
This reverts commit r258903 which reverted r255660.  r258903 was an
accidental commit and should not have been committed.
llvm-svn: 258905 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | SimplifyCFG tries to turn complex branch conditions into a switch.
Some of it's logic attempts to reason about bitwise arithmetic produced
by InstCombine.  InstCombine can turn things like (X == 2) || (X == 3)
into (X & 1) == 2 and so SimplifyCFG tries to detect when this occurs so
that it can produce a switch instruction.
However, the legality checking was not sufficient to determine whether
or not this had occured.  Correctly check this case by requiring that
the right-hand side of the comparison be a power of two.
This fixes PR26323.
llvm-svn: 258904 | 
| | 
| 
| 
| 
| 
| 
| 
| | (PR24818)"
This reverts commit r255660.
llvm-svn: 258903 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html
"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi
Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark
Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D16471
llvm-svn: 258861 | 
| | 
| 
| 
| 
| 
| 
| 
| | other minor fixes.
Differential revision: reviews.llvm.org/D16568
llvm-svn: 258831 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is a step towards solving PR25892:
https://llvm.org/bugs/show_bug.cgi?id=25892
It won't handle the reported case. As noted by the 'TODO' comments in the patch, 
we need to relax the hasOneUse() constraint and also match patterns that include
memset_chk() and the llvm.memset() intrinsic in addition to memset().
Differential Revision: http://reviews.llvm.org/D16337
llvm-svn: 258816 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Use existing functionality provided in changeToUnreachable instead of
reinventing it in LoopSimplify.
No functionality change is intended.
llvm-svn: 258663 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.
No functionality change intended.
llvm-svn: 258654 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so.  Share the code between the two.
No functionality change is intended.
llvm-svn: 258653 | 
| | 
| 
| 
| | llvm-svn: 258455 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Use the helper function added in r258428.
The check should really be hoisted to the caller of all of these
optimize* functions, but that's another step.
llvm-svn: 258446 | 
| | 
| 
| 
| | llvm-svn: 258445 | 
| | 
| 
| 
| | llvm-svn: 258444 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | instruction.
Patch by Yuanrui Zhang.
Differential Revision: http://reviews.llvm.org/D16100
llvm-svn: 258435 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is similar to the bug/fix:
https://llvm.org/bugs/show_bug.cgi?id=26211
http://reviews.llvm.org/rL258325
The fmin() test case reveals another bug caused by sloppy
code duplication. It will crash without this patch because
fp128 is a valid floating-point type, but we would think
that we had matched a function that used doubles.
The new helper function can be used to replace similar
checks that are used in several other places in this file.
llvm-svn: 258428 | 
| | 
| 
| 
| | llvm-svn: 258416 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | type.  NFC.
Summary:
The previous form, taking opcode and type, is moved to an internal
helper and the new form, taking an instruction, is a wrapper around this
helper.
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.
Reviewers: eddyb
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D16383
llvm-svn: 258391 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The test case will crash without this patch because the subsequent call to
hasUnsafeAlgebra() assumes that the call instruction is an FPMathOperator
(ie, returns an FP type).
This part of the function signature check was omitted for the sqrt() case, 
but seems to be in place for all other transforms.
Before:
http://reviews.llvm.org/rL257400
...we would have needlessly continued execution in optimizeSqrt(), but the
bug was harmless because we'd eventually fail some other check and return
without damage.
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=26211
Differential Revision: http://reviews.llvm.org/D16198
llvm-svn: 258325 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Funclet EH tables require that a given funclet have only one unwind
destination for exceptional exits.  The verifier will therefore reject
e.g. two cleanuprets with different unwind dests for the same cleanup, or
two invokes exiting the same funclet but to different unwind dests.
Because catchswitch has no 'nounwind' variant, and because IR producers
are not *required* to annotate calls which will not unwind as 'nounwind',
it is legal to nest a call or an "unwind to caller" catchswitch within a
funclet pad that has an unwind destination other than caller; it is
undefined behavior for such a call or catchswitch to unwind.
Normally when inlining an invoke, calls in the inlined sequence are
rewritten to invokes that unwind to the callsite invoke's unwind
destination, and "unwind to caller" catchswitches in the inlined sequence
are rewritten to unwind to the callsite invoke's unwind destination.
However, if such a call or "unwind to caller" catchswitch is located in a
callee funclet that has another exceptional exit with an unwind
destination within the callee, applying the normal transformation would
give that callee funclet multiple unwind destinations for its exceptional
exits.  There would be no way for EH table generation to determine which
is the "true" exit, and the verifier would reject the function
accordingly.
Add logic to the inliner to detect these cases and leave such calls and
"unwind to caller" catchswitches as calls and "unwind to caller"
catchswitches in the inlined sequence.
This fixes PR26147.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: alexcrichton, llvm-commits
Differential Revision: http://reviews.llvm.org/D16319
llvm-svn: 258273 | 
| | 
| 
| 
| | llvm-svn: 258176 |