| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
PR42924 points out that copying the GlobalsMetadata type during
construction of AddressSanitizer can result in exteremely lengthened
build times for translation units that have many globals. This can be addressed
by just making the GlobalsMD member in AddressSanitizer a reference to
avoid the copy. The GlobalsMetadata type is already passed to the
constructor as a reference anyway.
Differential Revision: https://reviews.llvm.org/D68287
llvm-svn: 373389
|
| |
|
|
|
|
|
|
| |
Identical to it's trunc-less variant, just pretent-to hoist
trunc, and everything else still holds:
https://rise4fun.com/Alive/JRU
llvm-svn: 373364
|
| |
|
|
|
|
| |
https://rise4fun.com/Alive/yR4
llvm-svn: 373363
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements a variation of a well known techniques for JIT compilers - we have an implementation in tree as LoopPredication - but with an interesting twist. This version does not assume the ability to execute a path which wasn't taken in the original program (such as a guard or widenable.condition intrinsic). The benefit is that this works for arbitrary IR from any frontend (including C/C++/Fortran). The tradeoff is that it's restricted to read only loops without implicit exits.
This builds on SCEV, and can thus eliminate the loop varying portion of the any early exit where all exits are understandable by SCEV. A key advantage is that fixing deficiency exposed in SCEV - already found one while writing test cases - will also benefit all of full redundancy elimination (and most other loop transforms).
I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.
Factoring wise, I chose to put this in IndVarSimplify since it's a generally applicable to all workloads. I could split this off into it's own pass, but we'd then probably want to add that new pass every place we use IndVars. One solid argument for splitting it off into it's own pass is that this transform is "too good". It breaks a huge number of existing IndVars test cases as they tend to be simple read only loops. At the moment, I've opted it off by default, but if we add this to IndVars and enable, we'll have to update around 20 test files to add side effects or disable this transform.
Near term plan is to fuzz this extensively while off by default, reflect and discuss on the factoring issue mentioned just above, and then enable by default. I also need to give some though to supporting widenable conditions in this framing.
Differential Revision: https://reviews.llvm.org/D67408
llvm-svn: 373351
|
| |
|
|
|
|
| |
Seems to be slower than memcpy + strlen.
llvm-svn: 373335
|
| |
|
|
| |
llvm-svn: 373333
|
| |
|
|
|
|
|
|
|
| |
This patch fixes the build break on Windows hosts.
There must be a better way of accessing the equivalent POSIX math constant
`M_E`.
llvm-svn: 373274
|
| |
|
|
|
|
|
|
|
| |
Expand the simplification of special cases of `log()` to include `log2()`
and `log10()` as well as intrinsics and more types.
Differential revision: https://reviews.llvm.org/D67199
llvm-svn: 373261
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The BasicBlockManager is potentially broken and should not be used.
Replace all uses of the BasicBlockPass with a FunctionBlockPass+loop on
blocks.
Reviewers: chandlerc
Subscribers: jholewinski, sanjoy.google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68234
llvm-svn: 373254
|
| |
|
|
| |
llvm-svn: 373251
|
| |
|
|
| |
llvm-svn: 373249
|
| |
|
|
|
|
|
|
|
|
|
|
| |
in ELF
With this patch, compiler generated profile variables will have its own COMDAT
name for ELF format, which syncs the behavior with COFF. Tested with clang
PGO bootstrap. This shows a modest reduction in object sizes in ELF format.
Differential Revision: https://reviews.llvm.org/D68041
llvm-svn: 373241
|
| |
|
|
| |
llvm-svn: 373231
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Name: negate if true
%sel = select i1 %cond, i32 -1, i32 1
%r = mul i32 %sel, %x
=>
%m = sub i32 0, %x
%r = select i1 %cond, i32 %m, i32 %x
Name: negate if false
%sel = select i1 %cond, i32 1, i32 -1
%r = mul i32 %sel, %x
=>
%m = sub i32 0, %x
%r = select i1 %cond, i32 %x, i32 %m
https://rise4fun.com/Alive/Nlh
llvm-svn: 373230
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68141
llvm-svn: 373207
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, jdoerfert
Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D68142
llvm-svn: 373195
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MergeFunctions and DCE pass are missing from OCaml/C-api. This patch
adds them.
Differential Revision: https://reviews.llvm.org/D65071
Reviewers: whitequark, hiraditya, deadalnix
Reviewed By: whitequark
Subscribers: llvm-commits
Tags: #llvm
Authored by: kren1
llvm-svn: 373170
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
different BB's (PR43500)
If we happen to have the same div in two basic blocks,
and in one of those we also happen to have the rem part,
we'd match the div-rem pair, but the wrong ones.
So let's drop overly-ambiguous assert.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43500
llvm-svn: 373167
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"SCEVAddRecExpr operand is not loop-invariant!")
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.
Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel
Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D29641
llvm-svn: 373166
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is to facilitate unittests
Reviewers: compnerd, vsk, tejohnson, sebpop, brzycki, SirishP
Reviewed By: tejohnson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68079
llvm-svn: 373151
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
profile symbol list.
Currently many existing users using profile-sample-accurate want to reduce
code size as much as possible. Their use cases are different from the scenario
profile symbol list tries to handle -- the major motivation of adding profile
symbol list is to get the major memory/code size saving without introduce
performance regression. So to keep the behavior of profile-sample-accurate
unchanged, we think decoupling these two things and using a new flag to
control the handling of profile symbol list may be better.
When profile-sample-accurate and the new flag profile-accurate-for-symsinlist
are both present, since profile-sample-accurate is a user assertion we let it
have a higher precedence.
Differential Revision: https://reviews.llvm.org/D68047
llvm-svn: 373133
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is valid for any `sext` bitwidth pair:
```
Processing /tmp/opt.ll..
----------------------------------------
%signed = sext %y
%r = shl %x, %signed
ret %r
=>
%unsigned = zext %y
%r = shl %x, %unsigned
ret %r
%signed = sext %y
Done: 2016
Optimization is correct!
```
(This isn't so for funnel shifts, there it's illegal for e.g. i6->i7.)
Main motivation is the C++ semantics:
```
int shl(int a, char b) {
return a << b;
}
```
ends as
```
%3 = sext i8 %1 to i32
%4 = shl i32 %0, %3
```
https://godbolt.org/z/0jgqUq
which is, as this shows, too pessimistic.
There is another problem here - we can only do the fold
if sext is one-use. But we can trivially have cases
where several shifts have the same sext shift amount.
This should be resolved, later.
Reviewers: spatel, nikic, RKSimon
Reviewed By: spatel
Subscribers: efriedma, hiraditya, nlopes, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68103
llvm-svn: 373106
|
| |
|
|
|
|
| |
The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.
llvm-svn: 373099
|
| |
|
|
|
|
|
|
| |
analyzer dyn_cast<FunctionSummary> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<FunctionSummary> directly and if not assert will fire for us.
llvm-svn: 373097
|
| |
|
|
|
|
|
|
| |
warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<StructType> directly and if not assert will fire for us.
llvm-svn: 373095
|
| |
|
|
| |
llvm-svn: 373081
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67922
llvm-svn: 373054
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't use short granules with stack instrumentation when targeting older
API levels because the rest of the system won't understand the short granule
tags stored in shadow memory.
Moreover, we need to be able to let old binaries (which won't understand
short granule tags) run on a new system that supports short granule
tags. Such binaries will call the __hwasan_tag_mismatch function when their
outlined checks fail. We can compensate for the binary's lack of support
for short granules by implementing the short granule part of the check in
the __hwasan_tag_mismatch function. Unfortunately we can't do anything about
inline checks, but I don't believe that we can generate these by default on
aarch64, nor did we do so when the ABI was fixed.
A new function, __hwasan_tag_mismatch_v2, is introduced that lets code
targeting the new runtime avoid redoing the short granule check. Because tag
mismatches are rare this isn't important from a performance perspective; the
main benefit is that it introduces a symbol dependency that prevents binaries
targeting the new runtime from running on older (i.e. incompatible) runtimes.
Differential Revision: https://reviews.llvm.org/D68059
llvm-svn: 373035
|
| |
|
|
|
|
|
|
| |
(isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!")
This reverts r372626 (git commit 6a278d9073bdc158d31d4f4b15bbe34238f22c18)
llvm-svn: 373019
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch extends the current capabilities in loop fusion to fuse guarded loops
(as defined in https://reviews.llvm.org/D63885). The patch adds the necessary
safety checks to ensure that it safe to fuse the guarded loops (control flow
equivalent, no intervening code, and same guard conditions). It also provides an
alternative method to perform the actual fusion of guarded loops. The mechanics
to fuse guarded loops are slightly different then fusing non-guarded loops, so I
opted to keep them separate methods. I will be cleaning this up in later
patches, and hope to converge on a single method to fuse both guarded and
non-guarded loops, but for now I think the review will be easier to keep them
separate.
Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65464
llvm-svn: 373018
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a runtime loop if we can compute its trip count upperbound:
Don't unroll if:
1. loop is not guaranteed to run either zero or upperbound iterations; and
2. trip count upperbound is less than UnrollMaxUpperBound
Unless user or TTI asked to do so.
If unrolling, limit unroll factor to loop's trip count upperbound.
Differential Revision: https://reviews.llvm.org/D62989
Change-Id: I6083c46a9d98b2e22cd855e60523fdc5a4929c73
llvm-svn: 373017
|
| |
|
|
|
|
|
|
|
|
|
|
| |
index is all zeroes to prevent an infinite loop.
The test case here previously infinite looped. Only one element from the GEP is used so SimplifyDemandedVectorElts would replace the other lanes in each index with undef leading to the first index being <0, undef, undef, undef>. But there's a GEP transform that tries to replace an index into a 0 sized type with a zero index. But the zero index check only works on ConstantInt 0 or ConstantAggregateZero so it would turn the index back to zeroinitializer. Resulting in a loop.
The fix is to use m_Zero() to allow a vector of zeroes and undefs.
Differential Revision: https://reviews.llvm.org/D67977
llvm-svn: 373000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
FlattenCFG merges two 'if' basicblocks by inserting one basicblock
to another basicblock. The inserted basicblock can have a successor
that contains a PHI node whoes incoming basicblock is the inserted
basicblock. Since the existing code does not handle it, it becomes
a badref.
if (cond1)
statement
if (cond2)
statement
successor - contains PHI node whose predecessor is cond2
-->
if (cond1 || cond2)
statement
(BB for cond2 was deleted)
successor - contains PHI node whose predecessor is cond2 --> bad ref!
Author: Jaebaek Seo
Reviewers: asbirlea, kuhar, tstellar, chandlerc, davide, dexonsmith
Reviewed By: kuhar
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68032
llvm-svn: 372989
|
| |
|
|
|
|
|
|
| |
warnings. NFCI.
The static analyzer is warning about a potential null dereferences, but we should be able to use cast<BranchInst> directly and if not assert will fire for us.
llvm-svn: 372977
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
getFlippedStrictnessPredicateAndConstant
Summary:
Removing an assumption (assert) that the CmpInst already has been
simplified in getFlippedStrictnessPredicateAndConstant. Solution is
to simply bail out instead of hitting the assertion. Instead we
assume that any profitable rewrite will happen in the next iteration
of InstCombine.
The reason why we can't assume that the CmpInst already has been
simplified is that the worklist does not guarantee such an ordering.
Solves https://bugs.llvm.org/show_bug.cgi?id=43376
Reviewers: spatel, lebedev.ri
Reviewed By: lebedev.ri
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68022
llvm-svn: 372972
|
| |
|
|
|
|
|
|
| |
warnings. NFCI.
The static analyzer is warning about a potential null dereferences, but we should be able to use cast<> directly and if not assert will fire for us.
llvm-svn: 372960
|
| |
|
|
|
|
|
|
| |
dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<MemIntrinsic> directly and if not assert will fire for us.
llvm-svn: 372959
|
| |
|
|
|
|
|
|
| |
(PR43251)
https://rise4fun.com/Alive/0j9
llvm-svn: 372930
|
| |
|
|
|
|
|
|
|
| |
For large functions, verifying the whole function after each loop takes
non-linear time.
Differential Revision: https://reviews.llvm.org/D67571
llvm-svn: 372924
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/KtL
This also shows that the fold added in D67412 / r372257
was too specific, and the new fold allows those test cases
to be handled more generically, therefore i delete now-dead code.
This is yet again motivated by
D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour"
llvm-svn: 372912
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.
This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.
Reviewers: spatel, lebedev.ri, reames, scanon
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67434
llvm-svn: 372899
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently m_Br only takes references to BasicBlock*, which limits its
flexibility. For example, you have to declare a variable, even if you
ignore the result or you have to have additional checks to make sure the
matched BB matches an expected one.
This patch adds m_BasicBlock and m_SpecificBB matchers, which can be
used like the existing matchers for constants or values.
I also had a look at the existing uses and updated a few. IMO it makes
the code a bit more explicit.
Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D68013
llvm-svn: 372885
|
| |
|
|
|
|
| |
If we generate the gc.relocate, and then later prove two arguments to the statepoint are equivalent, we should canonicalize the gc.relocate to the form we would have produced if this had been known before rewriting.
llvm-svn: 372771
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
For
```
#include <cassert>
char* test(char& base, signed long offset) {
__builtin_assume(offset < 0);
return &base + offset;
}
```
We produce
https://godbolt.org/z/r40U47
and again those two icmp's can be merged:
```
Name: 0
Pre: C != 0
%adjusted = add i8 %base, C
%not_null = icmp ne i8 %adjusted, 0
%no_underflow = icmp ult i8 %adjusted, %base
%r = and i1 %not_null, %no_underflow
=>
%neg_offset = sub i8 0, C
%r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/ALap
https://rise4fun.com/Alive/slnN
There are 3 other variants of this pattern,
i believe they all will go into InstSimplify.
https://bugs.llvm.org/show_bug.cgi?id=43259
Reviewers: spatel, xbolva00, nikic
Reviewed By: spatel
Subscribers: efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67849
llvm-svn: 372768
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
This pattern isn't exactly what we get there
(strict vs. non-strict predicate), but this pattern does not
require known-bits analysis, so it is best to handle it first.
```
Name: 0
%adjusted = add i8 %base, %offset
%not_null = icmp ne i8 %adjusted, 0
%no_underflow = icmp ule i8 %adjusted, %base
%r = and i1 %not_null, %no_underflow
=>
%neg_offset = sub i8 0, %offset
%r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/knp
There are 3 other variants of this pattern,
they all will go into InstSimplify:
https://rise4fun.com/Alive/bIDZ
https://bugs.llvm.org/show_bug.cgi?id=43259
Reviewers: spatel, xbolva00, nikic
Reviewed By: spatel
Subscribers: hiraditya, majnemer, vsk, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67846
llvm-svn: 372767
|
| |
|
|
|
|
|
|
| |
warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<CmpInst> directly and if not assert will fire for us.
llvm-svn: 372732
|
| |
|
|
|
|
|
|
| |
warning. NFCI.
Assert that we've found the DomBlock.
llvm-svn: 372728
|
| |
|
|
|
|
|
|
| |
dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<LandingPadInst> directly and if not assert will fire for us.
llvm-svn: 372727
|
| |
|
|
|
|
|
|
| |
warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<Instruction> directly and if not assert will fire for us.
llvm-svn: 372726
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplification.
Induction Variable Simplification pass does not update dbg.value intrinsic.
Before:
%add = add nuw nsw i32 %ArgIndex.06, 1
call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression())
After:
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 undef, metadata !17, metadata !DIExpression())
There should be:
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 %indvars.iv.next, metadata !17, metadata !DIExpression())
Differential Revision: https://reviews.llvm.org/D67770
llvm-svn: 372703
|