| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<MemIntrinsic> directly and if not assert will fire for us.
llvm-svn: 372959
|
|
|
|
|
|
|
|
| |
(PR43251)
https://rise4fun.com/Alive/0j9
llvm-svn: 372930
|
|
|
|
|
|
|
|
|
| |
For large functions, verifying the whole function after each loop takes
non-linear time.
Differential Revision: https://reviews.llvm.org/D67571
llvm-svn: 372924
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://rise4fun.com/Alive/KtL
This also shows that the fold added in D67412 / r372257
was too specific, and the new fold allows those test cases
to be handled more generically, therefore i delete now-dead code.
This is yet again motivated by
D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour"
llvm-svn: 372912
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.
This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.
Reviewers: spatel, lebedev.ri, reames, scanon
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67434
llvm-svn: 372899
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently m_Br only takes references to BasicBlock*, which limits its
flexibility. For example, you have to declare a variable, even if you
ignore the result or you have to have additional checks to make sure the
matched BB matches an expected one.
This patch adds m_BasicBlock and m_SpecificBB matchers, which can be
used like the existing matchers for constants or values.
I also had a look at the existing uses and updated a few. IMO it makes
the code a bit more explicit.
Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D68013
llvm-svn: 372885
|
|
|
|
|
|
| |
If we generate the gc.relocate, and then later prove two arguments to the statepoint are equivalent, we should canonicalize the gc.relocate to the form we would have produced if this had been known before rewriting.
llvm-svn: 372771
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
For
```
#include <cassert>
char* test(char& base, signed long offset) {
__builtin_assume(offset < 0);
return &base + offset;
}
```
We produce
https://godbolt.org/z/r40U47
and again those two icmp's can be merged:
```
Name: 0
Pre: C != 0
%adjusted = add i8 %base, C
%not_null = icmp ne i8 %adjusted, 0
%no_underflow = icmp ult i8 %adjusted, %base
%r = and i1 %not_null, %no_underflow
=>
%neg_offset = sub i8 0, C
%r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/ALap
https://rise4fun.com/Alive/slnN
There are 3 other variants of this pattern,
i believe they all will go into InstSimplify.
https://bugs.llvm.org/show_bug.cgi?id=43259
Reviewers: spatel, xbolva00, nikic
Reviewed By: spatel
Subscribers: efriedma, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67849
llvm-svn: 372768
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
This pattern isn't exactly what we get there
(strict vs. non-strict predicate), but this pattern does not
require known-bits analysis, so it is best to handle it first.
```
Name: 0
%adjusted = add i8 %base, %offset
%not_null = icmp ne i8 %adjusted, 0
%no_underflow = icmp ule i8 %adjusted, %base
%r = and i1 %not_null, %no_underflow
=>
%neg_offset = sub i8 0, %offset
%r = icmp ugt i8 %base, %neg_offset
```
https://rise4fun.com/Alive/knp
There are 3 other variants of this pattern,
they all will go into InstSimplify:
https://rise4fun.com/Alive/bIDZ
https://bugs.llvm.org/show_bug.cgi?id=43259
Reviewers: spatel, xbolva00, nikic
Reviewed By: spatel
Subscribers: hiraditya, majnemer, vsk, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67846
llvm-svn: 372767
|
|
|
|
|
|
|
|
| |
warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<CmpInst> directly and if not assert will fire for us.
llvm-svn: 372732
|
|
|
|
|
|
|
|
| |
warning. NFCI.
Assert that we've found the DomBlock.
llvm-svn: 372728
|
|
|
|
|
|
|
|
| |
dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<LandingPadInst> directly and if not assert will fire for us.
llvm-svn: 372727
|
|
|
|
|
|
|
|
| |
warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<Instruction> directly and if not assert will fire for us.
llvm-svn: 372726
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplification.
Induction Variable Simplification pass does not update dbg.value intrinsic.
Before:
%add = add nuw nsw i32 %ArgIndex.06, 1
call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression())
After:
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 undef, metadata !17, metadata !DIExpression())
There should be:
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 %indvars.iv.next, metadata !17, metadata !DIExpression())
Differential Revision: https://reviews.llvm.org/D67770
llvm-svn: 372703
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When vectorisation is forced with a pragma, we optimise for min size, and we
need to emit runtime memory checks, then allow this code growth and don't run
in an assert like we currently do.
This is the result of D65197 and D66803, and was a use-case not really
considered before. If this now happens, we emit an optimisation remark warning
about the code-size expansion, which can be avoided by not forcing
vectorisation or possibly source-code modifications.
Differential Revision: https://reviews.llvm.org/D67764
llvm-svn: 372694
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold
or(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X)
into
X s> Y ? -1 : X
https://rise4fun.com/Alive/d8Ab
clamp255 is a common operator in image processing, can be implemented
in a shifty way "(255 - X) >> 31 | X & 255". Fold shift into select
enables more optimization, e.g., vmin generation for ARM target.
Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon
Reviewed By: lebedev.ri
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67800
llvm-svn: 372678
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fold
and(ashr(subNSW(Y, X), ScalarSizeInBits(Y)-1), X)
into
X s> Y ? X : 0
https://rise4fun.com/Alive/lFH
Fold shift into select enables more optimization,
e.g., vmax generation for ARM target.
Reviewers: lebedev.ri, efriedma, spatel, kparzysz, bcahoon
Reviewed By: lebedev.ri
Subscribers: xbolva00, andreadb, craig.topper, RKSimon, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67799
llvm-svn: 372676
|
|
|
|
|
|
|
|
|
|
|
| |
When a cold path is outlined, the value tracking in the assumption cache may be
invalidated due to the code motion. We would previously trip an assertion in
subsequent passes (but required the passes to happen in a single run as the
assumption cache is shared across the passes). Invalidating the cache ensures
that we get the correct information when needed with the legacy pass manager as
well.
llvm-svn: 372667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is available
In rL372232, we treated names showing up in profile as not cold when
profile-sample-accurate is enabled. This caused 70k size regression in
Chrome/Android. The patch put a guard and only enable the change when
profile symbol list is available, i.e., keep the old behavior when profile
symbol list is not available.
Differential Revision: https://reviews.llvm.org/D67931
llvm-svn: 372665
|
|
|
|
|
|
| |
Extracted from https://reviews.llvm.org/D67849#inline-610377
llvm-svn: 372654
|
|
|
|
|
|
| |
Extracted from https://reviews.llvm.org/D67849#inline-610377
llvm-svn: 372653
|
|
|
|
|
|
| |
"Implementations are free to malloc() a buffer containing either (size + 1) bytes or (strnlen(s, size) + 1) bytes. Applications should not assume that strndup() will allocate (size + 1) bytes when strlen(s) is smaller than size."
llvm-svn: 372647
|
|
|
|
|
|
| |
IntegerType or VectorType)
llvm-svn: 372638
|
|
|
|
| |
llvm-svn: 372637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Motivation:
- If we can fold it to strdup, we should (strndup does more things than strdup).
- Annotation mechanism. (Works for strdup well).
strdup and strndup are part of C 20 (currently posix fns), so we should optimize them.
Reviewers: efriedma, jdoerfert
Reviewed By: jdoerfert
Subscribers: lebedev.ri, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67679
llvm-svn: 372636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR42563)
Summary:
If we have a pattern `(x & (-1 >> maskNbits)) << shiftNbits`,
we already know (have a fold) that will drop the `& (-1 >> maskNbits)`
mask iff `(shiftNbits-maskNbits) s>= 0` (i.e. `shiftNbits u>= maskNbits`).
So even if `(shiftNbits-maskNbits) s< 0`, we can still
fold, we will just need to apply a **constant** mask afterwards:
```
Name: c, normal+mask
%t0 = lshr i32 -1, C1
%t1 = and i32 %t0, %x
%r = shl i32 %t1, C2
=>
%n0 = shl i32 %x, C2
%n1 = i32 ((-(C2-C1))+32)
%n2 = zext i32 %n1 to i64
%n3 = lshr i64 -1, %n2
%n4 = trunc i64 %n3 to i32
%r = and i32 %n0, %n4
```
https://rise4fun.com/Alive/gslRa
Naturally, old `%masked` will have to be one-use.
This is not valid for pattern f - where "masking" is done via `ashr`.
https://bugs.llvm.org/show_bug.cgi?id=42563
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67725
llvm-svn: 372630
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR42563)
Summary:
And this is **finally** the interesting part of that fold!
If we have a pattern `(x & (~(-1 << maskNbits))) << shiftNbits`,
we already know (have a fold) that will drop the `& (~(-1 << maskNbits))`
mask iff `(maskNbits+shiftNbits) u>= bitwidth(x)`.
But that is actually ignorant, there's more general fold here:
In this pattern, `(maskNbits+shiftNbits)` actually correlates
with the number of low bits that will remain in the final value.
So even if `(maskNbits+shiftNbits) u< bitwidth(x)`, we can still
fold, we will just need to apply a **constant** mask afterwards:
```
Name: a, normal+mask
%onebit = shl i32 -1, C1
%mask = xor i32 %onebit, -1
%masked = and i32 %mask, %x
%r = shl i32 %masked, C2
=>
%n0 = shl i32 %x, C2
%n1 = add i32 C1, C2
%n2 = zext i32 %n1 to i64
%n3 = shl i64 -1, %n2
%n4 = xor i64 %n3, -1
%n5 = trunc i64 %n4 to i32
%r = and i32 %n0, %n5
```
https://rise4fun.com/Alive/F5R
Naturally, old `%masked` will have to be one-use.
Similar fold exists for patterns c,d,e, will post patch later.
https://bugs.llvm.org/show_bug.cgi?id=42563
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67677
llvm-svn: 372629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"SCEVAddRecExpr operand is not loop-invariant!")
Summary:
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.
Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel
Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D29641
llvm-svn: 372626
|
|
|
|
| |
llvm-svn: 372625
|
|
|
|
|
|
| |
Enable flag introduced in rL294998. Security concerns are no longer valid, since function signatures for mentioned libc functions has no nonnull attribute (Clang does not generate them? I see no nonnull attr in LLVM IR for these functions) and since rL372091 we carefully annotate the callsites where we know that size is static, non zero. So let's enable this flag again..
llvm-svn: 372573
|
|
|
|
|
|
| |
Add assertions to make it clear that GenerateIVChain / NarrowSearchSpaceByPickingWinnerRegs should succeed in finding non-null values
llvm-svn: 372518
|
|
|
|
|
|
| |
dereference warning. NFCI.
llvm-svn: 372517
|
|
|
|
|
|
|
|
|
| |
This has the potential to uncover missed analysis/folds as shown in the
min/max code comment/test, but fewer restrictions on icmp folds should
be better in general to solve cases like:
https://bugs.llvm.org/show_bug.cgi?id=43310
llvm-svn: 372510
|
|
|
|
| |
llvm-svn: 372502
|
|
|
|
|
|
|
|
|
|
| |
While Promoting alloca instruction of Vector Type,
Check total size in bits of its slices too.
If they don't match, don't promote the alloca instruction.
Bug : https://bugs.llvm.org/show_bug.cgi?id=42585
llvm-svn: 372480
|
|
|
|
|
|
| |
Testing commit acces. NFC.
llvm-svn: 372479
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch introduces `norecurse` function attribute deduction.
`norecurse` will be deduced if the following conditions hold:
* The size of SCC in which the function belongs equals to 1.
* The function doesn't have self-recursion.
* We have `norecurse` for all call site.
To avoid a large change, SCC is calculated using scc_iterator in InfoCache initialization for now.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67751
llvm-svn: 372475
|
|
|
|
|
|
| |
The static analyzer is warning about potential null dereference, but we can use cast<ConstantInt> directly and if not assert will fire for us.
llvm-svn: 372429
|
|
|
|
|
|
|
|
|
|
|
| |
objc_release calls
This fixes a bug where the presence of debug instructions would cause
ARC optimizer to change the order of retain and release calls.
rdar://problem/55319419
llvm-svn: 372352
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
FlattenCFG may erase unnecessary blocks, which also invalidates iterators to those erased blocks.
Before this patch, `iterativelyFlattenCFG` could try to increment a BB iterator after that BB has been removed and crash.
This patch makes FlattenCFGPass use `WeakVH` to skip over erased blocks.
Reviewers: dblaikie, tstellar, davide, sanjoy, asbirlea, grosser
Reviewed By: asbirlea
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67672
llvm-svn: 372347
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.
In this particular case, given
```
char* test(char& base, unsigned long offset) {
return &base - offset;
}
```
it will end up producing something like
https://godbolt.org/z/luGEju
which after optimizations reduces down to roughly
```
declare void @use64(i64)
define i1 @test(i8* dereferenceable(1) %base, i64 %offset) {
%base_int = ptrtoint i8* %base to i64
%adjusted = sub i64 %base_int, %offset
call void @use64(i64 %adjusted)
%not_null = icmp ne i64 %adjusted, 0
%no_underflow = icmp ule i64 %adjusted, %base_int
%no_underflow_and_not_null = and i1 %not_null, %no_underflow
ret i1 %no_underflow_and_not_null
}
```
Without D67122 there was no `%not_null`,
and in this particular case we can "get rid of it", by merging two checks:
Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset`
Alive proofs:
https://rise4fun.com/Alive/QOs
The `@llvm.usub.with.overflow` pattern itself is not handled here
because this is the main pattern, that we currently consider canonical.
https://bugs.llvm.org/show_bug.cgi?id=43251
Reviewers: spatel, nikic, xbolva00, majnemer
Reviewed By: xbolva00, majnemer
Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67356
llvm-svn: 372341
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the example from:
https://bugs.llvm.org/show_bug.cgi?id=38502
...we hit infinite looping/crashing because we have non-standard IR -
an instruction operand is used before defined.
This and other unusual constructs are allowed in unreachable blocks,
so avoid the problem by using DominatorTree to step around landmines.
Differential Revision: https://reviews.llvm.org/D67766
llvm-svn: 372339
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add an ability to specify the max full unroll count for LoopUnrollPass pass
in pass options.
Reviewers: fhahn, fedor.sergeev
Reviewed By: fedor.sergeev
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D67701
llvm-svn: 372305
|
|
|
|
|
|
|
|
| |
MSAN bot complains that there is use-of-uninitialized-value
of this FreeStores later in IsWorthwhile().
Perhaps FreeStores needs to be stored in a vector?
llvm-svn: 372262
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.
This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.
The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..
Proofs: https://rise4fun.com/Alive/21b
Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)
With this, i see no other missing folds for
https://bugs.llvm.org/show_bug.cgi?id=43251
Reviewers: spatel, nikic, xbolva00
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67412
llvm-svn: 372257
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction count
Summary:
As it can be see in the changed test, while `div` is really costly,
we were speculating it. This does not seem correct.
Also, the old code would run for every single insturuction in BB,
instead of eagerly bailing out as soon as there are too many instructions.
This function still has a problem that `PHINodeFoldingThreshold` is
per-basic-block, while it should be for all the basic blocks.
Reviewers: efriedma, craig.topper, dmgreen, jmolloy
Reviewed By: jmolloy
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67315
llvm-svn: 372255
|
|
|
|
|
|
| |
upcoming patch
llvm-svn: 372245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is enabled.
We can save memory and reduce binary size significantly by enabling
ProfileSampleAccurate. However when ProfileSampleAccurate is true,
function without sample will be regarded as cold and this could
potentially cause performance regression.
To minimize the potential negative performance impact, we want to be
a little conservative here saying if a function shows up in the profile,
no matter as outline instance, inline instance or call targets, treat
the function as not being cold. This will handle the cases such as most
callsites of a function are inlined in sampled binary (thus outline copy
don't get any sample) but not inlined in current build (because of source
code drift, imprecise debug information, or the callsites are all cold
individually but not cold accumulatively...), so that the outline function
showing up as cold in sampled binary will actually not be cold after current
build. After the change, such function will be treated as not cold even
profile-sample-accurate is enabled.
At the same time we lower the hot criteria of callsiteIsHot check when
profile-sample-accurate is enabled. callsiteIsHot is used to determined
whether a callsite is hot and qualified for early inlining. When
profile-sample-accurate is enabled, functions without profile will be
regarded as cold and much less inlining will happen in CGSCC inlining pass,
so we can worry less about size increase and be aggressive to allow more
early inlining to happen for warm callsites and it is helpful for performance
overall.
Differential Revision: https://reviews.llvm.org/D67561
llvm-svn: 372232
|
|
|
|
|
|
|
|
| |
...and improve some variable names while here.
https://bugs.llvm.org/show_bug.cgi?id=43347
llvm-svn: 372227
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The PGO counter reading will add cold and inlinehint (hot) attributes
to functions that are very cold or hot. This was using hardcoded
thresholds, instead of the profile summary cutoffs which are used in
other hot/cold detection and are more dynamic and adaptable. Switch
to using the summary-based cold/hot detection.
The hardcoded limits were causing some code that had a medium level of
hotness (per the summary) to be incorrectly marked with a cold
attribute, blocking inlining.
Reviewers: davidxl
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67673
llvm-svn: 372189
|