| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97.
Broke the bots, and should really be in Transforms/Coroutines
instead.
llvm-svn: 343337
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: mehdi_amini, modocache, llvm-commits
Differential Revision: https://reviews.llvm.org/D51642
llvm-svn: 343336
|
|
|
|
| |
llvm-svn: 343310
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.
The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.
It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.
Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D51702
llvm-svn: 343308
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add a dominance check to ensure that the possible devirtualizable
call is actually dominated by the type test/checked load intrinsic being
analyzed. With PGO, after indirect call promotion is performed during
the compile step, followed by inlining, we may have a type test in the
promoted and inlined sequence that allows an indirect call in that
sequence to be devirtualized. That indirect call (inserted by inlining
after promotion) will share the same vtable pointer as the fallback
indirect call that cannot be devirtualized.
Before this patch the code was incorrectly devirtualizing the fallback
indirect call.
See the new test and the example described there for more details.
Reviewers: pcc, vitalybuka
Subscribers: mehdi_amini, Prazek, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D52514
llvm-svn: 343226
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The convenience wrapper in STLExtras is available since rL342102.
Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb
Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D52573
llvm-svn: 343163
|
|
|
|
|
|
|
|
|
|
|
|
| |
DeadArgElim pass marks unused function arguments as ‘undef’ without updating
existing dbg.values referring to it. As a consequence the debug info
metadata in the final executable was wrong.
Patch by Djordje Todorovic.
Differential Revision: https://reviews.llvm.org/D51968
llvm-svn: 342871
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D52175
llvm-svn: 342836
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AvailableExternal was not handled in isDiscardableIfUnused when isDiscardableIfUnused
was added in r158476. Till it was handled in r247044. This is a NFC.
Reviewers: pcc, tejohnson
Reviewed By: tejohnson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52319
llvm-svn: 342684
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently only the first function in the module is checked to
see if it has remarks enabled. If that first function is a declaration,
remarks will be incorrectly skipped. Change to look for the first
non-empty function.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D51556
llvm-svn: 342477
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This handles terminator instructions and more.
Summary:
I tested this patch by compiling sqlite3.ll (clang -O3 -mllvm -disable-llvm-optzns sqlite3.c.)
opt -called-value-propagation sqlite3.ll -time-passes -f -o out.ll
I get 10+% speedup for the pass. I expect some of the gain come from skipping terminator instructions.
=== BEFORE THE PATCH ===
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.5562 seconds (0.5582 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.2485 ( 46.4%) 0.0120 ( 57.7%) 0.2605 ( 46.8%) 0.2615 ( 46.8%) Bitcode Writer
0.1607 ( 30.0%) 0.0079 ( 37.7%) 0.1685 ( 30.3%) 0.1693 ( 30.3%) Called Value Propagation
0.1262 ( 23.6%) 0.0009 ( 4.5%) 0.1271 ( 22.9%) 0.1275 ( 22.8%) Module Verifier
0.5353 (100.0%) 0.0209 (100.0%) 0.5562 (100.0%) 0.5582 (100.0%) Total
=== AFTER THE PATCH ===
===-------------------------------------------------------------------------===
... Pass execution timing report ...
===-------------------------------------------------------------------------===
Total Execution Time: 0.5338 seconds (0.5355 wall clock)
---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---
0.2498 ( 48.6%) 0.0118 ( 59.3%) 0.2615 ( 49.0%) 0.2629 ( 49.1%) Bitcode Writer
0.1377 ( 26.8%) 0.0075 ( 37.8%) 0.1452 ( 27.2%) 0.1455 ( 27.2%) Called Value Propagation
0.1264 ( 24.6%) 0.0006 ( 3.0%) 0.1270 ( 23.8%) 0.1271 ( 23.7%) Module Verifier
0.5139 (100.0%) 0.0199 (100.0%) 0.5338 (100.0%) 0.5355 (100.0%) Total
Reviewers: davide, mssimpso
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49108
llvm-svn: 342398
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912
has been fixed by rL342055.
Precommit testing performed:
* Overnight runs of csmith comparing the output between programs
compiled with gvn-hoist enabled/disabled.
* Bootstrap builds of clang with UbSan/ASan configurations.
llvm-svn: 342387
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch saves a function offset table which maps function name index to the
offset of its function profile to the start of the binary profile. By using
the function offset table, for those function profiles which will not be used
when compiling a module, the profile reader does't have to read them. For
profile size around 10~20M, it saves ~10% compile time.
Differential Revision: https://reviews.llvm.org/D51863
llvm-svn: 342283
|
|
|
|
|
|
|
|
| |
The test used to fail with an invalid phi node: the two predecessors were outlined
and the SSA representation was left invalid. The patch adds the exit block to the
cold region.
llvm-svn: 342277
|
|
|
|
|
|
|
|
|
|
|
|
| |
remove duplicate entries from isSingleEntrySingleExit: the Entry block is
already added by the loop over the dominance frontier.
Remove the heuristic from isOutlineCandidate that a region is too small when it
only contains a basic block. With this change we now grow regions starting from
a block and we continue adding to the ValidColdRegion. Check the heuristic just
before code generation.
llvm-svn: 342276
|
|
|
|
|
|
|
|
| |
Also fix a problem in forward propagation:
const TerminatorInst *TI = It->getTerminator();
was set outside the while loop that iterates over It.
llvm-svn: 342275
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts rL341954.
The builder `sanitizer-x86_64-linux-bootstrap-ubsan` has been
failing with timeouts at stage2 clang/ubsan:
[3065/3073] Linking CXX executable bin/lld
command timed out: 1200 seconds without output running python
../sanitizer_buildbot/sanitizers/buildbot_selector.py,
attempting to kill
llvm-svn: 342001
|
|
|
|
|
|
|
| |
Rebase rL340922 since https://bugs.llvm.org/show_bug.cgi?id=38807
has been fixed by rL341947.
llvm-svn: 341954
|
|
|
|
|
|
|
|
|
|
| |
The presence of readnone and an access range attribute (argmemonly,
inaccessiblememonly, inaccessiblemem_or_argmemonly) is considered an
error by the verifier. This seems strict but also not wrong. This
patch makes sure function attribute detection will remove all access
range attributes for readnone functions.
llvm-svn: 341927
|
|
|
|
|
|
|
|
|
| |
Before tagging a function with coldcc make sure the target supports cold calling
convention. Without this patch HotColdSplitting pass fails on aarch64 with:
fatal error: error in backend: Unsupported calling convention.
llvm-svn: 341838
|
|
|
|
|
|
|
|
|
|
|
|
| |
Find cold blocks based on profile information (or optionally with static analysis).
Forward propagate profile information to all cold-blocks.
Outline a cold region.
Set calling conv and prof hint for the callsite of the outlined function.
Worked in collaboration with: Sebastian Pop <s.pop@samsung.com>
Differential Revision: https://reviews.llvm.org/D50658
llvm-svn: 341669
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch tries to make sample profile loader independent of profile format
change. It moves compact format related code into FunctionSamples and
SampleProfileReader classes, and sample profile loader only has to interact
with those two classes and will be unaware of profile format changes.
The cleanup also contain some fixes to further remove the difference between
compactbinary format and binary format. After the cleanup using different
formats originated from the same profile will generate the same binaries,
which we verified by compiling two large server benchmarks w/wo thinlto.
Differential Revision: https://reviews.llvm.org/D51643
llvm-svn: 341591
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Control height reduction merges conditional blocks of code and reduces the
number of conditional branches in the hot path based on profiles.
if (hot_cond1) { // Likely true.
do_stg_hot1();
}
if (hot_cond2) { // Likely true.
do_stg_hot2();
}
->
if (hot_cond1 && hot_cond2) { // Hot path.
do_stg_hot1();
do_stg_hot2();
} else { // Cold path.
if (hot_cond1) {
do_stg_hot1();
}
if (hot_cond2) {
do_stg_hot2();
}
}
This speeds up some internal benchmarks up to ~30%.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: xbolva00, dmgreen, mehdi_amini, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D50591
llvm-svn: 341386
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Load Hardening.
Wires up the existing pass to work with a proper IR attribute rather
than just a hidden/internal flag. The internal flag continues to work
for now, but I'll likely remove it soon.
Most of the churn here is adding the IR attribute. I talked about this
Kristof Beyls and he seemed at least initially OK with this direction.
The idea of using a full attribute here is that we *do* expect at least
some forms of this for other architectures. There isn't anything
*inherently* x86-specific about this technique, just that we only have
an implementation for x86 at the moment.
While we could potentially expose this as a Clang-level attribute as
well, that seems like a good question to defer for the moment as it
isn't 100% clear whether that or some other programmer interface (or
both?) would be best. We'll defer the programmer interface side of this
for now, but at least get to the point where the feature can be enabled
without relying on implementation details.
This also allows us to do something that was really hard before: we can
enable *just* the indirect call retpolines when using SLH. For x86, we
don't have any other way to mitigate indirect calls. Other architectures
may take a different approach of course, and none of this is surfaced to
user-level flags.
Differential Revision: https://reviews.llvm.org/D51157
llvm-svn: 341363
|
|
|
|
|
|
|
|
|
|
| |
Another sanitizer buildbot failed this time at bootstrap when
compiling SemaTemplateInstantiate.cpp with this assertion:
`dominates(MD, U) && "Memory Def does not dominate it's uses"'.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/15047
llvm-svn: 340925
|
|
|
|
|
|
|
|
|
| |
Rebase rL338240 since the excessive memory usage observed when using
GVNHoist with UBSan has been fixed by rL340818.
Differential Revision: https://reviews.llvm.org/D49858
llvm-svn: 340922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.
An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.
For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive
It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
define void @test1() {
call void @bar(i1 true) #0
call void @bar(i1 false) #2
ret void
}
attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
..
attributes #2 = { "inline-remark"="(cost=never): recursive" }
Patch by: yrouban (Yevgeny Rouban)
Reviewers: xbolva00, tejohnson, apilipenko
Reviewed By: xbolva00, tejohnson
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D50435
llvm-svn: 340834
|
|
|
|
|
|
|
| |
We only use this set for `insert` and `count`, so a hashing container
seems better here.
llvm-svn: 340783
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.
All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.
llvm-svn: 340701
|
|
|
|
| |
llvm-svn: 340619
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Sometimes reading an output *.ll file it is not easy to understand why some callsites are not inlined. We can read output of inline remarks (option --pass-remarks-missed=inline) and try correlating its messages with the callsites.
An easier way proposed by this patch is to add to every callsite processed by Inliner an attribute with the latest message that describes the cause of not inlining this callsite. The attribute is called //inline-remark//. By default this feature is off. It can be switched on by the option //-inline-remark-attribute//.
For example in the provided test the result method //@test1// has two callsites //@bar// and inline remarks report different inlining missed reasons:
remark: <unknown>:0:0: bar not inlined into test1 because too costly to inline (cost=-5, threshold=-6)
remark: <unknown>:0:0: bar not inlined into test1 because it should never be inlined (cost=never): recursive
It is not clear which remark correspond to which callsite. With the inline remark attribute enabled we get the reasons attached to their callsites:
define void @test1() {
call void @bar(i1 true) #0
call void @bar(i1 false) #2
ret void
}
attributes #0 = { "inline-remark"="(cost=-5, threshold=-6)" }
..
attributes #2 = { "inline-remark"="(cost=never): recursive" }
Patch by: yrouban (Yevgeny Rouban)
Reviewers: xbolva00, tejohnson, apilipenko
Reviewed By: xbolva00, tejohnson
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D50435
llvm-svn: 340618
|
|
|
|
|
|
|
|
|
|
|
|
| |
These changes expand the FunctionAttr logic in order to mark functions as
WriteOnly when appropriate. This is done through an additional bool variable
and extended logic.
Reviewers: hfinkel, jdoerfert
Differential Revision: https://reviews.llvm.org/D48387
llvm-svn: 340537
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instructions.
This version of the patch fixes cleaning up ssa_copy intrinsics, so it does not
crash for instructions in blocks that have been marked unreachable.
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.
As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.
Reviewers: davide, mssimpso, dberlin, efriedma
Reviewed By: davide, dberlin
Differential Revision: https://reviews.llvm.org/D45330
llvm-svn: 340525
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds the option for the printing of summary information about functions
considered but rejected for importing during the thin link.
Reviewers: davidxl
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D50881
llvm-svn: 340047
|
|
|
|
|
|
| |
Increment existing NumInlined and NumDeleted stats in InlinerPass::run.
llvm-svn: 339682
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When WPD is performed in a ThinLTO backend, the function may be created
if it isn't already in that module. Module::getOrInsertFunction may
add a bitcast, in which case the returned Constant is not a Function and
doesn't have a name. Invoke stripPointerCasts() on the returned value
where we access its name.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D49959
llvm-svn: 339640
|
|
|
|
|
|
|
| |
My previous change moved some code upwards which caused an assert in debug mode
because the global value didn't necessarily have an initializer. Don't do that.
llvm-svn: 339485
|
|
|
|
|
|
| |
Sanitizers seem unhappy.
llvm-svn: 339480
|
|
|
|
|
|
| |
This makes my upcoming patch much easier to read.
llvm-svn: 339478
|
|
|
|
|
|
| |
This makes the code easier to read and will make an upcoming patch I have easier to review because that patch needed this refactoring to reuse some of the functions.
llvm-svn: 339391
|
|
|
|
|
|
| |
It was always false, which is obviously wrong.
llvm-svn: 339390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inalloca parameter has to be the only parameter passed in memory.
Changing the convention to fastcc can break that.
At some point we should teach global opt how to optimize ABI attributes
like inalloca and maybe byval. These attributes are mainly used to match
C ABIs. They are harder for LLVM to optimize and they don't always
generate the best code.
Fixes PR38487
llvm-svn: 339360
|
|
|
|
|
|
|
|
|
|
| |
Summary: DenseMap's operator[] performs an insertion if the entry isn't found. The second phase of ConstantMerge isn't trying to insert anything: it's just looking to see if the first phased performed an insertion. Use find instead, avoiding insertion of every single global initializer in the map of constants. This has the side-effect of making all entries in CMap non-null (because only global declarations would have null initializers, and that would be a bug).
Subscribers: dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D50476
llvm-svn: 339309
|
|
|
|
| |
llvm-svn: 339021
|
|
|
|
| |
llvm-svn: 339020
|
|
|
|
| |
llvm-svn: 339019
|
|
|
|
| |
llvm-svn: 338986
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.
Patch by: yrouban (Yevgeny Rouban)
Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00
Reviewed By: tejohnson, xbolva00
Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith
Differential Revision: https://reviews.llvm.org/D49412
llvm-svn: 338969
|
|
|
|
|
|
|
| |
- It's possible for 'Changed' to return as false even if we did
partial inline something. Fixed to accumulate return values
llvm-svn: 338896
|
|
|
|
|
|
|
|
| |
This patch just extract code into a separate function to remove some
duplication between the old and new pass manager pipeline. Due to the
different CGSCC iterators used, not all code duplication was eliminated.
llvm-svn: 338585
|