summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO
Commit message (Collapse)AuthorAgeFilesLines
* Merging r344325:Tom Stellard2018-10-261-2/+1
| | | | | | | | | | | | | ------------------------------------------------------------------------ r344325 | evgeny777 | 2018-10-12 00:24:02 -0700 (Fri, 12 Oct 2018) | 4 lines [ThinLTO] Don't import GV which contains blockaddress Differential revision: https://reviews.llvm.org/D53139 ------------------------------------------------------------------------ llvm-svn: 345401
* Revert "Enrich inline messages", tests failDavid Bolvansky2018-08-012-68/+56
| | | | llvm-svn: 338496
* Enrich inline messagesDavid Bolvansky2018-08-012-56/+68
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338494
* Revert Enrich inline messagesDavid Bolvansky2018-07-312-68/+56
| | | | llvm-svn: 338389
* Enrich inline messagesDavid Bolvansky2018-07-312-56/+68
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves Inliner to provide causes/reasons for negative inline decisions. 1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message. 2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision. 3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost. 4. Adjusted tests for changed printing. Patch by: yrouban (Yevgeny Rouban) Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00 Reviewed By: tejohnson, xbolva00 Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith Differential Revision: https://reviews.llvm.org/D49412 llvm-svn: 338387
* Revert "[GVNHoist] Re-enable GVNHoist by default"Vlad Tsyrklevich2018-07-301-2/+2
| | | | | | | This reverts commit r338240 because it was causing OOMs on the UBSan buildbot when building clang/lib/Sema/SemaChecking.cpp llvm-svn: 338297
* Remove trailing spaceFangrui Song2018-07-305-21/+21
| | | | | | sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h} llvm-svn: 338293
* [GVNHoist] Re-enable GVNHoist by defaultAlexandros Lamprineas2018-07-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | My initial motivation for this came from https://reviews.llvm.org/D48122, where it was pointed out that my change didn't fit well in SimplifyCFG and therefore using GVNHoist was a better way to go. GVNHoist has been disabled for a while as there was a list of bugs related to it. I have fixed the following bugs: https://bugs.llvm.org/show_bug.cgi?id=37808 -> https://reviews.llvm.org/D48372 (rL337149) https://bugs.llvm.org/show_bug.cgi?id=36787 -> https://reviews.llvm.org/D49555 (rL337674) https://bugs.llvm.org/show_bug.cgi?id=37445 -> https://reviews.llvm.org/D49425 (rL337680) The next two bugs no longer occur, and it's unclear which commit fixed them: https://bugs.llvm.org/show_bug.cgi?id=36635 https://bugs.llvm.org/show_bug.cgi?id=37791 I investigated this one and proved to be unrelated to GVNHoist, but a genuine bug in NewGvn: https://bugs.llvm.org/show_bug.cgi?id=37660 To convince myself GVNHoist is in a good state I made a successful bootstrap build of LLVM. Merging this change now in order to make it to the LLVM 7.0.0 branch. Differential Revision: https://reviews.llvm.org/D49858 llvm-svn: 338240
* [GlobalOpt] Test array indices inside structs for out-of-bounds accessesDavid Green2018-07-281-71/+47
| | | | | | | | | | | | | | | | | | We now, from clang, can turn arrays of static short g_data[] = {16, 16, 16, 16, 16, 16, 16, 16, 0, 0, 0, 0, 0, 0, 0, 0}; into structs of the form @g_data = internal global <{ [8 x i16], [8 x i16] }> ... GlobalOpt will incorrectly SROA it, not realising that the access to the first element may overflow into the second. This fixes it by checking geps more thoroughly. I believe this makes the globalsra-partial.ll test case invalid as the %i value could be out of bounds. I've re-purposed it as a negative test for this case. Differential Revision: https://reviews.llvm.org/D49816 llvm-svn: 338192
* Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵Florian Hahn2018-07-251-22/+2
| | | | | | | | instructions. I suspect it is causing the clang-stage2-Rthinlto failures. llvm-svn: 337956
* Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵Florian Hahn2018-07-251-2/+22
| | | | | | | | | | | | | | | | | | | | instructions. r337828 resolves a PredicateInfo issue with unnamed types. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 337904
* [ThinLTO] Ensure the TargetLibraryInfo is constructed early enoughTeresa Johnson2018-07-231-0/+2
| | | | | | | | | | | | | | | | Summary: Without this change, the WholeProgramDevirt pass, which requires the TargetLibraryInfo, will construct one from the default triple. Fixes PR38139. Reviewers: pcc Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49278 llvm-svn: 337750
* Change the cap on the amount of padding for each vtable to 32-byte ↵Peter Collingbourne2018-07-201-4/+6
| | | | | | | | | | | | | (previously it was 128-byte) We tested different cap values with a recent commit of Chromium. Our results show that the 32-byte cap yields the smallest binary and all the caps yield similar performance. Based on the results, we propose to change the cap value to 32-byte. Patch by Zhaomo Yang! Differential Revision: https://reviews.llvm.org/D49405 llvm-svn: 337622
* [ThinLTO] Enable ThinLTO WholeProgramDevirt and LowerTypeTests in new PMTeresa Johnson2018-07-192-4/+3
| | | | | | | | | | | | | | | | Summary: Enable these passes for CFI and WPD in ThinLTO and LTO with the new pass manager. Add a couple of tests for both PMs based on the clang tests tools/clang/test/CodeGen/thinlto-distributed-cfi*.ll, but just test through llvm-lto2 and not with distributed ThinLTO. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49429 llvm-svn: 337461
* Restore "[ThinLTO] Ensure we always select the same function copy to import"Teresa Johnson2018-07-161-69/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r337081, therefore restoring r337050 (and fix in r337059), with test fix for bot failure described after the original description below. In order to always import the same copy of a linkonce function, even when encountering it with different thresholds (a higher one then a lower one), keep track of the summary we decided to import. This ensures that the backend only gets a single definition to import for each GUID, so that it doesn't need to choose one. Move the largest threshold the GUID was considered for import into the current module out of the ImportMap (which is part of a larger map maintained across the whole index), and into a new map just maintained for the current module we are computing imports for. This saves some memory since we no longer have the thresholds maintained across the whole index (and throughout the in-process backends when doing a normal non-distributed ThinLTO build), at the cost of some additional information being maintained for each invocation of ComputeImportForModule (the selected summary pointer for each import). There is an additional map lookup for each callee being considered for importing, however, this was able to subsume a map lookup in the Worklist iteration that invokes computeImportForFunction. We also are able to avoid calling selectCallee if we already failed to import at the same or higher threshold. I compared the run time and peak memory for the SPEC2006 471.omnetpp benchmark (running in-process ThinLTO backends), as well as for a large internal benchmark with a distributed ThinLTO build (so just looking at the thin link time/memory). Across a number of runs with and without this change there was no significant change in the time and memory. (I tried a few other variations of the change but they also didn't improve time or peak memory). The new commit removes a test that no longer makes sense (Transforms/FunctionImport/hotness_based_import2.ll), as exposed by the reverse-iteration bot. The test depends on the order of processing the summary call edges, and actually depended on the old problematic behavior of selecting more than one summary for a given GUID when encountered with different thresholds. There was no guarantee even before that we would eventually pick the linkonce copy with the hottest call edges, it just happened to work with the test and the old code, and there was no guarantee that we would end up importing the selected version of the copy that had the hottest call edges (since the backend would effectively import only one of the selected copies). Reviewers: davidxl Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D48670 llvm-svn: 337184
* Revert "[ThinLTO] Ensure we always select the same function copy to import"Teresa Johnson2018-07-141-88/+69
| | | | | | | This reverts commits r337050 and r337059. Caused failure in reverse-iteration bot that needs more investigation. llvm-svn: 337081
* [ThinLTO] Ensure we always select the same function copy to importTeresa Johnson2018-07-131-69/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to always import the same copy of a linkonce function, even when encountering it with different thresholds (a higher one then a lower one), keep track of the summary we decided to import. This ensures that the backend only gets a single definition to import for each GUID, so that it doesn't need to choose one. Move the largest threshold the GUID was considered for import into the current module out of the ImportMap (which is part of a larger map maintained across the whole index), and into a new map just maintained for the current module we are computing imports for. This saves some memory since we no longer have the thresholds maintained across the whole index (and throughout the in-process backends when doing a normal non-distributed ThinLTO build), at the cost of some additional information being maintained for each invocation of ComputeImportForModule (the selected summary pointer for each import). There is an additional map lookup for each callee being considered for importing, however, this was able to subsume a map lookup in the Worklist iteration that invokes computeImportForFunction. We also are able to avoid calling selectCallee if we already failed to import at the same or higher threshold. I compared the run time and peak memory for the SPEC2006 471.omnetpp benchmark (running in-process ThinLTO backends), as well as for a large internal benchmark with a distributed ThinLTO build (so just looking at the thin link time/memory). Across a number of runs with and without this change there was no significant change in the time and memory. (I tried a few other variations of the change but they also didn't improve time or peak memory). Reviewers: davidxl Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D48670 llvm-svn: 337050
* [LowerTypeTests] Limit when icall jumptable entries are emittedVlad Tsyrklevich2018-07-131-6/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently LowerTypeTests emits jumptable entries for all live external and address-taken functions; however, we could limit the number of functions that we emit entries for significantly. For Cross-DSO CFI, we continue to emit jumptable entries for all exported definitions. In the non-Cross-DSO CFI case, we only need to emit jumptable entries for live functions that are address-taken in live functions. This ignores exported functions and functions that are only address taken in dead functions. This change uses ThinLTO summary data (now emitted for all modules during ThinLTO builds) to determine address-taken and liveness info. The logic for emitting jumptable entries is more conservative in the regular LTO case because we don't have summary data in the case of monolithic LTO builds; however, once summaries are emitted for all LTO builds we can unify the Thin/monolithic LTO logic to only use summaries to determine the liveness of address taking functions. This change is a partial fix for PR37474. It reduces the build size for nacl_helper by ~2-3%, the reduction is due to nacl_helper compiling in lots of unused code and unused functions that are address taken in dead functions no longer being being considered live due to emitted jumptable references. The reduction for chromium is ~0.1-0.2%. Reviewers: pcc, eugenis, javed.absar Reviewed By: pcc Subscribers: aheejin, dexonsmith, dschuff, mehdi_amini, eraman, steven_wu, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D47652 llvm-svn: 337038
* [ThinLTO] Use std::map to get determistic imports filesTeresa Johnson2018-07-101-5/+9
| | | | | | | | | | | | | | | | | | | | | | Summary: I noticed that the .imports files emitted for distributed ThinLTO backends do not have consistent ordering. This is because StringMap iteration order is not guaranteed to be deterministic. Since we already have a std::map with this information, used when emitting the individual index files (ModuleToSummariesForIndex), use it for the imports files as well. This issue is likely causing some unnecessary rebuilds of the ThinLTO backends in our distributed build system as the imports files are inputs to those backends. Reviewers: pcc, steven_wu, mehdi_amini Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D48783 llvm-svn: 336721
* llvm: Add support for "-fno-delete-null-pointer-checks"Manoj Gupta2018-07-091-3/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613
* [CVP] Handle calls with void return value. No need to create CVPLattice ↵Xin Tong2018-07-091-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | state for it. Summary: Tests: 10 Metric: compile_time Program unpatch-result patch-result diff Bullet/bullet 32.39 30.54 -5.7% SPASS/SPASS 18.14 17.25 -4.9% mafft/pairlocalalign 12.10 11.64 -3.8% ClamAV/clamscan 19.21 19.63 2.2% 7zip/7zip-benchmark 49.55 48.85 -1.4% kimwitu++/kc 15.68 15.87 1.2% lencod/lencod 21.13 21.34 1.0% consumer-typeset/consumer-typeset 13.65 13.62 -0.2% tramp3d-v4/tramp3d-v4 29.88 29.92 0.1% sqlite3/sqlite3 18.48 18.46 -0.1% unpatch-result patch-result diff count 10.000000 10.000000 10.000000 mean 23.022000 22.712400 -0.011671 std 11.362831 11.094183 0.027338 min 12.104000 11.640000 -0.057298 25% 16.299000 16.214000 -0.032282 50% 18.844000 19.048000 -0.001350 75% 27.689000 27.774000 0.007752 max 49.552000 48.852000 0.021861 I also tested only this pass by concatenating all the code from the llvm/lib/Analysis/ folder and do clang -g followed by opt. I get close to 20% speedup for the pass. I expect a majority of the gain come from skipping the dbg intrinsics. Before patch (opt -time-passes -called-value-propagation): ============ ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 3.8303 seconds (3.8279 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 2.0768 ( 57.3%) 0.0990 ( 48.0%) 2.1757 ( 56.8%) 2.1757 ( 56.8%) Bitcode Writer 0.8444 ( 23.3%) 0.0600 ( 29.1%) 0.9044 ( 23.6%) 0.9044 ( 23.6%) Called Value Propagation 0.7031 ( 19.4%) 0.0472 ( 22.9%) 0.7502 ( 19.6%) 0.7478 ( 19.5%) Module Verifier 3.6242 (100.0%) 0.2062 (100.0%) 3.8303 (100.0%) 3.8279 (100.0%) Total After patch (opt -time-passes -called-value-propagation): ============ ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 3.6605 seconds (3.6579 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 2.0716 ( 59.7%) 0.0990 ( 52.5%) 2.1705 ( 59.3%) 2.1706 ( 59.3%) Bitcode Writer 0.7144 ( 20.6%) 0.0300 ( 15.9%) 0.7444 ( 20.3%) 0.7444 ( 20.4%) Called Value Propagation 0.6859 ( 19.8%) 0.0596 ( 31.6%) 0.7455 ( 20.4%) 0.7429 ( 20.3%) Module Verifier 3.4719 (100.0%) 0.1886 (100.0%) 3.6605 (100.0%) 3.6579 (100.0%) Total Reviewers: davide, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49078 llvm-svn: 336551
* [UnrollAndJam] New Unroll and Jam passDavid Green2018-07-011-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 336062
* [instsimplify] Move the instsimplify pass to use more obvious file namesChandler Carruth2018-06-291-1/+2
| | | | | | | | | | | | | | | | and diretory. Also cleans up all the associated naming to be consistent and removes the public access to the pass ID which was unused in LLVM. Also runs clang-format over parts that changed, which generally cleans up a bunch of formatting. This is in preparation for doing some internal cleanups to the pass. Differential Revision: https://reviews.llvm.org/D47352 llvm-svn: 336028
* [ThinLTO] Port InlinerFunctionImportStats handling to new PMTeresa Johnson2018-06-281-0/+18
| | | | | | | | | | | | | | Summary: The InlinerFunctionImportStats will collect and dump stats regarding how many function inlined into the module were imported by ThinLTO. Reviewers: wmi, dexonsmith Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D48729 llvm-svn: 335914
* [ThinLTO] Print names in function import debug messages when availableTeresa Johnson2018-06-271-8/+15
| | | | | | | | | | | | | | | | Summary: Rather than just print the GUID, when it is available in the index, print the global name as well in the function import thin link debug messages. Names will be available when the combined index is being built by the same process, e.g. a linker or "llvm-lto2 run". Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits Differential Revision: https://reviews.llvm.org/D48612 llvm-svn: 335760
* [SampleFDO] Add an option to turn on/off warning about samples unused.Wei Mi2018-06-251-0/+8
| | | | | | | | | | | | | | | | | | If a function has sample to use, but cannot use them because of no debug information, currently a warning will be issued to inform the missing opportunity. This warning assumes the binary generating the profile and the binary using the profile are similar enough. It is not always the case. Sometimes even if the binaries are not quite similar, we may still get some benefit by using sampleFDO. In those cases, we may still want to apply sampleFDO but not want to see a lot of such warnings pop up. The patch adds an option for the warning. Differential Revision: https://reviews.llvm.org/D48510 llvm-svn: 335484
* Re-land "[LTO] Enable module summary emission by default for regular LTO"Tobias Edler von Koch2018-06-221-1/+5
| | | | | | | | | | | | Since we are now producing a summary also for regular LTO builds, we need to run the NameAnonGlobals pass in those cases as well (the summary cannot handle anonymous globals). See https://reviews.llvm.org/D34156 for details on the original change. This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3. llvm-svn: 335385
* Revert r335306 (and r335314) - the Call Graph Profile pass.Chandler Carruth2018-06-221-2/+0
| | | | | | | | | | | This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320
* [Instrumentation] Add Call Graph Profile passMichael J. Spencer2018-06-211-0/+2
| | | | | | | | | | | | | | | | | | | | This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306
* Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate ↵Francis Visoiu Mistrih2018-06-211-22/+2
| | | | | | | | | | | facts from cmp instructions." This reverts commit r335206. As discussed here: https://reviews.llvm.org/rL333740, a fix will come tomorrow. In the meanwhile, revert this to fix some bots. llvm-svn: 335272
* Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵Florian Hahn2018-06-211-2/+22
| | | | | | | | | | | | | | | | | | | | | instructions. r335150 should resolve the issues with the clang-with-thin-lto-ubuntu and clang-with-lto-ubuntu builders. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 335206
* Use SmallPtrSet explicitly for SmallSets with pointer types (NFC).Florian Hahn2018-06-123-12/+10
| | | | | | | | | | | | | | Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This patch replaces such types with SmallPtrSet, because IMO it is slightly clearer and allows us to get rid of unnecessarily including SmallSet.h Reviewers: dblaikie, craig.topper Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D47836 llvm-svn: 334492
* [SampleFDO] Add a new compact binary format for sample profile.Wei Mi2018-06-111-3/+10
| | | | | | | | | | | | | | Name table occupies a big chunk of size in current binary format sample profile. In order to reduce its size, the patch changes the sample writer/reader to save/restore MD5Hash of names in the name table. Sample annotation phase will also use MD5Hash of name to query samples accordingly. Experiment shows compact binary format can reduce the size of sample profile by 2/3 compared with binary format generally. Differential Revision: https://reviews.llvm.org/D47955 llvm-svn: 334447
* [ThinLTO] Rename index IsAnalysis flag to HaveGVs (NFC)Teresa Johnson2018-06-062-2/+2
| | | | | | | | | With the upcoming patch to add summary parsing support, IsAnalysis would be true in contexts where we are not performing module summary analysis. Rename to the more specific and approprate HaveGVs, which is essentially what this flag is indicating. llvm-svn: 334140
* Move Analysis/Utils/Local.h back to TransformsDavid Blaikie2018-06-044-4/+4
| | | | | | | | | | Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
* In thin and full LTO + CFI, direct function calls may go through jump tableDmitry Mikulin2018-06-041-16/+97
| | | | | | | | | | entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets, except in cases when they can be pre-empted. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 333937
* [ThinLTOBitcodeWriter] Emit summaries for regular LTO modulesVlad Tsyrklevich2018-06-011-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Emit summaries for bitcode modules that are only destined for the regular LTO portion of the build so they can participate in summary-based dead stripping. This change reduces the size of a nacl_helper build with cfi-icall enabled by 7%, removing the majority of the overhead due to enabling cfi-icall. The cfi-icall size increase was caused by compiling in lots of unused code and cfi-icall generating jumptable references to unused symbols that could no longer be removed by -Wl,-gc-sections. Increasing the visibility of summary-based dead stripping prevented jumptable entries being created for unused symbols from the regular LTO portion of the build. Reviewers: pcc Reviewed By: pcc Subscribers: dschuff, mehdi_amini, inglorion, eraman, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D47594 llvm-svn: 333768
* Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp.Florian Hahn2018-06-011-22/+2
| | | | | | This is breaking the clang-with-thin-lto-ubuntu bot. llvm-svn: 333745
* Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵Florian Hahn2018-06-011-2/+22
| | | | | | | | | | | | | | | | | | | instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333740
* Extend the GlobalObject metadata interfaceBenjamin Kramer2018-05-312-22/+15
| | | | | | | | | | | | - Make eraseMetadata return whether it changed something - Wire getMetadata for a single MDNode efficiently into the attachment map - Add hasMetadata, which is less weird than checking getMetadata == nullptr on a multimap. Use it to simplify code. llvm-svn: 333649
* [LowerTypeTests] Discard extern_weak linkage for definitionsVlad Tsyrklevich2018-05-301-0/+5
| | | | | | | | | | | | | | | | | | Summary: Fix PR37625. It's possible for an extern_weak declaration to be emitted to the merged module when a definition exists in the ThinLTO portion of the build; discard the linkage on the declaration in that case. (otherwise we copy the linkage to the alias to the jumptable and fail) Reviewers: pcc Reviewed By: pcc Subscribers: mehdi_amini, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D47494 llvm-svn: 333604
* [CalledValuePropagation] Just use a sorted vector instead of a set.Benjamin Kramer2018-05-301-9/+11
| | | | | | | | | The set properties are never used, so a vector is enough. No functionality change intended. While there add some std::moves to SparseSolver. llvm-svn: 333582
* [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, scheduleChandler Carruth2018-05-301-4/+19
| | | | | | | | | | | | | | | | | | | | | | loop-cleanup passes at the beginning of the loop pass pipeline, and re-enqueue loops after even trivial unswitching. This will allow us to much more consistently avoid simplifying code while doing trivial unswitching. I've also added a test case that specifically shows effective iteration using this technique. I've unconditionally updated the new PM as that is always using the SimpleLoopUnswitch pass, and I've made the pipeline changes for the old PM conditional on using this new unswitch pass. I added a bunch of comments to the loop pass pipeline in the old PM to make it more clear what is going on when reviewing. Hopefully this will unblock doing *partial* unswitching instead of just full unswitching. Differential Revision: https://reviews.llvm.org/D47408 llvm-svn: 333493
* Revert 333358 as it's failing on some builders.David Green2018-05-271-11/+0
| | | | | | I'm guessing the tests reply on the ARM backend being built. llvm-svn: 333359
* [UnrollAndJam] Add a new Unroll and Jam passDavid Green2018-05-271-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 llvm-svn: 333358
* Revert r333268: [IPSCCP] Use PredicateInfo to propagate facts from...Florian Hahn2018-05-251-22/+2
| | | | | | | | | | | | | | | | | | | | | | Reverting this to see if this is causing the failures of the clang-with-thin-lto-ubuntu bot. [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions. This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333323
* [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.Florian Hahn2018-05-251-2/+22
| | | | | | | | | | | | | | | | | This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin Differential Revision: https://reviews.llvm.org/D45330 llvm-svn: 333268
* [Dominators] Add PDT constructor from FunctionJakub Kuderski2018-05-231-3/+3
| | | | | | | | | | | | | | | | Summary: This patch adds a PDT constructor from Function and lets codes previously using a local class to do this use PostDominatorTree class directly. Reviewers: davide, kuhar, grosser, dberlin Reviewed By: kuhar Author: NutshellySima Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46709 llvm-svn: 333102
* Remove DEBUG macro.Nicola Zaghen2018-05-231-2/+2
| | | | | | | | | | Now that the LLVM_DEBUG() macro landed on the various sub-projects the DEBUG macro can be removed. Also change the new uses of DEBUG to LLVM_DEBUG. Differential Revision: https://reviews.llvm.org/D46952 llvm-svn: 333091
* revert r332610, it breaks cfi, see D46326Nico Weber2018-05-211-82/+11
| | | | llvm-svn: 332838
OpenPOWER on IntegriCloud