summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/IPO
Commit message (Collapse)AuthorAgeFilesLines
...
* [IPO][AVR] Create new Functions in the default address space specified in ↵Dylan McKay2018-12-188-17/+32
| | | | | | | | | | | | | | | | | | | | | the data layout This modifies the IPO pass so that it respects any explicit function address space specified in the data layout. In targets with nonzero program address spaces, all functions should, by default, be placed into the default program address space. This is required for Harvard architectures like AVR. Without this, the functions will be marked as residing in data space, and thus not be callable. This has no effect to any in-tree official backends, as none use an explicit program address space in their data layouts. Patch by Tim Neumann. llvm-svn: 349469
* [SampleFDO] handle ProfileSampleAccurate when initializing function entry countWei Mi2018-12-131-4/+18
| | | | | | | | | | | | | | ProfileSampleAccurate is used to indicate the profile has exact match to the code to be optimized. Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle ProfileSampleAccurate in multiple places. Differential Revision: https://reviews.llvm.org/D55660 llvm-svn: 349088
* [ThinLTO] Compute synthetic function entry countEaswaran Raman2018-12-131-1/+1
| | | | | | | | | | | | | | | | | Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076
* [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.Michael Kruse2018-12-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
* [HotColdSplitting] Disable outlining landingpad instructions (PR39917)Vedant Kumar2018-12-111-1/+1
| | | | | | | | | | It's currently not safe to outline landingpad instructions (see llvm.org/PR39917). Like @llvm.eh.typeid.for, the order and content of previous landingpad instructions in a function alters the lowering of subsequent landingpads by renumbering type info ID's. Outlining a landingpad therefore breaks exception handling & unwinding. llvm-svn: 348870
* [DeadArgElim] Fixes for dbg.values using dead arg/return valuesDavid Stenberg2018-12-111-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When eliminating a dead argument or return value in a function with local linkage, all uses, including in dbg.value intrinsics, would be replaced with null constants. This would mean that, for example for an integer argument, the debug info would incorrectly express that the value is 0. Instead, replace all uses with undef to indicate that the argument/return value is optimized out. Also, make sure that metadata uses of return values are rewritten even if there are no non-metadata uses of the value. As a bit of historical curiosity, the code that emitted null constants was introduced in the initial check-in of the pass in 2003, before 'undef' values even existed in LLVM. This fixes PR23260. Reviewers: dblaikie, aprantl, vsk, djtodoro Reviewed By: aprantl Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55513 llvm-svn: 348837
* [HotColdSplitting] Refine definition of unlikelyExecutedVedant Kumar2018-12-071-24/+16
| | | | | | | | | | | | | | | | | | | | | | | The splitting pass uses its 'unlikelyExecuted' predicate to statically decide which blocks are cold. - Do not treat noreturn calls as if they are cold unless they are actually marked cold. This is motivated by functions like exit() and longjmp(), which are not beneficial to outline. - Do not treat inline asm as an outlining barrier. In practice asm("") is frequently used to inhibit basic block merging; enabling outlining in this case results in substantial memory savings. - Treat invokes of cold functions as cold. As a drive-by, remove the 'exceptionHandlingFunctions' predicate, because it's no longer needed. The pass can identify & outline blocks dominated by EH pads, so there's no need to special-case __cxa_begin_catch etc. Differential Revision: https://reviews.llvm.org/D54244 llvm-svn: 348640
* [HotColdSplitting] Outline more than once per functionVedant Kumar2018-12-071-165/+287
| | | | | | | | | | | | | | | | | | Algorithm: Identify maximal cold regions and put them in a worklist. If a candidate region overlaps with another, discard it. While the worklist is full, remove a single-entry sub-region from the worklist and attempt to outline it. By the non-overlap property, this should not invalidate parts of the domtree pertaining to other outlining regions. Testing: LNT results on X86 are clean. With test-suite + externals, llvm outlines 134KB pre-patch, and 352KB post-patch (+ ~2.6x). The file 483.xalancbmk/src/Constants.cpp stands out as an extreme case where llvm outlines over 100 times in some functions (mostly EH paths). There was not a significant performance impact pre vs. post-patch. Differential Revision: https://reviews.llvm.org/D53887 llvm-svn: 348639
* Allow norecurse attribute on functions that have debug infos.Christian Bruel2018-12-051-7/+9
| | | | | | | | | | | | | | Summary: debug intrinsics might be marked norecurse to enable the caller function to be norecurse and optimized if needed. This avoids code gen optimisation differences when -g is used, as in globalOpt.cpp:processInternalGlobal checks. Reviewers: chandlerc, jmolloy, aprantl Reviewed By: aprantl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55187 llvm-svn: 348381
* [ThinLTO] Import local variables from the same module as callerTeresa Johnson2018-11-291-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We can sometimes end up with multiple copies of a local variable that have the same GUID in the index. This happens when there are local variables with the same name that are in different source files having the same name/path at compile time (but compiled into different bitcode objects). In this case make sure we import the copy in the caller's module. This enables importing both of the variables having the same GUID (but which will have different promoted names since the module paths, and therefore the module hashes, will be distinct). Importing the wrong copy is particularly problematic for read only variables, since we must import them as a local copy whenever referenced. Otherwise we get undefs at link time. Note that the llvm-lto.cpp and ThinLTOCodeGenerator changes are needed for testing the distributed index case via clang, which will be sent as a separate clang-side patch shortly. We were previously not doing the dead code/read only computation before computing imports when testing distributed index generation (like it was for testing importing and other ThinLTO mechanisms alone). Reviewers: evgeny777 Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55047 llvm-svn: 347886
* [PartialInliner] Make PHIs free in cost computation.Florian Hahn2018-11-271-10/+9
| | | | | | | | | | | | | | | | InlineCost also treats them as free and the current implementation can cause assertion failures if PHI nodes are moved outside the region from entry BBs to the region. It also updates the code to use the instructionsWithoutDebug iterator. Reviewers: davidxl, davide, vsk, graham-yiu-huawei Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D54748 llvm-svn: 347683
* [MergeFuncs] Generate alias instead of thunk if possibleNikita Popov2018-11-211-14/+73
| | | | | | | | | | | | | | | | | | | The MergeFunctions pass was originally intended to emit aliases instead of thunks where possible (unnamed_addr). However, for a long time this functionality was behind a flag hardcoded to false, bitrotted and was eventually removed in r309313. Originally the functionality was first disabled in r108417 due to lack of support for aliases in Mach-O. I believe that this is no longer the case nowadays, but not really familiar with this area. In the interest of being conservative, this patch reintroduces the aliasing functionality behind a default disabled -mergefunc-use-aliases flag. Differential Revision: https://reviews.llvm.org/D53285 llvm-svn: 347407
* [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlockVedant Kumar2018-11-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256
* [ProfileSummary] Standardize methods and fix commentVedant Kumar2018-11-194-6/+6
| | | | | | | | | | | | | | | | | | | | | Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182
* GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics.Adrian Prantl2018-11-161-5/+10
| | | | | | | This fixes PR39669. https://bugs.llvm.org/show_bug.cgi?id=39669 llvm-svn: 347065
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-161-4/+35
| | | | | | | | An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033
* [LTO] Load sample profile in LTO link step.Xin Tong2018-11-151-0/+6
| | | | | | | | | | | | | | Summary: Load sample profile in LTO link step. ThinLTO calls populateModulePassManager to load the profile Reviewers: tejohnson, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54564 llvm-svn: 346971
* Revert "[ThinLTO] Internalize readonly globals"Steven Wu2018-11-131-40/+5
| | | | | | This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768
* [IPSCCP,PM] Preserve PDT in the new pass manager.Florian Hahn2018-11-111-3/+5
| | | | | | | | | | Reviewers: kuhar, chandlerc, NutshellySima, brzycki Reviewed By: NutshellySima, brzycki Differential Revision: https://reviews.llvm.org/D54317 llvm-svn: 346618
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-101-5/+40
| | | | | | | | | This patch allows internalising globals if all accesses to them (from live functions) are from non-volatile load instructions Differential revision: https://reviews.llvm.org/D49362 llvm-svn: 346584
* [IPSCCP,PM] Preserve DT in the new pass manager.Florian Hahn2018-11-091-13/+22
| | | | | | | | | | | | | | After D45330, Dominators are required for IPSCCP and can be preserved. This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis. Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima Reviewed By: chandlerc, kuhar, NutshellySima Differential Revision: https://reviews.llvm.org/D47259 llvm-svn: 346486
* [LTO] Drop non-prevailing definitions only if linkage is not local or appendingPirama Arumuga Nainar2018-11-081-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes PR 37422 In ELF, non-weak symbols can also be non-prevailing. In this particular PR, the __llvm_profile_* symbols are non-prevailing but weren't getting dropped - causing multiply-defined errors with lld. Also add a test, strong_non_prevailing.ll, to ensure that multiple copies of a strong symbol are dropped. To fix the test regressions exposed by this fix, - do not mark prevailing copies for symbols with 'appending' linkage. There's no one prevailing copy for such symbols. - fix the prevailing version in dead-strip-fulllto.ll - explicitly pass exported symbols to llvm-lto in fumcimport.ll and funcimport_var.ll Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D54125 llvm-svn: 346436
* [MergeFuncs] Improve ordering of equal functionswhitequark2018-11-081-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: MergeFunctions currently tries to process strong functions before weak functions, because weak functions can simply call strong functions, while a strong/weak function cannot call a weak function (a backing strong function is needed). This patch additionally tries to process external functions before local functions, because we definitely have to keep the external function, but may be able to drop the local one (and definitely can if it is also unnamed_addr). Unfortunately, this exposes an existing bug in the implementation: The FnTree and FNodesInTree structures can currently go out of sync in the case where two weak functions are merged, because the function in FnTree/FNodesInTree is RAUWed. This leaves it behind in FnTree (this is intended, as it is the strong backing function which should be used for further merges), while it is replaced in FNodesInTree (this is not intended). This is fixed by switching FNodesInTree from using a ValueMap to using a DenseMap of AssertingVH. This exposes another minor issue: Currently FNodesInTree is not cleared after MergeFunctions finishes running. Currently, this is potentially dangerous (e.g. if something else wants to RAUW a function with a non-function), but at the very least it is unnecessary/inefficient. After the change to use AssertingVH it becomes more problematic, because there are certainly passes that remove functions. This issue is fixed by clearing FNodesInTree at the end of the pass. Reviewers: jfb, whitequark Reviewed By: whitequark Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D53271 llvm-svn: 346386
* [MergeFuncs] Call removeUsers() prior to unnamed_addr RAUWwhitequark2018-11-081-0/+1
| | | | | | | | | | | | | | | | | Summary: For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities. Fix this by calling removeUsers() prior to RAUW. Reviewers: jfb, whitequark Reviewed By: whitequark Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D53262 llvm-svn: 346385
* [ThinLTO] Split NotEligibleToImport into legality and inlinability flagsTeresa Johnson2018-11-061-0/+10
| | | | | | | | | | | | | | | | | | | | Summary: The NotEligibleToImport flag on the GlobalValueSummary was set if it isn't legal to import (e.g. because it references unpromotable locals) and when it can't be inlined (in which case importing is pointless). I split out the inlinable piece into a separate flag on the FunctionSummary (doesn't make sense for aliases or global variables), because in the future we may want to import for reasons other than inlining. Reviewers: davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D53345 llvm-svn: 346261
* [HotColdSplitting] Use TTI to inform outlining thresholdVedant Kumar2018-11-041-18/+26
| | | | | | | | | | | | | | | Using TargetTransformInfo allows the splitting pass to factor in the code size cost of instructions as it decides whether or not outlining is profitable. This did not regress the overall amount of outlining seen on the handful of internal frameworks I tested. Thanks to Jun Bum Lim for suggesting this! Differential Revision: https://reviews.llvm.org/D53835 llvm-svn: 346108
* ADT/STLExtras: Introduce llvm::empty; NFCMatthias Braun2018-10-311-2/+2
| | | | | | | | This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679
* [HotColdSplitting] Allow outlining single-block cold regionsVedant Kumar2018-10-291-3/+20
| | | | | | | | | | | | | | | | It can be profitable to outline single-block cold regions because they may be large. Allow outlining single-block regions if they have over some threshold of non-debug, non-terminator instructions. I chose 3 as the threshold after experimenting with several internal frameworks. In practice, reducing the threshold further did not give much improvement, whereas increasing it resulted in substantial regressions. Differential Revision: https://reviews.llvm.org/D53824 llvm-svn: 345524
* [HotColdSplitting] Identify larger cold regions using domtree queriesVedant Kumar2018-10-241-185/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current splitting algorithm works in three stages: 1) Identify cold blocks, then 2) Use forward/backward propagation to mark hot blocks, then 3) Grow a SESE region of blocks *outside* of the set of hot blocks and start outlining. While testing this pass on Apple internal frameworks I noticed that some kinds of control flow (e.g. loops) are never outlined, even though they unconditionally lead to / follow cold blocks. I noticed two other issues related to how cold regions are identified: - An inconsistency can arise in the internal state of the hotness propagation stage, as a block may end up in both the ColdBlocks set and the HotBlocks set. Further inconsistencies can arise as these sets do not match what's in ProfileSummaryInfo. - It isn't necessary to limit outlining to single-exit regions. This patch teaches the splitting algorithm to identify maximal cold regions and outline them. A maximal cold region is defined as the set of blocks post-dominated by a cold sink block, or dominated by that sink block. This approach can successfully outline loops in the cold path. As a side benefit, it maintains less internal state than the current approach. Due to a limitation in CodeExtractor, blocks within the maximal cold region which aren't dominated by a single entry point (a so-called "max ancestor") are filtered out. Results: - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to 47KB pre-patch, or a ~3x improvement. Did not see a performance impact across two runs. - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB of TEXT were outlined. Ditto re: performance impact. - Outlining results improve marginally in the internal frameworks I tested. Follow-ups: - Outline more than once per function, outline large single basic blocks, & try to remove unconditional branches in outlined functions. Differential Revision: https://reviews.llvm.org/D53627 llvm-svn: 345209
* [hot-cold-split] Name split functions with ".cold" suffixTeresa Johnson2018-10-241-7/+9
| | | | | | | | | | | | | | | | | | | | | | | Summary: The current default of appending "_"+entry block label to the new extracted cold function breaks demangling. Change the deliminator from "_" to "." to enable demangling. Because the header block label will be empty for release compile code, use "extracted" after the "." when the label is empty. Additionally, add a mechanism for the client to pass in an alternate suffix applied after the ".", and have the hot cold split pass use "cold."+Count, where the Count is currently 1 but can be used to uniquely number multiple cold functions split out from the same function with D53588. Reviewers: sebpop, hiraditya Subscribers: llvm-commits, erik.pilkington Differential Revision: https://reviews.llvm.org/D53534 llvm-svn: 345178
* [PM] keeping history when original SCC split and then merge into itselfWei Mi2018-10-231-2/+11
| | | | | | | | | | | | | | | | | | | | in the same round of SCC update. In https://reviews.llvm.org/rL309784, inline history is added to prevent infinite inlining across multiple run of inliner and SCC update, but the history will only be kept when new SCC is actually generated during SCC update. We found a case that SCC can be split and then merge into itself in the same round of SCC update, so the same SCC will be pop out from UR.CWorklist and then added back immediately, without any new SCC generated, that is why the existing patch cannot catch the infinite inline case. What the patch does is even if no new SCC is generated, if only the current SCC appears in UR.CWorklist again, then keep the inline history. Differential Revision: https://reviews.llvm.org/D52915 llvm-svn: 345103
* [HotColdSplitting] Attach MinSize to outlined codeVedant Kumar2018-10-231-0/+7
| | | | | | | | | | | | | | | | | | Outlined code is cold by assumption, so it makes sense to optimize it for minimal code size rather than performance. After r344869 moved the splitting pass to the end of the IR pipeline, this does not result in much of a code size reduction. This is probably because a comparatively small number backend transforms make use of the MinSize hint. Running LNT on x86_64, I see that 33/1020 binaries shrink for a total of 919 bytes of TEXT reduction. I didn't measure a significant performance impact. Differential Revision: https://reviews.llvm.org/D53518 llvm-svn: 345072
* [DebugInfo][GlobalOpt] Fix -debugify for globalopt shrinking globals to ↵Jordan Rupprecht2018-10-231-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | booleans. Summary: TryToShrinkGlobalToBoolean, when possible, will split store <value> + load <value> into store <bool> + select <bool ? value : 0>. This preserves DebugLoc during that pass. Fixes PR37959. The test case here is the simplified .ll for: ``` static int foo; int bar() { foo = 5; return foo; } ``` Reviewers: dblaikie, gbedwell, aprantl Reviewed By: dblaikie Subscribers: mehdi_amini, JDevlieghere, dexonsmith, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D53531 llvm-svn: 345046
* [hot-cold-split] Add opt remark on successTeresa Johnson2018-10-221-0/+8
| | | | | | | | | | | | Summary: Emit optimization remark on successful hot cold split. Reviewers: sebpop, hiraditya Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53512 llvm-svn: 344938
* Schedule Hot Cold Splitting pass after most optimization passesAditya Kumar2018-10-211-3/+3
| | | | | | | | | | | | | | | | Summary: In the new+old pass manager, hot cold splitting was schedule too early. Thanks to Vedant for pointing this out. Reviewers: sebpop, vsk Reviewed By: sebpop, vsk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D53437 llvm-svn: 344869
* [TI removal] Switch MergeFunctions to directly use Instruction API.Chandler Carruth2018-10-181-1/+1
| | | | llvm-svn: 344714
* [ThinLTO] Add importing stats to thin linkTeresa Johnson2018-10-161-5/+27
| | | | | | | | | | | | | | | Summary: Previously we could only get the number of imported functions and variables from the backend. This adds stats to the thin link where the importing is decided. Reviewers: wmi Subscribers: inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D53337 llvm-svn: 344658
* [InstCombine] Cleanup libfunc attribute inferringDavid Bolvansky2018-10-161-1/+1
| | | | | | | | | | | | Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53338 llvm-svn: 344645
* Change a TerminatorInst* to an Instruction* in HotColdSplitting.cpp.Lang Hames2018-10-151-1/+1
| | | | | | | | | | | r344558 added an assignment to a TerminatorInst* from BasicBlock::getTerminatorInst(), but BasicBlock::getTerminatorInst() returns an Instruction* rather than a TerminatorInst* since r344504 so this fails to compile. Changing the variable to an Instruction* should get the bots building again. llvm-svn: 344566
* [hot-cold-split] fix static analysis of cold regionsSebastian Pop2018-10-151-7/+41
| | | | | | | | | | | | | | | | | | | | | | Make the code of blockEndsInUnreachable to match the function blockEndsInUnreachable in CodeGen/BranchFolding.cpp. I also have added a note to make sure the code of this function will not be modified unless the back-end version is also modified. An early return before outlining has been added to avoid outlining the full function body when the first block in the function is marked cold. The static analysis of cold code has been amended to avoid marking the whole function as cold by back-propagation because the back-propagation would mark blocks with return statements as cold. The patch adds debug statements to help discover these problems. Differential Revision: https://reviews.llvm.org/D52904 llvm-svn: 344558
* [TI removal] Make variables declared as `TerminatorInst` and initializedChandler Carruth2018-10-155-6/+6
| | | | | | | | | | | | | by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). llvm-svn: 344502
* [InstCombine] Fixed crash with aliased functionsDavid Bolvansky2018-10-131-1/+1
| | | | | | | | | | | | | | Summary: Fixes PR39177 Reviewers: spatel, jbuening Reviewed By: jbuening Subscribers: jbuening, llvm-commits Differential Revision: https://reviews.llvm.org/D53129 llvm-svn: 344454
* [ThinLTO] Don't import GV which contains blockaddressEugene Leviant2018-10-121-2/+1
| | | | | | Differential revision: https://reviews.llvm.org/D53139 llvm-svn: 344325
* Add a flag to remap manglings when reading profile data information.Richard Smith2018-10-101-10/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can be used to preserve profiling information across codebase changes that have widespread impact on mangled names, but across which most profiling data should still be usable. For example, when switching from libstdc++ to libc++, or from the old libstdc++ ABI to the new ABI, or even from a 32-bit to a 64-bit build. The user can provide a remapping file specifying parts of mangled names that should be treated as equivalent (eg, std::__1 should be treated as equivalent to std::__cxx11), and profile data will be treated as applying to a particular function if its name is equivalent to the name of a function in the profile data under the provided equivalences. See the documentation change for a description of how this is configured. Remapping is supported for both sample-based profiling and instruction profiling. We do not support remapping indirect branch target information, but all other profile data should be remapped appropriately. Support is only added for the new pass manager. If someone wants to also add support for this for the old pass manager, doing so should be straightforward. This is the LLVM side of Clang r344199. Reviewers: davidxl, tejohnson, dlj, erik.pilkington Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51249 llvm-svn: 344200
* Replace most users of UnknownSize with LocationSize::unknown(); NFCGeorge Burgess IV2018-10-101-1/+1
| | | | | | | | | | | | Moving away from UnknownSize is part of the effort to migrate us to LocationSizes (e.g. the cleanup promised in D44748). This doesn't entirely remove all of the uses of UnknownSize; some uses require tweaks to assume that UnknownSize isn't just some kind of int. This patch is intended to just be a trivial replacement for all places where LocationSize::unknown() will Just Work. llvm-svn: 344186
* [ThinLTO] Keep non-prevailing (linkonce|weak)_odr symbols liveXin Tong2018-10-081-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If we have a symbol with (linkonce|weak)_odr linkage, we do not want to dead strip it even it is not prevailing. IR level (linkonce|weak)_odr symbol can become non-prevailing when we mix ELF objects and IR objects where the (linkonce|weak)_odr symbol in the ELF object is prevailing and the ones in the IR objects are not. Stripping them will prevent us from doing optimizations with them. By not dead stripping them, We will convert these symbols to available_externally linkage as a result of non-prevailing and eventually dropping them after inlining. I modified cache-prevailing.ll to use linkonce linkage as it is testing whether cache prevailing bit is effective or not, not we should treat linkonce_odr alive or not Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52893 llvm-svn: 343970
* Improve static analysis of cold basic blocksAditya Kumar2018-10-031-1/+14
| | | | | | | | | Differential Revision: https://reviews.llvm.org/D52704 Reviewers: sebpop, tejohnson, brzycki, SirishP Reviewed By: sebpop llvm-svn: 343663
* Add support for new pass managerAditya Kumar2018-10-031-0/+33
| | | | | | | | | | | Modified the testcases to use both pass managers Use single commandline flag for both pass managers. Differential Revision: https://reviews.llvm.org/D52708 Reviewers: sebpop, tejohnson, brzycki, SirishP Reviewed By: tejohnson, brzycki llvm-svn: 343662
* Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"Eric Christopher2018-10-011-2/+2
| | | | | | | | | This reverts commit r342387 as it's showing significant performance regressions in a number of benchmarks. Followed up with the committer and original thread with an example and will get performance numbers before recommitting. llvm-svn: 343522
* Recommit r343308: [LoopInterchange] Turn into a loop pass.Florian Hahn2018-10-011-4/+2
| | | | llvm-svn: 343450
OpenPOWER on IntegriCloud