summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils
Commit message (Collapse)AuthorAgeFilesLines
...
* [SimplifyLibCalls] Fix memchr expansion for constant strings.Eli Friedman2019-01-091-1/+4
| | | | | | | | | | | | The C standard says "The memchr function locates the first occurrence of c (converted to an unsigned char)[...]". The expansion was missing the conversion to unsigned char. Fixes https://bugs.llvm.org/show_bug.cgi?id=39041 . Differential Revision: https://reviews.llvm.org/D55947 llvm-svn: 350775
* [UnrollRuntime] Fix domTree failures in multiexit unrollingAnna Thomas2019-01-081-24/+24
| | | | | | | | | | | | | | | | | | | | Summary: This fixes the IDom for exit blocks and all blocks reachable from the exit blocks, when runtime unrolling under multiexit/exiting case. We initially had a restrictive check that the IDom is only updated when it is the header of the loop. However, we also need to update the IDom to the correct one when the IDom is any block within the original loop. See added test cases (which fail dom tree verification without the patch). Reviewers: reames, mzolotukhin, mkazantsev, hfinkel Reviewed by: brzycki, kuhar Subscribers: zzheng, dmgreen, llvm-commits Differential Revision: https://reviews.llvm.org/D56284 llvm-svn: 350640
* [CallSite removal] Migrate all Alias Analysis APIs to use the newlyChandler Carruth2019-01-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | minted `CallBase` class instead of the `CallSite` wrapper. This moves the largest interwoven collection of APIs that traffic in `CallSite`s. While a handful of these could have been migrated with a minorly more shallow migration by converting from a `CallSite` to a `CallBase`, it hardly seemed worth it. Most of the APIs needed to migrate together because of the complex interplay of AA APIs and the fact that converting from a `CallBase` to a `CallSite` isn't free in its current implementation. Out of tree users of these APIs can fairly reliably migrate with some combination of `.getInstruction()` on the `CallSite` instance and casting the resulting pointer. The most generic form will look like `CS` -> `cast_or_null<CallBase>(CS.getInstruction())` but in most cases there is a more elegant migration. Hopefully, this migrates enough APIs for users to fully move from `CallSite` to the base class. All of the in-tree users were easily migrated in that fashion. Thanks for the review from Saleem! Differential Revision: https://reviews.llvm.org/D55641 llvm-svn: 350503
* [ThinLTO] Handle chains of aliasesTeresa Johnson2019-01-043-0/+107
| | | | | | | | | | | | | | | | | | | At -O0, globalopt is not run during the compile step, and we can have a chain of an alias having an immediate aliasee of another alias. The summaries are constructed assuming aliases in a canonical form (flattened chains), and as a result only the base object but no intermediate aliases were preserved. Fix by adding a pass that canonicalize aliases, which ensures each alias is a direct alias of the base object. Reviewers: pcc, davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54507 llvm-svn: 350423
* [CodeExtractor] Do not extract unsafe lifetime markersVedant Kumar2019-01-041-10/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lifetime markers which reference inputs to the extraction region are not safe to extract. Example ('rhs' will be extracted): ``` entry: +------------+ | x = alloca | | y = alloca | +------------+ / \ lhs: rhs: +-------------------+ +-------------------+ | lifetime_start(x) | | lifetime_start(x) | | use(x) | | lifetime_start(y) | | lifetime_end(x) | | use(x, y) | | lifetime_start(y) | | lifetime_end(y) | | use(y) | | lifetime_end(x) | | lifetime_end(y) | +-------------------+ +-------------------+ ``` Prior to extraction, the stack coloring pass sees that the slots for 'x' and 'y' are in-use at the same time. After extraction, the coloring pass infers that 'x' and 'y' are *not* in-use concurrently, because markers from 'rhs' are no longer available to help decide otherwise. This leads to a miscompile, because the stack slots actually are in-use concurrently in the extracted function. Fix this by moving lifetime start/end markers for memory regions defined in the calling function around the call to the extracted function. Fixes llvm.org/PR39671 (rdar://45939472). Differential Revision: https://reviews.llvm.org/D55967 llvm-svn: 350420
* [UnrollRuntime] Move the DomTree verification under expensive checksAnna Thomas2019-01-031-1/+1
| | | | | | Suggested by Hal as done in r349871. llvm-svn: 350349
* [UnrollRuntime] Add DomTree verification under debug modeAnna Thomas2019-01-031-0/+6
| | | | | | | | | | | | NFC: This adds the dom tree verification under debug mode at a point just before we start unrolling the loop. This allows us to verify dom tree at a state where it is much smaller and before the unrolling actually happens. This also implies we do not need to run -verify-dom-info everytime to see if the DT is in a valid state when we transform the loop for runtime unrolling. llvm-svn: 350334
* [NewPM] Port MsanPhilip Pfaffe2019-01-031-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Keeping msan a function pass requires replacing the module level initialization: That means, don't define a ctor function which calls __msan_init, instead just declare the init function at the first access, and add that to the global ctors list. Changes: - Pull the actual sanitizer and the wrapper pass apart. - Add a newpm msan pass. The function pass inserts calls to runtime library functions, for which it inserts declarations as necessary. - Update tests. Caveats: - There is one test that I dropped, because it specifically tested the definition of the ctor. Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji Differential Revision: https://reviews.llvm.org/D55647 llvm-svn: 350305
* [UnrollRuntime] NFC: Add comment and verify LCSSAAnna Thomas2018-12-281-2/+2
| | | | | | | Added -verify-loop-lcssa to test cases. Updated comments in ConnectProlog. llvm-svn: 350131
* [llvm] API for encoding/decoding DWARF discriminators.Mircea Trofin2018-12-213-10/+41
| | | | | | | | | | | | | | | | | | | Summary: Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling) The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues: - communicates overflow back to the user - supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported. Reviewers: davidxl, danielcdh, wmi, dblaikie Reviewed By: dblaikie Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D55681 llvm-svn: 349973
* [IR] Add Instruction::isLifetimeStartOrEnd, NFCVedant Kumar2018-12-215-17/+7
| | | | | | | | | | | Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964
* [RuntimeUnrolling] NFC: Add TODO and comments in connectPrologAnna Thomas2018-12-211-0/+18
| | | | | | | | Currently, runtime unrolling does not support loops where multiple exiting blocks exit to the latchExit. Added TODO and other code clarifications for ConnectProlog code. llvm-svn: 349944
* [LoopUnroll] Don't verify domtree by default with +Asserts.Eli Friedman2018-12-212-3/+5
| | | | | | | | | | This verification is linear in the size of the function, so it can cause a quadratic compile-time explosion in a function with many loops to unroll. Differential Revision: https://reviews.llvm.org/D54732 llvm-svn: 349871
* Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.Michael Kruse2018-12-204-52/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725
* [Util] Refer to [s|z]exts of args when converting dbg.declares (fix PR35400)Vedant Kumar2018-12-151-27/+0
| | | | | | | | | | | | | | | | | | When converting dbg.declares, if the described value is a [s|z]ext, refer to the ext directly instead of referring to its operand. This fixes a narrowing bug (the debugger got the sign of a variable wrong, see llvm.org/PR35400). The main reason to refer to the ext's operand was that an optimization may remove the ext itself, leading to a dropped variable. Now that InstCombine has been taught to use replaceAllDbgUsesWith (r336451), this is less of a concern. Other passes can/should adopt this API as needed to fix dropped variable bugs. Differential Revision: https://reviews.llvm.org/D51813 llvm-svn: 349214
* [Transforms] Preserve metadata when converting invoke to call.Michael Kruse2018-12-141-0/+1
| | | | | | | | | | | | | | | The `changeToCall` function did not preserve the invoke's metadata. Currently, there is probably no metadata that depends on being applied on a CallInst or InvokeInst. Therefore we can replace the instruction's metadata. This fixes http://llvm.org/PR39994 Suggested-by: Moritz Kreutzer <moritz.kreutzer@siemens.com> Differential Revision: https://reviews.llvm.org/D55666 llvm-svn: 349170
* [ThinLTO] Compute synthetic function entry countEaswaran Raman2018-12-131-2/+17
| | | | | | | | | | | | | | | | | Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076
* [LoopUtils] Use i32 instead of `void`.Davide Italiano2018-12-131-1/+1
| | | | | | | | The actual type of the first argument of the @dbg intrinsic doesn't really matter as we're setting it to `undef`, but the bitcode reader is picky about `void` types. llvm-svn: 349069
* [LoopUtils] Prefer a set over a map. NFCI.Davide Italiano2018-12-131-6/+4
| | | | llvm-svn: 348999
* [LoopDeletion] Update debug values after loop deletion.Davide Italiano2018-12-121-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When loops are deleted, we don't keep track of variables modified inside the loops, so the DI will contain the wrong value for these. e.g. int b() { int i; for (i = 0; i < 2; i++) ; patatino(); return a; -> 6 patatino(); 7 return a; 8 } 9 int main() { b(); } (lldb) frame var i (int) i = 0 We mark instead these values as unavailable inserting a @llvm.dbg.value(undef to make sure we don't end up printing an incorrect value in the debugger. We could consider doing something fancier, for, e.g. constants, in the future. PR39868. rdar://problem/46418795) Differential Revision: https://reviews.llvm.org/D55299 llvm-svn: 348988
* [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.Michael Kruse2018-12-124-34/+286
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
* [Local] Promote an utility that could be used elsewhere. NFCI.Davide Italiano2018-12-101-0/+11
| | | | llvm-svn: 348804
* [CodeExtractor] Store outputs at the first valid insertion pointVedant Kumar2018-12-071-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | When CodeExtractor outlines values which are used by the original function, it must store those values in some in-out parameter. This store instruction must not be inserted in between a PHI and an EH pad instruction, as that results in invalid IR. This fixes the following verifier failure seen while outlining within ObjC methods with live exit values: The unwind destination does not have an exception handling instruction! %call35 = invoke i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %exn.adjusted, i8* %1) to label %invoke.cont34 unwind label %lpad33, !dbg !4183 The unwind destination does not have an exception handling instruction! invoke void @objc_exception_throw(i8* %call35) #12 to label %invoke.cont36 unwind label %lpad33, !dbg !4184 LandingPadInst not the first non-PHI instruction in the block. %3 = landingpad { i8*, i32 } catch i8* null, !dbg !1411 rdar://46540815 llvm-svn: 348562
* [CodeExtractor] Do not marked outlined calls which may resume EH as noreturnVedant Kumar2018-12-051-2/+5
| | | | | | | | | | Treat terminators which resume exception propagation as returning instructions (at least, for the purposes of marking outlined functions `noreturn`). This is to avoid inserting traps after calls to outlined functions which unwind. rdar://46129950 llvm-svn: 348404
* [CodeExtractor] Split PHI nodes with incoming values from outlined region ↵Vedant Kumar2018-12-031-49/+90
| | | | | | | | | | | | | | | | | (PR39433) If a PHI node out of extracted region has multiple incoming values from it, split this PHI on two parts. First PHI has incomings only from region and extracts with it (they are placed to the separate basic block that added to the list of outlined), and incoming values in original PHI are replaced by first PHI. Similar solution is already used in CodeExtractor for PHIs in entry block (severSplitPHINodes method). It covers PR39433 bug. Patch by Sergei Kachkov! Differential Revision: https://reviews.llvm.org/D55018 llvm-svn: 348205
* [ValueTracking] add helper function for testing implied condition; NFCISanjay Patel2018-12-021-22/+11
| | | | | | | | We were duplicating code around the existing isImpliedCondition() that checks for a predecessor block/dominating condition, so make that a wrapper call. llvm-svn: 348088
* [Mem2Reg] Fix nondeterministic corner caseJoseph Tremoulet2018-11-301-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | Summary: When mem2reg inserts phi nodes in blocks with unreachable predecessors, it adds undef operands for those incoming edges. When there are multiple such predecessors, the order is currently based on the address of the BasicBlocks. This change fixes that by using the BBNumbers in the sort/search predicates, as is done elsewhere in mem2reg to ensure determinism. Also adds a testcase with a bunch of unreachable preds, which (nodeterministically) fails without the fix. Reviewers: majnemer Reviewed By: majnemer Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D55077 llvm-svn: 348024
* [DebugInfo] Give inlinable calls DILocs (PR39807)Jeremy Morse2018-11-281-8/+9
| | | | | | | | | | | | | | | | | | | | In PR39807 we incorrectly handle circumstances where calls are common'd from conditional blocks into the parent BB. Calls that can be inlined must always have DebugLocs, however we strip them during commoning, which the IR verifier asserts on. Fix this by using applyMergedLocation: it will perform the same DebugLoc stripping of conditional Locs, but will also generate an unknown location DebugLoc that satisfies the requirement for inlinable calls to always have locations. Some of the prior logic for selecting a DebugLoc is now likely redundant; I'll generate a follow-up to remove it (involves editing more regression tests). Differential Revision: https://reviews.llvm.org/D54997 llvm-svn: 347782
* [ThinLTO] Correct linkonce_any function import linkage. NFC.Xin Tong2018-11-281-5/+6
| | | | | | | | | | | | | | Summary: This is a NFC as we do not import non-odr vague linkage when computing for import list for a module. Reviewers: tejohnson, pcc Subscribers: inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54928 llvm-svn: 347763
* [ICP] Remove incompatible attributes at indirect-call promoted callsites.Xin Tong2018-11-261-2/+27
| | | | | | | | | | | | | | Summary: Removing ncompatible attributes at indirect-call promoted callsites, not removing it results in at least a IR verification error. Reviewers: davidxl, xur, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54913 llvm-svn: 347605
* [Transforms] Prefer static and avoid namespaces, NFCReid Kleckner2018-11-191-10/+6
| | | | | | | | | | | | | Put 'static' on three functions in an anonymous namespace as per our coding style. Remove the 'namespace llvm {}' around the .cpp file and explicitly declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'. I prefer this style for free functions because the compiler will error out if the .h and .cpp files don't agree on the function name or prototype. llvm-svn: 347269
* [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlockVedant Kumar2018-11-192-7/+5
| | | | | | | | | | | | | | | | | | | | | | | Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-161-2/+19
| | | | | | | | An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033
* Revert r346810 "Preserve loop metadata when splitting exit blocks"Reid Kleckner2018-11-141-32/+0
| | | | | | | It broke the Windows self-host: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457 llvm-svn: 346823
* Preserve loop metadata when splitting exit blocksCraig Topper2018-11-131-0/+32
| | | | | | | | | | LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.) Patch by Chang Lin (clin1) Differential Revision: https://reviews.llvm.org/D53876 llvm-svn: 346810
* [CSP, Cloning] Update DuplicateInstructionsInSplitBetween to use DomTreeUpdater.Florian Hahn2018-11-131-7/+15
| | | | | | | | | | | | | | | | | | | | | This patch updates DuplicateInstructionsInSplitBetween to update a DTU instead of applying updates to the DT directly. Given that there only are 2 users, also updated them in this patch to avoid churn. I slightly moved the code in CallSiteSplitting around to reduce the places where we have to pass in DTU. If necessary, I could split those changes in a separate patch. This fixes missing DT updates when dealing with musttail calls in CallSiteSplitting, by using DTU->deleteBB. Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki Reviewed By: NutshellySima llvm-svn: 346769
* Revert "[ThinLTO] Internalize readonly globals"Steven Wu2018-11-131-19/+2
| | | | | | This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-101-2/+19
| | | | | | | | | This patch allows internalising globals if all accesses to them (from live functions) are from non-volatile load instructions Differential revision: https://reviews.llvm.org/D49362 llvm-svn: 346584
* [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.Carlos Alberto Enciso2018-11-091-0/+16
| | | | | | | | In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D53390 llvm-svn: 346481
* [CodeExtractor] Mark functions noreturn when applicableVedant Kumar2018-11-081-0/+7
| | | | | | | | | | This eliminates the outlining penalty for llvm.trap/unreachable, because callers no longer have to emit cleanup/ret instructions after calling an outlined `noreturn` function. rdar://45523626 llvm-svn: 346421
* [CodeExtractor] Do not extract calls to eh_typeid_for (PR39545)Vedant Kumar2018-11-061-3/+11
| | | | | | | | | | | The lowering for a call to eh_typeid_for changes when it's moved from one function to another. There are several proposals for fixing this issue in llvm.org/PR39545. Until some solution is in place, do not allow CodeExtractor to extract calls to eh_typeid_for, as that results in serious miscompilations. llvm-svn: 346256
* [CodeExtractor] Erase use-without-def debug intrinsics in parent funcVedant Kumar2018-11-061-0/+9
| | | | | | | | | | | When CodeExtractor moves instructions to a new function, debug intrinsics referring to those instructions within the parent function become invalid. This results in the same verifier failure which motivated r344545, about function-local metadata being used in the wrong function. llvm-svn: 346255
* Remove unnecessary fallthrough annotation after unreachableReid Kleckner2018-11-011-2/+0
| | | | | | | | | | Clang's -Wimplicit-fallthrough implementation warns on this. I built clang with GCC 7.3 in +asserts and -asserts mode, and GCC doesn't warn on this in either configuration. I think it is unnecessary. I separated it from the large mechanical patch (https://reviews.llvm.org/D53950) in case I am wrong and it has to be reverted. llvm-svn: 345876
* ADT/STLExtras: Introduce llvm::empty; NFCMatthias Braun2018-10-312-3/+3
| | | | | | | | This is modeled after C++17 std::empty(). Differential Revision: https://reviews.llvm.org/D53909 llvm-svn: 345679
* [Local] Keep K's range if K does not move when combining metadata.Florian Hahn2018-10-271-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As K has to dominate I, IIUC I's range metadata must be a subset of K's. After Eli's recent clarification to the LangRef, loading a value outside of the range is undefined behavior. Therefore if I's range contains elements outside of K's range and we would load one such value, K would cause undefined behavior. In cases like hoisting/sinking, we still want the most generic range over all code paths to/from the hoist/sink point. As suggested in the patches related to D47339, I will refactor the handling of those scenarios and try to decouple it from this function as follow up, once we switched to a similar handling of metadata in most of combineMetadata. I updated some tests checking mostly the merging of metadata to keep the metadata of to dominating load. The most interesting one is probably test8 in test/Transforms/JumpThreading/thread-loads.ll. It contained a comment about the alias metadata preventing us to eliminate the branch, but it seem like the actual problem currently is that we merge the ranges of both loads and cannot eliminate the icmp afterwards. With this patch, we manage to eliminate the icmp, as the range of the first load excludes 8. Reviewers: efriedma, nlopes, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D51629 llvm-svn: 345456
* [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.Carlos Alberto Enciso2018-10-252-18/+45
| | | | | | | | When SimplifyCFG changes the PHI node into a select instruction, the debug line records becomes ambiguous. It causes the debugger to display unreachable source lines. Differential Revision: https://reviews.llvm.org/D53287 llvm-svn: 345250
* Update MemorySSA in LoopRotate.Alina Sbirlea2018-10-241-9/+51
| | | | | | | | | | | | Summary: Teach LoopRotate to preserve MemorySSA. Enable tests for correctness, dependency disabled by default. Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D51718 llvm-svn: 345216
* [HotColdSplitting] Identify larger cold regions using domtree queriesVedant Kumar2018-10-241-16/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current splitting algorithm works in three stages: 1) Identify cold blocks, then 2) Use forward/backward propagation to mark hot blocks, then 3) Grow a SESE region of blocks *outside* of the set of hot blocks and start outlining. While testing this pass on Apple internal frameworks I noticed that some kinds of control flow (e.g. loops) are never outlined, even though they unconditionally lead to / follow cold blocks. I noticed two other issues related to how cold regions are identified: - An inconsistency can arise in the internal state of the hotness propagation stage, as a block may end up in both the ColdBlocks set and the HotBlocks set. Further inconsistencies can arise as these sets do not match what's in ProfileSummaryInfo. - It isn't necessary to limit outlining to single-exit regions. This patch teaches the splitting algorithm to identify maximal cold regions and outline them. A maximal cold region is defined as the set of blocks post-dominated by a cold sink block, or dominated by that sink block. This approach can successfully outline loops in the cold path. As a side benefit, it maintains less internal state than the current approach. Due to a limitation in CodeExtractor, blocks within the maximal cold region which aren't dominated by a single entry point (a so-called "max ancestor") are filtered out. Results: - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to 47KB pre-patch, or a ~3x improvement. Did not see a performance impact across two runs. - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB of TEXT were outlined. Ditto re: performance impact. - Outlining results improve marginally in the internal frameworks I tested. Follow-ups: - Outline more than once per function, outline large single basic blocks, & try to remove unconditional branches in outlined functions. Differential Revision: https://reviews.llvm.org/D53627 llvm-svn: 345209
* [hot-cold-split] Name split functions with ".cold" suffixTeresa Johnson2018-10-241-5/+11
| | | | | | | | | | | | | | | | | | | | | | | Summary: The current default of appending "_"+entry block label to the new extracted cold function breaks demangling. Change the deliminator from "_" to "." to enable demangling. Because the header block label will be empty for release compile code, use "extracted" after the "." when the label is empty. Additionally, add a mechanism for the client to pass in an alternate suffix applied after the ".", and have the hot cold split pass use "cold."+Count, where the Count is currently 1 but can be used to uniquely number multiple cold functions split out from the same function with D53588. Reviewers: sebpop, hiraditya Subscribers: llvm-commits, erik.pilkington Differential Revision: https://reviews.llvm.org/D53534 llvm-svn: 345178
* [NFC][InstCombine] Undo stray changeEvandro Menezes2018-10-191-2/+2
| | | | | | Undo stray change introduced by r344725. llvm-svn: 344814
OpenPOWER on IntegriCloud