summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils
Commit message (Collapse)AuthorAgeFilesLines
...
* [SCEV] Use depth limit instead of local cache for SExt and ZExtMax Kazantsev2017-06-301-5/+5
| | | | | | | | | | | | | | | | In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785
* Remove useless header. NFCXin Tong2017-06-291-1/+0
| | | | llvm-svn: 306712
* PredicateInfo: Use OrderedInstructions instead of our homemadeDaniel Berlin2017-06-291-51/+27
| | | | | | version. llvm-svn: 306703
* Remove unneeded else from OrderedInstructions::dominates.Daniel Berlin2017-06-291-2/+1
| | | | llvm-svn: 306701
* Add zero-length check to memcpy/memset load store loop expansionTeresa Johnson2017-06-281-5/+12
| | | | | | | | | | | | | | | | Summary: I was testing using this expansion logic in other cases besides NVPTX, and found some runtime failures due to the lack of a check for a zero length memcpy/memset before the loop. There is already such a check in the memmove expansion code though. Reviewers: hfinkel Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34707 llvm-svn: 306541
* Inlining: Don't re-map simplified cloned instructions.Kyle Butt2017-06-281-4/+5
| | | | | | | | | | | | | When simplifying an instruction that has been re-mapped, it should never simplify to an instruction in the original function. In the edge case where we are inlining a function into itself, the existing code led to incorrect behavior. Replace the incorrect code with an assert verifying that we never expect simplification to produce an instruction in the old function, unless the functions are the same. Differential Revision: https://reviews.llvm.org/D33850 llvm-svn: 306495
* [CodeExtractor] Prevent extraction of block involving blockaddressSerge Guelton2017-06-271-0/+27
| | | | | | | | | BlockAddress are only valid within their function context, which does not interact well with CodeExtractor. Detect this case and prevent it. Differential Revision: https://reviews.llvm.org/D33839 llvm-svn: 306448
* [LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCIAnna Thomas2017-06-271-1/+5
| | | | | | | Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block (which is proven to be the only exiting block in the loop to be unrolled). llvm-svn: 306410
* [InstCombine] Factor the logic for propagating !nonnull and !rangeChandler Carruth2017-06-261-5/+49
| | | | | | | | | | | | | | | | metadata out of InstCombine and into helpers. NFC, this just exposes the logic used by InstCombine when propagating metadata from one load instruction to another. The plan is to use this in SROA to address PR32902. If anyone has better ideas about how to factor this or name variables, I'm all ears, but this seemed like a pretty good start and lets us make progress on the PR. This is based on a patch by Ariel Ben-Yehuda (D34285). llvm-svn: 306267
* [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.Chandler Carruth2017-06-252-62/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was reverted in r306252, but I already had the bug fixed and was just trying to form a test case. The original commit factored the logic for forming dedicated exits inside of LoopSimplify into a helper that could be used elsewhere and with an approach that required fewer intermediate data structures. See that commit for full details including the change to the statistic, etc. The code looked fine to me and my reviewers, but in fact didn't handle indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty. If you have code that looks *just* right, you can end up leaking these predecessors into a subsequent rewrite, and crash deep down when trying to update PHI nodes for predecessors that don't exist. I've added an assert that makes the bug much more obvious, and then changed the code to reliably clear the vector so we don't get this bug again in some other form as the code changes. I've also added a test case that *does* manage to catch this while also giving some nice positive coverage in the face of indirectbr. The real code that found this came out of what I think is CPython's interpreter loop, but any code with really "creative" interpreter loops mixing indirectbr and other exit paths could manage to tickle the bug. I was hard to reduce the original test case because in addition to having a particular pattern of IR, the whole thing depends on the order of the predecessors which is in turn depends on use list order. The test case added here was designed so that in multiple different predecessor orderings it should always end up going down the same path and tripping the same bug. I hope. At least, it tripped it for me without manipulating the use list order which is better than anything bugpoint could do... llvm-svn: 306257
* Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."Daniel Jasper2017-06-252-83/+62
| | | | | | | This leads to a segfault. Chandler already has a test case and should be able to recommit with a fix soon. llvm-svn: 306252
* [Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a ↵Craig Topper2017-06-241-2/+1
| | | | | | few places. NFC llvm-svn: 306204
* [RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFCAnna Thomas2017-06-231-23/+23
| | | | | | | The single exit block allowed in runtime unrolling is guaranteed to be the Latch's successor, so rename it as LatchExitBlock. llvm-svn: 306105
* [LoopSimplify] Factor the logic to form dedicated exits into a utility.Chandler Carruth2017-06-232-62/+83
| | | | | | | | | | | | | | | | | | | | | | I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here. I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate. This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand. I also had to ditch a statistic, but it doesn't seem terribly valuable. Differential Revision: https://reviews.llvm.org/D34049 llvm-svn: 306081
* Add argmononly attribute to strlen and wcslen, i.e. they only read memory ↵Xin Tong2017-06-181-0/+1
| | | | | | | | | | | | | | | | (string) passed to them. Summary: This allows strlen to be moved out of the loop in case its argument is not modified in the loop in LICM. Reviewers: hfinkel, davide, sanjoy, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34323 llvm-svn: 305641
* [ScalarEvolution] Apply Depth limit to getMulExprMax Kazantsev2017-06-151-7/+7
| | | | | | | | | | | | | | | | | | | | | This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463
* PredicateInfo: Don't insert conditional info when a conditional branch jumps ↵Daniel Berlin2017-06-141-0/+3
| | | | | | to the same target regardless of condition llvm-svn: 305416
* [PartialInlining] Support shrinkwrap life_range markersXinliang David Li2017-06-111-16/+203
| | | | | | Differential Revision: http://reviews.llvm.org/D33847 llvm-svn: 305170
* [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltinAndrew Kaylor2017-06-091-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33737 llvm-svn: 305132
* [SimplifyLibCalls] fix formatting; NFCSanjay Patel2017-06-091-1/+1
| | | | llvm-svn: 305081
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-0626-35/+33
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* Add a dominanance check interface that uses caching for instructions within ↵Xin Tong2017-06-062-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | same basic block. Summary: This problem stems from the fact that instructions are allocated using new in LLVM, i.e. there is no relationship that can be derived by just looking at the pointer value. This interface dispatches to appropriate dominance check given 2 instructions, i.e. in case the instructions are in the same basic block, ordered basicblock (with instruction numbering and caching) are used. Otherwise, dominator tree is used. This is a preparation patch for https://reviews.llvm.org/D32720 Reviewers: dberlin, hfinkel, davide Subscribers: davide, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33380 llvm-svn: 304764
* Reapply "[Cloning] Take another pass at properly cloning debug info"Keno Fischer2017-06-011-28/+43
| | | | | | | | This was rL304226, reverted in 304228 due to a clang assertion failure on the build bots. That problem should have been addressed by clang commit rL304470. llvm-svn: 304488
* [PredicateInfo] Fix non-determinism in codegen uncovered by reverse ↵Mandeep Singh Grang2017-06-011-1/+34
| | | | | | | | | | | | | | | | | | | iterating SmallPtrSet Summary: Sort OpsToRename before iterating to make iteration order deterministic. Thanks to Daniel Berlin for the sorting logic. Reviewers: dberlin, RKSimon, efriedma, davide Reviewed By: dberlin, davide Subscribers: sanjoy, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D33265 llvm-svn: 304447
* [PPC] Inline expansion of memcmpZaara Syeda2017-05-311-14/+0
| | | | | | | | | | | | | | | This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
* [PartialInlining] Shrinkwrap allocas with live range contained in outline ↵Xinliang David Li2017-05-301-7/+76
| | | | | | | | region. Differential Revision: http://reviews.llvm.org/D33618 llvm-svn: 304245
* Revert "[Cloning] Take another pass at properly cloning debug info"Keno Fischer2017-05-301-43/+28
| | | | | | At least one build bot is complaining. Will investigate after lunch. llvm-svn: 304228
* [Cloning] Take another pass at properly cloning debug infoKeno Fischer2017-05-301-28/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In rL302576, DISubprograms gained the constraint that a !dbg attachments to functions must have a 1:1 mapping to DISubprograms. As part of that change, the function cloning support was adjusted to attempt to enforce this invariant during cloning. However, there were several problems with the implementation. Part of these were fixed in rL304079. However, there was a more fundamental problem with these changes, namely that it bypasses the matadata value map, causing the cloned metadata to be a mix of metadata pointing to the new suprogram (where manual code was added to fix those up) and the old suprogram (where this was not the case). This mismatch could cause a number of different assertion failures in the DWARF emitter. Some of these are given at https://github.com/JuliaLang/julia/issues/22069, but some others have been observed as well. Attempt to rectify this by partially reverting the manual DI metadata fixup, and instead using the standard value map approach. To retain the desired semantics of not duplicating the compilation unit and inlined subprograms, explicitly freeze these in the value map. Reviewers: dblaikie, aprantl, GorNishanov, echristo Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33655 llvm-svn: 304226
* Cloning: Fix debug info cloningGor Nishanov2017-05-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I believe https://reviews.llvm.org/rL302576 introduced two bugs: 1) it produces duplicate distinct variables for every: dbg.value describing the same variable. To fix the problme I switched form getDistinct() to get() in DebugLoc.cpp: auto reparentVar = [&](DILocalVariable *Var) { return DILocalVariable::getDistinct( 2) It passes NewFunction plain name as a linkagename parameter to Subprogram constructor. Breaks assert in: || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed. #9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3 # (Edit: reproducer added) Here how https://reviews.llvm.org/rL302576 broke coroutine debug info. Coroutine body of the original function is split into several parts by cloning and removing unneeded code. All parts describe the original function and variables present in the original function. For a simple case, prior to Split, original function has these two blocks: ``` PostSpill: ; preds = %AllocaSpillBB call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !14, metadata !15), !dbg !13 store i32 %x, i32* %x.addr, align 4 ... and sw.epilog: ; preds = %sw.bb %x.addr.reload.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 4, !dbg !20 %4 = load i32, i32* %x.addr.reload.addr, align 4, !dbg !20 call void @llvm.dbg.value(metadata i32 %4, i64 0, metadata !14, metadata !15), !dbg !13 !14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11) ``` Note that in two blocks different expression represent the same original user variable X. Before rL302576, for every cloned function there was exactly one cloned DILocalVariable(name: "x" as in: ``` define i8* @f(i32 %x) #0 !dbg !6 { ... !6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, ... !14 = !DILocalVariable(name: "x", arg: 1, scope: !6, file: !7, line: 55, type: !11) define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 { ... !25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, isOptimized: false, unit: !0, variables: !2) !28 = !DILocalVariable(name: "x", arg: 1, scope: !25, file: !7, line: 55, type: !11) ``` After rL302576, for every cloned function there were as many DILocalVariable(name: "x" as there were "call void @llvm.dbg.value" for that variable. This was causing asserts in VerifyDebugInfo and AssemblyPrinter. Example: ``` !27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, !29 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) !39 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) !41 = distinct !DILocalVariable(name: "x", arg: 1, scope: !27, file: !7, line: 55, type: !11) ``` Second problem: Prior to rL302576, all clones were described by DISubprogram referring to original function. ``` define i8* @f(i32 %x) #0 !dbg !6 { ... !6 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg !25 { ... !25 = distinct !DISubprogram(name: "f", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, flags: DIFlagPrototyped, ``` After rL302576, DISubprogram for clones is of two minds, plain name refers to the original name, linkageName refers to plain name of the clone. ``` !27 = distinct !DISubprogram(name: "f", linkageName: "f.resume", scope: !7, file: !7, line: 55, type: !8, isLocal: false, isDefinition: true, scopeLine: 55, ``` I think the assumption in AsmPrinter is that both name and linkageName should refer to the same entity. It asserts here when they are not: ``` || DeclLinkageName.empty()) || LinkageName == DeclLinkageName) && "decl has a linkage name and it is different"' failed. #9 0x00007f5010261b75 llvm::DwarfUnit::applySubprogramDefinitionAttributes(llvm::DISubprogram const*, llvm::DIE&) /home/gor/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp:1173:3 ``` After this fix, behavior (with respect to coroutines) reverts to exactly as it was before and therefore making them debuggable again, or even more importantly, compilable, with "-g" Reviewers: dblaikie, echristo, aprantl Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33614 llvm-svn: 304079
* [GVNSink] GVNSink passJames Molloy2017-05-252-47/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides an initial prototype for a pass that sinks instructions based on GVN information, similar to GVNHoist. It is not yet ready for commiting but I've uploaded it to gather some initial thoughts. This pass attempts to sink instructions into successors, reducing static instruction count and enabling if-conversion. We use a variant of global value numbering to decide what can be sunk. Consider: [ %a1 = add i32 %b, 1 ] [ %c1 = add i32 %d, 1 ] [ %a2 = xor i32 %a1, 1 ] [ %c2 = xor i32 %c1, 1 ] \ / [ %e = phi i32 %a2, %c2 ] [ add i32 %e, 4 ] GVN would number %a1 and %c1 differently because they compute different results - the VN of an instruction is a function of its opcode and the transitive closure of its operands. This is the key property for hoisting and CSE. What we want when sinking however is for a numbering that is a function of the *uses* of an instruction, which allows us to answer the question "if I replace %a1 with %c1, will it contribute in an equivalent way to all successive instructions?". The (new) PostValueTable class in GVN provides this mapping. This pass has some shown really impressive improvements especially for codesize already on internal benchmarks, so I have high hopes it can replace all the sinking logic in SimplifyCFG. Differential revision: https://reviews.llvm.org/D24805 llvm-svn: 303850
* [ValueTracking] Convert most of the calls to computeKnownBits to use the ↵Craig Topper2017-05-243-9/+4
| | | | | | | | | | version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773
* [IR] Switch AttributeList to use an array for O(1) accessReid Kleckner2017-05-231-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have attributes. However, it requires doing a search over the pairs to test an argument or function attribute. Profiling shows that this loop was 0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly tests values for nullability. This was worth about 2.5% of mid-level optimization cycles on the sqlite3 amalgamation. Here are the full perf results: https://reviews.llvm.org/P7995 Here are just the before and after cycle counts: ``` $ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null 13,274,181,184 cycles # 3.047 GHz ( +- 0.28% ) $ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null 12,906,927,263 cycles # 3.043 GHz ( +- 0.51% ) ``` This patch *does not* change the indices used to query attributes, as requested by reviewers. Tracking whether an index is usable for array indexing is a huge pain that affects many of the internal APIs, so it would be good to come back later and do a cleanup to remove this internal adjustment. Reviewers: pete, chandlerc Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D32819 llvm-svn: 303654
* [JumpThreading] Safely replace uses of conditionAnna Thomas2017-05-231-0/+17
| | | | | | | | | | | | | | | | | | | | | | This patch builds over https://reviews.llvm.org/rL303349 and replaces the use of the condition only if it is safe to do so. We should not blindly RAUW the condition if experimental.guard or assume is a use of that condition. This is because LVI may have used the guard/assume to identify the value of the condition, and RUAWing will fold the guard/assume and uses before the guards/assumes. Reviewers: sanjoy, reames, trentxintong, mkazantsev Reviewed by: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33257 llvm-svn: 303633
* Fix update VP metadata after inlining for instrumentation PGOTeresa Johnson2017-05-221-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With instrumentation profiling, when updating the VP metadata after an inline, VP metadata on the inlined copy was inadvertantly having all counts zeroed out. This was causing indirect calls from code inlined during the call step to be marked as cold in the ThinLTO summaries and not imported. The CallerBFI needs to be passed down so that the CallSiteCount can be computed from the profile summary info. With Sample PGO this was working since the count is extracted from the branch weight metadata on the call being inlined (even before we stopped looking at metadata for non-sample PGO in r302844 this largely wasn't working for instrumentation PGO since only promoted indirect calls would be getting inlined and have the metadata). Added an instrumentation PGO test and renamed the sample PGO test. Reviewers: danielcdh, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D33389 llvm-svn: 303574
* [SimplifyCFG] Prevent a few APInt copies on method calls that return const ↵Craig Topper2017-05-221-2/+2
| | | | | | reference. NFCI llvm-svn: 303523
* Revert "Add pthread_self function prototype and make it speculatable."Xin Tong2017-05-211-12/+0
| | | | | | | | This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21. Build breaking llvm-svn: 303496
* Add pthread_self function prototype and make it speculatable.Xin Tong2017-05-201-0/+12
| | | | | | | | | | | | | | Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495
* SimplifyLibCalls: Optimize wcslenMatthias Braun2017-05-191-28/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461
* [IR] De-virtualize ~Value to save a vptrReid Kleckner2017-05-182-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362
* PR32288: Describe a bool parameter's DWARF location with a simple registerDavid Blaikie2017-05-151-28/+23
| | | | | | | | | | | There's no need (& a bit incorrect) to mask off the high bits of the register reference when describing a simple bool value. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D31062 llvm-svn: 303117
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-122-3/+3
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* Remove spurious cast of nullptr. NFC.Serge Guelton2017-05-111-1/+1
| | | | | | Conversion rules allow automatic casting of nullptr to any pointer type. llvm-svn: 302780
* Add a late IR expansion pass for the experimental reduction intrinsics.Amara Emerson2017-05-101-5/+4
| | | | | | | | | This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631
* [ProfileSummary] Make getProfileCount a non-static member function.Easwaran Raman2017-05-091-8/+10
| | | | | | | | | | This change is required because the notion of count is different for sample profiling and getProfileCount will need to determine the underlying profile type. Differential revision: https://reviews.llvm.org/D33012 llvm-svn: 302597
* [GVN] Fix a crash on encountering non-integral pointersKeno Fischer2017-05-091-0/+9
| | | | | | | | | | | | | | | | | | Summary: This fixes the immediate crash caused by introducing an incorrect inttoptr before attempting the conversion. There may still be a legality check missing somewhere earlier for non-integral pointers, but this change seems necessary in any case. Reviewers: sanjoy, dberlin Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32623 llvm-svn: 302587
* Make it illegal for two Functions to point to the same DISubprogramAdrian Prantl2017-05-092-44/+31
| | | | | | | | | | | | | | | | | | | As recently discussed on llvm-dev [1], this patch makes it illegal for two Functions to point to the same DISubprogram and updates FunctionCloner to also clone the debug info of a function to conform to the new requirement. To simplify the implementation it also factors out the creation of inlineAt locations from the Inliner into a general-purpose utility in DILocation. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html <rdar://problem/31926379> Differential Revision: https://reviews.llvm.org/D32975 This reapplies r302469 with a fix for a bot failure (reparentDebugInfo now checks for the case the orig and new function are identical). llvm-svn: 302576
* NFC: refactor replaceDominatedUsesWithPiotr Padlewski2017-05-091-27/+26
| | | | | | | | | | | | | | | Summary: Since I will post patch with some changes to replaceDominatedUsesWith, it would be good to avoid duplicating code again. Reviewers: davide, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32798 llvm-svn: 302575
* Suppress all uses of LLVM_END_WITH_NULL. NFC.Serge Guelton2017-05-094-16/+14
| | | | | | | | | Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571
* Revert r302469 "Make it illegal for two Functions to point to the same ↵Hans Wennborg2017-05-092-31/+44
| | | | | | | | | | | | | | | | | | | | | | | | DISubprogram" This caused PR32977. Original commit message: > Make it illegal for two Functions to point to the same DISubprogram > > As recently discussed on llvm-dev [1], this patch makes it illegal for > two Functions to point to the same DISubprogram and updates > FunctionCloner to also clone the debug info of a function to conform > to the new requirement. To simplify the implementation it also factors > out the creation of inlineAt locations from the Inliner into a > general-purpose utility in DILocation. > > [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html > <rdar://problem/31926379> > > Differential Revision: https://reviews.llvm.org/D32975 llvm-svn: 302533
* Introduce experimental generic intrinsics for horizontal vector reductions.Amara Emerson2017-05-091-0/+202
| | | | | | | | | | | | | | - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514
OpenPOWER on IntegriCloud