summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Utils
Commit message (Collapse)AuthorAgeFilesLines
* Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI.Simon Pilgrim2017-07-131-1/+1
| | | | llvm-svn: 307929
* [RuntimeUnrolling] Update DomTree correctly when exit blocks have successorsAnna Thomas2017-07-131-2/+28
| | | | | | | | | | | | | | | | Summary: When we runtime unroll with multiple exit blocks, we also need to update the immediate dominators of the immediate successors of the exit blocks. Reviewers: reames, mkuper, mzolotukhin, apilipenko Reviewed by: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35304 llvm-svn: 307909
* [LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loopAnna Thomas2017-07-121-47/+58
| | | | | | | | | | | Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843
* Enhance synchscope representationKonstantin Zhuravlyov2017-07-111-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
* [LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop ↵Anna Thomas2017-07-111-8/+30
| | | | | | wasn't unrolled. llvm-svn: 307705
* [LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generationAnna Thomas2017-07-111-2/+10
| | | | | | | | | | The loop structure for the outer loop does not contain the epilog preheader when we try to unroll inner loop with multiple exits and epilog code is generated. For now, we just bail out in such cases. Added a test case that shows the problem. Without this bailout, we would trip on assert saying LCSSA form is incorrect for outer loop. llvm-svn: 307676
* [ConstantHoisting] Remove dupliate logic in constant hoistingLeo Li2017-07-101-0/+3
| | | | | | | | | | | | | | | | | | | | | Summary: As metioned in https://reviews.llvm.org/D34576, checkings in `collectConstantCandidates` can be replaced by using `llvm::canReplaceOperandWithVariable`. The only special case is that `collectConstantCandidates` return false for all `IntrinsicInst` but it is safe for us to collect constant candidates from `IntrinsicInst`. Reviewers: pirama, efriedma, srhines Reviewed By: efriedma Subscribers: llvm-commits, javed.absar Differential Revision: https://reviews.llvm.org/D34921 llvm-svn: 307587
* [LoopUnrollRuntime] Remove strict assert about VMap requirementAnna Thomas2017-07-101-4/+3
| | | | | | | | | | | | | When unrolling under multiple exits which is under off-by-default option, the assert that checks for VMap entry in loop exit values is too strong. (assert if VMap entry did not exist, the value should be a constant). However, values derived from constants or from values outside loop, does not have a VMap entry too. Removed the assert and added a testcase showcasing the property for non-constant values. llvm-svn: 307542
* [IR] Make use of ↵Craig Topper2017-07-091-8/+7
| | | | | | Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491
* Re-enable "[IndVars] Canonicalize comparisons between non-negative values ↵Max Kazantsev2017-07-081-0/+11
| | | | | | | | | | | | | | and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477
* [LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog ↵Anna Thomas2017-07-071-2/+0
| | | | | | | | | | | | | | | remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435
* [Local] Update the comment for removeUnreachableBlocks.Davide Italiano2017-07-071-2/+3
| | | | | | | It referenced a wrong function name, and didn't mention what the second argument did. This should be slightly more accurate now. llvm-svn: 307425
* [cloning] Do not duplicate types when cloning functionsGor Nishanov2017-07-071-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655) rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but, in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined. Consider cloning of the following function: ``` define private void @f() !dbg !5 { %1 = alloca i32, !dbg !11 call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18 ret void, !dbg !18 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct. ``` define private void @f.1() !dbg !23 { %1 = alloca i32, !dbg !26 call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30 ret void, !dbg !30 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ; !28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable !29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed. (Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492) In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable. Reviewers: loladiro, dblaikie, aprantl, echristo Reviewed By: loladiro Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D35106 llvm-svn: 307418
* [LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectPrologAnna Thomas2017-07-071-11/+11
| | | | | | | | | Minor refactoring to use the preexisting loop exit that's already calculated. We do not need to recompute the loop exit in ConnectProlog. Apart from avoiding redundant computation, this is required for supporting multiple loop exits when Prolog remainder loops are generated. llvm-svn: 307417
* Extend memcpy expansion in Transform/Utils to handle wider operand types.Sean Fertile2017-07-071-9/+279
| | | | | | | | | | | Adds loop expansions for known-size and unknown-sized memcpy calls, allowing the target to provide the operand types through TTI callbacks. The default values for the TTI callbacks use int8 operand types and matches the existing behaviour if they aren't overridden by the target. Differential revision: https://reviews.llvm.org/D32536 llvm-svn: 307346
* Modify constraints in `llvm::canReplaceOperandWithVariable`Leo Li2017-07-061-2/+8
| | | | | | | | | | | | | | | | | Summary: `Instruction::Switch`: only first operand can be set to a non-constant value. `Instruction::InsertValue` both the first and the second operand can be set to a non-constant value. `Instruction::Alloca` return true for non-static allocation. Reviewers: efriedma Reviewed By: efriedma Subscribers: srhines, pirama, llvm-commits Differential Revision: https://reviews.llvm.org/D34905 llvm-svn: 307294
* [Constants] If we already have a ConstantInt*, prefer to use ↵Craig Topper2017-07-063-4/+4
| | | | | | | | isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292
* [LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch ↵Anna Thomas2017-07-061-4/+5
| | | | | | | | | | | | | | exit block Currently, we do not support multiple exiting blocks to the latch exit block. However, this bailout wasn't triggered when we had a unique exit block (which is the latch exit), with multiple exiting blocks to that unique exit. Moved the bailout so that it's triggered in both cases and added testcase. llvm-svn: 307291
* [SimplifyCFG] Move a portion of an if statement that should already be ↵Craig Topper2017-07-061-2/+2
| | | | | | | | | | | | | | | | implied to an assert Summary: In this code we got to Dom by following the predecessor link of BB. So it stands to reason that BB should also show up as a successor of Dom's terminator right? There isn't a way to have the CFG connect in only one direction is there? Reviewers: jmolloy, davide, mcrosier Reviewed By: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D35025 llvm-svn: 307276
* Revert "Revert "Revert "[IndVars] Canonicalize comparisons between ↵Max Kazantsev2017-07-061-4/+0
| | | | | | | | | non-negative values and indvars""" It appears that the problem is still there. Needs more analysis to understand why SaturatedMultiply test fails. llvm-svn: 307249
* Revert "Revert "[IndVars] Canonicalize comparisons between non-negative ↵Max Kazantsev2017-07-061-0/+4
| | | | | | | | | | | | | values and indvars"" It seems that the patch was reverted by mistake. Clang testing showed failure of the MathExtras.SaturatingMultiply test, however I was unable to reproduce the issue on the fresh code base and was able to confirm that the transformation introduced by the change does not happen in the said test. This gives a strong confidence that the actual reason of the failure of the initial patch was somewhere else, and that problem now seems to be fixed. Re-submitting the change to confirm that. llvm-svn: 307244
* [IndVarSimplify] Add AShr exact flags using induction variables ranges.David Green2017-07-051-2/+34
| | | | | | | | | | This adds exact flags to AShr/LShr flags where we can statically prove it is valid using the range of induction variables. This allows further optimisations to remove extra loads. Differential Revision: https://reviews.llvm.org/D34207 llvm-svn: 307157
* Revert "[IndVars] Canonicalize comparisons between non-negative values and ↵Max Kazantsev2017-07-051-4/+0
| | | | | | | | | | | indvars" This patch seems to cause failures of test MathExtras.SaturatingMultiply on multiple buildbots. Reverting until the reason of that is clarified. Differential Revision: https://reviews.llvm.org/rL307126 llvm-svn: 307135
* [IndVars] Canonicalize comparisons between non-negative values and indvarsMax Kazantsev2017-07-051-0/+4
| | | | | | | | | | | | | | | | | -If there is a IndVar which is known to be non-negative, and there is a value which is also non-negative, then signed and unsigned comparisons between them produce the same result. Both of those can be seen in the same loop. To allow other optimizations to simplify them, we turn all instructions like %c = icmp slt i32 %iv, %b to %c = icmp ult i32 %iv, %b if both %iv and %b are known to be non-negative. Differential Revision: https://reviews.llvm.org/D34979 llvm-svn: 307126
* [CodeExtractor] Remove unneded and commented out debugging stmts.Davide Italiano2017-07-021-6/+0
| | | | llvm-svn: 306966
* [Cloner] Re-map simplfied cloned instructions.Davide Italiano2017-07-011-5/+4
| | | | | | | | | | | | | | | This commit pretty much rolls back the logic added in r306495 as in the testcase provided we simplify an `icmp` looking through a PHI that hasn't been mapped yet. I think instsimplify shouldn't do threading over select/phis or just looking through phis in general, but this is what we have now. Also, add a test to prevent this from happening in case somebody wants to modify this code again. Briefly discussed with Kyle Butt (thanks Kyle!). llvm-svn: 306938
* Recommit "r306541 - Add zero-length check to memcpy/memset load store loop ↵Teresa Johnson2017-07-011-5/+14
| | | | | | | | | | expansion"" With fix for use-after-free errors. We can't add the new branch and remove the old one until we are done with the Builder constructed for the block. llvm-svn: 306937
* [LV] Sink casts to unravel first order recurrenceAyal Zaks2017-06-301-3/+16
| | | | | | | | | | | Check if a single cast is preventing handling a first-order-recurrence Phi, because the scheduling constraints it imposes on the first-order-recurrence shuffle are infeasible; but they can be made feasible by moving the cast downwards. Record such casts and move them when vectorizing the loop. Differential Revision: https://reviews.llvm.org/D33058 llvm-svn: 306884
* [SimplifyCFG] Update the name of switch generated lookup table.Sumanth Gundapaneni2017-06-301-4/+6
| | | | | | | | | | This patch appends the name of the function to the switch generated lookup table. This will ease the visual debugging in identifying the function the table is generated from. Differential Revision: https://reviews.llvm.org/D34817 llvm-svn: 306867
* [RuntimeUnrolling] Add logic for loops with multiple exit blocksAnna Thomas2017-06-301-23/+101
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Runtime unrolling is done for loops with a single exit block and a single exiting block (and this exiting block should be the latch block). This patch adds logic to support unrolling in the presence of multiple exit blocks (which also means multiple exiting blocks). Currently this is under an off-by-default option and is supported when epilog code is generated. Support in presence of prolog code will be in a future patch (we just need to add more tests, and update comments). This patch is essentially an implementation patch. I have not added any heuristic (in terms of branches added or code size) to decide when this should be enabled. Reviewers: mkuper, sanjoy, reames, evstupac Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33001 llvm-svn: 306846
* Revert "r306541 - Add zero-length check to memcpy/memset load store loop ↵Daniel Jasper2017-06-301-12/+5
| | | | | | | | | expansion" Segfaults in non-optimized builds. I'll get a stack trace and a reproducer to Teresa. llvm-svn: 306793
* [SCEV] Use depth limit instead of local cache for SExt and ZExtMax Kazantsev2017-06-301-5/+5
| | | | | | | | | | | | | | | | In rL300494 there was an attempt to deal with excessive compile time on invocations of getSign/ZeroExtExpr using local caching. This approach only helps if we request the same SCEV multiple times throughout recursion. But in the bug PR33431 we see a case where we request different values all the time, so caching does not help and the size of the cache grows enormously. In this patch we remove the local cache for this methods and add the recursion depth limit instead, as we do for arithmetics. This gives us a guarantee that the invocation sequence is limited and reasonably short. Differential Revision: https://reviews.llvm.org/D34273 llvm-svn: 306785
* Remove useless header. NFCXin Tong2017-06-291-1/+0
| | | | llvm-svn: 306712
* PredicateInfo: Use OrderedInstructions instead of our homemadeDaniel Berlin2017-06-291-51/+27
| | | | | | version. llvm-svn: 306703
* Remove unneeded else from OrderedInstructions::dominates.Daniel Berlin2017-06-291-2/+1
| | | | llvm-svn: 306701
* Add zero-length check to memcpy/memset load store loop expansionTeresa Johnson2017-06-281-5/+12
| | | | | | | | | | | | | | | | Summary: I was testing using this expansion logic in other cases besides NVPTX, and found some runtime failures due to the lack of a check for a zero length memcpy/memset before the loop. There is already such a check in the memmove expansion code though. Reviewers: hfinkel Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34707 llvm-svn: 306541
* Inlining: Don't re-map simplified cloned instructions.Kyle Butt2017-06-281-4/+5
| | | | | | | | | | | | | When simplifying an instruction that has been re-mapped, it should never simplify to an instruction in the original function. In the edge case where we are inlining a function into itself, the existing code led to incorrect behavior. Replace the incorrect code with an assert verifying that we never expect simplification to produce an instruction in the old function, unless the functions are the same. Differential Revision: https://reviews.llvm.org/D33850 llvm-svn: 306495
* [CodeExtractor] Prevent extraction of block involving blockaddressSerge Guelton2017-06-271-0/+27
| | | | | | | | | BlockAddress are only valid within their function context, which does not interact well with CodeExtractor. Detect this case and prevent it. Differential Revision: https://reviews.llvm.org/D33839 llvm-svn: 306448
* [LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCIAnna Thomas2017-06-271-1/+5
| | | | | | | Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block (which is proven to be the only exiting block in the loop to be unrolled). llvm-svn: 306410
* [InstCombine] Factor the logic for propagating !nonnull and !rangeChandler Carruth2017-06-261-5/+49
| | | | | | | | | | | | | | | | metadata out of InstCombine and into helpers. NFC, this just exposes the logic used by InstCombine when propagating metadata from one load instruction to another. The plan is to use this in SROA to address PR32902. If anyone has better ideas about how to factor this or name variables, I'm all ears, but this seemed like a pretty good start and lets us make progress on the PR. This is based on a patch by Ariel Ben-Yehuda (D34285). llvm-svn: 306267
* [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.Chandler Carruth2017-06-252-62/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was reverted in r306252, but I already had the bug fixed and was just trying to form a test case. The original commit factored the logic for forming dedicated exits inside of LoopSimplify into a helper that could be used elsewhere and with an approach that required fewer intermediate data structures. See that commit for full details including the change to the statistic, etc. The code looked fine to me and my reviewers, but in fact didn't handle indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty. If you have code that looks *just* right, you can end up leaking these predecessors into a subsequent rewrite, and crash deep down when trying to update PHI nodes for predecessors that don't exist. I've added an assert that makes the bug much more obvious, and then changed the code to reliably clear the vector so we don't get this bug again in some other form as the code changes. I've also added a test case that *does* manage to catch this while also giving some nice positive coverage in the face of indirectbr. The real code that found this came out of what I think is CPython's interpreter loop, but any code with really "creative" interpreter loops mixing indirectbr and other exit paths could manage to tickle the bug. I was hard to reduce the original test case because in addition to having a particular pattern of IR, the whole thing depends on the order of the predecessors which is in turn depends on use list order. The test case added here was designed so that in multiple different predecessor orderings it should always end up going down the same path and tripping the same bug. I hope. At least, it tripped it for me without manipulating the use list order which is better than anything bugpoint could do... llvm-svn: 306257
* Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."Daniel Jasper2017-06-252-83/+62
| | | | | | | This leads to a segfault. Chandler already has a test case and should be able to recommit with a fix soon. llvm-svn: 306252
* [Analysis][Transforms] Use commutable matchers instead of m_CombineOr in a ↵Craig Topper2017-06-241-2/+1
| | | | | | few places. NFC llvm-svn: 306204
* [RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFCAnna Thomas2017-06-231-23/+23
| | | | | | | The single exit block allowed in runtime unrolling is guaranteed to be the Latch's successor, so rename it as LatchExitBlock. llvm-svn: 306105
* [LoopSimplify] Factor the logic to form dedicated exits into a utility.Chandler Carruth2017-06-232-62/+83
| | | | | | | | | | | | | | | | | | | | | | I want to use the same logic as LoopSimplify to form dedicated exits in another pass (SimpleLoopUnswitch) so I wanted to factor it out here. I also noticed that there is a pretty significantly more efficient way to implement this than the way the code in LoopSimplify worked. We don't need to actually retain the set of unique exit blocks, we can just rewrite them as we find them and use only a set to deduplicate. This did require changing one part of LoopSimplify to not re-use the unique set of exits, but it only used it to check that there was a single unique exit. That part of the code is about to walk the exiting blocks anyways, so it seemed better to rewrite it to use those exiting blocks to compute this property on-demand. I also had to ditch a statistic, but it doesn't seem terribly valuable. Differential Revision: https://reviews.llvm.org/D34049 llvm-svn: 306081
* Add argmononly attribute to strlen and wcslen, i.e. they only read memory ↵Xin Tong2017-06-181-0/+1
| | | | | | | | | | | | | | | | (string) passed to them. Summary: This allows strlen to be moved out of the loop in case its argument is not modified in the loop in LICM. Reviewers: hfinkel, davide, sanjoy, dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34323 llvm-svn: 305641
* [ScalarEvolution] Apply Depth limit to getMulExprMax Kazantsev2017-06-151-7/+7
| | | | | | | | | | | | | | | | | | | | | This is a fix for PR33292 that shows a case of extremely long compilation of a single .c file with clang, with most time spent within SCEV. We have a mechanism of limiting recursion depth for getAddExpr to avoid long analysis in SCEV. However, there are calls from getAddExpr to getMulExpr and back that do not propagate the info about depth. As result of this, a chain getAddExpr -> ... .> getAddExpr -> getMulExpr -> getAddExpr -> ... -> getAddExpr can be extremely long, with every segment of getAddExpr's being up to max depth long. This leads either to long compilation or crash by stack overflow. We face this situation while analyzing big SCEVs in the test of PR33292. This patch applies the same limit on max expression depth for getAddExpr and getMulExpr. Differential Revision: https://reviews.llvm.org/D33984 llvm-svn: 305463
* PredicateInfo: Don't insert conditional info when a conditional branch jumps ↵Daniel Berlin2017-06-141-0/+3
| | | | | | to the same target regardless of condition llvm-svn: 305416
* [PartialInlining] Support shrinkwrap life_range markersXinliang David Li2017-06-111-16/+203
| | | | | | Differential Revision: http://reviews.llvm.org/D33847 llvm-svn: 305170
* [InstSimplify] Don't constant fold or DCE calls that are marked nobuiltinAndrew Kaylor2017-06-091-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33737 llvm-svn: 305132
OpenPOWER on IntegriCloud