summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [ICP] Remove incompatible attributes at indirect-call promoted callsites.Xin Tong2018-11-261-2/+27
| | | | | | | | | | | | | | Summary: Removing ncompatible attributes at indirect-call promoted callsites, not removing it results in at least a IR verification error. Reviewers: davidxl, xur, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54913 llvm-svn: 347605
* [InstCombine] add helper function to reduce code duplication; NFCSanjay Patel2018-11-261-24/+19
| | | | llvm-svn: 347604
* [IPSCCP] Use input operand instead of OriginalOp for ssa_copy.Florian Hahn2018-11-251-5/+5
| | | | | | | | | | | OriginalOp of a Predicate refers to the original IR value, before renaming. While solving in IPSCCP, we have to use the operand of the ssa_copy instead, to avoid missing updates for nested conditions on the same IR value. Fixes PR39772. llvm-svn: 347524
* [InstCombine] Determine demanded and known bits for funnel shiftsNikita Popov2018-11-241-0/+24
| | | | | | | | | | | | | | Support funnel shifts in InstCombine demanded bits simplification. If the shift amount is constant, we can determine both the demanded bits of the operands, as well as the known bits of the result. If one of the operands has no demanded bits, it will be replaced by undef and the funnel shift will be simplified into a simple shift due to the simplifications added in D54778. Differential Revision: https://reviews.llvm.org/D54869 llvm-svn: 347515
* [InstCombine] Simplify funnel shift with zero/undef operand to shiftNikita Popov2018-11-231-0/+23
| | | | | | | | | | | | | | | | | | | | The following simplifications are implemented: * `fshl(X, 0, C) -> shl X, C%BW` * `fshl(X, undef, C) -> shl X, C%BW` (assuming undef = 0) * `fshl(0, X, C) -> lshr X, BW-C%BW` * `fshl(undef, X, C) -> lshr X, BW-C%BW` (assuming undef = 0) * `fshr(X, 0, C) -> shl X, (BW-C%BW)` * `fshr(X, undef, C) -> shl X, BW-C%BW` (assuming undef = 0) * `fshr(0, X, C) -> lshr X, C%BW` * `fshr(undef, X, C) -> lshr, X, C%BW` (assuming undef = 0) The simplification is only performed if the shift amount C is constant, because we can explicitly compute C%BW and BW-C%BW in this case. Differential Revision: https://reviews.llvm.org/D54778 llvm-svn: 347505
* Disable LoopSimplifyCFG terminator folding by defaultMax Kazantsev2018-11-231-0/+6
| | | | llvm-svn: 347486
* [LoopSimplifyCFG] Don't delete LCSSA PhisMax Kazantsev2018-11-231-1/+4
| | | | | | | | | | | | When removing edges, we also update Phi inputs and may end up removing a Phi if it has only one input. We should not do it for edges that leave the current loop because these Phis are LCSSA Phis and need to be preserved. Thanks @dmgreen for finding this! Differential Revision: https://reviews.llvm.org/D54841 llvm-svn: 347484
* [NFC] Assert that all blocks staying in loop are liveMax Kazantsev2018-11-221-0/+2
| | | | llvm-svn: 347458
* [NFC] Ensure deterministic order of dead exit blocksMax Kazantsev2018-11-221-6/+11
| | | | llvm-svn: 347457
* [NFC] Simplify code by using standard exit blocks collectionMax Kazantsev2018-11-221-10/+8
| | | | llvm-svn: 347454
* [PM] correcting return value for new-pass-manager version of ScalarizerFedor Sergeev2018-11-211-2/+2
| | | | | | Obvious mistake missed during D54695 review. llvm-svn: 347432
* [MergeFuncs] Generate alias instead of thunk if possibleNikita Popov2018-11-211-14/+73
| | | | | | | | | | | | | | | | | | | The MergeFunctions pass was originally intended to emit aliases instead of thunks where possible (unnamed_addr). However, for a long time this functionality was behind a flag hardcoded to false, bitrotted and was eventually removed in r309313. Originally the functionality was first disabled in r108417 due to lack of support for aliases in Mach-O. I believe that this is no longer the case nowadays, but not really familiar with this area. In the interest of being conservative, this patch reintroduces the aliasing functionality behind a default disabled -mergefunc-use-aliases flag. Differential Revision: https://reviews.llvm.org/D53285 llvm-svn: 347407
* [PM] Port Scalarizer to the new pass manager.Mikael Holmen2018-11-212-55/+72
| | | | | | | | | | | | | | Patch by: markus (Markus Lavin) Reviewers: chandlerc, fedor.sergeev Reviewed By: fedor.sergeev Subscribers: llvm-commits, Ka-Ka, bjope Differential Revision: https://reviews.llvm.org/D54695 llvm-svn: 347392
* [LoopSink] Add preheader to alias setGuozhi Wei2018-11-201-0/+1
| | | | | | | | | | This patch fixes PR39695. The original LoopSink only considers memory alias in loop body. But PR39695 shows that instructions following sink candidate in preheader should also be checked. This is a conservative patch, it simply adds whole preheader block to alias set. It may lose some optimization opportunity, but I think that is very rare because: 1 in the most common case st/ld to the same address, the load should already be optimized away. 2 usually preheader is not very large. Differential Revision: https://reviews.llvm.org/D54659 llvm-svn: 347325
* Recommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches ↵Max Kazantsev2018-11-201-0/+315
| | | | | | | | | | | and switches" The initial version of patch lacked Phi nodes updates in destinations of removed edges. This version contains this update and tests on this situation. Differential Revision: https://reviews.llvm.org/D54021 llvm-svn: 347289
* [Transforms] Prefer static and avoid namespaces, NFCReid Kleckner2018-11-191-10/+6
| | | | | | | | | | | | | Put 'static' on three functions in an anonymous namespace as per our coding style. Remove the 'namespace llvm {}' around the .cpp file and explicitly declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'. I prefer this style for free functions because the compiler will error out if the .h and .cpp files don't agree on the function name or prototype. llvm-svn: 347269
* Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches ↵Benjamin Kramer2018-11-191-313/+0
| | | | | | | | and switches" This reverts commits r347183 & r347184. Crashes while building libxml. llvm-svn: 347260
* [InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phiVedant Kumar2018-11-191-2/+6
| | | | | | | | | Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi improves backtrace quality. Fixes llvm.org/PR38083. llvm-svn: 347257
* [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlockVedant Kumar2018-11-195-10/+8
| | | | | | | | | | | | | | | | | | | | | | | Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256
* Revert "[LICM] Make LICM able to hoist phis"Benjamin Kramer2018-11-191-302/+15
| | | | | | This reverts commit r347190. llvm-svn: 347225
* [LV] Avoid vectorizing unsafe dependencies in uniform addressAnna Thomas2018-11-191-4/+3
| | | | | | | | | | | | | | | | | | | Summary: Currently, when vectorizing stores to uniform addresses, the only instance we prevent vectorization is if there are multiple stores to the same uniform address causing an unsafe dependency. This patch teaches LAA to avoid vectorizing loops that have an unsafe cross-iteration dependency between a load and a store to the same uniform address. Fixes PR39653. Reviewers: Ayal, efriedma Subscribers: rkruppe, llvm-commits Differential Revision: https://reviews.llvm.org/D54538 llvm-svn: 347220
* [LICM] Make LICM able to hoist phisJohn Brawn2018-11-191-15/+302
| | | | | | | | | | | | | | | The general approach taken is to make note of loop invariant branches, then when we see something conditional on that branch, such as a phi, we create a copy of the branch and (empty versions of) its successors and hoist using that. This has no impact by itself that I've been able to see, as LICM typically doesn't see such phis as they will have been converted into selects by the time LICM is run, but once we start doing phi-to-select conversion later it will be important. Differential Revision: https://reviews.llvm.org/D52827 llvm-svn: 347190
* [LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switchesMax Kazantsev2018-11-191-0/+313
| | | | | | | | | | | | | | | | This patch introduces infrastructure and the simplest case for constant-folding of branch and switch instructions within loop into unconditional branches. It is useful as a cleanup for such passes as loop unswitching that sometimes produce such branches. Only the simplest case supported in this patch: after the folding, no block should become dead or stop being part of the loop. Support for more sophisticated cases will go separately in follow-up patches. Differential Revision: https://reviews.llvm.org/D54021 Reviewed By: anna llvm-svn: 347183
* [ProfileSummary] Standardize methods and fix commentVedant Kumar2018-11-196-8/+8
| | | | | | | | | | | | | | | | | | | | | Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182
* [CorrelatedValuePropagation] Preserve debug locations (PR38178)Vedant Kumar2018-11-181-15/+16
| | | | | | | | | Fix all of the missing debug location errors in CVP found by debugify. This includes the missing-location-after-udiv-truncation case described in llvm.org/PR38178. llvm-svn: 347147
* Use llvm::copy. NFCFangrui Song2018-11-173-6/+5
| | | | llvm-svn: 347126
* [SimpleLoopUnswitch] adding cost multiplier to cap exponential unswitch withFedor Sergeev2018-11-161-2/+116
| | | | | | | | | | | | | | | | | | | | | We need to control exponential behavior of loop-unswitch so we do not get run-away compilation. Suggested solution is to introduce a multiplier for an unswitch cost that makes cost prohibitive as soon as there are too many candidates and too many sibling loops (meaning we have already started duplicating loops by unswitching). It does solve the currently known problem with compile-time degradation (PR 39544). Tests are built on top of a recently implemented CHECK-COUNT-<num> FileCheck directives. Reviewed By: chandlerc, mkazantsev Differential Revision: https://reviews.llvm.org/D54223 llvm-svn: 347097
* GlobalDCE: Teach isEmptyFunction() to ignore debug intrinsics.Adrian Prantl2018-11-161-5/+10
| | | | | | | This fixes PR39669. https://bugs.llvm.org/show_bug.cgi?id=39669 llvm-svn: 347065
* [ThinLTO] Internalize readonly globalsEugene Leviant2018-11-162-6/+54
| | | | | | | | An attempt to recommit r346584 after failure on OSX build bot. Fixed cache key computation in ThinLTOCodeGenerator and added test case llvm-svn: 347033
* [LTO] Load sample profile in LTO link step.Xin Tong2018-11-151-0/+6
| | | | | | | | | | | | | | Summary: Load sample profile in LTO link step. ThinLTO calls populateModulePassManager to load the profile Reviewers: tejohnson, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54564 llvm-svn: 346971
* [InstCombine] fix rotate narrowing bug for non-pow-2 typesSanjay Patel2018-11-151-2/+7
| | | | llvm-svn: 346968
* [InstCombine] Remove a couple of asserts based on incorrect assumptionsMandeep Singh Grang2018-11-141-11/+4
| | | | | | | | | | | | | | | | Summary: These asserts are based on the assumption that the order of true/false operands in a select and those in the compare would always be the same. This fixes PR39595. Reviewers: craig.topper, spatel, dmgreen Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54359 llvm-svn: 346874
* [InstCombine] fix formatting for matchBSwap(); NFCSanjay Patel2018-11-142-7/+9
| | | | | | | We should have a similar function for matching rotate and/or funnel shift, so tidy up the related existing call. llvm-svn: 346871
* [VPlan, SLP] Use SmallPtrSet for Candidates.Florian Hahn2018-11-142-27/+26
| | | | | | This slightly improves the candidate handling in getBest(). llvm-svn: 346870
* [VPlan] Remove LLVM_DEBUG from VPlanSlp::dumpBundle.Florian Hahn2018-11-141-4/+4
| | | | | | The caller should take care of only calling it with debug enabled. llvm-svn: 346860
* [VPlan] Update ifdef.Florian Hahn2018-11-141-2/+2
| | | | llvm-svn: 346858
* [VPlan, SLP] Add simple SLP analysis on top of VPlan.Florian Hahn2018-11-145-1/+623
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds an initial implementation of the look-ahead SLP tree construction described in 'Look-Ahead SLP: Auto-vectorization in the Presence of Commutative Operations, CGO 2018 by Vasileios Porpodas, Rodrigo C. O. Rocha, Luís F. W. Góes'. It returns an SLP tree represented as VPInstructions, with combined instructions represented as a single, wider VPInstruction. This initial version does not support instructions with multiple different users (either inside or outside the SLP tree) or non-instruction operands; it won't generate any shuffles or insertelement instructions. It also just adds the analysis that builds an SLP tree rooted in a set of stores. It does not include any cost modeling or memory legality checks. The plan is to integrate it with VPlan based cost modeling, once available and to only apply it to operations that can be widened. A follow-up patch will add a support for replacing instructions in a VPlan with their SLP counter parts. Reviewers: Ayal, mssimpso, rengolin, mkuper, hfinkel, hsaito, dcaballe, vporpo, RKSimon, ABataev Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D4949 llvm-svn: 346857
* Recommit r346483: [CallSiteSplitting] Only record conditions up to the ↵Florian Hahn2018-11-141-13/+26
| | | | | | | | | IDom(call site). The underlying problem causing the expensive-check failure was fixed in rL346769. llvm-svn: 346843
* Revert r346810 "Preserve loop metadata when splitting exit blocks"Reid Kleckner2018-11-141-32/+0
| | | | | | | It broke the Windows self-host: http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457 llvm-svn: 346823
* [InstCombine] fold funnel shift amount based on demanded bitsSanjay Patel2018-11-131-0/+14
| | | | | | | | | | | | | | The shift amount of a funnel shift is modulo the scalar bitwidth: http://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic ...so we can use demanded bits analysis on that operand to simplify it when we have a power-of-2 bitwidth. This is another step towards canonicalizing {shift/shift/or} to the intrinsics in IR. Differential Revision: https://reviews.llvm.org/D54478 llvm-svn: 346814
* Preserve loop metadata when splitting exit blocksCraig Topper2018-11-131-0/+32
| | | | | | | | | | LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.) Patch by Chang Lin (clin1) Differential Revision: https://reviews.llvm.org/D53876 llvm-svn: 346810
* [InstCombine] canonicalize rotate patterns with cmp/selectSanjay Patel2018-11-131-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cmp+branch variant of this pattern is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 ...and as discussed there, we probably can't transform that without a rotate intrinsic. We do have that now via funnel shift, but we're not quite ready to canonicalize IR to that form yet. The case with 'select' should already be transformed though, so that's this patch. The sequence with negation followed by masking is what we use in the backend and partly in clang (though that part should be updated). https://rise4fun.com/Alive/TplC %cmp = icmp eq i32 %shamt, 0 %sub = sub i32 32, %shamt %shr = lshr i32 %x, %shamt %shl = shl i32 %x, %sub %or = or i32 %shr, %shl %r = select i1 %cmp, i32 %x, i32 %or => %neg = sub i32 0, %shamt %masked = and i32 %shamt, 31 %maskedneg = and i32 %neg, 31 %shl2 = lshr i32 %x, %masked %shr2 = shl i32 %x, %maskedneg %r = or i32 %shl2, %shr2 llvm-svn: 346807
* [CSP, Cloning] Update DuplicateInstructionsInSplitBetween to use DomTreeUpdater.Florian Hahn2018-11-133-39/+48
| | | | | | | | | | | | | | | | | | | | | This patch updates DuplicateInstructionsInSplitBetween to update a DTU instead of applying updates to the DT directly. Given that there only are 2 users, also updated them in this patch to avoid churn. I slightly moved the code in CallSiteSplitting around to reduce the places where we have to pass in DTU. If necessary, I could split those changes in a separate patch. This fixes missing DT updates when dealing with musttail calls in CallSiteSplitting, by using DTU->deleteBB. Reviewers: junbuml, kuhar, NutshellySima, indutny, brzycki Reviewed By: NutshellySima llvm-svn: 346769
* Revert "[ThinLTO] Internalize readonly globals"Steven Wu2018-11-132-59/+7
| | | | | | This reverts commit 10c84a8f35cae4a9fc421648d9608fccda3925f2. llvm-svn: 346768
* [VPlan] VPlan version of InterleavedAccessInfo.Florian Hahn2018-11-134-9/+102
| | | | | | | | | | | | | | | | | This patch turns InterleaveGroup into a template with the instruction type being a template parameter. It also adds a VPInterleavedAccessInfo class, which only contains a mapping from VPInstructions to their respective InterleaveGroup. As we do not have access to scalar evolution in VPlan, we can re-use convert InterleavedAccessInfo to VPInterleavedAccess info. Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D49489 llvm-svn: 346758
* Introduce DebugCounter into ConstProp passZhizhou Yang2018-11-131-26/+43
| | | | | | | | | | | | | | | | | | | Summary: This patch introduces DebugCounter into ConstProp pass at per-transformation level. It will provide an option to skip first n or stop after n transformations for the whole ConstProp pass. This will make debug easier for the pass, also providing chance to do transformation level bisecting. Reviewers: davide, fhahn Reviewed By: fhahn Subscribers: llozano, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D50094 llvm-svn: 346720
* [InstCombine] narrow width of rotate patterns, part 3Sanjay Patel2018-11-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a longer variant for the pattern handled in rL346713 This one includes zexts. Eventually, we should canonicalize all rotate patterns to the funnel shift intrinsics, but we need a bit more infrastructure to make sure the vectorizers handle those intrinsics as well as the shift+logic ops. https://rise4fun.com/Alive/FMn Name: narrow rotateright %neg = sub i8 0, %shamt %rshamt = and i8 %shamt, 7 %rshamtconv = zext i8 %rshamt to i32 %lshamt = and i8 %neg, 7 %lshamtconv = zext i8 %lshamt to i32 %conv = zext i8 %x to i32 %shr = lshr i32 %conv, %rshamtconv %shl = shl i32 %conv, %lshamtconv %or = or i32 %shl, %shr %r = trunc i32 %or to i8 => %maskedShAmt2 = and i8 %shamt, 7 %negShAmt2 = sub i8 0, %shamt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shl2 = lshr i8 %x, %maskedShAmt2 %shr2 = shl i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346716
* [InstCombine] narrow width of rotate patterns, part 2 (PR39624)Sanjay Patel2018-11-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sub-pattern for the shift amount in a rotate can take on several different forms, and there's apparently no way to canonicalize those without seeing the entire rotate sequence. This is the form noted in: https://bugs.llvm.org/show_bug.cgi?id=39624 https://rise4fun.com/Alive/qnT %zx = zext i8 %x to i32 %maskedShAmt = and i32 %shAmt, 7 %shl = shl i32 %zx, %maskedShAmt %negShAmt = sub i32 0, %shAmt %maskedNegShAmt = and i32 %negShAmt, 7 %shr = lshr i32 %zx, %maskedNegShAmt %rot = or i32 %shl, %shr %r = trunc i32 %rot to i8 => %truncShAmt = trunc i32 %shAmt to i8 %maskedShAmt2 = and i8 %truncShAmt, 7 %shl2 = shl i8 %x, %maskedShAmt2 %negShAmt2 = sub i8 0, %truncShAmt %maskedNegShAmt2 = and i8 %negShAmt2, 7 %shr2 = lshr i8 %x, %maskedNegShAmt2 %r = or i8 %shl2, %shr2 llvm-svn: 346713
* [InstCombine] refactor code for matching shift amount of a rotate; NFCSanjay Patel2018-11-121-12/+17
| | | | | | | | As shown in existing test cases and with: https://bugs.llvm.org/show_bug.cgi?id=39624 ...we're missing at least 2 more patterns for rotate narrowing. llvm-svn: 346711
* [GC][InstCombine] Fix a potential iteration issuePhilip Reames2018-11-121-1/+4
| | | | | | Noticed via inspection. Appears to be largely innocious in practice, but slight code change could have resulted in either visit order dependent missed optimizations or infinite loops. May be a minor compile time problem today. llvm-svn: 346698
OpenPOWER on IntegriCloud