summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [LICM & MemorySSA] Make limit flags pass tuning options.Alina Sbirlea2019-04-193-40/+55
| | | | | | | | | | | | | | Summary: Make the flags in LICM + MemorySSA tuning options in the old and new pass managers. Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60490 llvm-svn: 358772
* [NewPassManager] Adding pass tuning options: loop vectorize.Alina Sbirlea2019-04-192-5/+14
| | | | | | | | | | | | | | | | Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763
* [MergeFunc] Delete unused FunctionNode::release()Fangrui Song2019-04-191-2/+0
| | | | llvm-svn: 358742
* [MergeFunc] removeUsers: call remove() only on direct usersFangrui Song2019-04-191-21/+3
| | | | | | | | | | | | | | | removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741
* [CallSite removal] Move the legacy PM, call graph, and some inlinerChandler Carruth2019-04-195-17/+19
| | | | | | | | | | | | code to `CallBase`. This patch focuses on the legacy PM, call graph, and some of inliner and legacy passes interacting with those APIs from `CallSite` to the new `CallBase` class. No interesting changes. Differential Revision: https://reviews.llvm.org/D60412 llvm-svn: 358739
* [MergeFunc] Use less_first() as the comparator of Schwartzian transformFangrui Song2019-04-191-6/+1
| | | | llvm-svn: 358738
* MergeFunc: preserve COMDAT information when creating a thunkSaleem Abdulrasool2019-04-191-0/+1
| | | | | | | | | | We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728
* [LoopUnroll] Move list of params into a struct [NFCI].Alina Sbirlea2019-04-183-66/+70
| | | | | | | | | | | | | | Summary: Cleanup suggested in review of r358304. Reviewers: sanjoy, efriedma Subscribers: jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60638 llvm-svn: 358723
* [GuardWidening] Wire up a NPM version of the LoopGuardWidening passPhilip Reames2019-04-181-0/+25
| | | | llvm-svn: 358704
* [BlockExtractor] Extend the file format to support the grouping of basic blocksQuentin Colombet2019-04-181-28/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701
* [LoopPred] Fix a blatantly obvious bug in r358684Philip Reames2019-04-181-1/+1
| | | | | | The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688
* [LoopPredication] Allow predication of loop invariant computations (within ↵Philip Reames2019-04-181-15/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684
* Elaborate why we have an option on by default for enabling chr.Eric Christopher2019-04-181-0/+2
| | | | llvm-svn: 358641
* Fix bad compare function over FusionCandidate.Richard Trieu2019-04-181-6/+8
| | | | | | | Reverse the checking of the domiance order so that when a self compare happens, it returns false. This makes compare function have strict weak ordering. llvm-svn: 358636
* Fix formatting. NFCAkira Hatanaka2019-04-171-90/+88
| | | | llvm-svn: 358623
* Test commit by Denis BakhvalovDenis Bakhvalov2019-04-171-1/+1
| | | | | Change-Id: I4d85123a157d957434902fb14ba50926b2d56212 llvm-svn: 358619
* Add basic loop fusion pass.Kit Barton2019-04-173-0/+1217
| | | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607
* [InstCombine] Factor out unreachable inst idiom creation [NFC]Philip Reames2019-04-173-13/+15
| | | | | | | | In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later. This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places. llvm-svn: 358600
* [LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.Florian Hahn2019-04-171-2/+13
| | | | | | | | | | | | | | | | | | | | Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586
* [CVP] processOverflowIntrinsic(): don't crash if constant-holding happenedRoman Lebedev2019-04-171-4/+7
| | | | | | | As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559
* Revert "Add basic loop fusion pass." Per request.Eric Christopher2019-04-173-1214/+0
| | | | | | This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-173-0/+1214
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Remove the run-slp-after-loop-vectorization option.Eric Christopher2019-04-171-12/+3
| | | | | | | It's been on by default for 4 years and cleans up the pass hierarchy. llvm-svn: 358548
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-173-1214/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Add basic loop fusion pass.Kit Barton2019-04-173-0/+1214
| | | | | | | | | | | | | | | | | | | This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543
* Fix a typo in comments. [NFC]Ali Tamur2019-04-161-1/+1
| | | | llvm-svn: 358531
* [EarlyCSE] detect equivalence of selects with inverse conditions and ↵Sanjay Patel2019-04-161-2/+59
| | | | | | | | | | | | | | | | | | commuted operands (PR41101) This is 1 of the problems discussed in the post-commit thread for: rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html and filed as: https://bugs.llvm.org/show_bug.cgi?id=41101 Instcombine tries to canonicalize some of these cases (and there's room for improvement there independently of this patch), but it can't always do that because of extra uses. So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar to how we detect commuted compares and commuted min/max/abs. Differential Revision: https://reviews.llvm.org/D60723 llvm-svn: 358523
* [CVP] Simplify umulo and smulo that cannot overflowNikita Popov2019-04-161-6/+1
| | | | | | | | | | | If a umul.with.overflow or smul.with.overflow operation cannot overflow, simplify it to a simple mul nuw / mul nsw. After the refactoring in D60668 this is just a matter of removing an explicit check against multiplications. Differential Revision: https://reviews.llvm.org/D60791 llvm-svn: 358521
* [SLP] Refactoring of the operand reordering code.Simon Pilgrim2019-04-161-171/+463
| | | | | | | | | | | | | | This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold: i. Cleanup and simplify the reordering code, and ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2. This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo . Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59973 llvm-svn: 358519
* [InstCombine] Prune fshl/fshr with masked operandsNikita Popov2019-04-161-0/+4
| | | | | | | | | | | | | | | | If a constant shift amount is used, then only some of the LHS/RHS operand bits are demanded and we may be able to simplify based on that. InstCombineSimplifyDemanded already had the necessary support for that, we just weren't calling it with fshl/fshr as root. In particular, this allows us to relax some masked funnel shifts into simple shifts, as shown in the tests. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60660 llvm-svn: 358515
* [IR] Add WithOverflowInst classNikita Popov2019-04-164-161/+68
| | | | | | | | | | | | | | This adds a WithOverflowInst class with a few helper methods to get the underlying binop, signedness and nowrap type and makes use of it where sensible. There will be two more uses in D60650/D60656. The refactorings are all NFC, though I left some TODOs where things could be improved. In particular we have two places where add/sub are handled but mul isn't. Differential Revision: https://reviews.llvm.org/D60668 llvm-svn: 358512
* Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink ↵Hans Wennborg2019-04-161-14/+15
| | | | | | | | | | | | | | | | | | | function calls without used results (PR41259) The original commit caused false positives from AddressSanitizer's use-after-scope checks, which have now been fixed in r358478. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 358483
* Asan use-after-scope: don't poison allocas if there were untraced lifetime ↵Hans Wennborg2019-04-161-1/+14
| | | | | | | | | | | | | | intrinsics in the function (PR41481) If there are any intrinsics that cannot be traced back to an alloca, we might have missed the start of a variable's scope, leading to false error reports if the variable is poisoned at function entry. Instead, if there are some intrinsics that can't be traced, fail safe and don't poison the variables in that function. Differential revision: https://reviews.llvm.org/D60686 llvm-svn: 358478
* [CodeExtractor] Add a few debug lines to understand why a region is not ↵Quentin Colombet2019-04-161-3/+8
| | | | | | | | | | | | | | | | | | extracted The CodeExtractor is not smart enough to compute which basic block is the entry of a region. Instead it relies on the order of the list of basic blocks that is handed to it and assumes that the entry is the first block in the list. Without the additional debug information, it is hard to understand why a valid region does not get extracted, because we would miss that the order of in the list just doesn't match what the CodeExtractor wants. NFC llvm-svn: 358471
* [LSR] Rewrite misses some fixup locations if it splits critical edgeQuentin Colombet2019-04-151-1/+42
| | | | | | | | | | | | | | | If LSR split critical edge during rewriting phi operands and phi node has other pending fixup operands, we need to update those pending fixups. Otherwise formulae will not be implemented completely and some instructions will not be eliminated. llvm.org/PR41445 Differential Revision: https://reviews.llvm.org/D60645 Patch by: Denis Bakhvalov <denis.bakhvalov@intel.com> llvm-svn: 358457
* [LoopPred] Stop passing around builders [NFC]Philip Reames2019-04-151-31/+49
| | | | | | | | | | | | This is a preparatory patch for D60093. This patch itself is NFC, but while preparing this I noticed and committed a small hoisting change in rL358419. The basic structure of the new scheme is that we pass around the guard ("the using instruction"), and select an optimal insert point by examining operands at each construction point. This seems conceptually a bit cleaner to start with as it isolates the knowledge about insertion safety at the actual insertion point. Note that the non-hoisting path is not actually used at the moment. That's not exercised until D60093 is rebased on this one. Differential Revision: https://reviews.llvm.org/D60718 llvm-svn: 358434
* [DEBUGINFO] Prevent Instcombine from dropping debuginfo when removing zextsWolfgang Pieb2019-04-151-5/+4
| | | | | | | | | | | Zexts can be treated like no-op casts when it comes to assessing whether their removal affects debug info. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D60641 llvm-svn: 358431
* [PGO] Profile guided code size optimization.Hiroshi Yamauchi2019-04-1511-38/+169
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Enable some of the existing size optimizations for cold code under PGO. A ~5% code size saving in big internal app under PGO. The way it gets BFI/PSI is discussed in the RFC thread http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html Note it doesn't currently touch loop passes. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59514 llvm-svn: 358422
* [LoopPred] Hoist and of predicated checks where legalPhilip Reames2019-04-151-2/+16
| | | | | | If we have multiple range checks which can be predicated, hoist the and of the results outside the loop. This minorly cleans up the resulting IR, but the main motivation is as a building block for D60093. llvm-svn: 358419
* [InstCombine] canonicalize fdiv after fmul if reassociation is allowedSanjay Patel2019-04-151-0/+8
| | | | | | | | (X / Y) * Z --> (X * Z) / Y This can allow other optimizations/reassociations as shown in the test diffs. llvm-svn: 358404
* [Transforms][ASan] Move findAllocaForValue() to Utils/Local.cpp. NFCAlexander Potapenko2019-04-152-39/+41
| | | | | | | | | | | | | | | | Summary: Factor out findAllocaForValue() from ASan so that we can use it in MSan to handle lifetime intrinsics. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60615 llvm-svn: 358380
* [Mem2Reg] Delete unused PointerAllocaValuesFangrui Song2019-04-141-5/+0
| | | | | | It is unused after AliasSetTracker support was removed. llvm-svn: 358352
* [Mem2Reg] Simplify and micro optimizeFangrui Song2019-04-141-13/+9
| | | | | | | | * Rearrange continu/break * BBNumbers.lookup(A) -> BBNumbers.find(A)->second BBNumbers has been computed, thus we can assume the value exists in the predicate. llvm-svn: 358351
* [Mem2Reg] Don't call LBI.deleteValue on AllocInst/DbgVariableIntrinsicFangrui Song2019-04-141-6/+1
| | | | | | Only StoreInst/LoadInst are assigned numbers. Other types of instructions are not in LBI. llvm-svn: 358350
* [Mem2Reg] Simplify rewriteSingleStoreAllocaFangrui Song2019-04-141-5/+2
| | | | llvm-svn: 358349
* [InstCombine] Remove redundant/bogus mul_with_overflow combinesNikita Popov2019-04-131-8/+0
| | | | | | | | | | | | As pointed out in D60518 folding mulo(%x, undef) to {undef, undef} isn't correct. As a correct version of this already exists in InstructionSimplify (https://github.com/llvm-mirror/llvm/blob/bd8056ef326e075cc500f3f0cfcd1193bc200594/lib/Analysis/InstructionSimplify.cpp#L4750-L4757) this is just dead code though. Drop it together with the mul(%x, 0) -> {0, false} fold that is also already handled by InstSimplify. Differential Revision: https://reviews.llvm.org/D60649 llvm-svn: 358339
* [Mem2Reg] Delete unused AllocaPointerValFangrui Song2019-04-131-4/+0
| | | | | | It is no longer used after the AliasSetTracker updating logic was removed. llvm-svn: 358334
* [InstCombine] Canonicalize (-X srem Y) to -(X srem Y).Chen Zheng2019-04-131-0/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D60647 llvm-svn: 358328
* [SCEV] Add option to forget everything in SCEV.Alina Sbirlea2019-04-125-36/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Create a method to forget everything in SCEV. Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll. Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch. Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build. Testcase showcasing this cannot be opensourced and is fairly large. The option disabled by default, but it may be desirable to enable by default. Evidence in favor (two difference runs on different days/ToT state): File Before (s) After (s) clang-9.bc 7267.91 6639.14 llvm-as.bc 194.12 194.12 llvm-dis.bc 62.50 62.50 opt.bc 1855.85 1857.53 File Before (s) After (s) clang-9.bc 8588.70 7812.83 llvm-as.bc 196.20 194.78 llvm-dis.bc 61.55 61.97 opt.bc 1739.78 1886.26 Reviewers: sanjoy Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60144 llvm-svn: 358304
* [InstCombine] Fix a nasty miscompile introduced w/masked.gather demanded eltsPhilip Reames2019-04-121-1/+5
| | | | | | | | This fixes a miscompile which was introduced in r356510 (https://reviews.llvm.org/D57372). The problem is that the original patch removed pointer operands where the load results we're demanded, but without considering the legality of the load itself. If the masked.gather had active, but undemanded, lanes, then we could end up creating a load which loaded from an undef address. The result could be a segfault, or, in theory, an arbitrary read from a random memory location into an used register. llvm-svn: 358299
OpenPOWER on IntegriCloud