summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
...
* [RS4GC] Use "deopt" operand bundlesSanjoy Das2015-10-161-36/+175
| | | | | | | | | | | | | | | | | | | | | | Summary: This is a step towards using operand bundles to carry deopt state till RewriteStatepointsForGC. The change adds a flag to RewriteStatepointsForGC that teaches it to pick up deopt state from a `"deopt"` operand bundle attached to the `call` or `invoke` it is wrapping. The command line flag added, `-rs4gc-use-deopt-bundles`, will only exist for a short while. Once we are able to pipe deopt bundle state through the full optimization pipeline without problems, we will "constant fold" `-rs4gc-use-deopt-bundles` to `true`. Reviewers: swaroop.sridhar, reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13372 llvm-svn: 250489
* [IndVars] Rename getExtend; NFCSanjoy Das2015-10-161-17/+17
| | | | | | | Rename `IndVarSimplify::getExtend` to `IndVarSimplify::createExtendInst` to make it obvious that it creates `llvm::Instruction` s. llvm-svn: 250484
* [IndVars] Have `cloneArithmeticIVUser` guess betterSanjoy Das2015-10-161-12/+74
| | | | | | | | | | | | | | | | | | | | | Summary: `cloneArithmeticIVUser` currently trips over expression like `add %iv, -1` when `%iv` is being zero extended -- it tries to construct the widened use as `add %iv.zext, zext(-1)` and (correctly) fails to prove equivalence to `zext(add %iv, -1)` (here the SCEV for `%iv` is `{1,+,1}`). This change teaches `IndVars` to try sign extending the non-IV operand if that makes the newly constructed IV use equivalent to the widened narrow IV use. Reviewers: atrick, hfinkel, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13717 llvm-svn: 250483
* [IndVars] Extract out a few local variables; NFCSanjoy Das2015-10-161-24/+32
| | | | llvm-svn: 250482
* [IndVars] Split `WidenIV::cloneIVUser`; NFCSanjoy Das2015-10-161-25/+71
| | | | | | | | | | | | | | | Summary: This NFC splitting is intended to make a later diff easier to follow. It just tail duplicates `cloneIVUser` into `cloneArithmeticIVUser` and `cloneBitwiseIVUser`. Reviewers: atrick, hfinkel, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13716 llvm-svn: 250481
* [ScalarOpts] Remove dead code.Benjamin Kramer2015-10-155-53/+9
| | | | | | Does not touch debug dumpers. NFC. llvm-svn: 250417
* Recommit r250345, it was reverted in r250366 to investigate a bot failure.Manman Ren2015-10-151-5/+116
| | | | | | Our internal bot is still red after r250366. llvm-svn: 250415
* Temporarily revert r250345 to sort out bot failure.Manman Ren2015-10-151-116/+5
| | | | | | | | | | | | | | With r250345 and r250343, we start to observe the following failure when bootstrap clang with lto and pgo: PHI node entries do not match predecessors! %.sroa.029.3.i = phi %"class.llvm::SDNode.13298"* [ null, %30953 ], [ null, %31017 ], [ null, %30998 ], [ null, %_ZN4llvm8dyn_castINS_14ConstantSDNodeENS_7SDValueEEENS_10cast_rettyIT_T0_E8ret_typeERS5_.exit.i.1804 ], [ null, %30975 ], [ null, %30991 ], [ null, %_ZNK4llvm3EVT13getScalarTypeEv.exit.i.1812 ], [ %..sroa.029.0.i, %_ZN4llvm11SmallVectorIiLj8EED1Ev.exit.i.1826 ], !dbg !451895 label %30998 label %_ZNK4llvm3EVTeqES0_.exit19.thread.i LLVM ERROR: Broken function found, compilation aborted! I will re-commit this if the bot does not recover. llvm-svn: 250366
* Update the branch weight metadata in JumpThreading pass.Cong Hou2015-10-141-5/+116
| | | | | | | | | | Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). This is the third attempt to submit this patch, while the first two led to failures in some FDO tests. After investigation, it is the edge weight normalization that caused those failures. In this patch the edge weight normalization is fixed so that there is no zero weight in the output and the sum of all weights can fit in 32-bit integer. Several unit tests are added. Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250345
* [LoopUnswitch] Correct misleading comments.Chen Li2015-10-141-2/+1
| | | | | | | | | | Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13738 llvm-svn: 250317
* Revert r250204 and r250240 due to bot failure. We failed to build PGO-ed clang.Manman Ren2015-10-141-118/+5
| | | | llvm-svn: 250264
* Typo.Chad Rosier2015-10-131-1/+1
| | | | llvm-svn: 250224
* Scalar: Remove remaining ilist iterator implicit conversionsDuncan P. N. Exon Smith2015-10-1327-240/+240
| | | | | | | | | | | | | | | | | | | Remove remaining `ilist_iterator` implicit conversions from LLVMScalarOpts. This change exposed some scary behaviour in lib/Transforms/Scalar/SCCP.cpp around line 1770. This patch changes a call from `Function::begin()` to `&Function::front()`, since the return was immediately being passed into another function that takes a `Function*`. `Function::front()` started to assert, since the function was empty. Note that `Function::end()` does not point at a legal `Function*` -- it points at an `ilist_half_node` -- so the other function was getting garbage before. (I added the missing check for `Function::isDeclaration()`.) Otherwise, no functionality change intended. llvm-svn: 250211
* Update the branch weight metadata in JumpThreading pass.Cong Hou2015-10-131-5/+118
| | | | | | | | Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250204
* Scalar: Remove some implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-1310-61/+62
| | | | | | | Remove some of the implicit ilist iterator conversions in LLVMScalarOpts. More to go. llvm-svn: 250197
* [IndVars] NFC Cleanup.Sanjoy Das2015-10-131-66/+62
| | | | | | | | - Rename methods according to the LLVM Coding Style - Merge adjacent anonymous namespace block - Use `auto` in two places llvm-svn: 250152
* Revert 250089 due to bot failure. It failed when building clang itself with PGO.Manman Ren2015-10-131-118/+5
| | | | llvm-svn: 250145
* Update the branch weight metadata in JumpThreading pass.Cong Hou2015-10-121-5/+118
| | | | | | | | In JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Differential revision: http://reviews.llvm.org/D10979 llvm-svn: 250089
* [IndVars] Use `auto`; NFCSanjoy Das2015-10-101-6/+4
| | | | llvm-svn: 249944
* Generalize convergent check to handle invokes as well as calls.Owen Anderson2015-10-091-4/+4
| | | | llvm-svn: 249892
* Teach LoopUnswitch not to perform non-trivial unswitching on loops ↵Owen Anderson2015-10-091-0/+14
| | | | | | | | | | | containing convergent operations. Doing so could cause the post-unswitching convergent ops to be control-dependent on the unswitch condition where they were not before. This check could be refined to allow unswitching where the convergent operation was already control-dependent on the unswitch condition. llvm-svn: 249874
* Refine the definition of convergent to only disallow the addition of new ↵Owen Anderson2015-10-091-1/+2
| | | | | | | | | | control dependencies. This covers the common case of operations that cannot be sunk. Operations that cannot be hoisted should already be handled properly via the safe-to-speculate rules and mechanisms. llvm-svn: 249865
* [MemCpyOpt] Fix wrong merging adjacent nontemporal stores into memset calls.Andrea Di Biagio2015-10-091-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass MemCpyOpt doesn't check if a store instruction is nontemporal. As a consequence, adjacent nontemporal stores are always merged into a memset call. Example: ;;; define void @foo(<4 x float>* nocapture %p) { entry: store <4 x float> zeroinitializer, <4 x float>* %p, align 16, !nontemporal !0 %p1 = getelementptr inbounds <4 x float>, <4 x float>* %dst, i64 1 store <4 x float> zeroinitializer, <4 x float>* %p1, align 16, !nontemporal !0 ret void } !0 = !{i32 1} ;;; In this example, the two nontemporal stores are combined to a memset of zero which does not preserve the nontemporal hint. Later on the backend (tested on a x86-64 corei7) expands that memset call into a sequence of two normal 16-byte aligned vector stores. opt -memcpyopt example.ll -S -o - | llc -mcpu=corei7 -o - Before: xorps %xmm0, %xmm0 movaps %xmm0, 16(%rdi) movaps %xmm0, (%rdi) With this patch, we no longer merge nontemporal stores into calls to memset. In this example, llc correctly expands the two stores into two movntps: xorps %xmm0, %xmm0 movntps %xmm0, 16(%rdi) movntps %xmm0, (%rdi) In theory, we could extend the usage of !nontemporal metadata to memcpy/memset calls. However a change like that would only have the effect of forcing the backend to expand !nontemporal memsets back to sequences of store instructions. A memset library call would not have exactly the same semantic of a builtin !nontemporal memset call. So, SelectionDAG will have to conservatively expand it back to a sequence of !nontemporal stores (effectively undoing the merging). Differential Revision: http://reviews.llvm.org/D13519 llvm-svn: 249820
* [EarlyCSE] Address post commit review for r249523.Arnaud A. de Grandmaison2015-10-091-10/+10
| | | | llvm-svn: 249814
* [RS4GC] Refactoring to make a later change easier, NFCISanjoy Das2015-10-081-19/+22
| | | | | | | | | | | | | | Summary: These non-semantic changes will help make a later change adding support for deopt operand bundles more streamlined. Reviewers: reames, swaroop.sridhar Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13491 llvm-svn: 249779
* [PlaceSafeopints] Extract out `callsGCLeafFunction`, NFCSanjoy Das2015-10-081-28/+1
| | | | | | | | | | | | | Summary: This will be used in a later change to RewriteStatepointsForGC. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13490 llvm-svn: 249777
* [RS4GC] Don't copy ADT's unneccessarily, NFCISanjoy Das2015-10-081-3/+3
| | | | | | | | | | | | Summary: Use `const auto &` instead of `auto` in `makeStatepointExplicit`. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13454 llvm-svn: 249776
* [RS4GC] Use AssertingVH for RematerializedValueMapTy, NFCISanjoy Das2015-10-071-1/+2
| | | | | | | | | | Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13489 llvm-svn: 249620
* [EarlyCSE] Fix handling of target memory intrinsics for CSE'ing loads.Arnaud A. de Grandmaison2015-10-071-14/+23
| | | | | | | | | | | | | | | | | Summary: Some target intrinsics can access multiple elements, using the pointer as a base address (e.g. AArch64 ld4). When trying to CSE such instructions, it must be checked the available value comes from a compatible instruction because the pointer is not enough to discriminate whether the value is correct. Reviewers: ssijaric Subscribers: mcrosier, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13475 llvm-svn: 249523
* [RS4GC] Remove an unnecessary assert & related variablesSanjoy Das2015-10-071-5/+0
| | | | | | | I don't think this assert adds much value, and removing it and related variables avoids an "unused variable" warning in release builds. llvm-svn: 249511
* [RS4GC] Cosmetic cleanup, NFCSanjoy Das2015-10-071-245/+211
| | | | | | | | | | | | | | | | | | | | Summary: A series of cosmetic cleanup changes to RewriteStatepointsForGC: - Rename variables to LLVM style - Remove some redundant asserts - Remove an unsued `Pass *` parameter - Remove unnecessary variables - Use C++11 idioms where applicable - Pass CallSite by value, not reference Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13370 llvm-svn: 249508
* Fix Clang-tidy modernize-use-nullptr warnings in source directories and ↵Hans Wennborg2015-10-061-3/+3
| | | | | | | | | | generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482
* [IndVars] Don't break dominance in `eliminateIdentitySCEV`Sanjoy Das2015-10-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: After r249211, `getSCEV(X) == getSCEV(Y)` does not guarantee that X and Y are related in the dominator tree, even if X is an operand to Y (I've included a toy example in comments, and a real example as a test case). This commit changes `SimplifyIndVar` to require a `DominatorTree`. I don't think this is a problem because `ScalarEvolution` requires it anyway. Fixes PR25051. Depends on D13459. Reviewers: atrick, hfinkel Subscribers: joker.eph, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D13460 llvm-svn: 249471
* [EarlyCSE] Constify ParseMemoryInst methods (NFC).Arnaud A. de Grandmaison2015-10-061-9/+9
| | | | llvm-svn: 249400
* inariant.group handling in GVNPiotr Padlewski2015-10-022-13/+12
| | | | | | | | | | | | The most important part required to make clang devirtualization works ( ͡°͜ʖ ͡°). The code is able to find non local dependencies, but unfortunatelly because the caller can only handle local dependencies, I had to add some restrictions to look for dependencies only in the same BB. http://reviews.llvm.org/D12992 llvm-svn: 249196
* [NaryReassociate] SeenExprs records WeakVHJingyue Wu2015-10-011-6/+12
| | | | | | | | | | | | | | | | Summary: The instructions SeenExprs records may be deleted during rewriting. FindClosestMatchingDominator should ignore these deleted instructions. Fixes PR24301. Reviewers: grosser Subscribers: grosser, llvm-commits Differential Revision: http://reviews.llvm.org/D13315 llvm-svn: 248983
* DeadCodeElimination: rewrite to be fasterFiona Glaser2015-09-301-31/+45
| | | | | | | | | | Same strategy as simplifyInstructionsInBlock. ~1/3 less time on my test suite. This pass doesn't have many in-tree users, but getting rid of an O(N^2) worst case and making it cleaner should at least make it a viable alternative to ADCE, since it's now consistently somewhat faster. llvm-svn: 248927
* [LoopUnswitch] Add block frequency analysis to recognize hot/cold regionsChen Li2015-09-291-0/+48
| | | | | | | | | | | | Summary: This patch adds block frequency analysis to LoopUnswitch pass to recognize hot/cold regions. For cold regions the pass only performs trivial unswitches since they do not increase code size, and for hot regions everything works as before. This helps to minimize code growth in cold regions and be more aggressive in hot regions. Currently the default cold regions are blocks with frequencies below 20% of function entry frequency, and it can be adjusted via -loop-unswitch-cold-block-frequency flag. The entire feature is controlled via -loop-unswitch-with-block-frequency flag and it is off by default. Reviewers: broune, silvas, dnovillo, reames Subscribers: davidxl, llvm-commits Differential Revision: http://reviews.llvm.org/D11605 llvm-svn: 248777
* [LoopReroll] Ignore debug intrinsicsWeiming Zhao2015-09-281-1/+20
| | | | | | | | | Originally, debug intrinsics and annotation intrinsics may prevent the loop to be rerolled, now they are ignored. Differential Revision: http://reviews.llvm.org/D13150 llvm-svn: 248718
* ADCE: Fix typo in file comment. NFCJustin Bogner2015-09-251-1/+1
| | | | llvm-svn: 248613
* Swap loop invariant GEP with loop variant GEP to allow more LICM.Lawrence Hu2015-09-231-8/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the order of GEPs generated by Splitting GEPs pass, specially when one of the GEPs has constant and the base is loop invariant, then we will generate the GEP with constant first when beneficial, to expose more cases for LICM. If originally Splitting GEP generate the following: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 %3 %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032 ... Now it genereates: do.body.i: %idxprom.i = sext i32 %shr.i to i64 %2 = bitcast %typeD* %s to i8* %3 = shl i64 %idxprom.i, 2 %uglygep = getelementptr i8, i8* %2, i64 1032 %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3 ... For no-loop cases, the original way of generating GEPs seems to expose more CSE cases, so we don't change the logic for no-loop cases, and only limit our change to the specific case we are interested in. llvm-svn: 248420
* [DeadStoreElimination] Remove dead zero store to calloc initialized memoryIgor Laevsky2015-09-231-33/+58
| | | | | | | | This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function. Differential Revision: http://reviews.llvm.org/D13021 llvm-svn: 248374
* [SCEV] Introduce ScalarEvolution::getOne and getZero.Sanjoy Das2015-09-235-7/+6
| | | | | | | | | | | | | | | | | | Summary: It is fairly common to call SE->getConstant(Ty, 0) or SE->getConstant(Ty, 1); this change makes such uses a little bit briefer. I've refactored the call sites I could find easily to use getZero / getOne. Reviewers: hfinkel, majnemer, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D12947 llvm-svn: 248362
* [Unroll] Do not crash trying to propagate a value to vector load.Michael Zolotukhin2015-09-221-0/+6
| | | | llvm-svn: 248333
* [Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.Michael Zolotukhin2015-09-221-1/+1
| | | | | | | | Apart from checking that GlobalVariable is a constant, we should check that it's not a weak constant, in which case we can't propagate its value. llvm-svn: 248327
* Prune trailing whitespaces.NAKAMURA Takumi2015-09-221-3/+3
| | | | llvm-svn: 248265
* Untabify.NAKAMURA Takumi2015-09-221-3/+3
| | | | llvm-svn: 248264
* Reformat blank lines.NAKAMURA Takumi2015-09-221-2/+2
| | | | llvm-svn: 248263
* Reformat comment lines.NAKAMURA Takumi2015-09-221-3/+3
| | | | llvm-svn: 248262
* Reformat.NAKAMURA Takumi2015-09-221-4/+1
| | | | llvm-svn: 248261
OpenPOWER on IntegriCloud