summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [RuntimeLoopUnroller] Add assert that we dont unroll non-rotated loopsAnna Thomas2017-05-031-0/+7
| | | | | | | | | | | | | | | | | | Summary: Cloning basic blocks in the loop for runtime loop unroller depends on loop being in rotated form (i.e. loop latch target is the exit block). Assert that this is true, so that callers of runtime loop unroller pass in canonical loops. The single caller of this function has that check recently added: https://reviews.llvm.org/rL301239 Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32801 llvm-svn: 302058
* [Loop Deletion] Delete loops that are never executedAnna Thomas2017-05-031-14/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, loop deletion deletes loop where the only values that are used outside the loop are loop-invariant. This patch adds logic to delete loops where the loop is proven to be never executed (i.e. the only predecessor of the loop preheader has a constant conditional branch as terminator, and the preheader is not the taken target). This will remove loops that become dead after loop-unswitching generates constant conditional branches. The next steps are: 1. moving the loop deletion implementation to LoopUtils. 2. Add logic in loop-simplifyCFG which will support changing conditional constant branches to unconditional branches. If loops become unreachable in this process, they can be removed using `deleteDeadLoop` function. Reviewers: chandlerc, efriedma, sanjoy, reames Reviewed by: sanjoy Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32494 llvm-svn: 302015
* Replace hardcoded intrinsic list with speculatable attribute.Matt Arsenault2017-05-031-1/+7
| | | | | | No change in which intrinsics should be speculated. llvm-svn: 301995
* Re-land r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of ↵Reid Kleckner2017-05-024-26/+11
| | | | | | | | | | AttributeList" This time, I fixed, built, and tested clang. This reverts r301712. llvm-svn: 301981
* [NewGVN] Fix typo and format comment. NFCI.Davide Italiano2017-05-021-7/+4
| | | | llvm-svn: 301974
* [PartialInlining] Add more early filteringXinliang David Li2017-05-021-0/+10
| | | | | | | This is a follow up to the previous inline cost patch for quicker filtering. llvm-svn: 301959
* SpeculativeExecution: Stop using whitelist for costsMatt Arsenault2017-05-021-42/+1
| | | | | | | Just let TTI's cost do this instead of arbitrarily restricting this. llvm-svn: 301950
* [InstCombine] don't use DeMorgan's Law on integer constants (2nd try)Sanjay Patel2017-05-021-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally checked in here: https://reviews.llvm.org/rL301923 And reverted here: https://reviews.llvm.org/rL301924 Because there's a clang test that would fail after this. I fixed/removed the offending CHECK lines in: https://reviews.llvm.org/rL301928 So let's try this again. Original commit message: This is the fold that causes the infinite loop in BoringSSL (https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) when we fix instcombine demanded bits to prefer 'not' ops as in https://reviews.llvm.org/D32255. There are 2 or 3 problems with dyn_castNotVal, and I don't think we can reinstate https://reviews.llvm.org/D32255 until dyn_castNotVal is completely eliminated. 1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot. 2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors. 3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X | ~Y) That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X | ~ Y) --> ~(X & Y) Differential Revision: https://reviews.llvm.org/D32665 llvm-svn: 301929
* revert r301923 : [InstCombine] don't use DeMorgan's Law on integer constantsSanjay Patel2017-05-021-21/+18
| | | | | | There's a clang test that is wrongly using -O1 and failing after this commit. llvm-svn: 301924
* [InstCombine] don't use DeMorgan's Law on integer constantsSanjay Patel2017-05-021-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is the fold that causes the infinite loop in BoringSSL (https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) when we fix instcombine demanded bits to prefer 'not' ops as in D32255. There are 2 or 3 problems with dyn_castNotVal, and I don't think we can reinstate D32255 until dyn_castNotVal is completely eliminated. 1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot. 2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors. 3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X | ~Y) That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X | ~ Y) --> ~(X & Y) Differential Revision: https://reviews.llvm.org/D32665 llvm-svn: 301923
* [PartialInlining] Hook up inline cost analysisXinliang David Li2017-05-021-11/+98
| | | | | | Differential Revision: http://reviews.llvm.org/D32666 llvm-svn: 301894
* Empty Space. NFCXin Tong2017-05-011-1/+1
| | | | llvm-svn: 301878
* [NewGVN] Don't derive incorrect implications.Davide Italiano2017-05-011-4/+4
| | | | | | | | | | | | | | In the testcase attached, we believe %tmp1 implies %tmp4. where: br i1 %tmp1, label %bb2, label %bb7 br i1 %tmp4, label %bb5, label %bb7 because Wwhile looking at PredicateInfo stuffs we end up calling isImpliedTrueByMatchingCmp() with the arguments backwards. Differential Revision: https://reviews.llvm.org/D32718 llvm-svn: 301849
* [InstCombine] check one-use before applying DeMorgan nor/nand foldsSanjay Patel2017-05-011-10/+14
| | | | | | | | | | | | | | | If we have ~(~X & Y), it only makes sense to transform it to (X | ~Y) when we do not need the intermediate (~X & Y) value. In that case, we would need an extra instruction to generate ~Y + 'or' (as shown in the test changes). It's ok if we have multiple uses of ~X or Y, however. In those cases, we may not reduce the instruction count or critical path, but we might improve throughput because we can generate ~X and ~Y in parallel. Whether that actually makes perf sense or not for a target is something we can't answer in IR. Differential Revision: https://reviews.llvm.org/D32703 llvm-svn: 301848
* IPO: Add missing build dep.Peter Collingbourne2017-05-011-1/+1
| | | | llvm-svn: 301835
* Object: Remove ModuleSummaryIndexObjectFile class.Peter Collingbourne2017-05-011-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D32195 llvm-svn: 301832
* Take indirect branch into account as well when folding.Xin Tong2017-05-011-6/+10
| | | | | | | | We may not be able to rewrite indirect branch target, but we also want to take it into account when folding, i.e. if it and all its successor's predecessors go to the same destination, we can fold, i.e. no need to thread. llvm-svn: 301816
* Rename WeakVH to WeakTrackingVH; NFCSanjoy Das2017-05-0117-109/+99
| | | | | | This relands r301424. llvm-svn: 301812
* [JumpThread] Add some assertions for expected ConstantInt/BlockAddressXin Tong2017-05-011-2/+5
| | | | llvm-svn: 301808
* [JumpThread] Do RAUW in case Cond folds to a constant in the CFGXin Tong2017-05-011-8/+24
| | | | | | | | | | | | | | Summary: [JumpThread] Do RAUW in case Cond folds to a constant in the CFG Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32407 llvm-svn: 301804
* Rename isKnownNotFullPoison to programUndefinedIfPoison; NFCSanjoy Das2017-04-301-1/+1
| | | | | | | | | | | | | | | | | Summary: programUndefinedIfPoison makes more sense, given what the function does; and I'm about to add a function with a name similar to isKnownNotFullPoison (so do the rename to avoid confusion). Reviewers: broune, majnemer, bjarke.roune Reviewed By: broune Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30444 llvm-svn: 301776
* [KnownBits] Add methods for determining if the known bits represent a ↵Craig Topper2017-04-291-4/+4
| | | | | | | | | | | | | | | | negative/nonnegative number and add methods for changing the negative/nonnegative state Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit. Reviewers: RKSimon, spatel, davide Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32651 llvm-svn: 301747
* [ObjCARC] Do not move a release between a call and aAkira Hatanaka2017-04-292-16/+33
| | | | | | | | | | | | | | | | | | | retainAutoreleasedReturnValue that retains the returned value. This commit fixes a bug in ARC optimizer where it moves a release between a call and a retainAutoreleasedReturnValue, causing the returned object to be released before the retainAutoreleasedReturnValue can retain it. This commit accomplishes that by doing a lookahead and checking whether the call prevents the release from moving upwards. In the long term, we should treat the region between the retainAutoreleasedReturnValue and the call as a critical section and disallow moving anything there (possibly using operand bundles). rdar://problem/20449878 llvm-svn: 301724
* [LoopUnswitch] Make DEBUG output more readable (part 2).Davide Italiano2017-04-291-1/+1
| | | | | | | | | I fixed my miscompile in r301722 and I hope I don't have to take a look at this code again now that Chandler has a new LoopUnswitch pass, but maybe this could be of use for somebody else in the meanwhile. llvm-svn: 301723
* [LoopUnswitch] Don't remove instructions with side effects.Davide Italiano2017-04-291-1/+2
| | | | | | | | This fixes PR32818. Differential Revision: https://reviews.llvm.org/D32664 llvm-svn: 301722
* Revert r301697 "[IR] Make add/remove Attributes use AttrBuilder instead of ↵Hans Wennborg2017-04-284-11/+26
| | | | | | | | | | | | | | | | | | AttributeList" This broke the Clang build. (Clang-side patch missing?) Original commit message: > [IR] Make add/remove Attributes use AttrBuilder instead of > AttributeList > > This change cleans up call sites and avoids creating temporary > AttributeList objects. > > NFC llvm-svn: 301712
* InferAddressSpaces: Search constant expressions for addrspacecastsMatt Arsenault2017-04-281-12/+48
| | | | | | | These are pretty common when using local memory, and the 64-bit generic addressing is much more expensive to compute. llvm-svn: 301711
* InferAddressSpaces: Avoid looking up deleted valuesMatt Arsenault2017-04-281-3/+10
| | | | | | | | | | | While looking at pure addressing expressions, it's possible for the value to appear later in Postorder. I haven't been able to come up with a testcase where this exhibits an actual issue, but if you insert a dump before the value map lookup, a few testcases crash. llvm-svn: 301705
* InferAddressSpaces: Infer from just addrspacecastsMatt Arsenault2017-04-281-0/+12
| | | | | | | | Eliminates some more cases where some subset of the addressing computation remains flat. Some cases with addrspacecasts in nested constant expressions are still left behind however. llvm-svn: 301704
* LoopRotate: Fix use after scope bugDaniel Berlin2017-04-281-3/+4
| | | | llvm-svn: 301702
* [IR] Make add/remove Attributes use AttrBuilder instead of AttributeListReid Kleckner2017-04-284-26/+11
| | | | | | | | | This change cleans up call sites and avoids creating temporary AttributeList objects. NFC llvm-svn: 301697
* [LoopUnswitch] Make DEBUG output more readable.Davide Italiano2017-04-281-1/+1
| | | | | | | | | While debugging a miscompile I realized loopunswitch doesn't put newlines when printing the instruction being replacement. Ending up with a single line with many instruction replaced isn't the best for readability and/or mental sanity. llvm-svn: 301692
* Make getParamAlignment use argument numbersReid Kleckner2017-04-283-4/+4
| | | | | | | | | | | | | | | | | | The method is called "get *Param* Alignment", and is only used for return values exactly once, so it should take argument indices, not attribute indices. Avoids confusing code like: IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError); Alignment = CS->getParamAlignment(ArgIdx + 1); Add getRetAlignment to handle the one case in Value.cpp that wants the return value alignment. This is a potentially breaking change for out-of-tree backends that do their own call lowering. llvm-svn: 301682
* Kill off the old SimplifyInstruction API by converting remaining users.Daniel Berlin2017-04-2813-52/+37
| | | | llvm-svn: 301673
* [IPO/MergeFunctions] This function is used only under DEBUG().Davide Italiano2017-04-281-0/+4
| | | | llvm-svn: 301672
* [RS4GC] Simplify attribute handling code NFCReid Kleckner2017-04-281-47/+25
| | | | | | | Avoids use of AttributeList::getNumSlots, making it easier to change the underlying implementation. llvm-svn: 301671
* Use Argument::hasAttribute and AttributeList::ReturnIndex moreReid Kleckner2017-04-282-19/+15
| | | | | | | | | | | This eliminates many extra 'Idx' induction variables in loops over arguments in CodeGen/ and Target/. It also reduces the number of places where we assume that ReturnIndex is 0 and that we should add one to argument numbers to get the corresponding attribute list index. NFC llvm-svn: 301666
* Clean up DIExpression::prependDIExpr a little. (NFC)Adrian Prantl2017-04-282-9/+6
| | | | llvm-svn: 301662
* [APInt] Add clearSignBit method. Use it and setSignBit in a few places. NFCICraig Topper2017-04-282-3/+3
| | | | llvm-svn: 301656
* Memory intrinsic value profile optimization: Avoid divide by 0Teresa Johnson2017-04-281-0/+4
| | | | | | | | | | | | | | Summary: Skip memops if the total value profiled count is 0, we can't correctly scale up the counts and there is no point anyway. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32624 llvm-svn: 301645
* [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values.Andrew Ng2017-04-281-47/+6
| | | | | | | | | | | | | | | This is a follow up to the fix in r298360 to improve the handling of debug values when redundant LEAs are removed. The fix in r298360 effectively discarded the debug values. This patch now attempts to preserve the debug values by using the DWARF DW_OP_stack_value operation via prependDIExpr. Moved functions appendOffset and prependDIExpr from Local.cpp to DebugInfoMetadata.cpp and made them available as static member functions of DIExpression. Differential Revision: https://reviews.llvm.org/D31604 llvm-svn: 301630
* [EarlyCSE] Mark the condition of assume intrinsic as trueMax Kazantsev2017-04-281-3/+9
| | | | | | | | | | | | | | EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions. Reviewers: sanjoy, reames, apilipenko, anna, skatkov Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32482 llvm-svn: 301625
* [EarlyCSE] Remove guards with conditions known to be trueMax Kazantsev2017-04-281-3/+18
| | | | | | | | | | | | | | | | | If a condition is calculated only once, and there are multiple guards on this condition, we should be able to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and mark its condition as 'true' for future consideration. Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32476 llvm-svn: 301623
* [APInt] Use inplace shift methods where possible. NFCICraig Topper2017-04-282-2/+2
| | | | llvm-svn: 301612
* [SROA] Fix nondeterminism exposed by Simon's r299221.Davide Italiano2017-04-271-14/+12
| | | | | | | Use a SmallSetSetVector instead of a SmallPtrSet as iterating over the latter is not stable ('<' relies on addresses). llvm-svn: 301599
* [InstCombine] fix matcher to bind to specific operand (PR32830)Sanjay Patel2017-04-271-1/+1
| | | | | | | Matching any random value would be very wrong: https://bugs.llvm.org/show_bug.cgi?id=32830 llvm-svn: 301594
* [asan] Fix dead stripping of globals on Linux.Evgeniy Stepanov2017-04-273-39/+138
| | | | | | | | | | | | | | | | | | | | | | Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a second re-land of r298158. This time, this feature is limited to -fdata-sections builds. llvm-svn: 301587
* [asan] Put ctor/dtor in comdat.Evgeniy Stepanov2017-04-271-9/+48
| | | | | | | | | | | | | | | | | | | | When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a second re-land of r298756. This time with a flag to disable the whole thing to avoid a bug in the gold linker: https://sourceware.org/bugzilla/show_bug.cgi?id=19002 llvm-svn: 301586
* [PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass.Chandler Carruth2017-04-274-1/+639
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, this pass only focuses on *trivial* loop unswitching. At that reduced problem it remains significantly better than the current loop unswitch: - Old pass is worse than cubic complexity. New pass is (I think) linear. - New pass is much simpler in its design by focusing on full unswitching. (See below for details on this). - New pass doesn't carry state for thresholds between pass iterations. - New pass doesn't carry state for correctness (both miscompile and infloop) between pass iterations. - New pass produces substantially better code after unswitching. - New pass can handle more trivial unswitch cases. - New pass doesn't recompute the dominator tree for the entire function and instead incrementally updates it. I've ported all of the trivial unswitching test cases from the old pass to the new one to make sure that major functionality isn't lost in the process. For several of the test cases I've worked to improve the precision and rigor of the CHECKs, but for many I've just updated them to handle the new IR produced. My initial motivation was the fact that the old pass carried state in very unreliable ways between pass iterations, and these mechansims were incompatible with the new pass manager. However, I discovered many more improvements to make along the way. This pass makes two very significant assumptions that enable most of these improvements: 1) Focus on *full* unswitching -- that is, completely removing whatever control flow construct is being unswitched from the loop. In the case of trivial unswitching, this means removing the trivial (exiting) edge. In non-trivial unswitching, this means removing the branch or switch itself. This is in opposition to *partial* unswitching where some part of the unswitched control flow remains in the loop. Partial unswitching only really applies to switches and to folded branches. These are very similar to full unrolling and partial unrolling. The full form is an effective canonicalization, the partial form needs a complex cost model, cannot be iterated, isn't canonicalizing, and should be a separate pass that runs very late (much like unrolling). 2) Leverage LLVM's Loop machinery to the fullest. The original unswitch dates from a time when a great deal of LLVM's loop infrastructure was missing, ineffective, and/or unreliable. As a consequence, a lot of complexity was added which we no longer need. With these two overarching principles, I think we can build a fast and effective unswitcher that fits in well in the new PM and in the canonicalization pipeline. Some of the remaining functionality around partial unswitching may not be relevant today (not many test cases or benchmarks I can find) but if they are I'd like to add support for them as a separate layer that runs very late in the pipeline. Purely to make reviewing and introducing this code more manageable, I've split this into first a trivial-unswitch-only pass and in the next patch I'll add support for full non-trivial unswitching against a *fixed* threshold, exactly like full unrolling. I even plan to re-use the unrolling thresholds, as these are incredibly similar cost tradeoffs: we're cloning a loop body in order to end up with simplified control flow. We should only do that when the total growth is reasonably small. One of the biggest changes with this pass compared to the previous one is that previously, each individual trivial exiting edge from a switch was unswitched separately as a branch. Now, we unswitch the entire switch at once, with cases going to the various destinations. This lets us unswitch multiple exiting edges in a single operation and also avoids numerous extremely bad behaviors, where we would introduce 1000s of branches to test for thousands of possible values, all of which would take the exact same exit path bypassing the loop. Now we will use a switch with 1000s of cases that can be efficiently lowered into a jumptable. This avoids relying on somehow forming a switch out of the branches or getting horrible code if that fails for any reason. Another significant change is that this pass actively updates the CFG based on unswitching. For trivial unswitching, this is actually very easy because of the definition of loop simplified form. Doing this makes the code coming out of loop unswitch dramatically more friendly. We still should run loop-simplifycfg (at the least) after this to clean up, but it will have to do a lot less work. Finally, this pass makes much fewer attempts to simplify instructions based on the unswitch. Something like loop-instsimplify, instcombine, or GVN can be used to do increasingly powerful simplifications based on the now dominating predicate. The old simplifications are things that something like loop-instsimplify should get today or a very, very basic loop-instcombine could get. Keeping that logic separate is a big simplifying technique. Most of the code in this pass that isn't in the old one has to do with achieving specific goals: - Updating the dominator tree as we go - Unswitching all cases in a switch in a single step. I think it is still shorter than just the trivial unswitching code in the old pass despite having this functionality. Differential Revision: https://reviews.llvm.org/D32409 llvm-svn: 301576
* [GlobalOpt] Correctly update metadata when localizing a global.Eli Friedman2017-04-271-1/+3
| | | | | | | | | Just calling dropAllReferences leaves pointers to the ConstantExpr behind, so we would eventually crash with a null pointer dereference. Differential Revision: https://reviews.llvm.org/D32551 llvm-svn: 301575
OpenPOWER on IntegriCloud