summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] canonicalize add/sub with boolSanjay Patel2019-02-241-0/+6
| | | | | | | | | | | | | | | | | | | | | | | add A, sext(B) --> sub A, zext(B) We have to choose 1 of these forms, so I'm opting for the zext because that's easier for value tracking. The backend should be prepared for this change after: D57401 rL353433 This is also a preliminary step towards reducing the amount of bit hackery that we do in IR to optimize icmp/select. That should be waiting to happen at a later optimization stage. The seeming regression in the fuzzer test was discussed in: D58359 We were only managing that fold in instcombine by luck, and other passes should be able to deal with that better anyway. llvm-svn: 354748
* BreakCriticalEdges: Update PostDominatorTreeMatt Arsenault2019-02-221-4/+13
| | | | llvm-svn: 354673
* [LowerSwitch][AMDGPU] Do not handle impossible valuesRoman Tereshin2019-02-221-68/+137
| | | | | | | | | | | | This patch adds LazyValueInfo to LowerSwitch to compute the range of the value being switched over and reduce the size of the tree LowerSwitch builds to lower a switch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D58096 llvm-svn: 354670
* [DTU] Refine the interface and logic of applyUpdatesChijun Sima2019-02-225-34/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch separates two semantics of `applyUpdates`: 1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update. 2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated. Logic changes: Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example, ``` DTU(Lazy) and Edge A->B exists. 1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued 2. Remove A->B 3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended) ``` But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue. Interface changes: The second semantic of `applyUpdates` is separated to `applyUpdatesPermissive`. These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`. Reviewers: kuhar, brzycki, dmgreen, grosser Reviewed By: brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58170 llvm-svn: 354669
* [MemorySSA & LoopPassManager] Resolve PR40038.Alina Sbirlea2019-02-221-3/+2
| | | | | | | | | | | | | | The correct edge being deleted is not to the unswitched exit block, but to the original block before it was split. That's the key in the map, not the value. The insert is correct. The new edge is to the .split block. The splitting turns OriginalBB into: OriginalBB -> OriginalBB.split. Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete ParentBB->OriginalBB, not ParentBB->OriginalBB.split. llvm-svn: 354656
* [DTU] Deprecate insertEdge*/deleteEdge*Chijun Sima2019-02-227-16/+21
| | | | | | | | | | | | | | | | Summary: This patch converts all existing `insertEdge*/deleteEdge*` to `applyUpdates` and marks `insertEdge*/deleteEdge*` as deprecated. Reviewers: kuhar, brzycki Reviewed By: kuhar, brzycki Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58443 llvm-svn: 354652
* [MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks.Alina Sbirlea2019-02-214-10/+20
| | | | | | | MemorySSA is now updated when forming dedicated exit blocks. Resolves PR40037. llvm-svn: 354623
* [LoopSimplifyCFG] Update MemorySSA after r353911.Alina Sbirlea2019-02-211-10/+17
| | | | | | | | | | | | | | | | | Summary: MemorySSA is not properly updated in LoopSimplifyCFG after recent changes. Use SplitBlock utility to resolve that and clear all updates once handleDeadExits is finished. All updates that follow are removal of edges which are safe to handle via the removeEdge() API. Also, deleting dead blocks is done correctly as is, i.e. delete from MemorySSA before updating the CFG and DT. Reviewers: mkazantsev, rtereshin Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58524 llvm-svn: 354613
* [EarlyCSE] Cleanup deadcode. [NFCI]Alina Sbirlea2019-02-211-5/+1
| | | | | | | | | | | | | | Summary: Cleanup nop assignments. Reviewers: george.burgess.iv, davide Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58308 llvm-svn: 354612
* [InferAddressSpaces] Fix fallthrough errorJoey Gouly2019-02-211-0/+1
| | | | llvm-svn: 354580
* [InferAddressSpaces] Fix crash on select of non-ptr operandsJoey Gouly2019-02-211-2/+5
| | | | | | | | | Check the operands of a select are pointers, to determine if it is an address expression or not. https://reviews.llvm.org/D58226 llvm-svn: 354576
* [LoopSimplifyCFG] Add missing MSSA edge deletionMax Kazantsev2019-02-211-0/+2
| | | | | | | | When we create fictive switch in preheader, we should take care about MSSA and delete edge between old preheader and header. llvm-svn: 354547
* [Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabledWei Mi2019-02-212-3/+10
| | | | | | | | | | | | | | | | | | | is false. Right now for inliner and partial inliner, we always pass the address of a valid ORE object to getInlineCost even if RemarkEnabled is false because of no -Rpass is specified. Since ComputeFullInlineCost will be set to true if ORE is non-null in getInlineCost, this introduces the problem that in getInlineCost we cannot return early even if we already know the cost is definitely higher than the threshold. It is a general problem for compile time. This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is false. Differential Revision: https://reviews.llvm.org/D58399 llvm-svn: 354542
* [GVN] Small tweaks to comments, style, and missed vector handlingPhilip Reames2019-02-201-5/+5
| | | | | | Noticed these while doing a final sweep of the code to make sure I hadn't missed anything in my last couple of patches. The (minor) missed optimization was noticed because of the stylistic fix to avoid an overly specific cast. llvm-svn: 354412
* [GVN] Fix last crasher w/non-integral pointersPhilip Reames2019-02-201-2/+18
| | | | | | | | | | Same case as for memset and memcpy, but this time for clobbering stores and loads. We still can't allow coercion to or from non-integrals, regardless of the transform. Now that I'm done the whole little sequence, it seems apparent that we'd entirely missed reasoning about clobbers in the original GVN support for non-integral pointers. My appologies, I thought we'd upstreamed all of this, but it turns out we were still carrying a downstream hack which hid all of these issues. My chanks to Cherry Zhang for helping debug. llvm-svn: 354407
* [GVN] Fix a crash bug w/non-integral pointers and memtransfersPhilip Reames2019-02-191-0/+5
| | | | | | Problem is very similiar to the one fixed for memsets in r354399, we try to coerce a value to non-integral type, and then crash while try to do so. Since we shouldn't be doing such coercions to start with, easy fix. From inspection, I see two other cases which look to be similiar and will follow up with most test cases and fixes if confirmed. llvm-svn: 354403
* [GVN] Fix a non-integral pointer bug w/vector typesPhilip Reames2019-02-191-2/+2
| | | | | | GVN generally doesn't forward structs or array types, but it *will* forward vector types to non-vectors and vice versa. As demonstrated in tests, we need to inhibit the same set of transforms for vector of non-integral pointers as for non-integral pointers themselves. llvm-svn: 354401
* [GVN] Fix a crash bug around non-integral pointersPhilip Reames2019-02-191-3/+16
| | | | | | If we encountered a location where we tried to forward the value of a memset to a load of a non-integral pointer, we crashed. Such a forward is not legal in general, but we can forward null pointers. Test for both cases are included. llvm-svn: 354399
* [InstCombine] reduce even more unsigned saturated add with 'not' opSanjay Patel2019-02-191-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354393
* [InstCombine] rearrange saturated add folds; NFCSanjay Patel2019-02-191-12/+14
| | | | | | | | | | This is no-functional-change-intended, but that was also true when it was part of rL354276, and I managed to lose 2 predicates for the fold with constant...causing much bot distress. So this time I'm adding a couple of negative tests to avoid that. llvm-svn: 354384
* [NFC] API for signaling that the current loop is being deletedMax Kazantsev2019-02-191-9/+30
| | | | | | | We are planning to be able to delete the current loop in LoopSimplifyCFG in the future. Add API to notify the loop pass manager that it happened. llvm-svn: 354314
* [NFC] Store loop header in a local to keep it available after the loop is ↵Max Kazantsev2019-02-191-11/+9
| | | | | | deleted llvm-svn: 354313
* Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op"Sanjay Patel2019-02-181-28/+17
| | | | | | | This reverts commit 079b610c29b4a428b3ae7b64dbac0378facf6632. Bots are failing after this change on a stage 2 compile of clang. llvm-svn: 354277
* [InstCombine] reduce even more unsigned saturated add with 'not' opSanjay Patel2019-02-181-17/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: uaddsat, -1 fval %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %notx, %y %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a Name: uaddsat, -1 fval + ult %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %y, %notx %r = select i1 %c, i32 %a, i32 -1 => %a = add i32 %x, %y %c2 = icmp ugt i32 %y, %a %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/nTp llvm-svn: 354276
* [NFC] Teach getInnermostLoopFor walk up the loop treesMax Kazantsev2019-02-171-6/+10
| | | | | | | This should be NFC in current use case of this method, but it will help to use it for solving more compex tasks in follow-up patches. llvm-svn: 354227
* [InstCombine] reduce more unsigned saturated add with 'not' opSanjay Patel2019-02-171-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 Name: not op %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ult i32 %notx, %y %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a Name: not op ugt %notx = xor i32 %x, -1 %a = add i32 %x, %y %c = icmp ugt i32 %y, %notx %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, %y %c2 = icmp ult i32 %a, %y %r = select i1 %c2, i32 -1, i32 %a https://rise4fun.com/Alive/niom (The matching here is still incomplete.) llvm-svn: 354224
* [InstCombine] reduce unsigned saturated add with 'not' opSanjay Patel2019-02-171-11/+28
| | | | | | | | | | | | | We want to use the sum in the icmp to allow matching with m_UAddWithOverflow and eliminate the 'not'. This is discussed in D51929 and is another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 (The matching here is incomplete. Trying to take minimal steps to make sure we don't induce infinite looping from existing canonicalizations of the 'select'.) llvm-svn: 354221
* [NFC] Fix name and clarifying comment for factored-out functionMax Kazantsev2019-02-171-4/+5
| | | | llvm-svn: 354220
* [NFC] Factor out a function for future reuseMax Kazantsev2019-02-171-8/+15
| | | | llvm-svn: 354218
* [EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE.Alina Sbirlea2019-02-151-2/+13
| | | | | | | | | | | | | | | | | | | Summary: Unlimitted number of calls to getClobberingAccess can lead to high compile times in pathological cases. Limitting getClobberingAccess to a fairly high number. Can be adjusted based on users/need. Note: this is the only user of MemorySSA currently enabled by default. The same handling exists in LICM (disabled atm). As MemorySSA gains more users, this logic of capping will need to move inside MemorySSA. Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D58248 llvm-svn: 354182
* [InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC]Philip Reames2019-02-151-6/+6
| | | | | | Better addressing comments from https://reviews.llvm.org/D58290. llvm-svn: 354171
* [InstCombine] Convert atomicrmws to xchg or store where legalPhilip Reames2019-02-151-15/+59
| | | | | | | | | | Implement two more transforms of atomicrmw: 1) We can convert an atomicrmw which produces a known value in memory into an xchg instead. 2) We can convert an atomicrmw xchg w/o users into a store for some orderings. Differential Revision: https://reviews.llvm.org/D58290 llvm-svn: 354170
* [CodeExtractor] Do not lift lifetime.end markers for region inputsVedant Kumar2019-02-151-13/+7
| | | | | | | | | | | | | | | | | | | | | | | If a lifetime.end marker occurs along one path through the extraction region, but not another, then it's still incorrect to lift the marker, because there is some path through the extracted function which would ordinarily not reach the marker. If the call to the extracted function is in a loop, unrolling can cause inputs to the function to become optimized out as undef after the first iteration. To prevent incorrect stack slot merging in the calling function, it should be sufficient to lift lifetime.start markers for region inputs. I've tested this theory out by doing a stage2 check-all with randomized splitting enabled. This is a follow-up to r353973, and there's additional context for this change in https://reviews.llvm.org/D57834. rdar://47896986 Differential Revision: https://reviews.llvm.org/D58253 llvm-svn: 354159
* [HotColdSplit] Schedule splitting late to fix perf regressionVedant Kumar2019-02-151-5/+10
| | | | | | | | | | | | | | | With or without PGO data applied, splitting early in the pipeline (either before the inliner or shortly after it) regresses performance across SPEC variants. The cause appears to be that splitting hides context for subsequent optimizations. Schedule splitting late again, in effect reversing r352080, which scheduled the splitting pass early for code size benefits (documented in https://reviews.llvm.org/D57082). Differential Revision: https://reviews.llvm.org/D58258 llvm-svn: 354158
* [InstCombine] fix crash while trying to narrow a binop of shuffles (PR40734)Sanjay Patel2019-02-151-1/+2
| | | | | | https://bugs.llvm.org/show_bug.cgi?id=40734 llvm-svn: 354144
* [MergeICmps] Make base ordering really deterministic.Clement Courbet2019-02-151-87/+107
| | | | | | | | | | | | | | | | | | Summary: The idea is that we now manipulate bases through a `unsigned BaseID` based on order of appearance in the comparison chain rather than through the `Value*`. Fixes 40714. Reviewers: gchatelet Subscribers: mgrang, jfb, jdoerfert, llvm-commits, hans Tags: #llvm Differential Revision: https://reviews.llvm.org/D58274 llvm-svn: 354131
* [MergeICmps][NFC] Improve doc.Clement Courbet2019-02-151-5/+28
| | | | llvm-svn: 354128
* [NFCI] Factor out block removal from stack of nested loopsMax Kazantsev2019-02-151-6/+14
| | | | llvm-svn: 354124
* Fix "field 'DFS' will be initialized after field 'DTU'" warning. NFCI.Simon Pilgrim2019-02-151-1/+1
| | | | llvm-svn: 354123
* [NFC] Promote DFS to field for further useMax Kazantsev2019-02-151-2/+2
| | | | llvm-svn: 354118
* [NFC] Tweak SplitBlockAndInsertIfThen to use existing ThenBlockMax Kazantsev2019-02-151-8/+16
| | | | llvm-svn: 354107
* [ThinLTO] Detect partially split modules during the thin linkTeresa Johnson2019-02-142-18/+16
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The changes to disable LTO unit splitting by default (r350949) and detect inconsistently split LTO units (r350948) are causing some crashes when the inconsistency is detected in multiple threads simultaneously. Fix that by having the code always look for the inconsistently split LTO units during the thin link, by checking for the presence of type tests recorded in the summaries. Modify test added in r350948 to remove single threading required to fix a bot failure due to this issue (and some debugging options added in the process of diagnosing it). Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57561 llvm-svn: 354062
* [InstCombine] Add todos for possible atomicrmw transformsPhilip Reames2019-02-141-0/+6
| | | | llvm-svn: 354059
* Canonicalize all integer "idempotent" atomicrmw opsPhilip Reames2019-02-141-5/+13
| | | | | | | | | | For "idempotent" atomicrmw instructions which we can't simply turn into load, canonicalize the operation and constant. This reduces the matching needed elsewhere in the optimizer, but doesn't directly impact codegen. For any architecture where OR/Zero is not a good default choice, you can extend the AtomicExpand lowerIdempotentRMWIntoFencedLoad mechanism. I reviewed X86 to make sure this works well, haven't audited other backends. Differential Revision: https://reviews.llvm.org/D58244 llvm-svn: 354058
* Teach instcombine about remaining "idempotent" atomicrmw typesPhilip Reames2019-02-141-30/+60
| | | | | | | | | | Expand on Quentin's r353471 patch which converts some atomicrmws into loads. Handle remaining operation types, and fix a slight bug. Atomic loads are required to have alignment. Since this was within the InstCombine fixed point, somewhere else in InstCombine was adding alignment before the verifier saw it, but still, we should fix. Terminology wise, I'm using the "idempotent" naming that is used for the same operations in AtomicExpand and X86ISelLoweringInfo. Once this lands, I'll add similar tests for AtomicExpand, and move the pattern match function to a common location. In the review, there was seemingly consensus that "idempotent" was slightly incorrect for this context. Once we setle on a better name, I'll update all uses at once. Differential Revision: https://reviews.llvm.org/D58242 llvm-svn: 354046
* Refine ArgPromotion metadata handlingTeresa Johnson2019-02-141-0/+2
| | | | | | | | | | | | | | | | | | | | Summary: In r353537 we now copy all metadata to the new function, with the old being removed when the old function is eliminated. In some cases the old function is dropped to a declaration (seems to only occur with the old PM). Go ahead and clear all metadata from the old function to handle that case, since verification will complain otherwise. This is consistent with what was being done for debug metadata before r353537. Reviewers: davidxl, uabelho Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58215 llvm-svn: 354032
* [LoopUnrollPeel] Add case where we should forget the peeled loop from SCEV.Florian Hahn2019-02-141-10/+8
| | | | | | | | | | | | | | | | | | | | The test case requires the peeled loop to be forgotten after peeling, even though it does not have a parent. When called via the unroller, SE->forgetTopmostLoop is also called, so the test case would also pass without any SCEV invalidation, but peelLoop is exposed as utility function. Also, in the test case, simplifyLoop will make changes, removing the loop from SCEV, but it is better to not rely on this behavior. Reviewers: sanjoy, mkazantsev Reviewed By: mkazantsev Tags: #llvm Differential Revision: https://reviews.llvm.org/D58192 llvm-svn: 354031
* [Instrumentation][NFC] Fix warning.Clement Courbet2019-02-141-1/+1
| | | | | | lib/Transforms/Instrumentation/AddressSanitizer.cpp:1173:29: warning: extra ‘;’ [-Wpedantic] llvm-svn: 354024
* [NFC] Refactor LICM code for better readabilityMax Kazantsev2019-02-141-6/+11
| | | | llvm-svn: 354013
* [NewPM] Second attempt at porting ASanLeonard Chan2019-02-132-184/+261
| | | | | | | | | | | | | | | | | | | | This is the second attempt to port ASan to new PM after D52739. This takes the initialization requried by ASan from the Module by moving it into a separate class with it's own analysis that the new PM ASan can use. Changes: - Split AddressSanitizer into 2 passes: 1 for the instrumentation on the function, and 1 for the pass itself which creates an instance of the first during it's run. The same is done for AddressSanitizerModule. - Add new PM AddressSanitizer and AddressSanitizerModule. - Add legacy and new PM analyses for reading data needed to initialize ASan with. - Removed DominatorTree dependency from ASan since it was unused. - Move GlobalsMetadata and ShadowMapping out of anonymous namespace since the new PM analysis holds these 2 classes and will need to expose them. Differential Revision: https://reviews.llvm.org/D56470 llvm-svn: 353985
OpenPOWER on IntegriCloud