summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [StatepointsForGC] Rematerialize in the presence of PHIsAnna Thomas2016-08-291-0/+35
| | | | | | | | | | | | | | | | Summary: While walking the use chain for identifying rematerializable values in RS4GC, add the case where the current value and base value are the same PHI nodes. This will aid rematerialization of geps and casts instead of relocating. Reviewers: sanjoy, reames, igor Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23920 llvm-svn: 279975
* [Coroutines] Part 9: Add cleanup subfunction.Gor Nishanov2016-08-297-65/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [Coroutines] Part 9: Add cleanup subfunction. This patch completes coroutine heap allocation elision. Now, the heap elision example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex3.ll) Intrinsic Changes: * coro.free gets a token parameter tying it to coro.id to allow reliably discovering all coro.frees associated with a particular coroutine. * coro.id gets an extra parameter that points back to a coroutine function. This allows to check whether a coro.id describes the enclosing function or it belongs to a different function that was later inlined. CoroSplit now creates three subfunctions: # f$resume - resume logic # f$destroy - cleanup logic, followed by a deallocation code # f$cleanup - just the cleanup code CoroElide pass during devirtualization replaces coro.destroy with either f$destroy or f$cleanup depending whether heap elision is performed or not. Other fixes, improvements: * Fixed buglet in Shape::buildFrame that was not creating coro.save properly if coroutine has more than one suspend point. * Switched to using variable width suspend index field (no longer limited to 32 bit index field can be as little as i1 or as large as i<whatever-size_t-is>) Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23844 llvm-svn: 279971
* [InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat ↵Sanjay Patel2016-08-281-32/+34
| | | | | | constant vectors llvm-svn: 279937
* GVN-hoist: invalidate MD cache (PR29144)Sebastian Pop2016-08-271-0/+2
| | | | | | | | | Without invalidating the entries in the MD cache we would try to access instructions that were removed in previous iterations of hoisting. Differential Revision: https://reviews.llvm.org/D23927 llvm-svn: 279907
* [Inliner] Report when inlining fails because callee's def is unavailableAdam Nemet2016-08-261-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is obviously an interesting case because it may motivate code restructuring or LTO. Reporting this requires instantiation of ORE in the loop where the call sites are first gathered. I've checked compile-time overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in mcf and quickly tailing off. As before without -Rpass-with-hotness there is no overhead. Because this could be a pretty noisy diagnostics, it is currently qualified as 'verbose'. As of this patch, 'verbose' diagnostics are only emitted with -Rpass-with-hotness, i.e. when the output is expected to be filtered. Reviewers: eraman, chandlerc, davidxl, hfinkel Subscribers: tejohnson, Prazek, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D23415 llvm-svn: 279860
* [InstCombine] add helper function for icmp (and (sh X, Y), C2), C1 ; NFCSanjay Patel2016-08-262-45/+64
| | | | | | | | Like other recent changes near here, the goal is to allow vector types for all of these folds. Splitting things up makes it easier to incrementally enhance the code and easier to read. llvm-svn: 279851
* [InstCombine] clean up foldICmpAndConstConst(); NFCSanjay Patel2016-08-261-172/+166
| | | | | | | | 1. Early exit to reduce indent 2. Fix comments and variable names to match 3. Reformat comments / clang-format code llvm-svn: 279837
* [InstCombine] add helper function for folding of icmp (and X, C2), C; NFCSanjay Patel2016-08-262-6/+21
| | | | llvm-svn: 279834
* limit the number of instructions per block examined by dead store eliminationBob Haarman2016-08-261-2/+10
| | | | | | | | | | | | Summary: Dead store elimination gets very expensive when large numbers of instructions need to be analyzed. This patch limits the number of instructions analyzed per store to the value of the memdep-block-scan-limit parameter (which defaults to 100). This resulted in no observed difference in performance of the generated code, and no change in the statistics for the dead store elimination pass, but improved compilation time on some files by more than an order of magnitude. Reviewers: dexonsmith, bruno, george.burgess.iv, dberlin, reames, davidxl Subscribers: davide, chandlerc, dberlin, davidxl, eraman, tejohnson, mbodart, llvm-commits Differential Revision: https://reviews.llvm.org/D15537 llvm-svn: 279833
* [InstCombine] rename variables in foldICmpAndConstant(); NFCSanjay Patel2016-08-261-54/+55
| | | | llvm-svn: 279831
* test commitBob Haarman2016-08-261-1/+0
| | | | llvm-svn: 279830
* [LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis passAdam Nemet2016-08-262-5/+4
| | | | | | | | | | | | | | | | We can't mark ORE (a function pass) preserved as required by the loop passes because that is how we ensure that the required passes like LazyBFI are all available any time ORE is used. See the new comments in the patch. Instead we use it directly just like the inliner does in D22694. As expected there is some additional overhead after removing the caching provided by analysis passes. The worst case, I measured was LNT/CINT2006_ref/401.bzip2 which regresses by 12%. As before, this only affects -Rpass-with-hotness and not default compilation. llvm-svn: 279829
* [InstCombine] rename variables in foldICmpDivConstant(); NFCSanjay Patel2016-08-261-29/+28
| | | | | | | | | | | Removing the redundant 'CmpRHSV' local variable exposes a bug in the caller foldICmpShrConstant() - it was sending in the div constant instead of the cmp constant. But I have not been able to expose this in a regression test yet - the affected folds all appear to be handled before we ever reach this code. I'll keep trying to find a case as I make changes to allow vector folds in both functions. llvm-svn: 279828
* [MemCpy] Add comments for r279769Tim Shen2016-08-251-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279778
* [MemCpy] Check for alias in performMemCpyToMemSetOptzn, instead of the ↵Tim Shen2016-08-251-1/+3
| | | | | | | | | | | | | | | identity of two operands Summary: This fixes pr29105. The reason is that lifetime marks creates new aliasing pointers the original ones, but before this patch aliases were not checked in performMemCpyToMemSetOptzn. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23846 llvm-svn: 279769
* [UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expandedWei Mi2016-08-251-5/+6
| | | | | | | | | | | | | | when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748
* GVN-hoist: fix hoistingFromAllPaths for loops (PR29034)Sebastian Pop2016-08-251-31/+46
| | | | | | | | | | | | | | | | It is invalid to hoist stores or loads if they are not executed on all paths from the hoisting point to the exit of the function. In the testcase, there are paths in the loop that do not execute the stores or the loads, and so hoisting them within the loop is unsafe. The problem is that the current implementation of hoistingFromAllPaths is incomplete: it walks all blocks dominated by the hoisting point, and does not return false when the loop contains a path on which the hoisted ld/st is not executed. Differential Revision: https://reviews.llvm.org/D23843 llvm-svn: 279732
* [Profile] Propagate branch metadata properly in instcombineXinliang David Li2016-08-251-11/+15
| | | | | | Differential Revision: http://reviews.llvm.org/D23590 llvm-svn: 279693
* [InstCombine] move foldICmpDivConstConst() contents to ↵Sanjay Patel2016-08-242-167/+158
| | | | | | | | | foldICmpDivConstant(); NFCI There was no logic in foldICmpDivConstant, so no need for a separate function. The code is directly copy/pasted, so further cleanups to follow. llvm-svn: 279685
* [InstCombine] use m_APInt to allow icmp eq/ne (shr X, C2), C folds for splat ↵Sanjay Patel2016-08-241-16/+19
| | | | | | constant vectors llvm-svn: 279677
* [LV] Unify vector and scalar mapsMatthew Simpson2016-08-241-207/+284
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch unifies the data structures we use for mapping instructions from the original loop to their corresponding instructions in the new loop. Previously, we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap. WidenMap maintained the vector values each instruction from the old loop was represented with, and ScalarIVMap maintained the scalar values each scalarized induction variable was represented with. With this patch, all values created for the new loop are maintained in VectorLoopValueMap. The change allows for several simplifications. Previously, when an instruction was scalarized, we had to insert the scalar values into vectors in order to maintain the mapping in WidenMap. Then, if a user of the scalarized value was also scalar, we had to extract the scalar values from the temporary vector we created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions. If a scalarized value is used by a scalar instruction, the scalar value is used directly. However, if the scalarized value is needed by a vector instruction, we generate the needed insertelement instructions on-demand. A common idiom in several locations in the code (including the scalarization code), is to first get the vector values an instruction from the original loop maps to, and then extract a particular scalar value. This patch adds getScalarValue for this purpose along side getVectorValue as an interface into VectorLoopValueMap. These functions work together to return the requested values if they're available or to produce them if they're not. The mapping has also be made less permissive. Entries can be added to VectorLoopValue map with the new initVector and initScalar functions. getVectorValue has been modified to return a constant reference to the mapped entries. There's no real functional change with this patch; however, in some cases we will generate slightly different code. For example, instead of an insertelement sequence following the definition of an instruction, it will now precede the first use of that instruction. This can be seen in the test case changes. Differential Revision: https://reviews.llvm.org/D23169 llvm-svn: 279649
* [SCCP] Don't delete side-effecting instructionsSanjoy Das2016-08-241-17/+6
| | | | | | | | I'm not sure if the `!isa<CallInst>(Inst) && !isa<TerminatorInst>(Inst))` bit is correct either, but this fixes the case we know is broken. llvm-svn: 279647
* [InstCombine] add assert and explanatory comment for fold removed in ↵Sanjay Patel2016-08-241-0/+7
| | | | | | | | | | | | | | | | r279568; NFC I deleted a fold from InstCombine at: https://reviews.llvm.org/rL279568 because it (like any InstCombine to a constant?) should always happen in InstSimplify, however, it's not obvious what the assumptions are in the remaining code. Add a comment and assert to make it clearer. Differential Revision: https://reviews.llvm.org/D23819 llvm-svn: 279626
* [Loop Vectorizer] Support predication of div/remGil Rapaport2016-08-241-73/+234
| | | | | | | | | | div/rem instructions in basic blocks that require predication currently prevent vectorization. This patch extends the existing mechanism for predicating stores to handle other instructions and leverages it to predicate divs and rems. Differential Revision: https://reviews.llvm.org/D22918 llvm-svn: 279620
* [PM] Introduce basic update capabilities to the new PM's CGSCC passChandler Carruth2016-08-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | manager, including both plumbing and logic to handle function pass updates. There are three fundamentally tied changes here: 1) Plumbing *some* mechanism for updating the CGSCC pass manager as the CG changes while passes are running. 2) Changing the CGSCC pass manager infrastructure to have support for the underlying graph to mutate mid-pass run. 3) Actually updating the CG after function passes run. I can separate them if necessary, but I think its really useful to have them together as the needs of #3 drove #2, and that in turn drove #1. The plumbing technique is to extend the "run" method signature with extra arguments. We provide the call graph that intrinsically is available as it is the basis of the pass manager's IR units, and an output parameter that records the results of updating the call graph during an SCC passes's run. Note that "...UpdateResult" isn't a *great* name here... suggestions very welcome. I tried a pretty frustrating number of different data structures and such for the innards of the update result. Every other one failed for one reason or another. Sometimes I just couldn't keep the layers of complexity right in my head. The thing that really worked was to just directly provide access to the underlying structures used to walk the call graph so that their updates could be informed by the *particular* nature of the change to the graph. The technique for how to make the pass management infrastructure cope with mutating graphs was also something that took a really, really large number of iterations to get to a place where I was happy. Here are some of the considerations that drove the design: - We operate at three levels within the infrastructure: RefSCC, SCC, and Node. In each case, we are working bottom up and so we want to continue to iterate on the "lowest" node as the graph changes. Look at how we iterate over nodes in an SCC running function passes as those function passes mutate the CG. We continue to iterate on the "lowest" SCC, which is the one that continues to contain the function just processed. - The call graph structure re-uses SCCs (and RefSCCs) during mutation events for the *highest* entry in the resulting new subgraph, not the lowest. This means that it is necessary to continually update the current SCC or RefSCC as it shifts. This is really surprising and subtle, and took a long time for me to work out. I actually tried changing the call graph to provide the opposite behavior, and it breaks *EVERYTHING*. The graph update algorithms are really deeply tied to this particualr pattern. - When SCCs or RefSCCs are split apart and refined and we continually re-pin our processing to the bottom one in the subgraph, we need to enqueue the newly formed SCCs and RefSCCs for subsequent processing. Queuing them presents a few challenges: 1) SCCs and RefSCCs use wildly different iteration strategies at a high level. We end up needing to converge them on worklist approaches that can be extended in order to be able to handle the mutations. 2) The order of the enqueuing need to remain bottom-up post-order so that we don't get surprising order of visitation for things like the inliner. 3) We need the worklists to have set semantics so we don't duplicate things endlessly. We don't need a *persistent* set though because we always keep processing the bottom node!!!! This is super, super surprising to me and took a long time to convince myself this is correct, but I'm pretty sure it is... Once we sink down to the bottom node, we can't re-split out the same node in any way, and the postorder of the current queue is fixed and unchanging. 4) We need to make sure that the "current" SCC or RefSCC actually gets enqueued here such that we re-visit it because we continue processing a *new*, *bottom* SCC/RefSCC. - We also need the ability to *skip* SCCs and RefSCCs that get merged into a larger component. We even need the ability to skip *nodes* from an SCC that are no longer part of that SCC. This led to the design you see in the patch which uses SetVector-based worklists. The RefSCC worklist is always empty until an update occurs and is just used to handle those RefSCCs created by updates as the others don't even exist yet and are formed on-demand during the bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and we push new SCCs onto it and blacklist existing SCCs on it to get the desired processing. We then *directly* update these when updating the call graph as I was never able to find a satisfactory abstraction around the update strategy. Finally, we need to compute the updates for function passes. This is mostly used as an initial customer of all the update mechanisms to drive their design to at least cover some real set of use cases. There are a bunch of interesting things that came out of doing this: - It is really nice to do this a function at a time because that function is likely hot in the cache. This means we want even the function pass adaptor to support online updates to the call graph! - To update the call graph after arbitrary function pass mutations is quite hard. We have to build a fairly comprehensive set of data structures and then process them. Fortunately, some of this code is related to the code for building the cal graph in the first place. Unfortunately, very little of it makes any sense to share because the nature of what we're doing is so very different. I've factored out the one part that made sense at least. - We need to transfer these updates into the various structures for the CGSCC pass manager. Once those were more sanely worked out, this became relatively easier. But some of those needs necessitated changes to the LazyCallGraph interface to make it significantly easier to extract the changed SCCs from an update operation. - We also need to update the CGSCC analysis manager as the shape of the graph changes. When an SCC is merged away we need to clear analyses associated with it from the analysis manager which we didn't have support for in the analysis manager infrsatructure. New SCCs are easy! But then we have the case that the original SCC has its shape changed but remains in the call graph. There we need to *invalidate* the analyses associated with it. - We also need to invalidate analyses after we *finish* processing an SCC. But the analyses we need to invalidate here are *only those for the newly updated SCC*!!! Because we only continue processing the bottom SCC, if we split SCCs apart the original one gets invalidated once when its shape changes and is not processed farther so its analyses will be correct. It is the bottom SCC which continues being processed and needs to have the "normal" invalidation done based on the preserved analyses set. All of this is mostly background and context for the changes here. Many thanks to all the reviewers who helped here. Especially Sanjoy who caught several interesting bugs in the graph algorithms, David, Sean, and others who all helped with feedback. Differential Revision: http://reviews.llvm.org/D21464 llvm-svn: 279618
* [Coroutines] Fix unused var warning in release buildGor Nishanov2016-08-241-2/+2
| | | | llvm-svn: 279610
* [Coroutines] Part 8: Coroutine Frame Building algorithmGor Nishanov2016-08-241-7/+545
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds coroutine frame building algorithm. Now, simple coroutines such as ex0.ll and ex1.ll (first examples from docs\Coroutines.rst can be compiled). Documentation and overview is here: http://llvm.org/docs/Coroutines.html. Upstreaming sequence (rough plan) 1.Add documentation. (https://reviews.llvm.org/D22603) 2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659) ... 7. Split coroutine into subfunctions. (https://reviews.llvm.org/D23461) 8. Coroutine Frame Building algorithm <= we are here 9. Add f.cleanup subfunction. 10+. The rest of the logic Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D23586 llvm-svn: 279609
* [ADCE] Add control dependence computationDavid Callahan2016-08-241-21/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is part of a serious of patches to evolve ADCE.cpp to support removing of unnecessary control flow. This patch adds the ability to compute control dependences using the iterated dominance frontier. We extend the liveness propagation to alternate between data and control dependences until convergences. Modify the pass manager intergation to compute the post-dominator tree needed for iterator dominance frontier. We still force all terminators live for now until we add code to handlinge removing control flow in a later patch. No changes to effective behavior with this patch Previous patches: D23225 [ADCE] Modify data structures to support removing control flow D23065 [ADCE] Refactor anticipating new functionality (NFC) D23102 [ADCE] Refactoring for new functionality (NFC) Reviewers: nadav, majnemer, mehdi_amini Subscribers: twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D23559 llvm-svn: 279594
* [LoopUnroll] By default disable unrolling when optimizing for size.Michael Zolotukhin2016-08-231-0/+4
| | | | | | | | | | | | | | | | | | | | | Summary: In clang commit r268509 we started to invoke loop-unroll pass from the driver even under -Os. However, we happen to not initialize optsize thresholds properly, which si fixed with this change. r268509 led to some big compile time regressions, because we started to unroll some loops that we didn't unroll before. With this change I hope to recover most of the regressions. We still are slightly slower than before, because we do some checks here and there in loop-unrolling before we bail out, but at least the slowdown is not that huge now. Reviewers: hfinkel, chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23388 llvm-svn: 279585
* [InstCombine] use local variables for repeated values; NFCISanjay Patel2016-08-231-12/+9
| | | | llvm-svn: 279578
* [InstCombine] move foldICmpShrConstConst() contents to foldICmpShrConst(); NFCISanjay Patel2016-08-232-77/+65
| | | | | | | There will only be 3 lines of code in foldICmpShrConst() when the cleanup is done, so it doesn't make much sense to have a separate function for a single fold. llvm-svn: 279575
* [InstCombine] remove icmp shr folds that are already handled by InstSimplifySanjay Patel2016-08-231-17/+3
| | | | | | | | AFAICT, these already worked in all cases for scalar types, and I enhanced the code to work for vector types in: https://reviews.llvm.org/rL279543 llvm-svn: 279568
* [SLP] Avoid signed integer overflowMatthew Simpson2016-08-231-9/+35
| | | | | | | | | | | | | | | | | | | The test case included with r279125 exposed an existing signed integer overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together with other costs, such as getReductionCost. This patch removes the possibility of assigning a cost of INT_MAX. Since we were previously using INT_MAX as an indicator for "should not vectorize", we now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable" before computing a cost. This patch adds a run-line to the test case used for r279125 that ensures we don't vectorize. Previously, this line would vectorize the test case by chance due to undefined behavior in the cost calculation. Differential Revision: https://reviews.llvm.org/D23723 llvm-svn: 279562
* Possible fix of test failures on win bots Xinliang David Li2016-08-231-3/+3
| | | | llvm-svn: 279542
* [Profile] refactor meta data copying/swapping codeXinliang David Li2016-08-231-37/+8
| | | | | | Differential Revision: http://reviews.llvm.org/D23619 llvm-svn: 279523
* GVNHoist: Use the pass version of MemorySSA and preserve it.Daniel Berlin2016-08-231-9/+12
| | | | | | | | | | | | Summary: GVNHoist: Use the pass version of MemorySSA and preserve it. Reviewers: sebpop, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23782 llvm-svn: 279504
* [MemorySSA] Remove unused field. NFC.George Burgess IV2016-08-221-6/+1
| | | | | | | | Given that we're not currently using blocker info, and whether or not we will end up using it it is unclear, don't waste 8 (or 4) bytes of memory per path node. llvm-svn: 279493
* [InstCombine] change param type from Instruction to BinaryOperator for icmp ↵Sanjay Patel2016-08-222-97/+109
| | | | | | | | helpers; NFCI This saves some casting in the helper functions and eases some further refactoring. llvm-svn: 279478
* [GraphTraits] Replace all NodeType usage with NodeRefTim Shen2016-08-221-9/+4
| | | | | | | | This should finish the GraphTraits migration. Differential Revision: http://reviews.llvm.org/D23730 llvm-svn: 279475
* [InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat ↵Sanjay Patel2016-08-221-14/+13
| | | | | | constant vectors llvm-svn: 279472
* MSSA: Factor out phi node placementDaniel Berlin2016-08-221-17/+22
| | | | llvm-svn: 279462
* MSSA: Only rename accesses whose defining access is nullptrDaniel Berlin2016-08-221-14/+6
| | | | llvm-svn: 279461
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-221-150/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460
* [SROA] Remove incorrect assertionJames Molloy2016-08-221-3/+0
| | | | | | | | | Confirmed with aprantl, this assertion is incorrect - code can get here (for example 80-bit FP types) and if it does it's benign. This is exposed by a completely unrelated patch of mine, so stop the compiler falling over. Original differential: http://reviews.llvm.org/D16187 aprantl's advice to remove assertion: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382129.html llvm-svn: 279454
* [InstCombine] Allow sinking from unique predecessor with multiple edgesJun Bum Lim2016-08-221-1/+1
| | | | | | | | | | | | Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination. Reviewers: mcrosier, majnemer, spatel, reames Subscribers: reames, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D23722 llvm-svn: 279448
* Revert "[SimplifyCFG] Rewrite SinkThenElseCodeToEnd"James Molloy2016-08-221-236/+150
| | | | | | This reverts commit r279443. It caused buildbot failures. llvm-svn: 279447
* [SimplifyCFG] Rewrite SinkThenElseCodeToEndJames Molloy2016-08-221-150/+236
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return *b += 3; else return *b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279443
* Revert -r278269 [IndVarSimplify] Eliminate zext of a signed IV when the IV ↵Artur Pilipenko2016-08-221-7/+2
| | | | | | | | | | is known to be non-negative This change needs to be reverted in order to revert -r278267 which cause performance regression on MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt from LNT and some other bechmarks. See comments on https://reviews.llvm.org/D18777 for details. llvm-svn: 279432
* [asan] Use 1 byte aligned stores to poison shadow memoryVitaly Buka2016-08-221-2/+2
| | | | | | | | | | | | Summary: r279379 introduced crash on arm 32bit bot. I suspect this is alignment issue. Reviewers: eugenis Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D23762 llvm-svn: 279413
* [InstCombine] use m_APInt to allow icmp (shl X, Y), C folds for splat ↵Sanjay Patel2016-08-211-13/+10
| | | | | | | | | | | constant vectors, part 4 This concludes the fixes for icmp+shl in this series: https://reviews.llvm.org/rL279339 https://reviews.llvm.org/rL279398 https://reviews.llvm.org/rL279399 llvm-svn: 279401
OpenPOWER on IntegriCloud