summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [ModRefInfo] Make enum ModRefInfo an enum class [NFC].Alina Sbirlea2017-12-073-7/+7
| | | | | | | | | | | | | | | Summary: Make enum ModRefInfo an enum class. Changes to ModRefInfo values should be done using inline wrappers. This should prevent future bit-wise opearations from being added, which can be more error-prone. Reviewers: sanjoy, dberlin, hfinkel, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40933 llvm-svn: 320107
* [PGO] detect infinite loop and form MST properlyXinliang David Li2017-12-072-38/+32
| | | | | | Differential Revision: http://reviews.llvm.org/D40873 llvm-svn: 320104
* [InstCombine] Don't crash on out of bounds index in the insertelementIgor Laevsky2017-12-071-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D40390 llvm-svn: 320049
* [LV] Interleaved access vectorization: fix computing new alias infoAdam Nemet2017-12-061-2/+16
| | | | | | | | | | | As a new access is generated spanning across multiple fields, we need to propagate alias info from all the fields to form the most generic alias info. rdar://35602528 Differential Revision: https://reviews.llvm.org/D40617 llvm-svn: 319979
* [InstCombine] canonicalize constant-minus-boolean to select-of-constantsSanjay Patel2017-12-061-1/+6
| | | | | | | | | | | | | | | | | | | | | | | This restores the half of: https://reviews.llvm.org/rL75531 that was reverted at: https://reviews.llvm.org/rL159230 For the x86 case mentioned there, we now produce: leal 1(%rdi), %eax subl %esi, %eax We have target hooks to invert this in DAGCombiner (and x86 is enabled) with: https://reviews.llvm.org/rL296977 https://reviews.llvm.org/rL311731 AArch64 and possibly other targets would probably benefit from enabling those hooks too. See PR30327: https://bugs.llvm.org/show_bug.cgi?id=30327#c2 Differential Revision: https://reviews.llvm.org/D40612 llvm-svn: 319964
* [PGO] Make indirect call promotion a utilityMatthew Simpson2017-12-064-316/+343
| | | | | | | | | | | | | | | | This patch factors out the main code transformation utilities in the pgo-driven indirect call promotion pass and places them in Transforms/Utils. The change is intended to be a non-functional change, letting non-pgo-driven passes share a common implementation with the existing pgo-driven pass. The common utilities are used to conditionally promote indirect call sites to direct call sites. They perform the underlying transformation, and do not consider profile information. The pgo-specific details (e.g., the computation of branch weight metadata) have been left in the indirect call promotion pass. Differential Revision: https://reviews.llvm.org/D40658 llvm-svn: 319963
* [ModRefInfo] Do not use ModRefInfo result in if conditions as this makesAlina Sbirlea2017-12-061-1/+2
| | | | | | | assumptions about the values in the enum. Replace with wrapper returning bool [NFC]. llvm-svn: 319949
* [InlineFunction] Only replace call if there are VarArgs to forward.Florian Hahn2017-12-061-1/+2
| | | | | | | | | | | | | | | | Summary: There is no need to replace the original call instruction if no VarArgs need to be forwarded. Reviewers: davide, rnk, majnemer, efriedma Reviewed By: efriedma Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D40412 llvm-svn: 319947
* [LoopUtils] simplify createTargetReduction(); NFCISanjay Patel2017-12-061-41/+25
| | | | llvm-svn: 319946
* [LoopUtils] fix variable name to match FMF vocabulary; NFCSanjay Patel2017-12-061-4/+4
| | | | llvm-svn: 319928
* Revert r319482 and r319483 "[memcpyopt] Teach memcpyopt to optimize across ↵Hans Wennborg2017-12-061-28/+3
| | | | | | | | | | | | | | | | | | | | | | | | basic blocks" This caused PR35519. > [memcpyopt] Teach memcpyopt to optimize across basic blocks > > This teaches memcpyopt to make a non-local memdep query when a local query > indicates that the dependency is non-local. This notably allows it to > eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. > > Fixes PR28958. > > Differential Revision: https://reviews.llvm.org/D38374 > > [memcpyopt] Commit file missed in r319482. > > This change was meant to be included with r319482 but was accidentally > omitted. llvm-svn: 319873
* Revert r319794: [PGO] detect infinite loop and form MST properly: memory ↵Xinliang David Li2017-12-052-57/+15
| | | | | | leak problem llvm-svn: 319841
* Modify ModRefInfo values using static inline method abstractions [NFC].Alina Sbirlea2017-12-057-29/+26
| | | | | | | | | | | | | | | | | Summary: The aim is to make ModRefInfo checks and changes more intuitive and less error prone using inline methods that abstract the bit operations. Ideally ModRefInfo would become an enum class, but that change will require a wider set of changes into FunctionModRefBehavior. Reviewers: sanjoy, george.burgess.iv, dberlin, hfinkel Subscribers: nlopes, llvm-commits Differential Revision: https://reviews.llvm.org/D40749 llvm-svn: 319821
* [CVP] Remove some {s|u}sub.with.overflow checks.Joel Galenson2017-12-051-8/+17
| | | | | | | | This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks. Differential Revision: https://reviews.llvm.org/D40039 llvm-svn: 319807
* Test commit.Joel Galenson2017-12-051-1/+1
| | | | | | I removed a space at the end of a comment. NFC. llvm-svn: 319803
* [PGO] detect infinite loop and form MST properlyXinliang David Li2017-12-052-15/+57
| | | | | | Differential Revision: http://reviews.llvm.org/D40702 llvm-svn: 319794
* Bail out of a SimplifyCFG switch table opt at undef values.Mikael Holmen2017-12-051-1/+1
| | | | | | | | | | | | | | | | | | | Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by JesperAntonsson. Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D40639 llvm-svn: 319768
* [msan] Add a fixme note for a minor deficiency.Evgeniy Stepanov2017-12-041-0/+2
| | | | llvm-svn: 319708
* Move splitIndirectCriticalEdges() to BasicBlockUtils.h.Hiroshi Yamauchi2017-12-041-0/+139
| | | | | | | | | | | | | | | | Summary: Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so that it can be called from other places. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40750 llvm-svn: 319689
* [BypassSlowDivision] Improve our handling of divisions by constantsSanjoy Das2017-12-041-7/+13
| | | | | | | | | | | | | | | | | | | (This reapplies r314253. r314253 was reverted on r314482 because of a correctness regression on P100, but that regression was identified to be something else.) Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 319677
* [Loop Predication] Teach LP about reverse loopsAnna Thomas2017-12-041-58/+135
| | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, we only support predication for forward loops with step of 1. This patch enables loop predication for reverse or countdownLoops, which satisfy the following conditions: 1. The step of the IV is -1. 2. The loop has a singe latch as B(X) = X <pred> latchLimit with pred as s> or u> 3. The IV of the guard is the decrement IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit). This patch was downstream for a while and is the last series of patches that's from our LP implementation downstream. Reviewers: apilipenko, mkazantsev, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40353 llvm-svn: 319659
* [IndVars] Fix a bug introduced in r317012Philip Reames2017-12-011-3/+13
| | | | | | | | Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant. In this case, there is no loop invariant value contributing and we'd fail an assert. The test case was found by a java fuzzer and reduced. It's a real cornercase. You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate. To my knowledge, only the fuzzer has hit this case. llvm-svn: 319583
* Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' ↵Hans Wennborg2017-12-011-343/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | elements in integer binary ops." It causes builds to fail with "Instruction does not dominate all uses" (PR35497). > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > *dst++ = *src++; > *dst++ = *src++ + 1; > *dst++ = *src++ + 2; > *dst++ = *src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319550
* Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values.Mikael Holmen2017-12-011-1/+1
| | | | | | Broke build bots so reverting. llvm-svn: 319539
* Bail out of a SimplifyCFG switch table opt at undef values.Mikael Holmen2017-12-011-1/+1
| | | | | | | | | | | | | | | | | | | Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by: JesperAntonsson Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40639 llvm-svn: 319537
* [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in ↵Dinar Temirbulatov2017-12-011-143/+343
| | | | | | | | | | | | | | | | | | | | | | | | | | | integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { *dst++ = *src++; *dst++ = *src++ + 1; *dst++ = *src++ + 2; *dst++ = *src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319531
* Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and storesHiroshi Inoue2017-12-011-18/+50
| | | | | | Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots. llvm-svn: 319522
* Mark all library options as hidden.Zachary Turner2017-12-012-4/+6
| | | | | | | | | | | | | | | | | These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505
* ThinLTOBitcodeWriter: Try harder to discard unused references to the merged ↵Peter Collingbourne2017-11-301-3/+11
| | | | | | | | | | | | | | | | | | | module. If the thin module has no references to an internal global in the merged module, we need to make sure to preserve that property if the global is a member of a comdat group, as otherwise promotion can end up adding global symbols to the comdat, which is not allowed. This situation can arise if the external global in the thin module has dead constant users, which would cause use_empty() to return false and would cause us to try to promote it. To prevent this from happening, discard the dead constant users before asking whether a global is empty. Differential Revision: https://reviews.llvm.org/D40593 llvm-svn: 319494
* [memcpyopt] Teach memcpyopt to optimize across basic blocksDan Gohman2017-11-301-3/+28
| | | | | | | | | | | | This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. Fixes PR28958. Differential Revision: https://reviews.llvm.org/D38374 llvm-svn: 319482
* [PGO] Skip counter promotion for infinite loopsXinliang David Li2017-11-301-0/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D40662 llvm-svn: 319462
* Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores Hiroshi Inoue2017-11-301-21/+10
| | | | | | This reverts commit rL319407 due to failures in some buildbot. llvm-svn: 319410
* [SROA] enable splitting for non-whole-alloca loads and storesHiroshi Inoue2017-11-301-10/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, SROA splits loads and stores only when they are accessing the whole alloca. This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is. The whole-alloca loads and stores meet this new condition and so they are still splittable. Here is a simplified motivating example. struct record { long long a; int b; int c; }; int func(struct record r) { for (int i = 0; i < r.c; i++) r.b++; return r.b; } When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument. With this patch, the above example is compiled into only few instructions without loop. Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers. Differential Revision: https://reviews.llvm.org/D32998 llvm-svn: 319407
* - Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398Graham Yiu2017-11-301-7/+1
| | | | | | - Added lit testcases that were supposed to be part of r319398 llvm-svn: 319399
* With PGO information, we can do more aggressive outlining of cold regions in ↵Graham Yiu2017-11-301-88/+579
| | | | | | | | | | | | the inline candidate function. This contrasts with the scheme of keeping only the 'early return' portion of the inline candidate and outlining the rest of the function as a single function call. Support for outlining multiple regions of each function is added, as well as some basic heuristics to determine which regions are good to outline. Outline candidates limited to regions that are single-entry & single-exit. We also avoid outlining regions that produce live-exit variables, which may inhibit some forms of code motion (like commoning). Fallback to the regular partial inlining scheme is retained when either i) no regions are identified for outlining in the function, or ii) the outlined function could not be inlined in any of its callers. Differential Revision: https://reviews.llvm.org/D38190 llvm-svn: 319398
* LowerTypeTests: Deduplicate code. NFC.Peter Collingbourne2017-11-301-30/+17
| | | | llvm-svn: 319390
* LowerTypeTests: Remove unnecessary cast. NFC.Peter Collingbourne2017-11-301-1/+1
| | | | llvm-svn: 319387
* Demote this opt remark to DEBUG.Adam Nemet2017-11-281-4/+1
| | | | | | | | | | | | | | | | | | From a random opt-stat output: Top 10 remarks: tailcallelim/tailcall 53% inline/AlwaysInline 13% gvn/LoadClobbered 13% inline/Inlined 8% inline/TooCostly 2% inline/NoDefinition 2% licm/LoadWithLoopInvariantAddressInvalidated 2% licm/Hoisted 1% asm-printer/InstructionCount 1% prologepilog/StackSize 1% llvm-svn: 319235
* SROA: Don't create variable fragments that are outside of the variable.Adrian Prantl2017-11-281-0/+9
| | | | | | | | | | | An alloca may be larger than a variable that is described to be stored there. Don't create a dbg.value for fragments that are outside of the variable. This fixes PR35447. https://bugs.llvm.org/show_bug.cgi?id=35447 llvm-svn: 319230
* EntryExitInstrumenter: set DebugLocs on the inserted call instructions (PR35412)Hans Wennborg2017-11-281-5/+20
| | | | | | | Apparently the verifier requires that inlineable calls in a function with debug info have debug locations. llvm-svn: 319199
* Use getStoreSize() in various places instead of 'BitSize >> 3'.Jonas Paulsson2017-11-281-11/+4
| | | | | | | | | | | | | | | | | | This is needed for cases when the memory access is not as big as the width of the data type. For instance, storing i1 (1 bit) would be done in a byte (8 bits). Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a size of 0, which for instance makes alias analysis return NoAlias even when it shouldn't. There are no tests as this was done as a follow-up to the bugfix for the case where this was discovered (r318824). This handles more similar cases. Review: Björn Petterson https://reviews.llvm.org/D40339 llvm-svn: 319173
* Add a new pass to speculate around PHI nodes with constant (integer) ↵Chandler Carruth2017-11-282-0/+812
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after *everything* that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 llvm-svn: 319164
* [TailRecursionElimination] Skip debug intrinsics.Florian Hahn2017-11-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: I think we do not need to analyze debug intrinsics here, as they should not impact codegen. This has 2 benefits: 1) slightly less work to do and 2) avoiding generating optimization remarks for converting calls to debug intrinsics to tail calls, which are not really helpful for users. Based on work by Sander de Smalen. Reviewers: davide, trentxintong, aprantl Reviewed By: aprantl Subscribers: llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D40440 llvm-svn: 319158
* [GVN] Prevent ScalarPRE from hoisting across instructions that don't pass ↵Max Kazantsev2017-11-281-0/+14
| | | | | | | | | | | | control flow to successors This is to address a problem similar to those in D37460 for Scalar PRE. We should not PRE across an instruction that may not pass execution to its successor unless it is safe to speculatively execute it. Differential Revision: https://reviews.llvm.org/D38619 llvm-svn: 319147
* This reverts commit r319096 and r319097.Rafael Espindola2017-11-283-165/+34
| | | | | | | | | Revert "[SROA] Propagate !range metadata when moving loads." Revert "[Mem2Reg] Clang-format unformatted parts of this file. NFCI." Davide says they broke a bot. llvm-svn: 319131
* SROA: Avoid creating a fragment expression that covers the entire variable.Adrian Prantl2017-11-281-4/+9
| | | | | | | | Fixes PR35416. https://bugs.llvm.org/show_bug.cgi?id=35416 llvm-svn: 319126
* [Mem2Reg] Clang-format unformatted parts of this file. NFCI.Davide Italiano2017-11-271-28/+23
| | | | llvm-svn: 319097
* [SROA] Propagate !range metadata when moving loads.Davide Italiano2017-11-273-32/+168
| | | | | | | | | | | | | This tries to propagate !range metadata to a pre-existing load when a load is optimized out. This is done instead of adding an assume because converting loads to and from assumes creates a lot of IR. Patch by Ariel Ben-Yehuda. Differential Revision: https://reviews.llvm.org/D37216 llvm-svn: 319096
* [PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend ↵Sanjay Patel2017-11-271-5/+10
| | | | | | | | | | | on arg rather than result This should fix PR31455: https://bugs.llvm.org/show_bug.cgi?id=31455 Differential Revision: https://reviews.llvm.org/D28314 llvm-svn: 319094
* Inliner: Don't mark notail calls with the 'tail' attributeArnold Schwaighofer2017-11-271-1/+2
| | | | | | | | | | | | enum TailCallKind { TCK_None = 0, TCK_Tail = 1, TCK_MustTail = 2, TCK_NoTail = 3 }; TCK_NoTail is greater than TCK_Tail so taking the min does not do the correct thing. rdar://35639547 llvm-svn: 319075
OpenPOWER on IntegriCloud