summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [EarlyCSE] Simplify guard intrinsicsSanjoy Das2016-04-291-0/+23
| | | | | | | | | | | | | | | | | | Summary: This change teaches EarlyCSE some basic properties of guard intrinsics: - Guard intrinsics read all memory, but don't write to any memory - After a guard has executed, the condition it was guarding on can be assumed to be true - Guard intrinsics on a constant `true` are no-ops Reviewers: reames, hfinkel Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19578 llvm-svn: 268120
* Revert r268107 -- debug build failureXinliang David Li2016-04-291-78/+70
| | | | llvm-svn: 268116
* [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate ↵Simon Pilgrim2016-04-291-17/+23
| | | | | | | | elements. NFCI. Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements. llvm-svn: 268115
* [inliner]: Refactor inline deferring logic into its own method /NFCXinliang David Li2016-04-291-70/+78
| | | | | | | | The implemented heuristic has a large body of code which better sits in its own function for better readability. It also allows adding more heuristics easier in the future. llvm-svn: 268107
* [InstCombine] Determine the result of a select based on a dominating condition.Chad Rosier2016-04-291-0/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19550 llvm-svn: 268104
* [InstCombine] clean up; NFCSanjay Patel2016-04-291-1/+1
| | | | llvm-svn: 268099
* [MemorySSA] Fix bugs in walker; refactor unittests a bit.George Burgess IV2016-04-291-8/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes two somewhat related bugs in MemorySSA's caching walker. These bugs were found because D19695 brought up the problem that we'd have defs cached to themselves, which is incorrect. The bugs this fixes are: - We would sometimes skip the nearest clobber of a MemoryAccess, because we would query our cache for a given potential clobber before checking if the potential clobber is the clobber we're looking for. The cache entry for the potential clobber would point to the nearest clobber *of the potential clobber*, so if that was a cache hit, we'd ignore the potential clobber entirely. - There are times (sometimes in DFS, sometimes in the getClobbering... functions) where we would insert cache entries that say a def clobbers itself. There's a bit of common code between the fixes for the bugs, so they aren't split out into multiple commits. This patch also adds a few unit tests, and refactors existing tests a bit to reduce the duplication of setup code. llvm-svn: 268087
* Do not read callee name when matching IR to profile as it is not used.Dehao Chen2016-04-291-8/+3
| | | | | | | | | | | | Summary: Callee name is not used to identify a callsite now, so do not read it during annotation. Reviewers: davidxl, dnovillo Subscribers: dnovillo, danielcdh, llvm-commits Differential Revision: http://reviews.llvm.org/D19704 llvm-svn: 268069
* [InstCombine] add helper function for ICmp with constant canonicalization; NFCISanjay Patel2016-04-291-24/+38
| | | | | | | As suggested in http://reviews.llvm.org/D17859 , we should enhance this to support vectors. llvm-svn: 268059
* Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to ↵Filipe Cabecinhas2016-04-293-3/+3
| | | | | | | | | | | | | | | | | | | the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050
* [GlobalOpt] Propagate operand bundlesDavid Majnemer2016-04-291-5/+10
| | | | | | | We neglected to transfer operand bundles for some transforms. These were found via inspection, I'll try to come up with some test cases. llvm-svn: 268011
* [InstCombine] Propagate operand bundlesDavid Majnemer2016-04-292-3/+9
| | | | | | | We neglected to transfer operand bundles for some transforms. These were found via inspection, I'll try to come up with some test cases. llvm-svn: 268010
* [DeadArgumentElimination] Propagate operand bundles to promoted call sitesDavid Majnemer2016-04-291-4/+10
| | | | | | | We neglected to transfer operand bundles when performing argument promotion. llvm-svn: 268008
* [LoopDist] Also emit optimization remark on success (-Rpass=)Adam Nemet2016-04-291-0/+3
| | | | | | | The option -Rpass=loop-distribute now reports the loops that were distributed. llvm-svn: 268006
* [LoopDist] Pass 'Function' to main class. NFCAdam Nemet2016-04-291-6/+8
| | | | | | Next patch will add another use for 'Function' inside the class. llvm-svn: 268005
* [SLPVectorizer] Add operand bundles to vectorized functionsDavid Majnemer2016-04-291-2/+16
| | | | | | | SLPVectorizing a call site should result in further propagation of its bundles. llvm-svn: 268004
* [LoopVectorize] Add operand bundles to vectorized functionsDavid Majnemer2016-04-291-5/+7
| | | | | | | Also, do not crash when calculating a cost model for loop-invariant token values. llvm-svn: 268003
* [ArgumentPromotion] Propagate operand bundles to promoted call sitesDavid Majnemer2016-04-291-2/+5
| | | | | | | | | We neglected to transfer operand bundles when performing argument promotion. This fixes PR27568. llvm-svn: 267986
* [PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer.Michael Zolotukhin2016-04-291-1/+0
| | | | | | | We don't preserve AAResults, because, for one, we don't preserve SCEV-AA. That fixes PR25281. llvm-svn: 267980
* Fix build by casting to the proper int type.Ivan Krasin2016-04-291-1/+1
| | | | | | | | Reviewers: eugenis Differential Revision: http://reviews.llvm.org/D19706 llvm-svn: 267974
* [LoopVectorize] Keep hints from original loop on the vector loopHal Finkel2016-04-291-0/+5
| | | | | | | | | | | | | | | | We need to keep loop hints from the original loop on the new vector loop. Failure to do this meant that, for example: void foo(int *b) { #pragma clang loop unroll(disable) for (int i = 0; i < 16; ++i) b[i] = 1; } this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the hints that unrolling should be disabled, and then we'd unroll it. llvm-svn: 267970
* [msan] Handle vector compare x86 intrinsics.Evgeniy Stepanov2016-04-291-0/+69
| | | | | | This handles SSE and SSE2 cmp_* and comiXX_* intrinsics. llvm-svn: 267966
* [LoopDist] Emit optimization remarks (-Rpass*)Adam Nemet2016-04-281-0/+25
| | | | | | | | | | | | | | | | I closely followed the precedents set by the vectorizer: * With -Rpass-missed, the loop is reported with further details pointing to -Rpass--analysis. * -Rpass-analysis reports the details why distribution has failed. * Regardless of -Rpass*, when distribution fails for a loop where distribution was forced with the pragma, a warning is produced according to -Wpass-failed. In this case the analysis info is also printed even without -Rpass-analysis. llvm-svn: 267952
* [LoopDist] Improve debug messagesAdam Nemet2016-04-281-6/+6
| | | | | | | | | | | The next patch will start using these for -Rpass-analysis so they won't be internal-only anymore. Move the 'Skipping; ' prefix that some of the message are using into the 'fail' function. We don't want to include this prefix in the -Rpass-analysis report. llvm-svn: 267951
* [LoopDist] Add helper to print debug message when distribution fails. NFCAdam Nemet2016-04-281-23/+20
| | | | | | This will form the basis to emit optimization remarks (-Rpass*). llvm-svn: 267950
* [Inliner] Preserve llvm.mem.parallel_loop_access metadataHal Finkel2016-04-281-0/+31
| | | | | | | | | | | | | | | | | | | | | | | When inlining a call site with llvm.mem.parallel_loop_access metadata, this metadata needs to be propagated to all cloned memory-accessing instructions. Otherwise, inlining parts of the loop body will invalidate the annotation. With this functionality, we now vectorize the following as expected: void Body(int *res, int *c, int *d, int *p, int i) { res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } void Test(int *res, int *c, int *d, int *p, int n) { int i; #pragma clang loop vectorize(assume_safety) for (i = 0; i < 1600; i++) { Body(res, c, d, p, i); } } llvm-svn: 267949
* [PGO] Fix incorrect Twine usage in emitting optimization remarks.Rong Xu2016-04-281-9/+8
| | | | | | | Should not store Twine objects to local variables. This is fixed the test failures with r267815 in VS2015 X64 build. llvm-svn: 267908
* Minor format change and fixing typos in the comments. NFC.Rong Xu2016-04-281-10/+7
| | | | llvm-svn: 267905
* [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.Arch D. Robison2016-04-281-37/+151
| | | | | | | | The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899
* [GVN] Minor code cleanup. NFC.Chad Rosier2016-04-281-65/+60
| | | | | | | Differential Revision: http://reviews.llvm.org/D18828 Patch by Aditya Kumar! llvm-svn: 267898
* [EarlyCSE] Change LoadValue field Value *Data to Instruction *Inst. NFC.Geoff Berry2016-04-281-9/+9
| | | | | | Made in preparation for adding MemorySSA support to EarlyCSE. llvm-svn: 267893
* [EarlyCSE] Sort includes. NFC.Geoff Berry2016-04-281-1/+1
| | | | | | | | | | Reviewers: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19617 llvm-svn: 267890
* [InstCombine] Remove trailing whitespace. NFC.Ahmed Bougacha2016-04-281-1/+1
| | | | | | r267873. llvm-svn: 267887
* [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBitsSimon Pilgrim2016-04-281-0/+22
| | | | | | | | | | The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873
* [PGO] Promote indirect calls to conditional direct calls with value-profileRong Xu2016-04-274-1/+705
| | | | | | | | | | This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 llvm-svn: 267815
* [SimplifyCFG] propagate branch metadata when creating selectSanjay Patel2016-04-271-2/+2
| | | | | | | | There's no existing test for this path, and I don't know how to expose it in a regression test, but I'm assuming there's some reason this path exists. llvm-svn: 267813
* [PGO] Prohibit address recording if the function is both internal and COMDATRong Xu2016-04-271-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792
* [LIR] Set attributes on memset_pattern16.Ahmed Bougacha2016-04-271-0/+2
| | | | | | | | | "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762
* [LIR] Reuse variable. NFCI.Ahmed Bougacha2016-04-271-1/+1
| | | | llvm-svn: 267761
* [InferAttrs] Mark memset_pattern16 params nocapture.Ahmed Bougacha2016-04-271-0/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760
* [TLI] Unify LibFunc attribute inference. NFCI.Ahmed Bougacha2016-04-272-795/+723
| | | | | | | | | | | | | Now the pass is just a tiny wrapper around the util. This lets us reuse the logic elsewhere (done here for BuildLibCalls) instead of duplicating it. The next step is to have something like getOrInsertLibFunc that also sets the attributes. Differential Revision: http://reviews.llvm.org/D19470 llvm-svn: 267759
* [TLI] Unify LibFunc signature checking. NFCI.Ahmed Bougacha2016-04-273-615/+23
| | | | | | | | | I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758
* [LV] Reallow positive-stride interleaved load groups with gapsMatthew Simpson2016-04-271-9/+47
| | | | | | | | | | | | | | We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751
* [SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live.Arch D. Robison2016-04-271-20/+28
| | | | | | | | | This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-271-3/+4
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* [ThinLTO] Refine fix to avoid renaming of uses in inline assembly.Teresa Johnson2016-04-271-8/+16
| | | | | | | | | | | | | | | | | Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717
* isSafeToLoadUnconditionally support queries without a contextArtur Pilipenko2016-04-274-9/+17
| | | | | | | | | | This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692
* [LoopDist] Add llvm.loop.distribute.enable loop metadataAdam Nemet2016-04-272-12/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672
* [Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loopsVaivaswatha Nagaraj2016-04-271-0/+2
| | | | | | | | | | | | | | Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671
* The patch fixes PR27392.Evgeny Stupachenko2016-04-271-9/+10
| | | | | | | | | | | | | | | | | Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662
OpenPOWER on IntegriCloud