summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [PGO] Fix incorrect Twine usage in emitting optimization remarks.Rong Xu2016-04-281-9/+8
| | | | | | | Should not store Twine objects to local variables. This is fixed the test failures with r267815 in VS2015 X64 build. llvm-svn: 267908
* Minor format change and fixing typos in the comments. NFC.Rong Xu2016-04-281-10/+7
| | | | llvm-svn: 267905
* [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.Arch D. Robison2016-04-281-37/+151
| | | | | | | | The refactoring portion part was done as r267748. http://reviews.llvm.org/D14185 llvm-svn: 267899
* [GVN] Minor code cleanup. NFC.Chad Rosier2016-04-281-65/+60
| | | | | | | Differential Revision: http://reviews.llvm.org/D18828 Patch by Aditya Kumar! llvm-svn: 267898
* [EarlyCSE] Change LoadValue field Value *Data to Instruction *Inst. NFC.Geoff Berry2016-04-281-9/+9
| | | | | | Made in preparation for adding MemorySSA support to EarlyCSE. llvm-svn: 267893
* [EarlyCSE] Sort includes. NFC.Geoff Berry2016-04-281-1/+1
| | | | | | | | | | Reviewers: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19617 llvm-svn: 267890
* [InstCombine] Remove trailing whitespace. NFC.Ahmed Bougacha2016-04-281-1/+1
| | | | | | r267873. llvm-svn: 267887
* [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBitsSimon Pilgrim2016-04-281-0/+22
| | | | | | | | | | The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 llvm-svn: 267873
* [PGO] Promote indirect calls to conditional direct calls with value-profileRong Xu2016-04-274-1/+705
| | | | | | | | | | This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 llvm-svn: 267815
* [SimplifyCFG] propagate branch metadata when creating selectSanjay Patel2016-04-271-2/+2
| | | | | | | | There's no existing test for this path, and I don't know how to expose it in a regression test, but I'm assuming there's some reason this path exists. llvm-svn: 267813
* [PGO] Prohibit address recording if the function is both internal and COMDATRong Xu2016-04-271-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D19515 llvm-svn: 267792
* [LIR] Set attributes on memset_pattern16.Ahmed Bougacha2016-04-271-0/+2
| | | | | | | | | "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762
* [LIR] Reuse variable. NFCI.Ahmed Bougacha2016-04-271-1/+1
| | | | llvm-svn: 267761
* [InferAttrs] Mark memset_pattern16 params nocapture.Ahmed Bougacha2016-04-271-0/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760
* [TLI] Unify LibFunc attribute inference. NFCI.Ahmed Bougacha2016-04-272-795/+723
| | | | | | | | | | | | | Now the pass is just a tiny wrapper around the util. This lets us reuse the logic elsewhere (done here for BuildLibCalls) instead of duplicating it. The next step is to have something like getOrInsertLibFunc that also sets the attributes. Differential Revision: http://reviews.llvm.org/D19470 llvm-svn: 267759
* [TLI] Unify LibFunc signature checking. NFCI.Ahmed Bougacha2016-04-273-615/+23
| | | | | | | | | I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758
* [LV] Reallow positive-stride interleaved load groups with gapsMatthew Simpson2016-04-271-9/+47
| | | | | | | | | | | | | | We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751
* [SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live.Arch D. Robison2016-04-271-20/+28
| | | | | | | | | This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-271-3/+4
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* [ThinLTO] Refine fix to avoid renaming of uses in inline assembly.Teresa Johnson2016-04-271-8/+16
| | | | | | | | | | | | | | | | | Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717
* isSafeToLoadUnconditionally support queries without a contextArtur Pilipenko2016-04-274-9/+17
| | | | | | | | | | This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692
* [LoopDist] Add llvm.loop.distribute.enable loop metadataAdam Nemet2016-04-272-12/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672
* [Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loopsVaivaswatha Nagaraj2016-04-271-0/+2
| | | | | | | | | | | | | | Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671
* The patch fixes PR27392.Evgeny Stupachenko2016-04-271-9/+10
| | | | | | | | | | | | | | | | | Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662
* Fix typo in comment; NFCSanjoy Das2016-04-271-1/+1
| | | | llvm-svn: 267653
* ThinLTO: do not promote GlobalVariable that have a specific section.Mehdi Amini2016-04-272-4/+75
| | | | | | | Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646
* SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic.Matt Arsenault2016-04-271-1/+3
| | | | | | | | | In the case where isLegalAddressingMode is used for cases not related to addressing modes, such as pure adds and muls, it should not be using address space 0. LSR already passes -1 as the address space in these cases. llvm-svn: 267645
* [LoopDist] Split main class. NFCAdam Nemet2016-04-271-86/+96
| | | | | | | | | | This splits out the per-loop functionality from the Pass class. With this the fact whether the loop is forced-distribute with the new metadata/pragma can be cached in the per-loop class rather than passed around. llvm-svn: 267643
* PM: Port Reassociate to the new pass managerJustin Bogner2016-04-262-140/+102
| | | | llvm-svn: 267631
* Reassociate: Convert another functor into a lambda. NFCJustin Bogner2016-04-261-15/+13
| | | | | | Also move the explanatory comment with it. llvm-svn: 267628
* [SimplifyCFG] propagate branch metadata when creating selectSanjay Patel2016-04-261-1/+1
| | | | llvm-svn: 267624
* [LowerExpectIntrinsic] make default likely/unlikely ratio biggerSanjay Patel2016-04-261-6/+18
| | | | | | | | | | We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615
* Reassociate: Simplify using lambdas. NFCJustin Bogner2016-04-261-17/+7
| | | | llvm-svn: 267614
* Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes"David Majnemer2016-04-261-4/+3
| | | | | | | | | | The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605
* Masked Store in Loop Vectorizer - bugfixElena Demikhovsky2016-04-261-13/+9
| | | | | | | | Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597
* PM: Port Internalize to the new pass managerJustin Bogner2016-04-262-99/+74
| | | | llvm-svn: 267596
* [SimplifyLibCalls] sprintf doesn't copy null bytesDavid Majnemer2016-04-261-3/+4
| | | | | | | | | | sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580
* Tune basic block annotation algorithm.Dehao Chen2016-04-261-9/+12
| | | | | | | | | | | | | | | Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519
* [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when mergingHal Finkel2016-04-262-1/+3
| | | | | | | | When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515
* [LoopVectorize] Don't consider conditional-load dereferenceability for ↵Hal Finkel2016-04-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int *res, int *c, int *d, int *p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514
* [SROA] Don't falsely report that changes have occuredDavid Majnemer2016-04-261-6/+10
| | | | | | | | | We would report that the function changed despite creating no new allocas or performing any promotion. This fixes PR27316. llvm-svn: 267507
* PM: Port GlobalOpt to the new pass managerJustin Bogner2016-04-262-39/+58
| | | | llvm-svn: 267499
* PM: Convert the logic for GlobalOpt into static functions. NFCJustin Bogner2016-04-261-66/+71
| | | | | | | | | | Pass all of the state we need around as arguments, so that these functions are easier to reuse. There is one part of this that is unusual: we pass around a functor to look up a DomTree for a function. This will be a necessary abstraction when we try to use this code in both the legacy and the new pass manager. llvm-svn: 267498
* Optimize store of "bitcast" from vector to aggregate.Arch D. Robison2016-04-251-0/+60
| | | | | | | | | | | This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482
* [ThinLTO] Introduce typedef for commonly-used map type (NFC)Teresa Johnson2016-04-251-7/+4
| | | | | | | | Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *> map that is passed around to identify summaries for values defined in a particular module. This shortens up declarations in a variety of places. llvm-svn: 267471
* Cleanup redundant expression in InstCombineAndOrXor.Etienne Bergeron2016-04-251-2/+0
| | | | | | | | | | | | | | | Summary: The expression is redundant on both side of operator |. detected by : http://reviews.llvm.org/D19451 Reviewers: rnk, majnemer Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19459 llvm-svn: 267458
* [ValueTracking] Improve isImpliedCondition when the dominating cond is false.Chad Rosier2016-04-252-5/+10
| | | | llvm-svn: 267430
* Test commit: modified comment. NFCAnna Thomas2016-04-251-1/+1
| | | | llvm-svn: 267406
* [GlobalOpt] Allow constant globals to be SRA'dJames Molloy2016-04-251-5/+9
| | | | | | | | The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393
* Run GlobalOpt before emitting the bitcode for ThinLTOMehdi Amini2016-04-251-0/+2
| | | | | | | | This is motivated by reducing the size of the IR and thus reduce compile time. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267385
OpenPOWER on IntegriCloud