summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Tidy up whitespace with clang-format prior to making significantChandler Carruth2014-05-071-45/+41
| | | | | | changes. llvm-svn: 208229
* [InstCombine] Add optimization of redundant insertvalue instructions.Michael Zolotukhin2014-05-072-0/+37
| | | | | | rdar://problem/11861387 llvm-svn: 208214
* [msan] Fix -fsanitize=memory -fno-integrated-as.Evgeniy Stepanov2014-05-071-1/+1
| | | | llvm-svn: 208211
* MergeFunctions Pass, introduced total ordering among values.Stepan Dyatkovskiy2014-05-071-41/+96
| | | | | | | | | | | | | | | | | | | This is a third patch of patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). This patch description: Being comparing functions we need to compare values we meet at left and right sides. Its easy to sort things out for external values. It just should be the same value at left and right. But for local values (those were introduced inside function body) we have to ensure they were introduced at exactly the same place, and plays the same role. In short, patch introduces values serial numbering and comparison routine. The last one compares two values by their serial numbers. llvm-svn: 208189
* [BUG][REFACTOR]Zinovy Nis2014-05-071-23/+22
| | | | | | | | | 1) Fix for printing debug locations for absolute paths. 2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel. Differential Revision: http://reviews.llvm.org/D3513 llvm-svn: 208177
* Second patch of patch series that improves MergeFunctions performance time ↵Stepan Dyatkovskiy2014-05-071-4/+278
| | | | | | | | | | | | | | | | | from O(N*N) to O(N*log(N)). The idea is to introduce total ordering among functions set. It allows to build binary tree and perform function look-up procedure in O(log(N)) time. This patch description: Introduced total ordering among constants implemented in cmpConstants method. Method performs lexicographical comparison between constants represented as hypothetical numbers of next format: <bitcastability-trait><raw-bit-contents> Please, read cmpConstants declaration comments for more details. llvm-svn: 208173
* Fix ASan init function detection after clang r208128.Nico Weber2014-05-061-3/+24
| | | | llvm-svn: 208141
* Re-commit r208025, reverted in r208030, with a fix for a conformance issueRichard Smith2014-05-063-10/+9
| | | | | | which GCC detects and Clang does not! llvm-svn: 208033
* Revert r208025, which made buildbots unhappy for unknown reasons.Richard Smith2014-05-063-9/+10
| | | | llvm-svn: 208030
* Add llvm::function_ref (and a couple of uses of it), representing a ↵Richard Smith2014-05-063-10/+9
| | | | | | type-erased reference to a callable object. llvm-svn: 208025
* Detabify.Nick Lewycky2014-05-061-2/+2
| | | | llvm-svn: 208019
* Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k ↵Nick Lewycky2014-05-051-73/+241
| | | | | | | | | | calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017
* Reapply: Add slp vectorization to LTO passes. The bug it exposed has been ↵Yi Jiang2014-05-051-0/+3
| | | | | | fixed by r207983. <radar://16641956> llvm-svn: 208013
* Always set alignment of vectorized LD/ST in SLP-Vectorizer. ↵Yi Jiang2014-05-051-0/+4
| | | | | | <rdar://problem/16812145> llvm-svn: 207983
* LTO: -internalize sets visibility to defaultDuncan P. N. Exon Smith2014-05-051-0/+3
| | | | | | | | | Visibility is meaningless when the linkage is local. Change `-internalize` to reset the visibility to `default`. <rdar://problem/16141113> llvm-svn: 207979
* [ASan/Win] Fix issue 305 -- don't instrument .CRT initializer/terminator ↵Timur Iskhodzhanov2014-05-051-4/+14
| | | | | | | | | callbacks See https://code.google.com/p/address-sanitizer/issues/detail?id=305 Reviewed at http://reviews.llvm.org/D3607 llvm-svn: 207968
* LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to ↵Benjamin Kramer2014-05-041-3/+6
| | | | | | | | | | | limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940
* SLPVectorizer: Bring back the insertelement patch (r205965) with fixesArnold Schwaighofer2014-05-041-30/+71
| | | | | | | | | | | | | | | | When can't assume a vectorized tree is rooted in an instruction. The IRBuilder could have constant folded it. When we rebuild the build_vector (the series of InsertElement instructions) use the last original InsertElement instruction. The vectorized tree root is guaranteed to be before it. Also, we can't assume that the n-th InsertElement inserts the n-th element into a vector. This reverts r207746 which reverted the revert of the revert of r205018 or so. Fixes the test case in PR19621. llvm-svn: 207939
* SLPVectorizer: Lazily allocate the map for block numbering.Benjamin Kramer2014-05-032-27/+26
| | | | | | | | There is no point in creating it if we're not going to vectorize anything. Creating the map is expensive as it creates large values. No functionality change. llvm-svn: 207916
* Vectorize intrinsic math function calls in SLPVectorizer.Karthik Bhat2014-05-032-143/+22
| | | | | | | This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer. Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559 llvm-svn: 207901
* Clean up constructor logic and member access for LoopVectorizeHints.Eric Christopher2014-05-021-34/+39
| | | | | | | | | There are public functions that mutate various members as well as another private member already, so make all the members private to avoid the discontinuity and add accessors for the values. Should be no functional change. llvm-svn: 207868
* Teach GlobalDCE how to remove empty global_ctor entries.Nico Weber2014-05-024-158/+204
| | | | | | | | | | | | | | | | | This moves most of GlobalOpt's constructor optimization code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The public interface is a single function OptimizeGlobalCtorsList() that takes a predicate returning which constructors to remove. GlobalOpt calls this with a function that statically evaluates all constructors, just like it did before. This part of the change is behavior-preserving. Also add a call to this from GlobalDCE with a filter that removes global constructors that contain a "ret" instruction and nothing else – this fixes PR19590. llvm-svn: 207856
* [GVN] Pass the phi-translated address of a load instead of the untranslatedAkira Hatanaka2014-05-021-2/+1
| | | | | | | | | address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. llvm-svn: 207853
* Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 ↵Nick Lewycky2014-05-021-0/+15
| | | | | | times in a bootstrap of clang. llvm-svn: 207828
* Update and sort CMakeLists.Benjamin Kramer2014-05-011-5/+6
| | | | llvm-svn: 207785
* Add an optimization that does CSE in a group of similar GEPs.Eli Bendersky2014-05-012-0/+584
| | | | | | | | | | | | | | This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783
* Revert r205965, which essentially reverts r205018 for the second time.Chandler Carruth2014-05-011-65/+30
| | | | | | | | | | | | | | | | =[ Turns out that this was the root cause of PR19621. We found a crasher only recently (likely due to improvements elsewhere in the SLP vectorizer) but the reduced test case failed all the way back to here. I've confirmed that reverting this patch both fixes the reduced test case in PR19621 and the actual source file that led to it, so it seems to really be rooted here. I've replied to the commit thread with discussion of my (feeble) attempts to debug this. Didn't make it very far, so reverting now that we have a good test case so that things can get back to healthy while the debugging carries on. llvm-svn: 207746
* Patch for function cloning to inline all blocks whose address is takenGerolf Hoflehner2014-04-301-34/+106
| | | | | | | | | | | | | Not all address taken blocks get inlined. The reason is that a blocks new address is known only when it is cloned. But e.g. a branch instruction in a different block could need that address earlier while it gets cloned. The solution is to collect the set of all blocks that can potentially get inlined and compute a new block address up front. Then clone and cleanup. rdar://16427209 llvm-svn: 207713
* Revert r207571 - Add slp vectorization to LTO passesYi Jiang2014-04-301-3/+0
| | | | llvm-svn: 207693
* [IPO/MergeFunctions] changes so it doesn't try to bitcast a struct return ↵Carlo Kok2014-04-301-1/+16
| | | | | | type but instead recreates it with insert/extract value. llvm-svn: 207679
* Add a <tuple> include to more files that aren't getting it transitively on MSVC.Benjamin Kramer2014-04-302-0/+2
| | | | llvm-svn: 207617
* ConstantHoisting.cpp: Add <tuple> for std::tie, since r207593 removed ↵NAKAMURA Takumi2014-04-301-0/+1
| | | | | | FileSystem.h, it includes <tuple>. llvm-svn: 207614
* Tidy up.Jim Grosbach2014-04-291-2/+2
| | | | llvm-svn: 207585
* Spelling.Jim Grosbach2014-04-291-1/+1
| | | | llvm-svn: 207584
* Also handle ConstantAggregateZero when optimizing vpermilvar*.Rafael Espindola2014-04-291-20/+22
| | | | llvm-svn: 207582
* Remove tabs.Rafael Espindola2014-04-291-4/+4
| | | | | | Sorry, new machine and I forgot to change the editor setting. llvm-svn: 207578
* Two fixes to the vpermilvar optimization.Rafael Espindola2014-04-291-1/+24
| | | | | | | | The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect. The _256 variants have indexes into the individual 128 bit lanes and in all cases it also has to mask out unused bits. llvm-svn: 207577
* Fix vectorization remarks.Diego Novillo2014-04-291-6/+13
| | | | | | | | | This patch changes the vectorization remarks to also inform when vectorization is possible but not beneficial. Added tests to exercise some loop remarks. llvm-svn: 207574
* Continue slp vectorization even the BB already has vectorized store ↵Yi Jiang2014-04-291-1/+1
| | | | | | radar://16641956 llvm-svn: 207572
* Add slp vectorization to LTO passesYi Jiang2014-04-291-0/+3
| | | | llvm-svn: 207571
* Reapply r207271 without the testcaseAdam Nemet2014-04-291-9/+12
| | | | | | PR19608 was filed to find a suitable testcase. llvm-svn: 207569
* Add optimization remarks to the loop unroller and vectorizer.Diego Novillo2014-04-292-0/+20
| | | | | | | | | | | | | | | Summary: This calls emitOptimizationRemark from the loop unroller and vectorizer at the point where they make a positive transformation. For the vectorizer, it reports vectorization and interleave factors. For the loop unroller, it reports all the different supported types of unrolling. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3456 llvm-svn: 207528
* [BUG] Fix -Wunused-variable warning in Release mode. Thnx to Kostya ↵Zinovy Nis2014-04-291-2/+3
| | | | | | Serebryany for pointing. llvm-svn: 207516
* fix -Wunused-variable warning in Release modeKostya Serebryany2014-04-291-0/+1
| | | | llvm-svn: 207514
* [OPENMP][LV][D3423] Respect Hints.Force meta-data for loops in LoopVectorizerZinovy Nis2014-04-291-27/+57
| | | | llvm-svn: 207512
* Fix a typo in commentMichael Zolotukhin2014-04-291-1/+1
| | | | llvm-svn: 207499
* Revert r207271 for now. This commit introduced a test case that ranChandler Carruth2014-04-281-12/+9
| | | | | | | | clang directly from the LLVM test suite! That doesn't work. I've followed up on the review thread to try and get a viable solution sorted out, but trying to get the tree clean here. llvm-svn: 207462
* InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)Hans Wennborg2014-04-281-0/+1
| | | | llvm-svn: 207426
* Fix rampant quadratic behavior in UpdatePHINodes. The operation ofChandler Carruth2014-04-281-23/+40
| | | | | | | | | | | | | | | | | | | | | | | mapping from a basic block to an incoming value, either for removal or just lookup, is linear in the number of predecessors, and we were doing this for every entry in the 'Preds' list which is in many cases almost all of them! Unfortunately, the fixes are quite ugly. PHI nodes just don't make this operation easy. The efficient way to fix this is to have a clever 'remove_if' operation on PHI nodes that lets us do a single pass over all the incoming values of the original PHI node, extracting the ones we care about. Then we could quickly construct the new phi node from this list. This would remove the remaining underlying quadratic movement of unrelated incoming values and the need for silly backwards looping to "minimize" how often we hit the quadratic case. This is the last obvious fix for PR19499. It shaves another 20% off the compile time for me, and while UpdatePHINodes remains in the profile, most of the time is now stemming from the well known inefficiencies of LVI and jump threading. llvm-svn: 207409
* [C++] Use 'nullptr'.Craig Topper2014-04-2814-44/+44
| | | | llvm-svn: 207394
OpenPOWER on IntegriCloud