summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Vectorize intrinsic math function calls in SLPVectorizer.Karthik Bhat2014-05-032-143/+22
| | | | | | | This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer. Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559 llvm-svn: 207901
* Clean up constructor logic and member access for LoopVectorizeHints.Eric Christopher2014-05-021-34/+39
| | | | | | | | | There are public functions that mutate various members as well as another private member already, so make all the members private to avoid the discontinuity and add accessors for the values. Should be no functional change. llvm-svn: 207868
* Teach GlobalDCE how to remove empty global_ctor entries.Nico Weber2014-05-024-158/+204
| | | | | | | | | | | | | | | | | This moves most of GlobalOpt's constructor optimization code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The public interface is a single function OptimizeGlobalCtorsList() that takes a predicate returning which constructors to remove. GlobalOpt calls this with a function that statically evaluates all constructors, just like it did before. This part of the change is behavior-preserving. Also add a call to this from GlobalDCE with a filter that removes global constructors that contain a "ret" instruction and nothing else – this fixes PR19590. llvm-svn: 207856
* [GVN] Pass the phi-translated address of a load instead of the untranslatedAkira Hatanaka2014-05-021-2/+1
| | | | | | | | | address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. llvm-svn: 207853
* Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 ↵Nick Lewycky2014-05-021-0/+15
| | | | | | times in a bootstrap of clang. llvm-svn: 207828
* Update and sort CMakeLists.Benjamin Kramer2014-05-011-5/+6
| | | | llvm-svn: 207785
* Add an optimization that does CSE in a group of similar GEPs.Eli Bendersky2014-05-012-0/+584
| | | | | | | | | | | | | | This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783
* Revert r205965, which essentially reverts r205018 for the second time.Chandler Carruth2014-05-011-65/+30
| | | | | | | | | | | | | | | | =[ Turns out that this was the root cause of PR19621. We found a crasher only recently (likely due to improvements elsewhere in the SLP vectorizer) but the reduced test case failed all the way back to here. I've confirmed that reverting this patch both fixes the reduced test case in PR19621 and the actual source file that led to it, so it seems to really be rooted here. I've replied to the commit thread with discussion of my (feeble) attempts to debug this. Didn't make it very far, so reverting now that we have a good test case so that things can get back to healthy while the debugging carries on. llvm-svn: 207746
* Patch for function cloning to inline all blocks whose address is takenGerolf Hoflehner2014-04-301-34/+106
| | | | | | | | | | | | | Not all address taken blocks get inlined. The reason is that a blocks new address is known only when it is cloned. But e.g. a branch instruction in a different block could need that address earlier while it gets cloned. The solution is to collect the set of all blocks that can potentially get inlined and compute a new block address up front. Then clone and cleanup. rdar://16427209 llvm-svn: 207713
* Revert r207571 - Add slp vectorization to LTO passesYi Jiang2014-04-301-3/+0
| | | | llvm-svn: 207693
* [IPO/MergeFunctions] changes so it doesn't try to bitcast a struct return ↵Carlo Kok2014-04-301-1/+16
| | | | | | type but instead recreates it with insert/extract value. llvm-svn: 207679
* Add a <tuple> include to more files that aren't getting it transitively on MSVC.Benjamin Kramer2014-04-302-0/+2
| | | | llvm-svn: 207617
* ConstantHoisting.cpp: Add <tuple> for std::tie, since r207593 removed ↵NAKAMURA Takumi2014-04-301-0/+1
| | | | | | FileSystem.h, it includes <tuple>. llvm-svn: 207614
* Tidy up.Jim Grosbach2014-04-291-2/+2
| | | | llvm-svn: 207585
* Spelling.Jim Grosbach2014-04-291-1/+1
| | | | llvm-svn: 207584
* Also handle ConstantAggregateZero when optimizing vpermilvar*.Rafael Espindola2014-04-291-20/+22
| | | | llvm-svn: 207582
* Remove tabs.Rafael Espindola2014-04-291-4/+4
| | | | | | Sorry, new machine and I forgot to change the editor setting. llvm-svn: 207578
* Two fixes to the vpermilvar optimization.Rafael Espindola2014-04-291-1/+24
| | | | | | | | The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect. The _256 variants have indexes into the individual 128 bit lanes and in all cases it also has to mask out unused bits. llvm-svn: 207577
* Fix vectorization remarks.Diego Novillo2014-04-291-6/+13
| | | | | | | | | This patch changes the vectorization remarks to also inform when vectorization is possible but not beneficial. Added tests to exercise some loop remarks. llvm-svn: 207574
* Continue slp vectorization even the BB already has vectorized store ↵Yi Jiang2014-04-291-1/+1
| | | | | | radar://16641956 llvm-svn: 207572
* Add slp vectorization to LTO passesYi Jiang2014-04-291-0/+3
| | | | llvm-svn: 207571
* Reapply r207271 without the testcaseAdam Nemet2014-04-291-9/+12
| | | | | | PR19608 was filed to find a suitable testcase. llvm-svn: 207569
* Add optimization remarks to the loop unroller and vectorizer.Diego Novillo2014-04-292-0/+20
| | | | | | | | | | | | | | | Summary: This calls emitOptimizationRemark from the loop unroller and vectorizer at the point where they make a positive transformation. For the vectorizer, it reports vectorization and interleave factors. For the loop unroller, it reports all the different supported types of unrolling. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3456 llvm-svn: 207528
* [BUG] Fix -Wunused-variable warning in Release mode. Thnx to Kostya ↵Zinovy Nis2014-04-291-2/+3
| | | | | | Serebryany for pointing. llvm-svn: 207516
* fix -Wunused-variable warning in Release modeKostya Serebryany2014-04-291-0/+1
| | | | llvm-svn: 207514
* [OPENMP][LV][D3423] Respect Hints.Force meta-data for loops in LoopVectorizerZinovy Nis2014-04-291-27/+57
| | | | llvm-svn: 207512
* Fix a typo in commentMichael Zolotukhin2014-04-291-1/+1
| | | | llvm-svn: 207499
* Revert r207271 for now. This commit introduced a test case that ranChandler Carruth2014-04-281-12/+9
| | | | | | | | clang directly from the LLVM test suite! That doesn't work. I've followed up on the review thread to try and get a viable solution sorted out, but trying to get the tree clean here. llvm-svn: 207462
* InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)Hans Wennborg2014-04-281-0/+1
| | | | llvm-svn: 207426
* Fix rampant quadratic behavior in UpdatePHINodes. The operation ofChandler Carruth2014-04-281-23/+40
| | | | | | | | | | | | | | | | | | | | | | | mapping from a basic block to an incoming value, either for removal or just lookup, is linear in the number of predecessors, and we were doing this for every entry in the 'Preds' list which is in many cases almost all of them! Unfortunately, the fixes are quite ugly. PHI nodes just don't make this operation easy. The efficient way to fix this is to have a clever 'remove_if' operation on PHI nodes that lets us do a single pass over all the incoming values of the original PHI node, extracting the ones we care about. Then we could quickly construct the new phi node from this list. This would remove the remaining underlying quadratic movement of unrelated incoming values and the need for silly backwards looping to "minimize" how often we hit the quadratic case. This is the last obvious fix for PR19499. It shaves another 20% off the compile time for me, and while UpdatePHINodes remains in the profile, most of the time is now stemming from the well known inefficiencies of LVI and jump threading. llvm-svn: 207409
* [C++] Use 'nullptr'.Craig Topper2014-04-2814-44/+44
| | | | llvm-svn: 207394
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-262-2/+18
| | | | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 Repaired r207302. llvm-svn: 207309
* Restore CloneFunction.cpp which got accidentlyGerolf Hoflehner2014-04-261-92/+33
| | | | | | overwritten by previous backout of r207303 llvm-svn: 207308
* Revert commit r207302 since build failuresGerolf Hoflehner2014-04-263-51/+94
| | | | | | have been reported. llvm-svn: 207303
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-262-2/+18
| | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 llvm-svn: 207302
* [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shiftAndrea Di Biagio2014-04-261-9/+41
| | | | | | | | | | right intrinsics. A packed logical shift right with a shift count bigger than or equal to the element size always produces a zero vector. In all other cases, it can be safely replaced by a 'lshr' instruction. llvm-svn: 207299
* Unbreak the gdb buildbot by not lowering dbg.declare intrinsics for arrays.Adrian Prantl2014-04-251-1/+7
| | | | llvm-svn: 207284
* [LoopStrengthReduce] Don't trim formula that uses a subset of required registersAdam Nemet2014-04-251-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider this use from the new testcase: LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32 reg({1000,+,-1}<nw><%for.body>) -3003 + reg({3,+,3}<nw><%for.body>) -1001 + reg({1,+,1}<nuw><nsw><%for.body>) -1000 + reg({0,+,1}<nw><%for.body>) -3000 + reg({0,+,3}<nuw><%for.body>) reg({-1000,+,1}<nw><%for.body>) reg({-3000,+,3}<nsw><%for.body>) This is the last use we consider for a solution in SolveRecurse, so CurRegs is a large set. (CurRegs is the set of registers that are needed by the previously visited uses in the in-progress solution.) ReqRegs is { {3,+,3}<nw><%for.body>, {1,+,1}<nuw><nsw><%for.body> } This is the intersection of the regs used by any of the formulas for the current use and CurRegs. Now, the code requires a formula to contain *all* these regs (the comment is simply wrong), otherwise the formula is immediately disqualified. Obviously, no formula for this use contains two regs so they will all get disqualified. The fix modifies the check to allow the formula in this case. The idea is that neither of these formulae is introducing any new registers which is the point of this early pruning as far as I understand. In terms of set arithmetic, we now allow formulas whose used regs are a subset of the required regs not just the other way around. There are few more loops in the test-suite that are now successfully LSRed. I have benchmarked those and found very minimal change. Fixes <rdar://problem/13965777> llvm-svn: 207271
* This reapplies r207235 with an additional bugfixes caught by the msanAdrian Prantl2014-04-251-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | buildbot - do not insert debug intrinsics before phi nodes. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207269
* SCC: Change clients to use const, NFCDuncan P. N. Exon Smith2014-04-252-8/+7
| | | | | | | | | | It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252
* Revert "This reapplies r207130 with an additional testcase+and a missing ↵Adrian Prantl2014-04-251-14/+8
| | | | | | | | check for" This reverts commit 207235 to investigate msan buildbot breakage. llvm-svn: 207250
* [inline cold threshold] Command line argument for inline threshold willManman Ren2014-04-251-1/+6
| | | | | | | | | | | override the default cold threshold. When we use command line argument to set the inline threshold, the default cold threshold will not be used. This is in line with how we use OptSizeThreshold. When we want a higher threshold for all functions, we do not have to set both inline threshold and cold threshold. llvm-svn: 207245
* Reapply r207135 without modifications.Adrian Prantl2014-04-251-17/+3
| | | | | | | | | | Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the location of the dbg.value. This gets rid of tons of redundant variable DIEs in subscopes. rdar://problem/14874886, rdar://problem/16679936 llvm-svn: 207236
* This reapplies r207130 with an additional testcase+and a missing check forAdrian Prantl2014-04-251-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207235
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-25100-1600/+1628
| | | | llvm-svn: 207196
* Allow vectorization of bit intrinsics in BB Vectorizer.Karthik Bhat2014-04-251-8/+21
| | | | | | This patch adds support for vectorization of bit intrinsics such as bswap,ctpop,ctlz,cttz. llvm-svn: 207174
* Revert "This reapplies r207130 with an additional testcase+and a missing ↵Adrian Prantl2014-04-251-14/+8
| | | | | | | | check for" Typo in testcase. llvm-svn: 207166
* This reapplies r207130 with an additional testcase+and a missing check forAdrian Prantl2014-04-251-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165
* Revert "Debug info for optimized code: Support variables that are on the ↵Adrian Prantl2014-04-251-14/+8
| | | | | | | | stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162
* Revert "Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the ↵Adrian Prantl2014-04-241-3/+17
| | | | | | | | location" This reverts commit 207130 for buildbot breakage. llvm-svn: 207159
OpenPOWER on IntegriCloud