summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
...
* Hoist the insertVector helper to be a static helper.Chandler Carruth2012-12-171-49/+62
| | | | | | | | | | | | | | | This will allow its use inside of memcpy rewriting as well. This routine is more complex than extractVector, and some of its uses are not 100% where I want them to be so there is still some work to do here. While this can technically change the output in some cases, it shouldn't be a change that matters -- IE, it can leave some dead code lying around that prior versions did not, etc. Yet another step in the refactorings leading up to the solution to the last component of PR14478. llvm-svn: 170328
* Lift the extractVector helper all the way out to a static helper function.Chandler Carruth2012-12-171-30/+32
| | | | | | | | | | | | | | | | | The method helpers all implicitly act upon the alloca, and what we really want is a fully generic helper. Doing memcpy rewrites is more special than all other rewrites because we are at times rewriting instructions which touch pointers *other* than the alloca. As a consequence all of the helpers needed by memcpy rewriting of sub-vector copies will need to be generalized fully. Note that all of these helpers ({insert,extract}{Integer,Vector}) are woefully uncommented. I'm going to go back through and document them once I get the factoring correct. No functionality changed. llvm-svn: 170325
* Factor the vector load rewriting into a more generic form.Chandler Carruth2012-12-171-16/+27
| | | | | | | | | This makes it suitable for use in rewriting memcpy in the presence of subvector memcpy intrinsics. No functionality changed. llvm-svn: 170324
* Fix the first part of PR14478: memset now works.Chandler Carruth2012-12-171-34/+68
| | | | | | | | | | | | | | | | | | | PR14478 highlights a serious problem in SROA that simply wasn't being exercised due to a lack of vector input code mixed with C-library function calls. Part of SROA was written carefully to handle subvector accesses via memset and memcpy, but the rewriter never grew support for this. Fixing it required refactoring the subvector access code in other parts of SROA so it could be shared, and then fixing the splat formation logic and using subvector insertion (this patch). The PR isn't quite fixed yet, as memcpy is still broken in the same way. I'm starting on that series of patches now. Hopefully this will be enough to bring the bullet benchmark back to life with the bb-vectorizer enabled, but that may require fixing memcpy as well. llvm-svn: 170301
* Extract the logic for inserting a subvector into a vector alloca.Chandler Carruth2012-12-171-38/+50
| | | | | | | No functionality changed. Another step of refactoring toward solving PR14487. llvm-svn: 170300
* Lift the integer splat computation into a helper function.Chandler Carruth2012-12-171-11/+28
| | | | | | | | No functionality changed. Refactoring leading up to the fix for PR14478 which requires some significant changes to the memset and memcpy rewriting. llvm-svn: 170299
* Relax an overly aggressive assert to fix PR14572.Chandler Carruth2012-12-151-1/+1
| | | | | | The alloca width is based on the alloc size, not the type size. llvm-svn: 170270
* Revert EVT->MVT changes, r169836-169851, due to buildbot failures.Patrik Hagglund2012-12-111-1/+1
| | | | llvm-svn: 169854
* Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT.Patrik Hagglund2012-12-111-1/+1
| | | | llvm-svn: 169840
* Add a new visitor for walking the uses of a pointer value.Chandler Carruth2012-12-101-219/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This visitor provides infrastructure for recursively traversing the use-graph of a pointer-producing instruction like an alloca or a malloc. It maintains a worklist of uses to visit, so it can handle very deep recursions. It automatically looks through instructions which simply translate one pointer to another (bitcasts and GEPs). It tracks the offset relative to the original pointer as long as that offset remains constant and exposes it during the visit as an APInt offset. Finally, it performs conservative escape analysis. However, currently it has some limitations that should be addressed going forward: 1) It doesn't handle vectors of pointers. 2) It doesn't provide a cheaper visitor when the constant offset tracking isn't needed. 3) It doesn't support non-instruction pointer values. The current functionality is exactly what is required to implement the SROA pointer-use visitors in terms of this one, rather than in terms of their own ad-hoc base visitor, which was always very poorly specified. SROA has been converted to use this, and the code there deleted which this utility now provides. Technically speaking, using this new visitor allows SROA to handle a few more cases than it previously did. It is now more aggressive in ignoring chains of instructions which look like they would defeat SROA, but in fact do not because they never result in a read or write of memory. While this is "neat", it shouldn't be interesting for real programs as any such chains should have been removed by others passes long before we get to SROA. As a consequence, I've not added any tests for these features -- it shouldn't be part of SROA's contract to perform such heroics. The goal is to extend the functionality of this visitor going forward, and re-use it from passes like ASan that can benefit from doing a detailed walk of the uses of a pointer. Thanks to Ben Kramer for the code review rounds and lots of help reviewing and debugging this patch. llvm-svn: 169728
* Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores.Chandler Carruth2012-12-101-2/+2
| | | | | | | | | | | | | | | | | | | When SROA was evaluating a mixture of i1 and i8 loads and stores, in just a particular case, it would tickle a latent bug where we compared bits to bytes rather than bits to bits. As a consequence of the latent bug, we would allow integers through which were not byte-size multiples, a situation the later rewriting code was never intended to handle. In release builds this could trigger all manner of oddities, but the reported issue in PR14548 was forming invalid bitcast instructions. The only downside of this fix is that it makes it more clear that SROA in its current form is not capable of handling mixed i1 and i8 loads and stores. Sometimes with the previous code this would work by luck, but usually it would crash, so I'm not terribly worried. I'll watch the LNT numbers just to be sure. llvm-svn: 169719
* Switch SROA to pop Uses off the back of its visitors' queues.Chandler Carruth2012-12-091-10/+8
| | | | | | | | This will more closely match the behavior of the new PtrUseVisitor that I am adding. Hopefully this will not change the actual behavior in any way, but by making the processing order more similar help in debugging. llvm-svn: 169697
* - Re-enable population count loop idiom recognization Shuxin Yang2012-12-091-19/+516
| | | | | | | - fix a bug which cause sigfault. - add two testing cases which was causing crash llvm-svn: 169687
* Revert the patches adding a popcount loop idiom recognition pass.Chandler Carruth2012-12-081-513/+19
| | | | | | | | | | | | | | There are still bugs in this pass, as well as other issues that are being worked on, but the bugs are crashers that occur pretty easily in the wild. Test cases have been sent to the original commit's review thread. This reverts the commits: r169671: Fix a logic error. r169604: Move the popcnt tests to an X86 subdirectory. r168931: Initial commit adding the pass. llvm-svn: 169683
* Fix an inadvertent typo error.Shuxin Yang2012-12-081-1/+1
| | | | llvm-svn: 169671
* s/AttrListPtr/AttributeSet/g to better label what this class is going to be ↵Bill Wendling2012-12-071-17/+17
| | | | | | in the near future. llvm-svn: 169651
* Set the 'MadeChange' variable if we are deleting blocks.Bill Wendling2012-12-061-0/+1
| | | | llvm-svn: 169455
* Add 'using' declarations to suppress -Woverloaded-virtual warnings.Matt Beaumont-Gay2012-12-041-0/+2
| | | | llvm-svn: 169214
* Teach the jump threading optimization to stop scanning the basic block when ↵Nadav Rotem2012-12-031-5/+10
| | | | | | calculating the cost after passing the threshold. llvm-svn: 169135
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-0332-211/+211
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Remove some buggy and apparantly unnecessary code from SROA.Chandler Carruth2012-12-031-25/+6
| | | | | | | | | | | | | | | | | | | | | | | | | The partitioning logic attempted to handle uses of an alloca with an offset starting before the alloca so long as the use had some overlap with the alloca itself. However, there was a bug where we tested '(uint64_t)Offset >= AllocSize' without first checking whether 'Offset' was positive. As a consequence, essentially every negative offset (that is, starting *before* the alloca does) would be thrown out, even if it was overlapping. The subsequent code to throw out negative offsets which were actually non-overlapping was essentially dead. The code to *handle* overlapping negative offsets was actually dead! I've just removed all of this, and taught SROA to discard any uses which start prior to the alloca from the beginning. It has the lovely property of simplifying the code. =] All the tests still pass, and in fact no new tests are needed as this is already covered by our testsuite. Fixing the code so that negative offsets work the way the comments indicate they were supposed to work causes regressions. That's how I found this. Anyways, this is all progress in the correct direction -- tightening up SROA to be maximally aggressive. Some day, I really hope to turn out-of-bounds accesses to an alloca into 'unreachable'. llvm-svn: 169120
* SROA: Avoid struct and array types early to avoid creating an overly large ↵Benjamin Kramer2012-12-011-0/+3
| | | | | | | | | | integer type. Fixes PR14465. Differential Revision: http://llvm-reviews.chandlerc.com/D148 llvm-svn: 169084
* Replace r168930 with a more reasonable patch.Bill Wendling2012-11-301-75/+0
| | | | | | | | | | | The original patch removed a bunch of code that the SjLjEHPrepare pass placed into the entry block if all of the landing pads were removed during the CodeGenPrepare class. The more natural way of doing things is to run the CGP *before* we run the SjLjEHPrepare pass. Make it so! llvm-svn: 169044
* Move library call simplification statistic to instcombineMeador Inge2012-11-301-2/+0
| | | | | | | | | The simplify-libcalls pass maintained a statistic to count the number of library calls that have been simplified. Now that library call simplification is being carried out in instcombine the statistic should be moved to there. llvm-svn: 168975
* Move the InstVisitor utility into VMCore where it belongs. It heavilyChandler Carruth2012-11-302-2/+2
| | | | | | | | | | | | depends on the IR infrastructure, there is no sense in it being off in Support land. This is in preparation to start working to expand InstVisitor into more special-purpose visitors that are still generic and can be re-used across different passes. The expansion will go into the Analylis tree though as nothing in VMCore needs it. llvm-svn: 168972
* rdar://12100355 (part 1)Shuxin Yang2012-11-291-19/+513
| | | | | | | | | | | | | | This revision attempts to recognize following population-count pattern: while(a) { c++; ... ; a &= a - 1; ... }, where <c> and <a>could be used multiple times in the loop body. TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent instruction sequence, which need to be improved in the following commits. Reviewed by Nadav, really appreciate! llvm-svn: 168931
* Handle the situation where CodeGenPrepare removes a reference to a BB that hasBill Wendling2012-11-291-0/+75
| | | | | | | | | | | | | the last invoke instruction in the function. This also removes the last landing pad in an function. This is fine, but with SjLj EH code, we've already placed a bunch of code in the 'entry' block, which expects the landing pad to stick around. When we get to the situation where CGP has removed the last landing pad, go ahead and nuke the SjLj instructions from the 'entry' block. <rdar://problem/12721258> llvm-svn: 168930
* instcombine: Migrate puts optimizationsMeador Inge2012-11-291-39/+0
| | | | | | | | | | | | This patch migrates the puts optimizations from the simplify-libcalls pass into the instcombine library call simplifier. All the simplifiers from simplify-libcalls have now been migrated to instcombine. Yay! Just a few other bits to migrate (prototype attribute inference and a few statistics) and simplify-libcalls can finally be put to rest. llvm-svn: 168925
* instcombine: Migrate fputs optimizationsMeador Inge2012-11-291-27/+0
| | | | | | | This patch migrates the fputs optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168893
* instcombine: Migrate fwrite optimizationsMeador Inge2012-11-291-38/+1
| | | | | | | This patch migrates the fwrite optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168892
* instcombine: Migrate fprintf optimizationsMeador Inge2012-11-291-93/+1
| | | | | | | This patch migrates the fprintf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168891
* When we delete a dead basic block, see if any of its successors are dead andBill Wendling2012-11-281-3/+13
| | | | | | delete those as well. llvm-svn: 168829
* instcombine: Migrate sprintf optimizationsMeador Inge2012-11-271-98/+0
| | | | | | | This patch migrates the sprintf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168677
* instcombine: Migrate printf optimizationsMeador Inge2012-11-261-89/+1
| | | | | | | This patch migrates the printf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168604
* instcombine: Migrate toascii optimizationsMeador Inge2012-11-261-26/+0
| | | | | | | This patch migrates the toascii optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168580
* instcombine: Migrate isascii optimizationsMeador Inge2012-11-261-20/+0
| | | | | | | This patch migrates the isascii optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168579
* instcombine: Migrate isdigit optimizationsMeador Inge2012-11-261-21/+1
| | | | | | | This patch migrates the isdigit optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168578
* instcombine: Migrate *abs optimizationsMeador Inge2012-11-261-25/+1
| | | | | | | This patch migrates the *abs optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168574
* instcombine: Migrate ffs* optimizationsMeador Inge2012-11-251-41/+1
| | | | | | | This patch migrates the ffs* optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168571
* CodeGenPrepare: Move ret duplication out of the instruction iteration loop.Benjamin Kramer2012-11-231-6/+8
| | | | | | | It can delete the block, and the loop continues on free'd memory. No change in output. Found by valgrind. llvm-svn: 168525
* PR14055: Implement support for sub-vector operations in SROA.Chandler Carruth2012-11-211-21/+75
| | | | | | | | | | Now if we can transform an alloca into a single vector value, but it has subvector, non-element accesses, we form the appropriate shufflevectors to allow SROA to proceed. This fixes PR14055 which pointed out a very common pattern that SROA couldn't handle -- mixed vec3 and vec4 operations on a single alloca. llvm-svn: 168418
* Use LLVM_ENABLE_DUMP for the variables used in printing as well as theChandler Carruth2012-11-201-2/+2
| | | | | | | | | printing functions themselves. Part of PR14324 (which should have just been a patch to the list, but hey...) llvm-svn: 168362
* Fix PR14132 and handle OOB loads speculated throuh PHI nodes.Chandler Carruth2012-11-201-0/+21
| | | | | | | | | | | | The issue is that we may end up with newly OOB loads when speculating a load into the predecessors of a PHI node, and this confuses the new integer splitting logic in some cases, triggering an assertion failure. In fact, the branch in question must be dead code as it loads from a too-narrow alloca. Add code to handle this gracefully and leave the requisite FIXMEs for both optimizing more aggressively and doing more to aid sanitizing invalid code which triggers these patterns. llvm-svn: 168361
* Add a comment to associate a FIXME with a PR where it is matters.Chandler Carruth2012-11-201-1/+2
| | | | llvm-svn: 168347
* Rework the rewriting of loads and stores for vector and integer allocasChandler Carruth2012-11-201-168/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | to properly handle the combinations of these with split integer loads and stores. This essentially replaces Evan's r168227 by refactoring the code in a different way, and trynig to mirror that refactoring in both the load and store sides of the rewriting. Generally speaking there was some really problematic duplicated code here that led to poorly founded assumptions and then subtle bugs. Now much of the code actually flows through and follows a more consistent style and logical path. There is still a tiny bit of duplication on the store side of things, but it is much less bad. This also changes the logic to never re-use a load or store instruction as that was simply too error prone in practice. I've added a few tests (one a reduction of the one in Evan's original patch, which happened to be the same as the report in PR14349). I'm going to look at adding a few more tests for things I found and fixed in passing (such as the volatile tests in the vectorizable predicate). This patch has survived bootstrap, and modulo one bugfix survived Duncan's test suite, but let me know if anything else explodes. llvm-svn: 168346
* Remove the last bit of constant folding from LinearizeExprTree (most of it wasDuncan Sands2012-11-181-11/+0
| | | | | | removed in commit 168035, but I missed this bit). llvm-svn: 168292
* Fix PR14060, an infinite loop in reassociate. The problem was that one of theDuncan Sands2012-11-181-6/+24
| | | | | | | | | | operands of the expression being written was wrongly thought to be reusable as an inner node of the expression resulting in it turning up as both an inner node *and* a leaf, creating a cycle in the def-use graph. This would have caused the verifier to blow up if things had gotten that far, however it managed to provoke an infinite loop first. llvm-svn: 168291
* Teach SROA rewriteVectorizedStoreInst to handle cases when the loaded value ↵Evan Cheng2012-11-171-33/+42
| | | | | | is narrower than the stored value. rdar://12713675 llvm-svn: 168227
* Fix a crash observed by Shuxin Yang. The issue here is that LinearizeExprTree,Duncan Sands2012-11-151-54/+21
| | | | | | | | | | | | | | the utility for extracting a chain of operations from the IR, thought that it might as well combine any constants it came across (rather than just returning them along with everything else). On the other hand, the factorization code would like to see the individual constants (this is quite reasonable: it is much easier to pull a factor of 3 out of 2*3 than it is to pull it out of 6; you may think 6/3 isn't so hard, but due to overflow it's not as easy to undo multiplications of constants as it may at first appear). This patch therefore makes LinearizeExprTree stupider: it now leaves optimizing to the optimization part of reassociate, and sticks to just analysing the IR. llvm-svn: 168035
* instcombine: Migrate math library call simplificationsMeador Inge2012-11-131-248/+1
| | | | | | | | | | | | | | | | | This patch migrates the math library call simplifications from the simplify-libcalls pass into the instcombine library call simplifier. I have typically migrated just one simplifier at a time, but the math simplifiers are interdependent because: 1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt. 2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on the option -enable-double-float-shrink. These two factors made migrating each of these simplifiers individually more of a pain than it would be worth. So, I migrated them all together. llvm-svn: 167815
OpenPOWER on IntegriCloud