summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* [SeparateConstOffsetFromGEP] inbounds zext => sext for better splittingJingyue Wu2014-06-081-1/+57
| | | | | | | | | | For each array index that is in the form of zext(a), convert it to sext(a) if we can prove zext(a) <= max signed value of typeof(a). The conversion helps to split zext(x + y) into sext(x) + sext(y). Reviewed in http://reviews.llvm.org/D4060 llvm-svn: 210444
* [SeparateConstOffsetFromGEP] Fix an illegitimate optimization on zextJingyue Wu2014-06-081-2/+2
| | | | | | | | | | zext(a + b) != zext(a) + zext(b) even if a + b >= 0 && b >= 0. e.g., a = i4 0b1111, b = i4 0b0001 zext a + b to i8 = zext 0b0000 to i8 = 0b00000000 (zext a to i8) + (zext b to i8) = 0b00001111 + 0b00000001 = 0b00010000 llvm-svn: 210439
* Refactor canonicalizing array indices to a helper functionJingyue Wu2014-06-081-32/+51
| | | | | | No functionality changes. llvm-svn: 210438
* Fixed several correctness issues in SeparateConstOffsetFromGEPJingyue Wu2014-06-051-204/+338
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most issues are on mishandling s/zext. Fixes: 1. When rebuilding new indices, s/zext should be distributed to sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not sext(a + b) + 5. This also affects the logic of recursively looking for a constant offset, we need to include s/zext into the context of the searching. 2. Function find should return the bitwidth of the constant offset instead of always sign-extending it to i64. 3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP indices to pointer-size before computing the address. Therefore, gep base, zext(a + b) != gep base, a + b Improvements: 1. Add an optimization for splitting sext(a + b): if a + b is proven non-negative (e.g., used as an index of an inbound GEP) and one of a, b is non-negative, sext(a + b) = sext(a) + sext(b) 2. Function Distributable checks whether both sext and zext can be distributed to operands of a binary operator. This helps us split zext(sext(a + b)) to zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow. Refactoring: Merge some common logic of handling add/sub/or in find. Testing: Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes we made. llvm-svn: 210291
* [Reassociate] Similar to "X + -X" -> "0", added code to handle "X + ~X" -> "-1".Benjamin Kramer2014-05-311-8/+23
| | | | | | | | | | | | Handle "X + ~X" -> "-1" in the function Value *Reassociate::OptimizeAdd(Instruction *I, SmallVectorImpl<ValueEntry> &Ops); This patch implements: TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1". Patch by Rahul Jain! Differential Revision: http://reviews.llvm.org/D3835 llvm-svn: 209973
* Add LoadCombine pass.Michael J. Spencer2014-05-293-0/+270
| | | | | | | | This pass is disabled by default. Use -combine-loads to enable in -O[1-3] Differential revision: http://reviews.llvm.org/D3580 llvm-svn: 209791
* Distribute sext/zext to the operands of and/or/xorJingyue Wu2014-05-271-13/+29
| | | | | | | | | | | | This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can extract a constant offset from "s/zext and/or/xor A, B". Added a new test @ext_or to verify this enhancement. Refactoring the code, I also extracted some common logic to function Distributable. llvm-svn: 209670
* Make the LoopRotate pass's maximum header size configurable both ↵Owen Anderson2014-05-261-4/+14
| | | | | | | | | | programmatically and via the command line, mirroring similar functionality in LoopUnroll. In situations where clients used custom unrolling thresholds, their intent could previously be foiled by LoopRotate having a hardcoded threshold. llvm-svn: 209617
* Add the extracted constant offset using GEPJingyue Wu2014-05-231-26/+50
| | | | | | | | | | | | | Fixed a TODO in r207783. Add the extracted constant offset using GEP instead of ugly ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR easier to understand. Updated all affected tests, and added a new test in split-gep.ll to cover a corner case where emitting uglygep is necessary. llvm-svn: 209537
* Add support for missed and analysis optimization remarks.Diego Novillo2014-05-221-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds two new diagnostics: -pass-remarks-missed and -pass-remarks-analysis. They take the same values as -pass-remarks but are intended to be triggered in different contexts. -pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed, which passes call when they tried to apply a transformation but couldn't. -pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis, which passes call when they want to inform the user about analysis results. The patch also: 1- Adds support in the inliner for the two new remarks and a test case. 2- Moves emitOptimizationRemark* functions to the llvm namespace. 3- Adds an LLVMContext argument instead of making them member functions of LLVMContext. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3682 llvm-svn: 209442
* [LSR] Canonicalize reg1 + ... + regN into reg1 + ... + 1*regN.Quentin Colombet2014-05-201-183/+375
| | | | | | | | | | | | | | This commit introduces a canonical representation for the formulae. Basically, as soon as a formula has more that one base register, the scaled register field is used for one of them. The register put into the scaled register is preferably a loop variant. The commit refactors how the formulae are built in order to produce such representation. This yields a more accurate, but still perfectible, cost model. <rdar://problem/16731508> llvm-svn: 209230
* Use range forMatt Arsenault2014-05-191-4/+1
| | | | llvm-svn: 209147
* Revert "Implement global merge optimization for global variables."Rafael Espindola2014-05-162-76/+10
| | | | | | | | | | | | This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978
* Implement global merge optimization for global variables.Jiangning Liu2014-05-152-10/+76
| | | | | | | | | | | This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934
* Fix typosAlp Toker2014-05-151-2/+2
| | | | llvm-svn: 208839
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-141-1/+1
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* GVN: Fix non-determinism in map iteration.Benjamin Kramer2014-05-131-4/+7
| | | | | | | | | Iterating over a DenseMaop is non-deterministic and results to unpredictable IR output. Based on a patch by Daniel Reynaud! llvm-svn: 208728
* GVN: rangify a couple of loops.Benjamin Kramer2014-05-131-13/+9
| | | | | | No functionality change. llvm-svn: 208727
* Improve wording to make it sounds more like a change than an analysis.Nick Lewycky2014-05-081-2/+3
| | | | llvm-svn: 208370
* Simplify and fix incorrect comment. No functionality change.Richard Smith2014-05-081-22/+15
| | | | llvm-svn: 208272
* Detabify.Nick Lewycky2014-05-061-2/+2
| | | | llvm-svn: 208019
* Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k ↵Nick Lewycky2014-05-051-73/+241
| | | | | | | | | | calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017
* LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to ↵Benjamin Kramer2014-05-041-3/+6
| | | | | | | | | | | limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940
* [GVN] Pass the phi-translated address of a load instead of the untranslatedAkira Hatanaka2014-05-021-2/+1
| | | | | | | | | address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. llvm-svn: 207853
* Update and sort CMakeLists.Benjamin Kramer2014-05-011-5/+6
| | | | llvm-svn: 207785
* Add an optimization that does CSE in a group of similar GEPs.Eli Bendersky2014-05-012-0/+584
| | | | | | | | | | | | | | This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783
* ConstantHoisting.cpp: Add <tuple> for std::tie, since r207593 removed ↵NAKAMURA Takumi2014-04-301-0/+1
| | | | | | FileSystem.h, it includes <tuple>. llvm-svn: 207614
* Tidy up.Jim Grosbach2014-04-291-2/+2
| | | | llvm-svn: 207585
* Spelling.Jim Grosbach2014-04-291-1/+1
| | | | llvm-svn: 207584
* Reapply r207271 without the testcaseAdam Nemet2014-04-291-9/+12
| | | | | | PR19608 was filed to find a suitable testcase. llvm-svn: 207569
* Revert r207271 for now. This commit introduced a test case that ranChandler Carruth2014-04-281-12/+9
| | | | | | | | clang directly from the LLVM test suite! That doesn't work. I've followed up on the review thread to try and get a viable solution sorted out, but trying to get the tree clean here. llvm-svn: 207462
* [C++] Use 'nullptr'.Craig Topper2014-04-283-4/+4
| | | | llvm-svn: 207394
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-261-1/+9
| | | | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 Repaired r207302. llvm-svn: 207309
* Revert commit r207302 since build failuresGerolf Hoflehner2014-04-261-9/+1
| | | | | | have been reported. llvm-svn: 207303
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-261-1/+9
| | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 llvm-svn: 207302
* [LoopStrengthReduce] Don't trim formula that uses a subset of required registersAdam Nemet2014-04-251-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider this use from the new testcase: LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32 reg({1000,+,-1}<nw><%for.body>) -3003 + reg({3,+,3}<nw><%for.body>) -1001 + reg({1,+,1}<nuw><nsw><%for.body>) -1000 + reg({0,+,1}<nw><%for.body>) -3000 + reg({0,+,3}<nuw><%for.body>) reg({-1000,+,1}<nw><%for.body>) reg({-3000,+,3}<nsw><%for.body>) This is the last use we consider for a solution in SolveRecurse, so CurRegs is a large set. (CurRegs is the set of registers that are needed by the previously visited uses in the in-progress solution.) ReqRegs is { {3,+,3}<nw><%for.body>, {1,+,1}<nuw><nsw><%for.body> } This is the intersection of the regs used by any of the formulas for the current use and CurRegs. Now, the code requires a formula to contain *all* these regs (the comment is simply wrong), otherwise the formula is immediately disqualified. Obviously, no formula for this use contains two regs so they will all get disqualified. The fix modifies the check to allow the formula in this case. The idea is that neither of these formulae is introducing any new registers which is the point of this early pruning as far as I understand. In terms of set arithmetic, we now allow formulas whose used regs are a subset of the required regs not just the other way around. There are few more loops in the test-suite that are now successfully LSRed. I have benchmarked those and found very minimal change. Fixes <rdar://problem/13965777> llvm-svn: 207271
* SCC: Change clients to use const, NFCDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | | | | | It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-2528-413/+422
| | | | llvm-svn: 207196
* Remove more default address space argument usage.Matt Arsenault2014-04-231-1/+2
| | | | | | These places are inconsequential in practice. llvm-svn: 207021
* [Constant Hoisting] Materialize the constant before the cloned cast instruction.Juergen Ributzka2014-04-221-2/+11
| | | | | | | | | | | | In the case where the constant comes from a cloned cast instruction, the materialization code has to go before the cloned cast instruction. This commit fixes the method that finds the materialization insertion point by making it aware of this case. This fixes <rdar://problem/15532441> llvm-svn: 206913
* [Constant Hoisting] Print the instructions in the correct order for ↵Juergen Ributzka2014-04-221-2/+2
| | | | | | debugging. No functional change. llvm-svn: 206912
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-2235-36/+70
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* Fix PR7272 in -tailcallelim instead of the inlinerReid Kleckner2014-04-211-0/+9
| | | | | | | | | | | | | | | | The -tailcallelim pass should be checking if byval or inalloca args can be captured before marking calls as tail calls. This was the real root cause of PR7272. With a better fix in place, revert the inliner change from r105255. The test case it introduced still passes and has been moved to test/Transforms/Inline/byval-tail-call.ll. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3403 llvm-svn: 206789
* Remove some empty statementsAlp Toker2014-04-191-1/+1
| | | | | | Cleanup only. llvm-svn: 206710
* remove some dead codeNuno Lopes2014-04-171-21/+0
| | | | | | | | | | | | | | | lib/Analysis/IPA/InlineCost.cpp | 18 ------------------ lib/Analysis/RegionPass.cpp | 1 - lib/Analysis/TypeBasedAliasAnalysis.cpp | 1 - lib/Transforms/Scalar/LoopUnswitch.cpp | 21 --------------------- lib/Transforms/Utils/LCSSA.cpp | 2 -- lib/Transforms/Utils/LoopSimplify.cpp | 6 ------ utils/TableGen/AsmWriterEmitter.cpp | 13 ------------- utils/TableGen/DFAPacketizerEmitter.cpp | 7 ------- utils/TableGen/IntrinsicEmitter.cpp | 2 -- 9 files changed, 71 deletions(-) llvm-svn: 206506
* verify-di: Implement DebugInfoVerifierDuncan P. N. Exon Smith2014-04-151-0/+1
| | | | | | | | | | | | | | | | | | | | | Implement DebugInfoVerifier, which steals verification relying on DebugInfoFinder from Verifier. - Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps DebugInfoVerifier. Uses -verify-di command-line flag. - Change verifyModule() to invoke DebugInfoVerifier as well as Verifier. - Add a call to createDebugInfoVerifierPass() wherever there was a call to createVerifierPass(). This implementation as a module pass should sidestep efficiency issues, allowing us to turn debug info verification back on. <rdar://problem/15500563> llvm-svn: 206300
* D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadataAlexey Bataev2014-04-151-0/+9
| | | | llvm-svn: 206266
* Implement depth_first and inverse_depth_first range factory functions.David Blaikie2014-04-111-7/+3
| | | | | | | | | | | | | | Also updated as many loops as I could find using df_begin/idf_begin - strangely I found no uses of idf_begin. Is that just used out of tree? Also a few places couldn't use df_begin because either they used the member functions of the depth first iterators or had specific ordering constraints (I added a comment in the latter case). Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T> where you needed iterator_range<idf_iterator<T>>) llvm-svn: 206016
* Fix some doc and comment typosAlp Toker2014-04-091-1/+1
| | | | llvm-svn: 205899
* Revert "[Constant Hoisting] Lazily compute the idom and cache the result."Juergen Ributzka2014-04-031-43/+4
| | | | | | | This code is no longer usefull, because we only compute and use the IDom once. There is no benefit in caching it anymore. llvm-svn: 205498
OpenPOWER on IntegriCloud