summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Implement global merge optimization for global variables."Rafael Espindola2014-05-162-76/+10
| | | | | | | | | | | | This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978
* Implement global merge optimization for global variables.Jiangning Liu2014-05-152-10/+76
| | | | | | | | | | | This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934
* Fix typosAlp Toker2014-05-151-2/+2
| | | | llvm-svn: 208839
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-141-1/+1
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* GVN: Fix non-determinism in map iteration.Benjamin Kramer2014-05-131-4/+7
| | | | | | | | | Iterating over a DenseMaop is non-deterministic and results to unpredictable IR output. Based on a patch by Daniel Reynaud! llvm-svn: 208728
* GVN: rangify a couple of loops.Benjamin Kramer2014-05-131-13/+9
| | | | | | No functionality change. llvm-svn: 208727
* Improve wording to make it sounds more like a change than an analysis.Nick Lewycky2014-05-081-2/+3
| | | | llvm-svn: 208370
* Simplify and fix incorrect comment. No functionality change.Richard Smith2014-05-081-22/+15
| | | | llvm-svn: 208272
* Detabify.Nick Lewycky2014-05-061-2/+2
| | | | llvm-svn: 208019
* Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k ↵Nick Lewycky2014-05-051-73/+241
| | | | | | | | | | calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017
* LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to ↵Benjamin Kramer2014-05-041-3/+6
| | | | | | | | | | | limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940
* [GVN] Pass the phi-translated address of a load instead of the untranslatedAkira Hatanaka2014-05-021-2/+1
| | | | | | | | | address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. llvm-svn: 207853
* Update and sort CMakeLists.Benjamin Kramer2014-05-011-5/+6
| | | | llvm-svn: 207785
* Add an optimization that does CSE in a group of similar GEPs.Eli Bendersky2014-05-012-0/+584
| | | | | | | | | | | | | | This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. llvm-svn: 207783
* ConstantHoisting.cpp: Add <tuple> for std::tie, since r207593 removed ↵NAKAMURA Takumi2014-04-301-0/+1
| | | | | | FileSystem.h, it includes <tuple>. llvm-svn: 207614
* Tidy up.Jim Grosbach2014-04-291-2/+2
| | | | llvm-svn: 207585
* Spelling.Jim Grosbach2014-04-291-1/+1
| | | | llvm-svn: 207584
* Reapply r207271 without the testcaseAdam Nemet2014-04-291-9/+12
| | | | | | PR19608 was filed to find a suitable testcase. llvm-svn: 207569
* Revert r207271 for now. This commit introduced a test case that ranChandler Carruth2014-04-281-12/+9
| | | | | | | | clang directly from the LLVM test suite! That doesn't work. I've followed up on the review thread to try and get a viable solution sorted out, but trying to get the tree clean here. llvm-svn: 207462
* [C++] Use 'nullptr'.Craig Topper2014-04-283-4/+4
| | | | llvm-svn: 207394
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-261-1/+9
| | | | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 Repaired r207302. llvm-svn: 207309
* Revert commit r207302 since build failuresGerolf Hoflehner2014-04-261-9/+1
| | | | | | have been reported. llvm-svn: 207303
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-261-1/+9
| | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 llvm-svn: 207302
* [LoopStrengthReduce] Don't trim formula that uses a subset of required registersAdam Nemet2014-04-251-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider this use from the new testcase: LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32 reg({1000,+,-1}<nw><%for.body>) -3003 + reg({3,+,3}<nw><%for.body>) -1001 + reg({1,+,1}<nuw><nsw><%for.body>) -1000 + reg({0,+,1}<nw><%for.body>) -3000 + reg({0,+,3}<nuw><%for.body>) reg({-1000,+,1}<nw><%for.body>) reg({-3000,+,3}<nsw><%for.body>) This is the last use we consider for a solution in SolveRecurse, so CurRegs is a large set. (CurRegs is the set of registers that are needed by the previously visited uses in the in-progress solution.) ReqRegs is { {3,+,3}<nw><%for.body>, {1,+,1}<nuw><nsw><%for.body> } This is the intersection of the regs used by any of the formulas for the current use and CurRegs. Now, the code requires a formula to contain *all* these regs (the comment is simply wrong), otherwise the formula is immediately disqualified. Obviously, no formula for this use contains two regs so they will all get disqualified. The fix modifies the check to allow the formula in this case. The idea is that neither of these formulae is introducing any new registers which is the point of this early pruning as far as I understand. In terms of set arithmetic, we now allow formulas whose used regs are a subset of the required regs not just the other way around. There are few more loops in the test-suite that are now successfully LSRed. I have benchmarked those and found very minimal change. Fixes <rdar://problem/13965777> llvm-svn: 207271
* SCC: Change clients to use const, NFCDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | | | | | It's fishy to be changing the `std::vector<>` owned by the iterator, and no one actual does it, so I'm going to remove the ability in a subsequent commit. First, update the users. <rdar://problem/14292693> llvm-svn: 207252
* [C++] Use 'nullptr'. Transforms edition.Craig Topper2014-04-2528-413/+422
| | | | llvm-svn: 207196
* Remove more default address space argument usage.Matt Arsenault2014-04-231-1/+2
| | | | | | These places are inconsequential in practice. llvm-svn: 207021
* [Constant Hoisting] Materialize the constant before the cloned cast instruction.Juergen Ributzka2014-04-221-2/+11
| | | | | | | | | | | | In the case where the constant comes from a cloned cast instruction, the materialization code has to go before the cloned cast instruction. This commit fixes the method that finds the materialization insertion point by making it aware of this case. This fixes <rdar://problem/15532441> llvm-svn: 206913
* [Constant Hoisting] Print the instructions in the correct order for ↵Juergen Ributzka2014-04-221-2/+2
| | | | | | debugging. No functional change. llvm-svn: 206912
* [Modules] Fix potential ODR violations by sinking the DEBUG_TYPEChandler Carruth2014-04-2235-36/+70
| | | | | | | | | | | | | | | | | definition below all of the header #include lines, lib/Transforms/... edition. This one is tricky for two reasons. We again have a couple of passes that define something else before the includes as well. I've sunk their name macros with the DEBUG_TYPE. Also, InstCombine contains headers that need DEBUG_TYPE, so now those headers #define and #undef DEBUG_TYPE around their code, leaving them well formed modular headers. Fixing these headers was a large motivation for all of these changes, as "leaky" macros of this form are hard on the modules implementation. llvm-svn: 206844
* Fix PR7272 in -tailcallelim instead of the inlinerReid Kleckner2014-04-211-0/+9
| | | | | | | | | | | | | | | | The -tailcallelim pass should be checking if byval or inalloca args can be captured before marking calls as tail calls. This was the real root cause of PR7272. With a better fix in place, revert the inliner change from r105255. The test case it introduced still passes and has been moved to test/Transforms/Inline/byval-tail-call.ll. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3403 llvm-svn: 206789
* Remove some empty statementsAlp Toker2014-04-191-1/+1
| | | | | | Cleanup only. llvm-svn: 206710
* remove some dead codeNuno Lopes2014-04-171-21/+0
| | | | | | | | | | | | | | | lib/Analysis/IPA/InlineCost.cpp | 18 ------------------ lib/Analysis/RegionPass.cpp | 1 - lib/Analysis/TypeBasedAliasAnalysis.cpp | 1 - lib/Transforms/Scalar/LoopUnswitch.cpp | 21 --------------------- lib/Transforms/Utils/LCSSA.cpp | 2 -- lib/Transforms/Utils/LoopSimplify.cpp | 6 ------ utils/TableGen/AsmWriterEmitter.cpp | 13 ------------- utils/TableGen/DFAPacketizerEmitter.cpp | 7 ------- utils/TableGen/IntrinsicEmitter.cpp | 2 -- 9 files changed, 71 deletions(-) llvm-svn: 206506
* verify-di: Implement DebugInfoVerifierDuncan P. N. Exon Smith2014-04-151-0/+1
| | | | | | | | | | | | | | | | | | | | | Implement DebugInfoVerifier, which steals verification relying on DebugInfoFinder from Verifier. - Adds LegacyDebugInfoVerifierPassPass, a ModulePass which wraps DebugInfoVerifier. Uses -verify-di command-line flag. - Change verifyModule() to invoke DebugInfoVerifier as well as Verifier. - Add a call to createDebugInfoVerifierPass() wherever there was a call to createVerifierPass(). This implementation as a module pass should sidestep efficiency issues, allowing us to turn debug info verification back on. <rdar://problem/15500563> llvm-svn: 206300
* D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadataAlexey Bataev2014-04-151-0/+9
| | | | llvm-svn: 206266
* Implement depth_first and inverse_depth_first range factory functions.David Blaikie2014-04-111-7/+3
| | | | | | | | | | | | | | Also updated as many loops as I could find using df_begin/idf_begin - strangely I found no uses of idf_begin. Is that just used out of tree? Also a few places couldn't use df_begin because either they used the member functions of the depth first iterators or had specific ordering constraints (I added a comment in the latter case). Based on a patch by Jim Grosbach. (Jim - you just had iterator_range<T> where you needed iterator_range<idf_iterator<T>>) llvm-svn: 206016
* Fix some doc and comment typosAlp Toker2014-04-091-1/+1
| | | | llvm-svn: 205899
* Revert "[Constant Hoisting] Lazily compute the idom and cache the result."Juergen Ributzka2014-04-031-43/+4
| | | | | | | This code is no longer usefull, because we only compute and use the IDom once. There is no benefit in caching it anymore. llvm-svn: 205498
* Add some additional fields to TTI::UnrollingPreferencesHal Finkel2014-04-011-4/+13
| | | | | | | | | | | | | | | | | | | In preparation for an upcoming commit implementing unrolling preferences for x86, this adds additional fields to the UnrollingPreferences structure: - PartialThreshold and PartialOptSizeThreshold - Like Threshold and OptSizeThreshold, but used when not fully unrolling. These are necessary because we need different thresholds for full unrolling from those used when partially unrolling (the full unrolling thresholds are generally going to be larger). - MaxCount - A cap on the unrolling factor when partially unrolling. This can be used by a target to prevent the unrolled loop from exceeding some resource limit independent of the loop size (such as number of branches). There should be no functionality change for any in-tree targets. llvm-svn: 205347
* Move partial/runtime unrolling late in the pipelineHal Finkel2014-03-311-0/+4
| | | | | | | | | | | | | | | | The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. llvm-svn: 205264
* Revert "GVN: merge overflow intrinsics with non-overflow instructions."Erik Verbruggen2014-03-281-124/+58
| | | | | | | | | This reverts commit r203553, and follow-up commits r203558 and r203574. I will follow this up on the mailinglist to do it in a way that won't cause subtle PRE bugs. llvm-svn: 205009
* Treat lifetime.start'd memory like we treat freshly alloca'd memory. Patch ↵Nick Lewycky2014-03-261-4/+16
| | | | | | by Björn Steinbrink! llvm-svn: 204876
* [Constant Hoisting] Make the constant candidate map local to the ↵Juergen Ributzka2014-03-251-11/+14
| | | | | | collectConstantCandidates method. llvm-svn: 204758
* remove a bunch of unused private methodsNuno Lopes2014-03-232-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h | 1 include/llvm/IR/DebugInfo.h | 3 lib/CodeGen/MachineSSAUpdater.cpp | 10 -- lib/CodeGen/PostRASchedulerList.cpp | 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 10 -- lib/IR/DebugInfo.cpp | 12 -- lib/MC/MCAsmStreamer.cpp | 2 lib/Support/YAMLParser.cpp | 39 --------- lib/TableGen/TGParser.cpp | 16 --- lib/TableGen/TGParser.h | 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp | 9 -- lib/Target/ARM/ARMCodeEmitter.cpp | 12 -- lib/Target/ARM/ARMFastISel.cpp | 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp | 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp | 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp | 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h | 2 lib/Target/PowerPC/PPCFastISel.cpp | 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp | 2 lib/Transforms/Instrumentation/BoundsChecking.cpp | 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp | 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp | 8 - lib/Transforms/Scalar/SCCP.cpp | 1 utils/TableGen/CodeEmitterGen.cpp | 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560
* [Constant Hoisting] Erase dead cast instructions.Juergen Ributzka2014-03-221-1/+1
| | | | | | | The cleanup code that removes dead cast instructions only removed them from the basic block, but didn't delete them. This fix erases them now too. llvm-svn: 204538
* [Constant Hoisting] Fix multiple entries for the same basic block in PHI nodes.Juergen Ributzka2014-03-221-3/+36
| | | | | | | | | | | | | | | | | | | | A PHI node usually has only one value/basic block pair per incoming basic block. In the case of a switch statement it is possible that a following PHI node may have more than one such pair per incoming basic block. E.g.: %0 = phi i64 [ 123456, %case2 ], [ 654321, %Entry ], [ 654321, %Entry ] This is valid and the verfier doesn't complain, because both values are the same. Constant hoisting materializes the constant for each operand separately and the value is still the same, but the variable names have changed. As a result the verfier can't recognize anymore that they are the same value and complains. This fix adds special update code for PHI node in constant hoisting to prevent this corner case. This fixes <rdar://problem/16394449> llvm-svn: 204537
* Sink: Don't sink static allocas from the entry blockTom Stellard2014-03-211-0/+7
| | | | | | | CodeGen treats allocas outside the entry block as dynamically sized stack objects. llvm-svn: 204473
* [Constant Hoisting] Make the constant materialization cost operand dependentJuergen Ributzka2014-03-211-7/+7
| | | | | | | | | Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> llvm-svn: 204435
* [Constant Hoisting] Lazily compute the idom and cache the result.Juergen Ributzka2014-03-211-4/+43
| | | | | | Related to <rdar://problem/16381500> llvm-svn: 204434
* [Constant Hoisting] Change the algorithm to only track constants for ↵Juergen Ributzka2014-03-211-239/+321
| | | | | | | | | | | | | | | | | | | | | | | | instructions. Originally the algorithm would search for expensive constants and track their users, which could be instructions and constant expressions. This change only tracks the constants for instructions, but constant expressions are indirectly covered too. If an operand is an constant expression, then we look through the expression to find anny expensive constants. The algorithm keep now track of the instruction and the operand index where the constant is used. This allows more precise hoisting of constant materialization code for PHI instructions, because we only hoist to the basic block of the incoming operand. Before we had to find the idom of all PHI operands and hoist the materialization code there. This also makes updating of instructions easier. Before we had to keep track of the original constant, find it in the instructions, and then replace it. Now we can just simply update the operand. Related to <rdar://problem/16381500> llvm-svn: 204433
OpenPOWER on IntegriCloud