summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [SROA] Adjust to new clang-format style.Chandler Carruth2014-02-251-2/+2
| | | | llvm-svn: 202134
* [SROA] Fix a *glaring* bug in r202091: you have to actually *write*Chandler Carruth2014-02-251-0/+2
| | | | | | | | | | | | | | the break statement, not just think it to yourself.... No idea how this worked at all, much less survived most bots, my bootstrap, and some bot bootstraps! The Polly one didn't survive, and this was filed as PR18959. I don't have a reduced test case and honestly I'm not seeing the need. What we probably need here are better asserts / debug-build behavior in SmallPtrSet so that this madness doesn't make it so far. llvm-svn: 202129
* Silence GCC warningAlexey Samsonov2014-02-251-1/+1
| | | | llvm-svn: 202119
* Fix typosAlp Toker2014-02-251-1/+1
| | | | llvm-svn: 202107
* [SROA] Add a debugging tool which shuffles the slices sequence prior toChandler Carruth2014-02-251-0/+19
| | | | | | | | | | | | | sorting it. This helps uncover latent reliance on the original ordering which aren't guaranteed to be preserved by std::sort (but often are), and which are based on the use-def chain orderings which also aren't (technically) guaranteed. Only available in C++11 debug builds, and behind a flag to prevent noise at the moment, but this is generally useful so figured I'd put it in the tree rather than keeping it out-of-tree. llvm-svn: 202106
* [SROA] Use a more direct way of determining whether we are processingChandler Carruth2014-02-251-2/+3
| | | | | | | | | | | | | | | | | | the destination operand or source operand of a memmove. It so happens that it was impossible for SROA to try to rewrite self-memmove where the operands are *identical*, because either such a think is volatile (and we don't rewrite) or it is non-volatile, and we don't even register it as a use of the alloca. However, making the 'IsDest' test *rely* on this subtle fact is... Very confusing for the reader. We should use the direct and readily available test of the Use* which gives us concrete information about which operand is being rewritten. No functionality changed, I hope! ;] llvm-svn: 202103
* [SROA] Fix another instability in SROA with respect to the sliceChandler Carruth2014-02-251-66/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ordering. The fundamental problem that we're hitting here is that the use-def chain ordering is *itself* not a stable thing to be relying on in the rewriting for SROA. Further, we use a non-stable sort over the slices to arrange them based on the section of the alloca they're operating on. With a debugging STL implementation (or different implementations in stage2 and stage3) this can cause stage2 != stage3. The specific aspect of this problem fixed in this commit deals with the rewriting and load-speculation around PHIs and Selects. This, like many other aspects of the use-rewriting in SROA, is really part of the "strong SSA-formation" that is doen by SROA where it works very hard to canonicalize loads and stores in *just* the right way to satisfy the needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the same alloca, we test that loads downstream of the select are speculatable around it twice. If only one of the operands to the select needs to be rewritten, then if we get lucky we rewrite that one first and the select is immediately speculatable. This can cause the order of operand visitation, and thus the order of slices to be rewritten, to change an alloca from promotable to non-promotable and vice versa. The fix is to defer all of the speculation until *after* the rewrite phase is done. Once we've rewritten everything, we can accurately test for whether speculation will work (once, instead of twice!) and the order ceases to matter. This also happens to simplify the other subtlety of speculation -- we need to *not* speculate anything unless the result of speculating will make the alloca fully promotable by mem2reg. I had a previous attempt at simplifying this, but it was still pretty horrible. There is actually already a *really* nice test case for this in basictest.ll, but on multiple STL implementations and inputs, we just got "lucky". Fortunately, the test case is very small and we can essentially build it in exactly the opposite way to get reasonable coverage in both directions even from normal STL implementations. llvm-svn: 202092
* Make some DataLayout pointers const.Rafael Espindola2014-02-2417-47/+49
| | | | | | No functionality change. Just reduces the noise of an upcoming patch. llvm-svn: 202087
* SLPVectorizer: Try vectorizing 'splat' storesArnold Schwaighofer2014-02-241-3/+7
| | | | | | | | | Vectorize sequential stores of a broadcasted value. 5% on eon. radar://16124699 llvm-svn: 202067
* Replace the F_Binary flag with a F_Text one.Rafael Espindola2014-02-242-2/+2
| | | | | | | | | After this I will set the default back to F_None. The advantage is that before this patch forgetting to set F_Binary would corrupt a file on windows. Forgetting to set F_Text produces one that cannot be read in notepad, which is a better failure mode :-) llvm-svn: 202052
* LTO: Add the loop vectorizer to the LTO pipeline.Arnold Schwaighofer2014-02-241-0/+3
| | | | | | | | | | | | | During the LTO phase LICM will move loop invariant global variables out of loops (informed by GlobalModRef). This makes more loops countable presenting opportunity for the loop vectorizer. Adding the loop vectorizer improves some TSVC benchmarks and twolf/ref dataset (5%) on x86-64. radar://15970632 llvm-svn: 202051
* Don't make F_None the default.Rafael Espindola2014-02-241-1/+1
| | | | | | This will make it easier to switch the default to being binary files. llvm-svn: 202042
* [asan] simplify the code that compute the shadow offset; get rid of two ↵Kostya Serebryany2014-02-241-46/+26
| | | | | | internal flags that allowed to override it. The tests pass, but still this change might break asan on some platform not covered by tests. If you see this, please submit a fix with a test. llvm-svn: 202033
* Include <cctype> for isdigit().Logan Chien2014-02-221-0/+1
| | | | llvm-svn: 201930
* [CodeGenPrepare] Move CodeGenPrepare into lib/CodeGen.Quentin Colombet2014-02-223-2916/+0
| | | | | | | | | | | | CodeGenPrepare uses extensively TargetLowering which is part of libLLVMCodeGen. This is a layer violation which would introduce eventually a dependence on CodeGen in ScalarOpts. Move CodeGenPrepare into libLLVMCodeGen to avoid that. Follow-up of <rdar://problem/15519855> llvm-svn: 201912
* Rename a few more DataLayout variables from TD to DL.Rafael Espindola2014-02-211-5/+5
| | | | llvm-svn: 201870
* Rename a few more DataLayout variables.Rafael Espindola2014-02-216-25/+25
| | | | llvm-svn: 201833
* Rename many DataLayout variables from TD to DL.Rafael Espindola2014-02-2135-740/+740
| | | | | | | | | I am really sorry for the noise, but the current state where some parts of the code use TD (from the old name: TargetData) and other parts use DL makes it hard to write a patch that changes where those variables come from and how they are passed along. llvm-svn: 201827
* Make sure that value handle users see the transformation of an indirect call ↵Nick Lewycky2014-02-201-0/+2
| | | | | | to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink! llvm-svn: 201822
* Add back r201608, r201622, r201624 and r201625Rafael Espindola2014-02-191-11/+5
| | | | | | | | | | | | | | r201608 made llvm corretly handle private globals with MachO. r201622 fixed a bug in it and r201624 and r201625 were changes for using private linkage, assuming that llvm would do the right thing. They all got reverted because r201608 introduced a crash in LTO. This patch includes a fix for that. The issue was that TargetLoweringObjectFile now has to be initialized before we can mangle names of private globals. This is trivially true during the normal codegen pipeline (the asm printer does it), but LTO has to do it manually. llvm-svn: 201700
* This reverts commit r201625 and r201624.Rafael Espindola2014-02-191-5/+11
| | | | | | | Since r201608 got reverted, it is not safe to use private linkage in these cases until it is committed back. llvm-svn: 201688
* X86 CodeGenPrep: sink shufflevectors before shiftsTim Northover2014-02-191-0/+72
| | | | | | | | | | | | | | | | | On x86, shifting a vector by a scalar is significantly cheaper than shifting a vector by another fully general vector. Unfortunately, because SelectionDAG operates on just one basic block at a time, the shufflevector instruction that reveals whether the right-hand side of a shift *is* really a scalar is often not visible to CodeGen when it's needed. This adds another handler to CodeGenPrepare, to sink any useful shufflevector instructions down to the basic block where they're used, predicated on a target hook (since on other architectures, doing so will often just introduce extra real work). rdar://problem/16063505 llvm-svn: 201655
* Now that llvm always does the right thing with private, use it.Rafael Espindola2014-02-191-11/+5
| | | | llvm-svn: 201625
* Rename some member variables from TD to DL.Rafael Espindola2014-02-181-9/+9
| | | | | | TargetData was renamed DataLayout back in r165242. llvm-svn: 201581
* GlobalMerge: move "-global-merge" option to the pass itself.Tim Northover2014-02-181-0/+8
| | | | | | | It's rather odd to have the flag enabling and disabling this pass only affect a single target. llvm-svn: 201559
* fix for null VectorizedValue assertion in the SLP Vectorizer (in function ↵Gerolf Hoflehner2014-02-171-2/+4
| | | | | | vectorizeTree()). radar://16064178 llvm-svn: 201501
* fixed typo in comment as my test commitGerolf Hoflehner2014-02-161-1/+1
| | | | llvm-svn: 201486
* [CodeGenPrepare][AddressingModeMatcher] Give up on type promotion if theQuentin Colombet2014-02-141-3/+33
| | | | | | | transformation does not bring any immediate benefits and introduce an illegal operation. llvm-svn: 201439
* Trivial cleanup: reuse existing variable.Rafael Espindola2014-02-141-2/+1
| | | | | | | | Extracted while trying to understand http://llvm-reviews.chandlerc.com/D1764. Patch by Matt Arsenault. llvm-svn: 201425
* Do more addrspacecast transforms that happen for bitcast.Matt Arsenault2014-02-141-6/+12
| | | | | | Makes addrspacecast (gep) do addrspacecast (gep) instead. llvm-svn: 201376
* InstCombine: Replace custom constant folding code with ConstantExpr.Benjamin Kramer2014-02-131-26/+11
| | | | llvm-svn: 201352
* Reduce code duplication resulting from the ConstantVector/ConstantDataVector ↵Benjamin Kramer2014-02-132-22/+9
| | | | | | | | split. No intended functionality change. llvm-svn: 201344
* GlobalOpt: Aliases don't have sections, don't copy them when replacingReid Kleckner2014-02-131-1/+2
| | | | | | | | | | | | | | | | | | | | | As defined in LangRef, aliases do not have sections. However, LLVM's GlobalAlias class inherits from GlobalValue, which means we can read and set its section. We should probably ban that as a separate change, since it doesn't make much sense for an alias to have a section that differs from its aliasee. Fixes PR18757, where the section was being lost on the global in code from Clang like: extern "C" { __attribute__((used, section("CUSTOM"))) static int in_custom_section; } Reviewers: rafael.espindola Differential Revision: http://llvm-reviews.chandlerc.com/D2758 llvm-svn: 201286
* Remove a very old instcombine where we would turn sequences of selects intoOwen Anderson2014-02-121-25/+0
| | | | | | | | | | | | | logical operations on the i1's driving them. This is a bad idea for every target I can think of (confirmed with micro tests on all of: x86-64, ARM, AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into a general purpose register, whereas consuming it directly into a select generally allows it to exist only transiently in a predicate or flags register. Chandler ran a set of performance tests with this change, and reported no measurable change on x86-64. llvm-svn: 201275
* [Vectorizer] Add a new 'OperandValueKind' in TargetTransformInfo calledAndrea Di Biagio2014-02-123-7/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | 'OK_NonUniformConstValue' to identify operands which are constants but not constant splats. The cost model now allows returning 'OK_NonUniformConstValue' for non splat operands that are instances of ConstantVector or ConstantDataVector. With this change, targets are now able to compute different costs for instructions with non-uniform constant operands. For example, On X86 the cost of a vector shift may vary depending on whether the second operand is a uniform or non-uniform constant. This patch applies the following changes: - The cost model computation now takes into account non-uniform constants; - The cost of vector shift instructions has been improved in X86TargetTransformInfo analysis pass; - BBVectorize, SLPVectorizer and LoopVectorize now know how to distinguish between non-uniform and uniform constant operands. Added a new test to verify that the output of opt '-cost-model -analyze' is valid in the following configurations: SSE2, SSE4.1, AVX, AVX2. llvm-svn: 201272
* InstCombine: Teach icmp merging about the equivalence of bit tests and ↵Benjamin Kramer2014-02-111-23/+38
| | | | | | | | | UGE/ULT with a power of 2. This happens in bitfield code. While there reorganize the existing code a bit. llvm-svn: 201176
* [LPM] Switch LICM to actively use LCSSA in addition to preserving it.Chandler Carruth2014-02-111-152/+90
| | | | | | | | | | | | | | | | | | | | | | | Fixes PR18753 and PR18782. This is necessary for LICM to preserve LCSSA correctly and efficiently. There is still some active discussion about whether we should be using LCSSA, but we can't just immediately stop using it and we *need* LICM to preserve it while we are using it. We can restore the old SSAUpdater driven code if and when there is a serious effort to remove the reliance on LCSSA from all of the loop passes. However, this also serves as a great example of why LCSSA is very nice to have. This change significantly simplifies the process of sinking instructions for LICM, and makes it quite a bit less expensive. It wouldn't even be as complex as it is except that I had to start the process of removing the big recursive LCSSA formation hammer in order to switch even this much of the re-forming code to asserting that LCSSA was preserved. I'll fully remove that next just to tidy things up until the LCSSA debate settles one way or the other. llvm-svn: 201148
* [CodeGenPrepare] Undo changes that happened for the profitability check.Quentin Colombet2014-02-111-0/+7
| | | | | | | | | | | | | | | | | The addressing mode matcher checks at some point the profitability of folding an instruction into the addressing mode. When the instruction to be folded has several uses, it checks that the instruction can be folded in each use. To do so, it creates a new matcher for each use and check if the instruction is in the list of the matched instructions of this new matcher. The new matchers may promote some instructions and this has to be undone to keep the state of the original matcher consistent. A test case will follow. <rdar://problem/16020230> llvm-svn: 201121
* [LPM] A terribly simple fix to a terribly complex bug: PR18773.Chandler Carruth2014-02-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The crux of the issue is that LCSSA doesn't preserve stateful alias analyses. Before r200067, LICM didn't cause LCSSA to run in the LTO pass manager, where LICM runs essentially without any of the other loop passes. As a consequence the globalmodref-aa pass run before that loop pass manager was able to survive the loop pass manager and be used by DSE to eliminate stores in the function called from the loop body in Adobe-C++/loop_unroll (and similar patterns in other benchmarks). When LICM was taught to preserve LCSSA it had to require it as well. This caused it to be run in the loop pass manager and because it did not preserve AA, the stateful AA was lost. Most of LLVM's AA isn't stateful and so this didn't manifest in most cases. Also, in most cases LCSSA was already running, and so there was no interesting change. The real kicker is that LCSSA by its definition (injecting PHI nodes only) trivially preserves AA! All we need to do is mark it, and then everything goes back to working as intended. It probably was blocking some other weird cases of stateful AA but the only one I have is a 1000-line IR test case from loop_unroll, so I don't really have a good test case here. Hopefully this fixes the regressions on performance that have been seen since that revision. llvm-svn: 201104
* Make succ_iterator a real random access iterator and clean up a couple of users.Benjamin Kramer2014-02-101-2/+1
| | | | llvm-svn: 201088
* [asan] support for FreeBSD, LLVM part. patch by Viktor KutuzovKostya Serebryany2014-02-101-2/+7
| | | | llvm-svn: 201067
* LoopVectorizer: Keep track of conditional store basic blocksArnold Schwaighofer2014-02-081-0/+4
| | | | | | | | | | | | | Before conditional store vectorization/unrolling we had only one vectorized/unrolled basic block. After adding support for conditional store vectorization this will not only be one block but multiple basic blocks. The last block would have the back-edge. I updated the code to use a vector of basic blocks instead of a single basic block and fixed the users to use the last entry in this vector. But, I forgot to add the basic blocks to this vector! Fixes PR18724. llvm-svn: 201028
* [Constant Hoisting] Fix insertion point for constant materialization.Juergen Ributzka2014-02-081-18/+21
| | | | | | | | | | The bitcast instruction during constant materialization was not placed correcly in the presence of phi nodes. This commit fixes the insertion point to be in the idom instead. This fixes PR18768 llvm-svn: 201009
* [Constant Hoisting] Don't update the use list while traversing it - DOH!Juergen Ributzka2014-02-081-5/+16
| | | | | | | | This fix first traverses the whole use list of the constant expression and keeps track of the instructions that need to be updated. Then perform the fixup afterwards. llvm-svn: 201008
* [CodeGenPrepare] Move away sign extensions that get in the way of addressingQuentin Colombet2014-02-061-14/+808
| | | | | | | | | | | | | | | | | | | | | | | | | | mode. Basically the idea is to transform code like this: %idx = add nsw i32 %a, 1 %sextidx = sext i32 %idx to i64 %gep = gep i8* %myArray, i64 %sextidx load i8* %gep Into: %sexta = sext i32 %a to i64 %idx = add nsw i64 %sexta, 1 %gep = gep i8* %myArray, i64 %idx load i8* %gep That way the computation can be folded into the addressing mode. This transformation is done as part of the addressing mode matcher. If the matching fails (not profitable, addressing mode not legal, etc.), the matcher will revert the related promotions. <rdar://problem/15519855> llvm-svn: 200947
* A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton!Nick Lewycky2014-02-061-1/+11
| | | | llvm-svn: 200907
* Set default of inlinecold-threshold to 225.Manman Ren2014-02-061-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | 225 is the default value of inline-threshold. This change will make sure we have the same inlining behavior as prior to r200886. As Chandler points out, even though we don't have code in our testing suite that uses cold attribute, there are larger applications that do use cold attribute. r200886 + this commit intend to keep the same behavior as prior to r200886. We can later on tune the inlinecold-threshold. The main purpose of r200886 is to help performance of instrumentation based PGO before we actually hook up inliner with analysis passes such as BPI and BFI. For instrumentation based PGO, we try to increase inlining of hot functions and reduce inlining of cold functions by setting inlinecold-threshold. Another option suggested by Chandler is to use a boolean flag that controls if we should use OptSizeThreshold for cold functions. The default value of the boolean flag should not change the current behavior. But it gives us less freedom in controlling inlining of cold functions. llvm-svn: 200898
* Disable most IR-level transform passes on functions marked 'optnone'.Paul Robinson2014-02-0629-0/+89
| | | | | | | | | Ideally only those transform passes that run at -O0 remain enabled, in reality we get as close as we reasonably can. Passes are responsible for disabling themselves, it's not the job of the pass manager to do it for them. llvm-svn: 200892
* Inliner uses a smaller inline threshold for callees with cold attribute.Manman Ren2014-02-051-0/+11
| | | | | | | | Added command line option inlinecold-threshold to set threshold for inlining functions with cold attribute. Listen to the cold attribute when it would decrease the inline threshold. llvm-svn: 200886
* SimplifyLibCalls: Push TLI through the exp2->ldexp transform.Benjamin Kramer2014-02-041-29/+29
| | | | | | For the odd case of platforms with exp2 available but not ldexp. llvm-svn: 200795
OpenPOWER on IntegriCloud