summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Trivial test commit.Raul E. Silvera2014-03-051-0/+1
| | | | llvm-svn: 202924
* Allow constant folding of fma and fmuladdMatt Arsenault2014-03-051-0/+39
| | | | llvm-svn: 202914
* Pass to emit DWARF path discriminators.Diego Novillo2014-03-033-0/+212
| | | | | | | | | | | | | | | | DWARF discriminators are used to distinguish multiple control flow paths on the same source location. When this happens, instructions across basic block boundaries will share the same debug location. This pass detects this situation and creates a new lexical scope to one of the two instructions. This lexical scope is a child scope of the original and contains a new discriminator value. This discriminator is then picked up from MCObjectStreamer::EmitDwarfLocDirective to be written on the object file. This fixes http://llvm.org/bugs/show_bug.cgi?id=18270. llvm-svn: 202752
* Add a debug info code generation level to the compile unit metadataEric Christopher2014-02-272-2/+2
| | | | | | | | | | and update everything accordingly. This can be used to conditionalize the amount of output in the backend based on the amount of debug requested/metadata emission scheme by a front end (e.g. clang). Paired with a commit to clang. llvm-svn: 202332
* Fix broken FileCheck prefixesNico Rieck2014-02-262-8/+8
| | | | llvm-svn: 202308
* GlobalOpt: Apply fastcc to internal x86_thiscallcc functionsReid Kleckner2014-02-261-0/+46
| | | | | | | | | | We should apply fastcc whenever profitable. We can expand this list, but there are lots of conventions with performance implications that we don't want to change. Differential Revision: http://llvm-reviews.chandlerc.com/D2705 llvm-svn: 202293
* Fix broken FileCheck prefixNico Rieck2014-02-261-4/+4
| | | | llvm-svn: 202291
* Fix PR18165: LSR must avoid scaling factors that exceed the limit on ↵Andrew Trick2014-02-261-0/+88
| | | | | | | | truncated use. Patch by Michael Zolotukhin! llvm-svn: 202273
* [SROA] Use the correct index integer size in GEPs through non-defaultChandler Carruth2014-02-261-1/+1
| | | | | | | | | | | address spaces. This isn't really a correctness issue (the values are truncated) but its much cleaner. Patch by Matt Arsenault! llvm-svn: 202252
* [SROA] Teach SROA how to handle pointers from address spaces other thanChandler Carruth2014-02-263-1/+135
| | | | | | | | | | | | | | | the default. Based on the patch by Matt Arsenault, D1764! I switched one place to use the more direct pointer type to compute the desired address space, and I reworked the memcpy rewriting section to reflect significant refactorings that this patch helped inspire. Thanks to several of the folks who helped review and improve the patch as well. llvm-svn: 202247
* [SROA] Split the alignment computation complete for the memcpy rewritingChandler Carruth2014-02-261-0/+16
| | | | | | | | | | | | | to work independently for the slice side and the other side. This allows us to only compute the minimum of the two when we actually rewrite to a memcpy that needs to take the minimum, and preserve higher alignment for one side or the other when rewriting to loads and stores. This fix was inspired by seeing the result of some refactoring that makes addrspace handling better. llvm-svn: 202242
* [SROA] Fix PR18615 with some long overdue simplifications to the boundsChandler Carruth2014-02-261-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | checking in SROA. The primary change is to just rely on uge for checking that the offset is within the allocation size. This removes the explicit checks against isNegative which were terribly error prone (including the reversed logic that led to PR18615) and prevented us from supporting stack allocations larger than half the address space.... Ok, so maybe the latter isn't *common* but it's a silly restriction to have. Also, we used to try to support a PHI node which loaded from before the start of the allocation if any of the loaded bytes were within the allocation. This doesn't make any sense, we have never really supported loading or storing *before* the allocation starts. The simplified logic just doesn't care. We continue to allow loading past the end of the allocation in part to support cases where there is a PHI and some loads are larger than others and the larger ones reach past the end of the allocation. We could solve this a different and more conservative way, but I'm still somewhat paranoid about this. llvm-svn: 202224
* [SROA] Fix another instability in SROA with respect to the sliceChandler Carruth2014-02-251-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ordering. The fundamental problem that we're hitting here is that the use-def chain ordering is *itself* not a stable thing to be relying on in the rewriting for SROA. Further, we use a non-stable sort over the slices to arrange them based on the section of the alloca they're operating on. With a debugging STL implementation (or different implementations in stage2 and stage3) this can cause stage2 != stage3. The specific aspect of this problem fixed in this commit deals with the rewriting and load-speculation around PHIs and Selects. This, like many other aspects of the use-rewriting in SROA, is really part of the "strong SSA-formation" that is doen by SROA where it works very hard to canonicalize loads and stores in *just* the right way to satisfy the needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the same alloca, we test that loads downstream of the select are speculatable around it twice. If only one of the operands to the select needs to be rewritten, then if we get lucky we rewrite that one first and the select is immediately speculatable. This can cause the order of operand visitation, and thus the order of slices to be rewritten, to change an alloca from promotable to non-promotable and vice versa. The fix is to defer all of the speculation until *after* the rewrite phase is done. Once we've rewritten everything, we can accurately test for whether speculation will work (once, instead of twice!) and the order ceases to matter. This also happens to simplify the other subtlety of speculation -- we need to *not* speculate anything unless the result of speculating will make the alloca fully promotable by mem2reg. I had a previous attempt at simplifying this, but it was still pretty horrible. There is actually already a *really* nice test case for this in basictest.ll, but on multiple STL implementations and inputs, we just got "lucky". Fortunately, the test case is very small and we can essentially build it in exactly the opposite way to get reasonable coverage in both directions even from normal STL implementations. llvm-svn: 202092
* SLPVectorizer: Try vectorizing 'splat' storesArnold Schwaighofer2014-02-241-0/+15
| | | | | | | | | Vectorize sequential stores of a broadcasted value. 5% on eon. radar://16124699 llvm-svn: 202067
* Make sure that value handle users see the transformation of an indirect call ↵Nick Lewycky2014-02-201-0/+23
| | | | | | to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink! llvm-svn: 201822
* X86: move test requiring X86TargetLowering info into its own directoryTim Northover2014-02-192-0/+4
| | | | | | | If LLVM is built without X86 as a supported target then the test would mysteriously fail. llvm-svn: 201668
* Try addding datalayout in case that's what Hexagon doesn't like.Tim Northover2014-02-191-2/+5
| | | | | | | Just a wild stab in the dark really, but in the absence of any ability to reproduce the problem... llvm-svn: 201658
* X86 CodeGenPrep: sink shufflevectors before shiftsTim Northover2014-02-191-0/+102
| | | | | | | | | | | | | | | | | On x86, shifting a vector by a scalar is significantly cheaper than shifting a vector by another fully general vector. Unfortunately, because SelectionDAG operates on just one basic block at a time, the shufflevector instruction that reveals whether the right-hand side of a shift *is* really a scalar is often not visible to CodeGen when it's needed. This adds another handler to CodeGenPrepare, to sink any useful shufflevector instructions down to the basic block where they're used, predicated on a target hook (since on other architectures, doing so will often just introduce extra real work). rdar://problem/16063505 llvm-svn: 201655
* fix for null VectorizedValue assertion in the SLP Vectorizer (in function ↵Gerolf Hoflehner2014-02-171-0/+65
| | | | | | vectorizeTree()). radar://16064178 llvm-svn: 201501
* SCEVExpander: Try hard not to create derived induction variables in other loopsArnold Schwaighofer2014-02-161-0/+50
| | | | | | | | | | | | | | | | | | | During LSR of one loop we can run into a situation where we have to expand the start of a recurrence of a loop induction variable in this loop. This start value is a value derived of the induction variable of a preceeding loop. SCEV has cannonicalized this value to a different recurrence than the recurrence of the preceeding loop's induction variable (the type and/or step direction) has changed). When we come to instantiate this SCEV we created a second induction variable in this preceeding loop. This patch tries to base such derived induction variables of the preceeding loop's induction variable. This helps twolf on arm and seems to help scimark2 on x86. Reapply with a fix for the case of a value derived from a pointer. radar://15970709 llvm-svn: 201496
* Fix broken CHECK linesNico Rieck2014-02-161-1/+1
| | | | llvm-svn: 201479
* Revert "SCEVExpander: Try hard not to create derived induction variables in ↵Arnold Schwaighofer2014-02-151-50/+0
| | | | | | | | other loops" This reverts commit r201465. It broke an arm bot. llvm-svn: 201466
* SCEVExpander: Try hard not to create derived induction variables in other loopsArnold Schwaighofer2014-02-151-0/+50
| | | | | | | | | | | | | | | | | During LSR of one loop we can run into a situation where we have to expand the start of a recurrence of a loop induction variable in this loop. This start value is a value derived of the induction variable of a preceeding loop. SCEV has cannonicalized this value to a different recurrence than the recurrence of the preceeding loop's induction variable (the type and/or step direction) has changed). When we come to instantiate this SCEV we created a second induction variable in this preceeding loop. This patch tries to base such derived induction variables of the preceeding loop's induction variable. This helps twolf on arm and seems to help scimark2 on x86. radar://15970709 llvm-svn: 201465
* Do more addrspacecast transforms that happen for bitcast.Matt Arsenault2014-02-142-2/+70
| | | | | | Makes addrspacecast (gep) do addrspacecast (gep) instead. llvm-svn: 201376
* Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333
* GlobalOpt: Aliases don't have sections, don't copy them when replacingReid Kleckner2014-02-132-4/+9
| | | | | | | | | | | | | | | | | | | | | As defined in LangRef, aliases do not have sections. However, LLVM's GlobalAlias class inherits from GlobalValue, which means we can read and set its section. We should probably ban that as a separate change, since it doesn't make much sense for an alias to have a section that differs from its aliasee. Fixes PR18757, where the section was being lost on the global in code from Clang like: extern "C" { __attribute__((used, section("CUSTOM"))) static int in_custom_section; } Reviewers: rafael.espindola Differential Revision: http://llvm-reviews.chandlerc.com/D2758 llvm-svn: 201286
* Remove a very old instcombine where we would turn sequences of selects intoOwen Anderson2014-02-121-0/+24
| | | | | | | | | | | | | logical operations on the i1's driving them. This is a bad idea for every target I can think of (confirmed with micro tests on all of: x86-64, ARM, AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into a general purpose register, whereas consuming it directly into a select generally allows it to exist only transiently in a predicate or flags register. Chandler ran a set of performance tests with this change, and reported no measurable change on x86-64. llvm-svn: 201275
* Revert r201237+r201238: Demote EmitRawText call in ↵Daniel Sanders2014-02-121-1/+1
| | | | | | | | AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241
* Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-121-1/+1
| | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237
* InstCombine: Teach icmp merging about the equivalence of bit tests and ↵Benjamin Kramer2014-02-111-0/+38
| | | | | | | | | UGE/ULT with a power of 2. This happens in bitfield code. While there reorganize the existing code a bit. llvm-svn: 201176
* [LPM] Switch LICM to actively use LCSSA in addition to preserving it.Chandler Carruth2014-02-112-17/+82
| | | | | | | | | | | | | | | | | | | | | | | Fixes PR18753 and PR18782. This is necessary for LICM to preserve LCSSA correctly and efficiently. There is still some active discussion about whether we should be using LCSSA, but we can't just immediately stop using it and we *need* LICM to preserve it while we are using it. We can restore the old SSAUpdater driven code if and when there is a serious effort to remove the reliance on LCSSA from all of the loop passes. However, this also serves as a great example of why LCSSA is very nice to have. This change significantly simplifies the process of sinking instructions for LICM, and makes it quite a bit less expensive. It wouldn't even be as complex as it is except that I had to start the process of removing the big recursive LCSSA formation hammer in order to switch even this much of the re-forming code to asserting that LCSSA was preserved. I'll fully remove that next just to tidy things up until the LCSSA debate settles one way or the other. llvm-svn: 201148
* LoopVectorizer: Keep track of conditional store basic blocksArnold Schwaighofer2014-02-081-0/+40
| | | | | | | | | | | | | Before conditional store vectorization/unrolling we had only one vectorized/unrolled basic block. After adding support for conditional store vectorization this will not only be one block but multiple basic blocks. The last block would have the back-edge. I updated the code to use a vector of basic blocks instead of a single basic block and fixed the users to use the last entry in this vector. But, I forgot to add the basic blocks to this vector! Fixes PR18724. llvm-svn: 201028
* [Constant Hoisting] Fix insertion point for constant materialization.Juergen Ributzka2014-02-081-0/+22
| | | | | | | | | | The bitcast instruction during constant materialization was not placed correcly in the presence of phi nodes. This commit fixes the insertion point to be in the idom instead. This fixes PR18768 llvm-svn: 201009
* A memcpy out of an fresh alloca is a no-op, delete it. Patch by Patrick Walton!Nick Lewycky2014-02-061-0/+25
| | | | llvm-svn: 200907
* Set default of inlinecold-threshold to 225.Manman Ren2014-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | 225 is the default value of inline-threshold. This change will make sure we have the same inlining behavior as prior to r200886. As Chandler points out, even though we don't have code in our testing suite that uses cold attribute, there are larger applications that do use cold attribute. r200886 + this commit intend to keep the same behavior as prior to r200886. We can later on tune the inlinecold-threshold. The main purpose of r200886 is to help performance of instrumentation based PGO before we actually hook up inliner with analysis passes such as BPI and BFI. For instrumentation based PGO, we try to increase inlining of hot functions and reduce inlining of cold functions by setting inlinecold-threshold. Another option suggested by Chandler is to use a boolean flag that controls if we should use OptSizeThreshold for cold functions. The default value of the boolean flag should not change the current behavior. But it gives us less freedom in controlling inlining of cold functions. llvm-svn: 200898
* Inliner uses a smaller inline threshold for callees with cold attribute.Manman Ren2014-02-051-0/+88
| | | | | | | | Added command line option inlinecold-threshold to set threshold for inlining functions with cold attribute. Listen to the cold attribute when it would decrease the inline threshold. llvm-svn: 200886
* SimplifyLibCalls: Push TLI through the exp2->ldexp transform.Benjamin Kramer2014-02-041-0/+24
| | | | | | For the odd case of platforms with exp2 available but not ldexp. llvm-svn: 200795
* OS X: the correct function is __sincospif_stret, not __sincospi_stretfTim Northover2014-02-041-4/+4
| | | | | | rdar://problem/13729466 llvm-svn: 200771
* Add strchr(p, 0) -> p + strlen(p) to SimplifyLibCallsKai Nacke2014-02-041-0/+13
| | | | | | | | Add the missing transformation strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls and remove the ToDo comment. Reviewer: Duncan P.N. Exan Smith llvm-svn: 200736
* inalloca: Don't remove dead arguments in the presence of inalloca argsReid Kleckner2014-02-031-0/+16
| | | | | | | | | | | | It disturbs the layout of the parameters in memory and registers, leading to problems in the backend. The plan for optimizing internal inalloca functions going forward is to essentially SROA the argument memory and demote any captured arguments (things that aren't trivially written by a load or store) to an indirect pointer to a static alloca. llvm-svn: 200717
* Lower llvm.expect intrinsic correctly for i1Duncan P. N. Exon Smith2014-02-021-0/+29
| | | | | | | | | | LowerExpectIntrinsic previously only understood the idiom of an expect intrinsic followed by a comparison with zero. For llvm.expect.i1, the comparison would be stripped by the early-cse pass. Patch by Daniel Micay. llvm-svn: 200664
* LoopVectorizer: Enable unrolling of conditional stores and the load/storeArnold Schwaighofer2014-02-021-0/+3
| | | | | | | | | unrolling heuristic per default Benchmarking on x86_64 (thanks Chandler!) and ARM has shown those options speed up some benchmarks while not causing any interesting regressions. llvm-svn: 200621
* ARMTTI: We don't have 16 allocatable scalar registersArnold Schwaighofer2014-02-011-0/+36
| | | | | | | This caused an regression on libquantum after enabling the new loop vectorizer unroll heuristics. llvm-svn: 200616
* [LPM] Apply a really big hammer to fix PR18688 by recursively reformingChandler Carruth2014-02-011-0/+76
| | | | | | | | | | | | | | | | | | | | LCSSA when we promote to SSA registers inside of LICM. Currently, this is actually necessary. The promotion logic in LICM uses SSAUpdater which doesn't understand how to place LCSSA PHI nodes. Teaching it to do so would be a very significant undertaking. It may be worthwhile and I've left a FIXME about this in the code as well as starting a thread on llvmdev to try to figure out the right long-term solution. For now, the PR needs to be fixed. Short of using the promition SSAUpdater to place both the LCSSA PHI nodes and the promoted PHI nodes, I don't see a cleaner or cheaper way of achieving this. Fortunately, LCSSA is relatively lazy and sparse -- it should only update instructions which need it. We can also skip the recursive variant when we don't promote to SSA values. llvm-svn: 200612
* [inliner] Skip debug intrinsics even earlier in computing the inlineChandler Carruth2014-02-011-0/+55
| | | | | | | | | | | | | | | | | | | | | | | cost so that they don't impact the vector bonus. Fundamentally, counting unsimplified instructions is just *wrong*; it will continue to introduce instability as things which do not generate code bizarrely impact inlining. For example, sufficiently nested inlined functions could turn off the vector bonus with lifetime markers just like the debug intrinsics do. =/ This is a short-term tactical fix. Long term, I think we need to remove the vector bonus entirely. That's a separate patch and discussion though. The patch to fix this provided by Dario Domizioli. I've added some comments about the planned direction and used a heavily pruned form of debug info intrinsics for the test case. While this debug info doesn't work or "do" anything useful, it lets us easily test all manner of interference easily, and I suspect this will not be the last time we want to craft a pattern where debug info interferes with the inliner in a problematic way. llvm-svn: 200609
* Revert "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..."Reid Kleckner2014-02-011-75/+0
| | | | | | | This reverts commit r200576. It broke 32-bit self-host builds by vectorizing two calls to @llvm.bswap.i64, which we then fail to expand. llvm-svn: 200602
* [SLPV] Recognize vectorizable intrinsics during SLP vectorization andChandler Carruth2014-01-311-0/+75
| | | | | | | | | | transform accordingly. Based on similar code from Loop vectorization. Subsequent commits will include vectorization of function calls to vector intrinsics and form function calls to vector library calls. Patch by Raul Silvera! (Much delayed due to my not running dcommit) llvm-svn: 200576
* [vectorizer] Tweak the way we do small loop runtime unrolling in theChandler Carruth2014-01-311-15/+42
| | | | | | | | | | | | | | loop vectorizer to not do so when runtime pointer checks are needed and share code with the new (not yet enabled) load/store saturation runtime unrolling. Also ensure that we only consider the runtime checks when the loop hasn't already been vectorized. If it has, the runtime check cost has already been paid. I've fleshed out a test case to cover the scalar unrolling as well as the vector unrolling and comment clearly why we are or aren't following the pattern. llvm-svn: 200530
* Allow speculating llvm.sqrt, fma and fmuladdMatt Arsenault2014-01-311-0/+58
| | | | | | | | This doesn't set errno, so this should be OK. Also update the documentation to explicitly state that errno are not set. llvm-svn: 200501
* LoopVectorizer: Add a test case for unrolling of small loops that need a runtimeArnold Schwaighofer2014-01-291-0/+23
| | | | | | check. llvm-svn: 200408
OpenPOWER on IntegriCloud