summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Teach isKnownNonNull that a nonnull return is not null. Add a test for this ↵Nick Lewycky2014-05-201-0/+17
| | | | | | case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193
* [ConstantHoisting][X86] Change the cost model to never hoist constants for ↵Juergen Ributzka2014-05-191-0/+9
| | | | | | | | | | | | | | | types larger than i128. Currently the X86 backend doesn't support types larger than i128 very well. For example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted. This fix changes the cost model to never hoist constants for types larger than i128. Once the codegen issues have been resolved, the cost model can be updated to allow also larger types. This is related to <rdar://problem/16954938> llvm-svn: 209162
* Check the alwaysinline attribute on the call as well as on the caller.Peter Collingbourne2014-05-191-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150
* Flip on vectorization of bswap intrinsics.Benjamin Kramer2014-05-191-0/+44
| | | | | | | | | The cost model conservatively assumes that it will always get scalarized and that's about as good as we can get with the generic TTI; reasoning whether a shuffle with an efficient lowering is available is hard. We can override that conservative estimate for some targets in the future. llvm-svn: 209125
* Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'Dinesh Dwivedi2014-05-191-0/+52
| | | | | | | | | | | This removes TODO added in r208849 [http://reviews.llvm.org/D3629] MIN(MIN(A, 97), 23) -> MIN(A, 23) MAX(MAX(A, 23), 97) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3785 llvm-svn: 209110
* Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"NAKAMURA Takumi2014-05-171-56/+0
| | | | | | It broke clang selfhosting even after r209065. llvm-svn: 209067
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-161-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. rdar://15547484 llvm-svn: 209049
* Add comdat key field to llvm.global_ctors and llvm.global_dtorsReid Kleckner2014-05-161-2/+17
| | | | | | | | | | | | | | This allows us to put dynamic initializers for weak data into the same comdat group as the data being initialized. This is necessary for MSVC ABI compatibility. Once we have comdats for guard variables, we can use the combination to help GlobalOpt fire more often for weak data with guarded initialization on other platforms. Reviewers: nlewycky Differential Revision: http://reviews.llvm.org/D3499 llvm-svn: 209015
* Fix most of PR10367.Rafael Espindola2014-05-165-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the design of GlobalAlias so that it doesn't take a ConstantExpr anymore. It now points directly to a GlobalObject, but its type is independent of the aliasee type. To avoid changing all alias related tests in this patches, I kept the common syntax @foo = alias i32* @bar to mean the same as now. The cases that used to use cast now use the more general syntax @foo = alias i16, i32* @bar. Note that GlobalAlias now behaves a bit more like GlobalVariable. We know that its type is always a pointer, so we omit the '*'. For the bitcode, a nice surprise is that we were writing both identical types already, so the format change is minimal. Auto upgrade is handled by looking through the casts and no new fields are needed for now. New bitcode will simply have different types for Alias and Aliasee. One last interesting point in the patch is that replaceAllUsesWith becomes smart enough to avoid putting a ConstantExpr in the aliasee. This seems better than checking and updating every caller. A followup patch will delete getAliasedGlobal now that it is redundant. Another patch will add support for an explicit offset. llvm-svn: 209007
* InstSimplify: Improve handling of ashr/lshrDavid Majnemer2014-05-161-0/+40
| | | | | | | | | | | | | | Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000
* InstSimplify: Optimize using dividend in sdivDavid Majnemer2014-05-161-0/+9
| | | | | | | | | | | | | | | Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999
* Revert "Implement global merge optimization for global variables."Rafael Espindola2014-05-166-88/+39
| | | | | | | | | | | | This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978
* Implement global merge optimization for global variables.Jiangning Liu2014-05-156-39/+88
| | | | | | | | | | | This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934
* Don't insert lifetime.end markers between a musttail call and retReid Kleckner2014-05-151-0/+36
| | | | | | | | | | | | | The allocas going out of scope are immediately killed by the return instruction. This is a resend of r208912, which was committed accidentally. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3792 llvm-svn: 208920
* Revert "Don't insert lifetime.end markers between a musttail call and ret"Reid Kleckner2014-05-151-36/+0
| | | | | | | | This reverts commit r208912. It was committed accidentally without review. llvm-svn: 208914
* Don't insert lifetime.end markers between a musttail call and retReid Kleckner2014-05-151-0/+36
| | | | | | | | | | | The allocas going out of scope are immediately killed by the return instruction. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3630 llvm-svn: 208912
* Teach the inliner how to preserve musttail invariantsReid Kleckner2014-05-151-9/+140
| | | | | | | | | | | | | | | | | | | | The interesting case is what happens when you inline a musttail call through a musttail call site. In this case, we can't break perfect forwarding or allow any stack growth. Instead of merging control flow from the inlined return instruction after a musttail call into the body of the caller, leave the inlined return instruction in the caller so that the musttail call stays in the tail position. More work is required in http://reviews.llvm.org/D3630 to handle the case where the inlined function has dynamic allocas or byval arguments. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3491 llvm-svn: 208910
* Teach the constant folder to look through bitcast constant expressionsChandler Carruth2014-05-151-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | much more effectively when trying to constant fold a load of a constant. Previously, we only handled bitcasts by trying to find a totally generic byte representation of the constant and use that. Now, we look through the bitcast to see what constant we might fold the load into, and then try to form a constant expression cast of the found value that would be equivalent to loading the value. You might wonder why on earth this actually matters. Well, turns out that the Itanium ABI causes us to create a single array for a vtable where the first elements are virtual base offsets, followed by the virtual function pointers. Because the array is homogenous the element type is consistently i8* and we inttoptr the virtual base offsets into the initial elements. Then constructors bitcast these pointers to i64 pointers prior to loading them. Boom, no more constant folding of virtual base offsets. This is the first fix to LLVM to address the *insane* performance Eric Niebler discovered with Clang on his range comprehensions[1]. There is more to come though, this doesn't *really* fix the problem fully. [1]: http://ericniebler.com/2014/04/27/range-comprehensions/ llvm-svn: 208856
* Reverting r208848, reason: build failure: ↵Dinesh Dwivedi2014-05-151-54/+0
| | | | | | sanitizer-x86_64-linux-bootstrap/builds/3399 llvm-svn: 208852
* Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'Dinesh Dwivedi2014-05-151-0/+48
| | | | | | | | | MIN(MIN(A, 23), 97) -> MIN(A, 23) MAX(MAX(A, 97), 23) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3629 llvm-svn: 208849
* Added inst combine transforms for single bit tests from Chris's noteDinesh Dwivedi2014-05-151-0/+54
| | | | | | | | | | | | | | | if ((x & C) == 0) x |= C becomes x |= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x |= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Z3 Verifications code for above transform http://rise4fun.com/Z3/Pmsh Differential Revision: http://reviews.llvm.org/D3717 llvm-svn: 208848
* InstCombine: Optimize -x s< cstDavid Majnemer2014-05-151-0/+9
| | | | | | | | | | | | | | Summary: This gets rid of a sub instruction by moving the negation to the constant when valid. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3773 llvm-svn: 208827
* InstSimplify: Optimize signed icmp of -(zext V)David Majnemer2014-05-141-0/+60
| | | | | | | | | | | | | | | | Summary: We know that -(zext V) will always be <= zero, simplify signed icmps that have these. Uncovered using http://www.cs.utah.edu/~regehr/souper/ Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3754 llvm-svn: 208809
* Fix the case when reordering shuffle and binop produces a constant.Serge Pavlov2014-05-141-0/+11
| | | | | | This resolves PR19737. llvm-svn: 208762
* Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. ↵Nick Lewycky2014-05-141-0/+19
| | | | | | This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/ llvm-svn: 208750
* Fix type of shuffle resulted from shuffle merge.Serge Pavlov2014-05-131-0/+8
| | | | | | This fix resolves PR19730. llvm-svn: 208666
* Convert test to FileCheck.Rafael Espindola2014-05-131-1/+8
| | | | llvm-svn: 208658
* Convert test to FileCheck.Rafael Espindola2014-05-131-2/+12
| | | | llvm-svn: 208644
* [Test] Trim unnecessary .c and .cpp from config.suffix in lit.local.cfgAdam Nemet2014-05-122-2/+2
| | | | | | | | | | | | | | Tested by comparing make check VERBOSE=1 before and after to make sure no tests are missed. (VERBOSE=1 prints the list of tests.) Only one test :( remains where .cpp is required: tools/llvm-cov/range_based_for.cpp:// RUN: llvm-cov range_based_for.cpp | FileCheck %s --check-prefix=STDOUT The topic was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140428/214905.html llvm-svn: 208621
* Fix type of shuffle obtained from reordering with binary operationSerge Pavlov2014-05-121-0/+11
| | | | | | | | In transformation: BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef) type of the undef argument must be same as type of BinOp. llvm-svn: 208531
* Fix reordering of shuffles and binary operationsSerge Pavlov2014-05-121-0/+12
| | | | | | | | | | | | Do not apply transformation: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) if operands v1 and v2 are of different size. This change fixes PR19717, which was caused by r208488. llvm-svn: 208518
* Reorder shuffle and binary operation.Serge Pavlov2014-05-112-12/+125
| | | | | | | | | | | | | This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488
* SLPVectorizer: When sorting by domination for CSE don't assert on ↵Benjamin Kramer2014-05-091-0/+30
| | | | | | | | | | | | unreachable code. There is no total ordering if the CFG is disconnected. We don't care if we catch all CSE opportunities in dead code either so just exclude ignore them in the assert. PR19646 llvm-svn: 208461
* Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCostLouis Gerbarg2014-05-091-0/+22
| | | | | | | | | | | | | Since ExtractValue is not included in ComputeSpeculationCost CFGs containing ExtractValueInsts cannot be simplified. In particular this interacts with InstCombineCompare's tendency to insert add.with.overflow intrinsics for certain idiomatic math operations, preventing optimization. This patch adds ExtractValue to the ComputeSpeculationCost. Test case included rdar://14853450 llvm-svn: 208434
* [InstCombine] Some cleanup in optimization of redundant insertvalue ↵Michael Zolotukhin2014-05-081-0/+11
| | | | | | | | instructions. And one more test added. llvm-svn: 208355
* Revert test commit. Removed blank line.Dario Domizioli2014-05-081-1/+0
| | | | llvm-svn: 208308
* Test commit. Added blank line.Dario Domizioli2014-05-081-0/+1
| | | | llvm-svn: 208298
* Move late partial-unrolling thresholds into the processor definitionsHal Finkel2014-05-082-12/+12
| | | | | | | | | | | | | | | | | | | | | | The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289
* IR: Don't allow non-default visibility on local linkageDuncan P. N. Exon Smith2014-05-075-20/+20
| | | | | | | | | | | | | | | | | Visibilities of `hidden` and `protected` are meaningless for symbols with local linkage. - Change the assembler to reject non-default visibility on symbols with local linkage. - Change the bitcode reader to auto-upgrade `hidden` and `protected` to `default` when the linkage is local. - Update LangRef. <rdar://problem/16141113> llvm-svn: 208263
* [InstCombine] Add optimization of redundant insertvalue instructions.Michael Zolotukhin2014-05-071-0/+25
| | | | | | rdar://problem/11861387 llvm-svn: 208214
* Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k ↵Nick Lewycky2014-05-051-0/+23
| | | | | | | | | | calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017
* Move test from r207969 to another folder and rename it.Michael Zolotukhin2014-05-051-25/+0
| | | | llvm-svn: 207984
* Always set alignment of vectorized LD/ST in SLP-Vectorizer. ↵Yi Jiang2014-05-051-0/+27
| | | | | | <rdar://problem/16812145> llvm-svn: 207983
* LTO: -internalize sets visibility to defaultDuncan P. N. Exon Smith2014-05-051-0/+25
| | | | | | | | | Visibility is meaningless when the linkage is local. Change `-internalize` to reset the visibility to `default`. <rdar://problem/16141113> llvm-svn: 207979
* Fix test from r207966 and add a comment there.Michael Zolotukhin2014-05-051-2/+4
| | | | llvm-svn: 207969
* Add regression test for r207692.Michael Zolotukhin2014-05-051-0/+23
| | | | llvm-svn: 207966
* LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to ↵Benjamin Kramer2014-05-041-0/+47
| | | | | | | | | | | limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. llvm-svn: 207940
* SLPVectorizer: Bring back the insertelement patch (r205965) with fixesArnold Schwaighofer2014-05-042-0/+140
| | | | | | | | | | | | | | | | When can't assume a vectorized tree is rooted in an instruction. The IRBuilder could have constant folded it. When we rebuild the build_vector (the series of InsertElement instructions) use the last original InsertElement instruction. The vectorized tree root is guaranteed to be before it. Also, we can't assume that the n-th InsertElement inserts the n-th element into a vector. This reverts r207746 which reverted the revert of the revert of r205018 or so. Fixes the test case in PR19621. llvm-svn: 207939
* Vectorize intrinsic math function calls in SLPVectorizer.Karthik Bhat2014-05-031-0/+128
| | | | | | | This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer. Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559 llvm-svn: 207901
* [LSR] Add llc testcase for r207271/r207569.Adam Nemet2014-05-021-0/+70
| | | | | | | See PR19608 for the details but to summarize it was easy to modify the .ll file to get the desired def-use ordering. llvm-svn: 207887
OpenPOWER on IntegriCloud