summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-2415-20/+14
| | | | | | | | | | | | | | | This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
* AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.Tim Northover2014-05-241-1/+1
| | | | | | | | | | | | | | | | I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576
* Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) andMichael Zolotukhin2014-05-241-0/+175
| | | | | | | | | | | sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
* Fix broken FileCheck prefixesNico Rieck2014-05-231-1/+1
| | | | llvm-svn: 209538
* Add the extracted constant offset using GEPJingyue Wu2014-05-232-13/+30
| | | | | | | | | | | | | Fixed a TODO in r207783. Add the extracted constant offset using GEP instead of ugly ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR easier to understand. Updated all affected tests, and added a new test in split-gep.ll to cover a corner case where emitting uglygep is necessary. llvm-svn: 209537
* ScalarEvolution: Fix handling of AddRecs in isKnownPredicateJustin Bogner2014-05-231-0/+30
| | | | | | | | | | | | | ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487
* Add support for missed and analysis optimization remarks.Diego Novillo2014-05-221-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds two new diagnostics: -pass-remarks-missed and -pass-remarks-analysis. They take the same values as -pass-remarks but are intended to be triggered in different contexts. -pass-remarks-missed is used by LLVMContext::emitOptimizationRemarkMissed, which passes call when they tried to apply a transformation but couldn't. -pass-remarks-analysis is used by LLVMContext::emitOptimizationRemarkAnalysis, which passes call when they want to inform the user about analysis results. The patch also: 1- Adds support in the inliner for the two new remarks and a test case. 2- Moves emitOptimizationRemark* functions to the llvm namespace. 3- Adds an LLVMContext argument instead of making them member functions of LLVMContext. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3682 llvm-svn: 209442
* Similar to bitcast, treat addrspacecast as a foldable operand.Eli Bendersky2014-05-221-0/+37
| | | | | | | | Added a test sink-addrspacecast.ll to verify this change. Patch by Jingyue Wu. llvm-svn: 209343
* Teach isKnownNonNull that a nonnull return is not null. Add a test for this ↵Nick Lewycky2014-05-201-0/+17
| | | | | | case as well as the case of a nonnull attribute (already handled but not tested). llvm-svn: 209193
* [ConstantHoisting][X86] Change the cost model to never hoist constants for ↵Juergen Ributzka2014-05-191-0/+9
| | | | | | | | | | | | | | | types larger than i128. Currently the X86 backend doesn't support types larger than i128 very well. For example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted. This fix changes the cost model to never hoist constants for types larger than i128. Once the codegen issues have been resolved, the cost model can be updated to allow also larger types. This is related to <rdar://problem/16954938> llvm-svn: 209162
* Check the alwaysinline attribute on the call as well as on the caller.Peter Collingbourne2014-05-191-0/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D3815 llvm-svn: 209150
* Flip on vectorization of bswap intrinsics.Benjamin Kramer2014-05-191-0/+44
| | | | | | | | | The cost model conservatively assumes that it will always get scalarized and that's about as good as we can get with the generic TTI; reasoning whether a shuffle with an efficient lowering is available is hard. We can override that conservative estimate for some targets in the future. llvm-svn: 209125
* Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'Dinesh Dwivedi2014-05-191-0/+52
| | | | | | | | | | | This removes TODO added in r208849 [http://reviews.llvm.org/D3629] MIN(MIN(A, 97), 23) -> MIN(A, 23) MAX(MAX(A, 23), 97) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3785 llvm-svn: 209110
* Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"NAKAMURA Takumi2014-05-171-56/+0
| | | | | | It broke clang selfhosting even after r209065. llvm-svn: 209067
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-161-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. rdar://15547484 llvm-svn: 209049
* Add comdat key field to llvm.global_ctors and llvm.global_dtorsReid Kleckner2014-05-161-2/+17
| | | | | | | | | | | | | | This allows us to put dynamic initializers for weak data into the same comdat group as the data being initialized. This is necessary for MSVC ABI compatibility. Once we have comdats for guard variables, we can use the combination to help GlobalOpt fire more often for weak data with guarded initialization on other platforms. Reviewers: nlewycky Differential Revision: http://reviews.llvm.org/D3499 llvm-svn: 209015
* Fix most of PR10367.Rafael Espindola2014-05-165-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the design of GlobalAlias so that it doesn't take a ConstantExpr anymore. It now points directly to a GlobalObject, but its type is independent of the aliasee type. To avoid changing all alias related tests in this patches, I kept the common syntax @foo = alias i32* @bar to mean the same as now. The cases that used to use cast now use the more general syntax @foo = alias i16, i32* @bar. Note that GlobalAlias now behaves a bit more like GlobalVariable. We know that its type is always a pointer, so we omit the '*'. For the bitcode, a nice surprise is that we were writing both identical types already, so the format change is minimal. Auto upgrade is handled by looking through the casts and no new fields are needed for now. New bitcode will simply have different types for Alias and Aliasee. One last interesting point in the patch is that replaceAllUsesWith becomes smart enough to avoid putting a ConstantExpr in the aliasee. This seems better than checking and updating every caller. A followup patch will delete getAliasedGlobal now that it is redundant. Another patch will add support for an explicit offset. llvm-svn: 209007
* InstSimplify: Improve handling of ashr/lshrDavid Majnemer2014-05-161-0/+40
| | | | | | | | | | | | | | Summary: Analyze the range of values produced by ashr/lshr cst, %V when it is being used in an icmp. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3774 llvm-svn: 209000
* InstSimplify: Optimize using dividend in sdivDavid Majnemer2014-05-161-0/+9
| | | | | | | | | | | | | | | Summary: The dividend in an sdiv tells us the largest and smallest possible results. Use this fact to optimize comparisons against an sdiv with a constant dividend. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3795 llvm-svn: 208999
* Revert "Implement global merge optimization for global variables."Rafael Espindola2014-05-166-88/+39
| | | | | | | | | | | | This reverts commit r208934. The patch depends on aliases to GEPs with non zero offsets. That is not supported and fairly broken. The good news is that GlobalAlias is being redesigned and will have support for offsets, so this patch should be a nice match for it. llvm-svn: 208978
* Implement global merge optimization for global variables.Jiangning Liu2014-05-156-39/+88
| | | | | | | | | | | This commit implements two command line switches -global-merge-on-external and -global-merge-aligned, and both of them are false by default, so this optimization is disabled by default for all targets. For ARM64, some back-end behaviors need to be tuned to get this optimization further enabled. llvm-svn: 208934
* Don't insert lifetime.end markers between a musttail call and retReid Kleckner2014-05-151-0/+36
| | | | | | | | | | | | | The allocas going out of scope are immediately killed by the return instruction. This is a resend of r208912, which was committed accidentally. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3792 llvm-svn: 208920
* Revert "Don't insert lifetime.end markers between a musttail call and ret"Reid Kleckner2014-05-151-36/+0
| | | | | | | | This reverts commit r208912. It was committed accidentally without review. llvm-svn: 208914
* Don't insert lifetime.end markers between a musttail call and retReid Kleckner2014-05-151-0/+36
| | | | | | | | | | | The allocas going out of scope are immediately killed by the return instruction. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3630 llvm-svn: 208912
* Teach the inliner how to preserve musttail invariantsReid Kleckner2014-05-151-9/+140
| | | | | | | | | | | | | | | | | | | | The interesting case is what happens when you inline a musttail call through a musttail call site. In this case, we can't break perfect forwarding or allow any stack growth. Instead of merging control flow from the inlined return instruction after a musttail call into the body of the caller, leave the inlined return instruction in the caller so that the musttail call stays in the tail position. More work is required in http://reviews.llvm.org/D3630 to handle the case where the inlined function has dynamic allocas or byval arguments. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3491 llvm-svn: 208910
* Teach the constant folder to look through bitcast constant expressionsChandler Carruth2014-05-151-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | much more effectively when trying to constant fold a load of a constant. Previously, we only handled bitcasts by trying to find a totally generic byte representation of the constant and use that. Now, we look through the bitcast to see what constant we might fold the load into, and then try to form a constant expression cast of the found value that would be equivalent to loading the value. You might wonder why on earth this actually matters. Well, turns out that the Itanium ABI causes us to create a single array for a vtable where the first elements are virtual base offsets, followed by the virtual function pointers. Because the array is homogenous the element type is consistently i8* and we inttoptr the virtual base offsets into the initial elements. Then constructors bitcast these pointers to i64 pointers prior to loading them. Boom, no more constant folding of virtual base offsets. This is the first fix to LLVM to address the *insane* performance Eric Niebler discovered with Clang on his range comprehensions[1]. There is more to come though, this doesn't *really* fix the problem fully. [1]: http://ericniebler.com/2014/04/27/range-comprehensions/ llvm-svn: 208856
* Reverting r208848, reason: build failure: ↵Dinesh Dwivedi2014-05-151-54/+0
| | | | | | sanitizer-x86_64-linux-bootstrap/builds/3399 llvm-svn: 208852
* Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'Dinesh Dwivedi2014-05-151-0/+48
| | | | | | | | | MIN(MIN(A, 23), 97) -> MIN(A, 23) MAX(MAX(A, 97), 23) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3629 llvm-svn: 208849
* Added inst combine transforms for single bit tests from Chris's noteDinesh Dwivedi2014-05-151-0/+54
| | | | | | | | | | | | | | | if ((x & C) == 0) x |= C becomes x |= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x |= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Z3 Verifications code for above transform http://rise4fun.com/Z3/Pmsh Differential Revision: http://reviews.llvm.org/D3717 llvm-svn: 208848
* InstCombine: Optimize -x s< cstDavid Majnemer2014-05-151-0/+9
| | | | | | | | | | | | | | Summary: This gets rid of a sub instruction by moving the negation to the constant when valid. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3773 llvm-svn: 208827
* InstSimplify: Optimize signed icmp of -(zext V)David Majnemer2014-05-141-0/+60
| | | | | | | | | | | | | | | | Summary: We know that -(zext V) will always be <= zero, simplify signed icmps that have these. Uncovered using http://www.cs.utah.edu/~regehr/souper/ Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3754 llvm-svn: 208809
* Fix the case when reordering shuffle and binop produces a constant.Serge Pavlov2014-05-141-0/+11
| | | | | | This resolves PR19737. llvm-svn: 208762
* Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. ↵Nick Lewycky2014-05-141-0/+19
| | | | | | This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/ llvm-svn: 208750
* Fix type of shuffle resulted from shuffle merge.Serge Pavlov2014-05-131-0/+8
| | | | | | This fix resolves PR19730. llvm-svn: 208666
* Convert test to FileCheck.Rafael Espindola2014-05-131-1/+8
| | | | llvm-svn: 208658
* Convert test to FileCheck.Rafael Espindola2014-05-131-2/+12
| | | | llvm-svn: 208644
* [Test] Trim unnecessary .c and .cpp from config.suffix in lit.local.cfgAdam Nemet2014-05-122-2/+2
| | | | | | | | | | | | | | Tested by comparing make check VERBOSE=1 before and after to make sure no tests are missed. (VERBOSE=1 prints the list of tests.) Only one test :( remains where .cpp is required: tools/llvm-cov/range_based_for.cpp:// RUN: llvm-cov range_based_for.cpp | FileCheck %s --check-prefix=STDOUT The topic was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140428/214905.html llvm-svn: 208621
* Fix type of shuffle obtained from reordering with binary operationSerge Pavlov2014-05-121-0/+11
| | | | | | | | In transformation: BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef) type of the undef argument must be same as type of BinOp. llvm-svn: 208531
* Fix reordering of shuffles and binary operationsSerge Pavlov2014-05-121-0/+12
| | | | | | | | | | | | Do not apply transformation: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) if operands v1 and v2 are of different size. This change fixes PR19717, which was caused by r208488. llvm-svn: 208518
* Reorder shuffle and binary operation.Serge Pavlov2014-05-112-12/+125
| | | | | | | | | | | | | This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488
* SLPVectorizer: When sorting by domination for CSE don't assert on ↵Benjamin Kramer2014-05-091-0/+30
| | | | | | | | | | | | unreachable code. There is no total ordering if the CFG is disconnected. We don't care if we catch all CSE opportunities in dead code either so just exclude ignore them in the assert. PR19646 llvm-svn: 208461
* Add ExtractValue instruction to SimplifyCFG's ComputeSpeculationCostLouis Gerbarg2014-05-091-0/+22
| | | | | | | | | | | | | Since ExtractValue is not included in ComputeSpeculationCost CFGs containing ExtractValueInsts cannot be simplified. In particular this interacts with InstCombineCompare's tendency to insert add.with.overflow intrinsics for certain idiomatic math operations, preventing optimization. This patch adds ExtractValue to the ComputeSpeculationCost. Test case included rdar://14853450 llvm-svn: 208434
* [InstCombine] Some cleanup in optimization of redundant insertvalue ↵Michael Zolotukhin2014-05-081-0/+11
| | | | | | | | instructions. And one more test added. llvm-svn: 208355
* Revert test commit. Removed blank line.Dario Domizioli2014-05-081-1/+0
| | | | llvm-svn: 208308
* Test commit. Added blank line.Dario Domizioli2014-05-081-0/+1
| | | | llvm-svn: 208298
* Move late partial-unrolling thresholds into the processor definitionsHal Finkel2014-05-082-12/+12
| | | | | | | | | | | | | | | | | | | | | | The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289
* IR: Don't allow non-default visibility on local linkageDuncan P. N. Exon Smith2014-05-075-20/+20
| | | | | | | | | | | | | | | | | Visibilities of `hidden` and `protected` are meaningless for symbols with local linkage. - Change the assembler to reject non-default visibility on symbols with local linkage. - Change the bitcode reader to auto-upgrade `hidden` and `protected` to `default` when the linkage is local. - Update LangRef. <rdar://problem/16141113> llvm-svn: 208263
* [InstCombine] Add optimization of redundant insertvalue instructions.Michael Zolotukhin2014-05-071-0/+25
| | | | | | rdar://problem/11861387 llvm-svn: 208214
* Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k ↵Nick Lewycky2014-05-051-0/+23
| | | | | | | | | | calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. llvm-svn: 208017
* Move test from r207969 to another folder and rename it.Michael Zolotukhin2014-05-051-25/+0
| | | | llvm-svn: 207984
OpenPOWER on IntegriCloud