summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Revert 209903 and 210040.Rafael Espindola2014-06-071-16/+0
| | | | | | | | | | | | The messages were "PR19753: Optimize comparisons with "ashr exact" of a constanst." "Added support to optimize comparisons with "lshr exact" of a constant." They were not correctly handling signed/unsigned operation differences, causing pr19958. llvm-svn: 210393
* InstCombine: Canonicalize addrspacecast between different element typesJingyue Wu2014-06-063-4/+96
| | | | | | | | | | | | | | | | addrspacecast X addrspace(M)* to Y addrspace(N)* --> bitcast X addrspace(M)* to Y addrspace(M)* addrspacecast Y addrspace(M)* to Y addrspace(N)* Updat all affected tests and add several new tests in addrspacecast.ll. This patch is based on http://reviews.llvm.org/D2186 (authored by Matt Arsenault) with fixes and more tests. llvm-svn: 210375
* Added select flavour for ABS and NEG(ABS)Dinesh Dwivedi2014-06-061-0/+481
| | | | | | | | | | | | | | | | This patch can identify ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X and can transform ABS(ABS(X)) -> ABS(X) NABS(NABS(X)) -> NABS(X) Differential Revision: http://reviews.llvm.org/D3658 llvm-svn: 210312
* Allow aliases to be unnamed_addr.Rafael Espindola2014-06-061-1/+1
| | | | | | | | | | | | | | | | | | Alias with unnamed_addr were in a strange state. It is stored in GlobalValue, the language reference talks about "unnamed_addr aliases" but the verifier was rejecting them. It seems natural to allow unnamed_addr in aliases: * It is a property of how it is accessed, not of the data itself. * It is perfectly possible to write code that depends on the address of an alias. This patch then makes unname_addr legal for aliases. One side effect is that the syntax changes for a corner case: In globals, unnamed_addr is now printed before the address space. llvm-svn: 210302
* Add a testcase where there is an overflow when combining two constants.Rafael Espindola2014-06-051-0/+10
| | | | | | I noticed that a proposed optimization would have prevented this. llvm-svn: 210287
* InstCombine: Improvement to check if signed addition overflows.Rafael Espindola2014-06-041-0/+58
| | | | | | | | | | | | | | | | | | This patch implements two things: 1. If we know one number is positive and another is negative, we return true as signed addition of two opposite signed numbers will never overflow. 2. Implemented TODO : If one of the operands only has one non-zero bit, and if the other operand has a known-zero bit in a more significant place than it (not including the sign bit) the ripple may go up to and fill the zero, but won't change the sign. e.x - (x & ~4) + 1 We make sure that we are ignoring 0 at MSB. Patch by Suyog Sarda. llvm-svn: 210186
* Allow alias to point to an arbitrary ConstantExpr.Rafael Espindola2014-06-031-12/+12
| | | | | | | | | | | | | | | | | | | | | This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is up to MC (or the system assembler) to decide if that expression is valid or not. This reduces our ability to diagnose invalid uses and how early we can spot them, but it also lets us do things like @test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32), i32 ptrtoint (i32* @bar to i32)) to i32*) An important implication of this patch is that the notion of aliased global doesn't exist any more. The alias has to encode the information needed to access it in its metadata (linkage, visibility, type, etc). Another consequence to notice is that getSection has to return a "const char *". It could return a NullTerminatedStringRef if there was such a thing, but when that was proposed the decision was to just uses "const char*" for that. llvm-svn: 210062
* Add back commit r210029.Rafael Espindola2014-06-026-12/+12
| | | | | | | | The code was actually correct. Sorry for the confusion. I have expanded the comment saying why the analysis is valid to avoid me misunderstaning it again in the future. llvm-svn: 210052
* Convert test to FileCheck.Rafael Espindola2014-06-021-4/+6
| | | | llvm-svn: 210049
* Revert "Add the nsw flag when we detect that an add will not signed overflow."Rafael Espindola2014-06-026-12/+12
| | | | | | | | | This reverts commit r210029. It was not correctly handling cases where LHS and RHS had multiple but different sign bits. llvm-svn: 210048
* Added support to optimize comparisons with "lshr exact" of a constant.Rafael Espindola2014-06-021-0/+8
| | | | | | Patch by Rahul Jain. llvm-svn: 210040
* Add the nsw flag when we detect that an add will not signed overflow.Rafael Espindola2014-06-026-12/+12
| | | | | | | We already had a function for checking this, we were just using it only in specialized cases. llvm-svn: 210029
* Added inst combine tarnsform for (1 << X) & C pattrens where C is (some ↵Dinesh Dwivedi2014-06-021-0/+17
| | | | | | | | | | | | PowerOf2 - 1) This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt "((1 << X) & 7) == 0" ==> "X > 2" "((1 << X) & 7) != 0" ==> "X < 3". Differential Revision: http://reviews.llvm.org/D3678 llvm-svn: 210007
* Added inst combine transforms for single bit tests from Chris's noteDinesh Dwivedi2014-06-021-0/+105
| | | | | | | | | | | | if ((x & C) == 0) x |= C becomes x |= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x |= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Differential Revision: http://reviews.llvm.org/D3777 llvm-svn: 210006
* PR19753: Optimize comparisons with "ashr exact" of a constanst.Rafael Espindola2014-05-301-0/+8
| | | | | | Patch by suyog sarda. llvm-svn: 209903
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-291-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. llvm-svn: 209843
* Revert "Revert "Revert "InstCombine: Improvement to check if signed addition ↵Rafael Espindola2014-05-291-56/+0
| | | | | | | | | | overflows.""" This reverts commit r209776. It was miscompiling llvm::SelectionDAGISel::MorphNode. llvm-svn: 209817
* Revert "Revert "InstCombine: Improvement to check if signed addition ↵Rafael Espindola2014-05-281-0/+56
| | | | | | | | overflows."" This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure llvm-svn: 209776
* Revert "Add support for combining GEPs across PHI nodes"Rafael Espindola2014-05-281-56/+0
| | | | | | | | This reverts commit r209755. it was the real cause of the libc++ build failure. llvm-svn: 209775
* Revert "InstCombine: Improvement to check if signed addition overflows."Rafael Espindola2014-05-281-56/+0
| | | | | | | | | This reverts commit r209746. It looks it is causing a crash while building libcxx. I am trying to get a reduced testcase. llvm-svn: 209762
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-281-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. llvm-svn: 209755
* InstCombine: Improvement to check if signed addition overflows.Rafael Espindola2014-05-281-0/+56
| | | | | | | | | | | | | | | | | | This patch implements two things: 1. If we know one number is positive and another is negative, we return true as signed addition of two opposite signed numbers will never overflow. 2. Implemented TODO : If one of the operands only has one non-zero bit, and if the other operand has a known-zero bit in a more significant place than it (not including the sign bit) the ripple may go up to and fill the zero, but won't change the sign. e.x - (x & ~4) + 1 We make sure that we are ignoring 0 at MSB. Patch by Suyog Sarda. llvm-svn: 209746
* Post-commit fixes for r209643Filipe Cabecinhas2014-05-271-7/+6
| | | | | | | | | | Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio Fixed the argument order to select (the mask semantics to blendv* are the inverse of select) and fixed the tests Added parenthesis to the assert condition Ran clang-format llvm-svn: 209667
* Convert some X86 blendv* intrinsics into IR.Filipe Cabecinhas2014-05-271-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implemented an InstCombine transformation that takes a blendv* intrinsic call and translates it into an IR select, if the mask is constant. This will eventually get lowered into blends with immediates if possible, or pblendvb (with an option to further optimize if we can transform the pblendvb into a blend+immediate instruction, depending on the selector). It will also enable optimizations by the IR passes, which give up on sight of the intrinsic. Both the transformation and the lowering of its result to asm got shiny new tests. The transformation is a bit convoluted because of blendvp[sd]'s definition: Its mask is a floating point value! This forces us to convert it and get the highest bit. I suppose this happened because the mask has type __m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin. I will send an email to llvm-dev to discuss if we want to change this or not. Reviewers: grosbach, delena, nadav Differential Revision: http://reviews.llvm.org/D3859 llvm-svn: 209643
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-241-10/+10
| | | | | | | | | | | | | | | This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
* Added inst-combine for 'MIN(MIN(A, 97), 23)' and 'MAX(MAX(A, 23), 97)'Dinesh Dwivedi2014-05-191-0/+52
| | | | | | | | | | | This removes TODO added in r208849 [http://reviews.llvm.org/D3629] MIN(MIN(A, 97), 23) -> MIN(A, 23) MAX(MAX(A, 23), 97) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3785 llvm-svn: 209110
* Revert r209049 and r209065, "Add support for combining GEPs across PHI nodes"NAKAMURA Takumi2014-05-171-56/+0
| | | | | | It broke clang selfhosting even after r209065. llvm-svn: 209067
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-161-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. rdar://15547484 llvm-svn: 209049
* Fix most of PR10367.Rafael Espindola2014-05-161-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the design of GlobalAlias so that it doesn't take a ConstantExpr anymore. It now points directly to a GlobalObject, but its type is independent of the aliasee type. To avoid changing all alias related tests in this patches, I kept the common syntax @foo = alias i32* @bar to mean the same as now. The cases that used to use cast now use the more general syntax @foo = alias i16, i32* @bar. Note that GlobalAlias now behaves a bit more like GlobalVariable. We know that its type is always a pointer, so we omit the '*'. For the bitcode, a nice surprise is that we were writing both identical types already, so the format change is minimal. Auto upgrade is handled by looking through the casts and no new fields are needed for now. New bitcode will simply have different types for Alias and Aliasee. One last interesting point in the patch is that replaceAllUsesWith becomes smart enough to avoid putting a ConstantExpr in the aliasee. This seems better than checking and updating every caller. A followup patch will delete getAliasedGlobal now that it is redundant. Another patch will add support for an explicit offset. llvm-svn: 209007
* Reverting r208848, reason: build failure: ↵Dinesh Dwivedi2014-05-151-54/+0
| | | | | | sanitizer-x86_64-linux-bootstrap/builds/3399 llvm-svn: 208852
* Added instcombine for 'MIN(MIN(A, 27), 93)' and 'MAX(MAX(A, 93), 27)'Dinesh Dwivedi2014-05-151-0/+48
| | | | | | | | | MIN(MIN(A, 23), 97) -> MIN(A, 23) MAX(MAX(A, 97), 23) -> MAX(A, 97) Differential Revision: http://reviews.llvm.org/D3629 llvm-svn: 208849
* Added inst combine transforms for single bit tests from Chris's noteDinesh Dwivedi2014-05-151-0/+54
| | | | | | | | | | | | | | | if ((x & C) == 0) x |= C becomes x |= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x |= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Z3 Verifications code for above transform http://rise4fun.com/Z3/Pmsh Differential Revision: http://reviews.llvm.org/D3717 llvm-svn: 208848
* InstCombine: Optimize -x s< cstDavid Majnemer2014-05-151-0/+9
| | | | | | | | | | | | | | Summary: This gets rid of a sub instruction by moving the negation to the constant when valid. Reviewers: nicholas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3773 llvm-svn: 208827
* Fix the case when reordering shuffle and binop produces a constant.Serge Pavlov2014-05-141-0/+11
| | | | | | This resolves PR19737. llvm-svn: 208762
* Optimize integral reciprocal (udiv 1, x and sdiv 1, x) to not use division. ↵Nick Lewycky2014-05-141-0/+19
| | | | | | This fires exactly once in a clang bootstrap, but covers a few different results from http://www.cs.utah.edu/~regehr/souper/ llvm-svn: 208750
* Fix type of shuffle resulted from shuffle merge.Serge Pavlov2014-05-131-0/+8
| | | | | | This fix resolves PR19730. llvm-svn: 208666
* Fix type of shuffle obtained from reordering with binary operationSerge Pavlov2014-05-121-0/+11
| | | | | | | | In transformation: BinOp(shuffle(v1,undef), shuffle(v2,undef)) -> shuffle(BinOp(v1, v2),undef) type of the undef argument must be same as type of BinOp. llvm-svn: 208531
* Fix reordering of shuffles and binary operationsSerge Pavlov2014-05-121-0/+12
| | | | | | | | | | | | Do not apply transformation: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) if operands v1 and v2 are of different size. This change fixes PR19717, which was caused by r208488. llvm-svn: 208518
* Reorder shuffle and binary operation.Serge Pavlov2014-05-111-1/+119
| | | | | | | | | | | | | This patch enables transformations: BinOp(shuffle(v1), shuffle(v2)) -> shuffle(BinOp(v1, v2)) BinOp(shuffle(v1), const1) -> shuffle(BinOp, const2) They allow to eliminate extra shuffles in some cases. Differential Revision: http://reviews.llvm.org/D3525 llvm-svn: 208488
* [InstCombine] Some cleanup in optimization of redundant insertvalue ↵Michael Zolotukhin2014-05-081-0/+11
| | | | | | | | instructions. And one more test added. llvm-svn: 208355
* [InstCombine] Add optimization of redundant insertvalue instructions.Michael Zolotukhin2014-05-071-0/+25
| | | | | | rdar://problem/11861387 llvm-svn: 208214
* Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 ↵Nick Lewycky2014-05-021-0/+12
| | | | | | times in a bootstrap of clang. llvm-svn: 207828
* IR: Conservatively verify inalloca argumentsDavid Majnemer2014-04-301-1/+1
| | | | | | | | | | | | Summary: Try to spot obvious mismatches with inalloca use. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3572 llvm-svn: 207676
* Also handle ConstantAggregateZero when optimizing vpermilvar*.Rafael Espindola2014-04-291-0/+28
| | | | llvm-svn: 207582
* Two fixes to the vpermilvar optimization.Rafael Espindola2014-04-291-4/+4
| | | | | | | | The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect. The _256 variants have indexes into the individual 128 bit lanes and in all cases it also has to mask out unused bits. llvm-svn: 207577
* InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569)Hans Wennborg2014-04-281-0/+21
| | | | llvm-svn: 207426
* [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shiftAndrea Di Biagio2014-04-261-2/+103
| | | | | | | | | | right intrinsics. A packed logical shift right with a shift count bigger than or equal to the element size always produces a zero vector. In all other cases, it can be safely replaced by a 'lshr' instruction. llvm-svn: 207299
* [InstCombine][x86] Constant fold psll intrinsics.Michael J. Spencer2014-04-241-0/+110
| | | | | | | | | | | | This excludes avx512 as I don't have hardware to verify. It excludes _dq variants because they are represented in the IR as <{2,4} x i64> when it's actually a byte shift of the entire i{128,265}. This also excludes _dq_bs as they aren't at all supported by the backend. There are also no corresponding instructions in the ISA. I have no idea why they exist... llvm-svn: 207058
* Optimize some special cases for SSE4a insertqiFilipe Cabecinhas2014-04-241-0/+97
| | | | | | | | | | | | | | | | | | | | Summary: Since the upper 64 bits of the destination register are undefined when performing this operation, we can substitute it and let the optimizer figure out that only a copy is needed. Also added range merging, if an instruction copies a range that can be merged with a previous copied range. Added test cases for both optimizations. Reviewers: grosbach, nadav CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3357 llvm-svn: 207055
* Handle addrspacecast when looking at memcpys from globalsMatt Arsenault2014-04-241-4/+63
| | | | llvm-svn: 207054
OpenPOWER on IntegriCloud