summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [SeparateConstOffsetFromGEP] inbounds zext => sext for better splittingJingyue Wu2014-06-082-3/+71
| | | | | | | | | | For each array index that is in the form of zext(a), convert it to sext(a) if we can prove zext(a) <= max signed value of typeof(a). The conversion helps to split zext(x + y) into sext(x) + sext(y). Reviewed in http://reviews.llvm.org/D4060 llvm-svn: 210444
* [SeparateConstOffsetFromGEP] make two tests more strictJingyue Wu2014-06-081-4/+4
| | | | | | | inbounds are not necessary in these two tests. zext(a +nuw b) = zext(a) + zext(b) should hold with or without inbounds. llvm-svn: 210437
* Revert 209903 and 210040.Rafael Espindola2014-06-071-16/+0
| | | | | | | | | | | | The messages were "PR19753: Optimize comparisons with "ashr exact" of a constanst." "Added support to optimize comparisons with "lshr exact" of a constant." They were not correctly handling signed/unsigned operation differences, causing pr19958. llvm-svn: 210393
* InstCombine: Canonicalize addrspacecast between different element typesJingyue Wu2014-06-063-4/+96
| | | | | | | | | | | | | | | | addrspacecast X addrspace(M)* to Y addrspace(N)* --> bitcast X addrspace(M)* to Y addrspace(M)* addrspacecast Y addrspace(M)* to Y addrspace(N)* Updat all affected tests and add several new tests in addrspacecast.ll. This patch is based on http://reviews.llvm.org/D2186 (authored by Matt Arsenault) with fixes and more tests. llvm-svn: 210375
* Fix typo in a test from r210342.Michael Zolotukhin2014-06-061-1/+1
| | | | llvm-svn: 210343
* [SLP] Enable vectorization of GEP expressions.Michael Zolotukhin2014-06-061-0/+41
| | | | | | | | The use cases look like the following: x->a = y->a + 10 x->b = y->b + 12 llvm-svn: 210342
* Added select flavour for ABS and NEG(ABS)Dinesh Dwivedi2014-06-061-0/+481
| | | | | | | | | | | | | | | | This patch can identify ABS(X) ==> (X >s 0) ? X : -X and (X >s -1) ? X : -X ABS(X) ==> (X <s 0) ? -X : X and (X <s 1) ? -X : X NABS(X) ==> (X >s 0) ? -X : X and (X >s -1) ? -X : X NABS(X) ==> (X <s 0) ? X : -X and (X <s 1) ? X : -X and can transform ABS(ABS(X)) -> ABS(X) NABS(NABS(X)) -> NABS(X) Differential Revision: http://reviews.llvm.org/D3658 llvm-svn: 210312
* Fix PR19657 (scalar loads not combined into vector load)Karthik Bhat2014-06-061-0/+73
| | | | | | | | If we have common uses on separate paths in the tree; process the one with greater common depth first. This makes sure that we do not assume we need to extract a load when it is actually going to be part of a vectorized tree. Review: http://reviews.llvm.org/D3800 llvm-svn: 210310
* Allow aliases to be unnamed_addr.Rafael Espindola2014-06-061-1/+1
| | | | | | | | | | | | | | | | | | Alias with unnamed_addr were in a strange state. It is stored in GlobalValue, the language reference talks about "unnamed_addr aliases" but the verifier was rejecting them. It seems natural to allow unnamed_addr in aliases: * It is a property of how it is accessed, not of the data itself. * It is perfectly possible to write code that depends on the address of an alias. This patch then makes unname_addr legal for aliases. One side effect is that the syntax changes for a corner case: In globals, unnamed_addr is now printed before the address space. llvm-svn: 210302
* Fixed several correctness issues in SeparateConstOffsetFromGEPJingyue Wu2014-06-052-55/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most issues are on mishandling s/zext. Fixes: 1. When rebuilding new indices, s/zext should be distributed to sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not sext(a + b) + 5. This also affects the logic of recursively looking for a constant offset, we need to include s/zext into the context of the searching. 2. Function find should return the bitwidth of the constant offset instead of always sign-extending it to i64. 3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP indices to pointer-size before computing the address. Therefore, gep base, zext(a + b) != gep base, a + b Improvements: 1. Add an optimization for splitting sext(a + b): if a + b is proven non-negative (e.g., used as an index of an inbound GEP) and one of a, b is non-negative, sext(a + b) = sext(a) + sext(b) 2. Function Distributable checks whether both sext and zext can be distributed to operands of a binary operator. This helps us split zext(sext(a + b)) to zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow. Refactoring: Merge some common logic of handling add/sub/or in find. Testing: Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes we made. llvm-svn: 210291
* Add a testcase where there is an overflow when combining two constants.Rafael Espindola2014-06-051-0/+10
| | | | | | I noticed that a proposed optimization would have prevented this. llvm-svn: 210287
* Fix coverage for files with global constructors again. Adds a testcase to ↵Nick Lewycky2014-06-051-0/+58
| | | | | | the commit from r206671, as requested by David Blaikie. llvm-svn: 210239
* Use AArch64 instead of now removed ARM64 in test configsAlexey Samsonov2014-06-051-1/+1
| | | | llvm-svn: 210229
* InstCombine: Improvement to check if signed addition overflows.Rafael Espindola2014-06-041-0/+58
| | | | | | | | | | | | | | | | | | This patch implements two things: 1. If we know one number is positive and another is negative, we return true as signed addition of two opposite signed numbers will never overflow. 2. Implemented TODO : If one of the operands only has one non-zero bit, and if the other operand has a known-zero bit in a more significant place than it (not including the sign bit) the ripple may go up to and fill the zero, but won't change the sign. e.x - (x & ~4) + 1 We make sure that we are ignoring 0 at MSB. Patch by Suyog Sarda. llvm-svn: 210186
* Ignore line numbers on debug intrinsics. Add an assert to ensure that we ↵Nick Lewycky2014-06-031-0/+143
| | | | | | aren't emitting line number zero, the .gcno format uses this to indicate that the next field is a filename. llvm-svn: 210068
* Allow alias to point to an arbitrary ConstantExpr.Rafael Espindola2014-06-034-19/+19
| | | | | | | | | | | | | | | | | | | | | This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is up to MC (or the system assembler) to decide if that expression is valid or not. This reduces our ability to diagnose invalid uses and how early we can spot them, but it also lets us do things like @test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32), i32 ptrtoint (i32* @bar to i32)) to i32*) An important implication of this patch is that the notion of aliased global doesn't exist any more. The alias has to encode the information needed to access it in its metadata (linkage, visibility, type, etc). Another consequence to notice is that getSection has to return a "const char *". It could return a NullTerminatedStringRef if there was such a thing, but when that was proposed the decision was to just uses "const char*" for that. llvm-svn: 210062
* Add back commit r210029.Rafael Espindola2014-06-026-12/+12
| | | | | | | | The code was actually correct. Sorry for the confusion. I have expanded the comment saying why the analysis is valid to avoid me misunderstaning it again in the future. llvm-svn: 210052
* Convert test to FileCheck.Rafael Espindola2014-06-021-4/+6
| | | | llvm-svn: 210049
* Revert "Add the nsw flag when we detect that an add will not signed overflow."Rafael Espindola2014-06-026-12/+12
| | | | | | | | | This reverts commit r210029. It was not correctly handling cases where LHS and RHS had multiple but different sign bits. llvm-svn: 210048
* Added support to optimize comparisons with "lshr exact" of a constant.Rafael Espindola2014-06-021-0/+8
| | | | | | Patch by Rahul Jain. llvm-svn: 210040
* Add the nsw flag when we detect that an add will not signed overflow.Rafael Espindola2014-06-026-12/+12
| | | | | | | We already had a function for checking this, we were just using it only in specialized cases. llvm-svn: 210029
* Added inst combine tarnsform for (1 << X) & C pattrens where C is (some ↵Dinesh Dwivedi2014-06-021-0/+17
| | | | | | | | | | | | PowerOf2 - 1) This patch can handles following cases from http://nondot.org/sabre/LLVMNotes/InstCombine.txt "((1 << X) & 7) == 0" ==> "X > 2" "((1 << X) & 7) != 0" ==> "X < 3". Differential Revision: http://reviews.llvm.org/D3678 llvm-svn: 210007
* Added inst combine transforms for single bit tests from Chris's noteDinesh Dwivedi2014-06-021-0/+105
| | | | | | | | | | | | if ((x & C) == 0) x |= C becomes x |= C if ((x & C) != 0) x ^= C becomes x &= ~C if ((x & C) == 0) x ^= C becomes x |= C if ((x & C) != 0) x &= ~C becomes x &= ~C if ((x & C) == 0) x &= ~C becomes nothing Differential Revision: http://reviews.llvm.org/D3777 llvm-svn: 210006
* [Reassociate] Similar to "X + -X" -> "0", added code to handle "X + ~X" -> "-1".Benjamin Kramer2014-05-311-0/+12
| | | | | | | | | | | | Handle "X + ~X" -> "-1" in the function Value *Reassociate::OptimizeAdd(Instruction *I, SmallVectorImpl<ValueEntry> &Ops); This patch implements: TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1". Patch by Rahul Jain! Differential Revision: http://reviews.llvm.org/D3835 llvm-svn: 209973
* Make bitcast, extractelement, and insertelement considered cheap for ↵Matt Arsenault2014-05-301-0/+60
| | | | | | | | | | speculation. This helps more branches into selects. On R600, vectors are cheap and anything that helps remove branches is very good. llvm-svn: 209914
* PR19753: Optimize comparisons with "ashr exact" of a constanst.Rafael Espindola2014-05-301-0/+8
| | | | | | Patch by suyog sarda. llvm-svn: 209903
* ARM & AArch64: make use of common cmpxchg idioms after expansionTim Northover2014-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The C and C++ semantics for compare_exchange require it to return a bool indicating success. This gets mapped to LLVM IR which follows each cmpxchg with an icmp of the value loaded against the desired value. When lowered to ldxr/stxr loops, this extra comparison is redundant: its results are implicit in the control-flow of the function. This commit makes two changes: it replaces that icmp with appropriate PHI nodes, and then makes sure earlyCSE is called after expansion to actually make use of the opportunities revealed. I've also added -{arm,aarch64}-enable-atomic-tidy options, so that existing fragile tests aren't perturbed too much by the change. Many of them either rely on undef/unreachable too pervasively to be restored to something well-defined (particularly while making sure they test the same obscure assert from many years ago), or depend on a particular CFG shape, which is disrupted by SimplifyCFG. rdar://problem/16227836 llvm-svn: 209883
* Allow vectorization of intrinsics such as powi,cttz and ctlz in Loop and SLP ↵Karthik Bhat2014-05-302-0/+369
| | | | | | | | | | Vectorizer. This patch adds support to vectorize intrinsics such as powi, cttz and ctlz in Vectorizer. These intrinsics are different from other intrinsics as second argument to these function must be same in order to vectorize them and it should be represented as a scalar. Review: http://reviews.llvm.org/D3851#inline-32769 and http://reviews.llvm.org/D3937#inline-32857 llvm-svn: 209873
* When analyzing params/args for readnone/readonly, don't forget to consider ↵Nick Lewycky2014-05-302-1/+15
| | | | | | that a pointer argument may be passed through a callsite to the return, and that we may need to analyze it. Fixes a bug reported on llvm-dev: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073098.html llvm-svn: 209870
* LoopVectorizer: Add a check that the backedge taken count + 1 does not overflowArnold Schwaighofer2014-05-292-0/+28
| | | | | | | | | | | The loop vectorizer instantiates be-taken-count + 1 as the loop iteration count. If this expression overflows the generated code was invalid. In case of overflow the code now jumps to the scalar loop. Fixes PR17288. llvm-svn: 209854
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-291-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. llvm-svn: 209843
* Revert "Revert "Revert "InstCombine: Improvement to check if signed addition ↵Rafael Espindola2014-05-291-56/+0
| | | | | | | | | | overflows.""" This reverts commit r209776. It was miscompiling llvm::SelectionDAGISel::MorphNode. llvm-svn: 209817
* LCSSA should be performed on the outermost affected loop while unrolling loop.Dinesh Dwivedi2014-05-291-0/+43
| | | | | | | | | | During loop-unroll, loop exits from the current loop may end up in in different outer loop. This requires to re-form LCSSA recursively for one level down from the outer most loop where loop exits are landed during unroll. This fixes PR18861. Differential Revision: http://reviews.llvm.org/D2976 llvm-svn: 209796
* Add LoadCombine pass.Michael J. Spencer2014-05-291-0/+190
| | | | | | | | This pass is disabled by default. Use -combine-loads to enable in -O[1-3] Differential revision: http://reviews.llvm.org/D3580 llvm-svn: 209791
* Revert "Revert "InstCombine: Improvement to check if signed addition ↵Rafael Espindola2014-05-281-0/+56
| | | | | | | | overflows."" This reverts commit r209762, bringing back r209746. It was not responsible for the libc++ build failure llvm-svn: 209776
* Revert "Add support for combining GEPs across PHI nodes"Rafael Espindola2014-05-281-56/+0
| | | | | | | | This reverts commit r209755. it was the real cause of the libc++ build failure. llvm-svn: 209775
* Revert "InstCombine: Improvement to check if signed addition overflows."Rafael Espindola2014-05-281-56/+0
| | | | | | | | | This reverts commit r209746. It looks it is causing a crash while building libcxx. I am trying to get a reduced testcase. llvm-svn: 209762
* Add support for combining GEPs across PHI nodesLouis Gerbarg2014-05-281-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | Currently LLVM will generally merge GEPs. This allows backends to use more complex addressing modes. In some cases this is not happening because there is PHI inbetween the two GEPs: GEP1--\ |-->PHI1-->GEP3 GEP2--/ This patch checks to see if GEP1 and GEP2 are similiar enough that they can be cloned (GEP12) in GEP3's BB, allowing GEP->GEP merging (GEP123): GEP1--\ --\ --\ |-->PHI1-->GEP3 ==> |-->PHI2->GEP12->GEP3 == > |-->PHI2->GEP123 GEP2--/ --/ --/ This also breaks certain use chains that are preventing GEP->GEP merges that the the existing instcombine would merge otherwise. Tests included. llvm-svn: 209755
* InstCombine: Improvement to check if signed addition overflows.Rafael Espindola2014-05-281-0/+56
| | | | | | | | | | | | | | | | | | This patch implements two things: 1. If we know one number is positive and another is negative, we return true as signed addition of two opposite signed numbers will never overflow. 2. Implemented TODO : If one of the operands only has one non-zero bit, and if the other operand has a known-zero bit in a more significant place than it (not including the sign bit) the ripple may go up to and fill the zero, but won't change the sign. e.x - (x & ~4) + 1 We make sure that we are ignoring 0 at MSB. Patch by Suyog Sarda. llvm-svn: 209746
* No need for those tests to go thru llvm-as and/or llvm-dis.Arnaud A. de Grandmaison2014-05-273-3/+3
| | | | | | opt can handle them by itself. llvm-svn: 209689
* Fixed a test in r209670Jingyue Wu2014-05-271-2/+1
| | | | | | The test was outdated with r209537. llvm-svn: 209671
* Distribute sext/zext to the operands of and/or/xorJingyue Wu2014-05-271-0/+19
| | | | | | | | | | | | This is an enhancement to SeparateConstOffsetFromGEP. With this patch, we can extract a constant offset from "s/zext and/or/xor A, B". Added a new test @ext_or to verify this enhancement. Refactoring the code, I also extracted some common logic to function Distributable. llvm-svn: 209670
* Post-commit fixes for r209643Filipe Cabecinhas2014-05-271-7/+6
| | | | | | | | | | Detected by Daniel Jasper, Ilia Filippov, and Andrea Di Biagio Fixed the argument order to select (the mask semantics to blendv* are the inverse of select) and fixed the tests Added parenthesis to the assert condition Ran clang-format llvm-svn: 209667
* Convert some X86 blendv* intrinsics into IR.Filipe Cabecinhas2014-05-271-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implemented an InstCombine transformation that takes a blendv* intrinsic call and translates it into an IR select, if the mask is constant. This will eventually get lowered into blends with immediates if possible, or pblendvb (with an option to further optimize if we can transform the pblendvb into a blend+immediate instruction, depending on the selector). It will also enable optimizations by the IR passes, which give up on sight of the intrinsic. Both the transformation and the lowering of its result to asm got shiny new tests. The transformation is a bit convoluted because of blendvp[sd]'s definition: Its mask is a floating point value! This forces us to convert it and get the highest bit. I suppose this happened because the mask has type __m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin. I will send an email to llvm-dev to discuss if we want to change this or not. Reviewers: grosbach, delena, nadav Differential Revision: http://reviews.llvm.org/D3859 llvm-svn: 209643
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-2415-20/+14
| | | | | | | | | | | | | | | This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
* AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.Tim Northover2014-05-241-1/+1
| | | | | | | | | | | | | | | | I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576
* Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) andMichael Zolotukhin2014-05-241-0/+175
| | | | | | | | | | | sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
* Fix broken FileCheck prefixesNico Rieck2014-05-231-1/+1
| | | | llvm-svn: 209538
* Add the extracted constant offset using GEPJingyue Wu2014-05-232-13/+30
| | | | | | | | | | | | | Fixed a TODO in r207783. Add the extracted constant offset using GEP instead of ugly ptrtoint+add+inttoptr. Using GEP simplifies future optimizations and makes IR easier to understand. Updated all affected tests, and added a new test in split-gep.ll to cover a corner case where emitting uglygep is necessary. llvm-svn: 209537
* ScalarEvolution: Fix handling of AddRecs in isKnownPredicateJustin Bogner2014-05-231-0/+30
| | | | | | | | | | | | | ScalarEvolution::isKnownPredicate() can wrongly reduce a comparison when both the LHS and RHS are SCEVAddRecExprs. This checks that both LHS and RHS are guarded in the case when both are SCEVAddRecExprs. The test case is against indvars because I could not find a way to directly test SCEV. Patch by Sanjay Patel! llvm-svn: 209487
OpenPOWER on IntegriCloud