summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [SROA] Fold a PHI node if all its incoming values are the sameJingyue Wu2014-08-221-41/+41
| | | | | | | | | | | | | | | | | | | Summary: Fixes PR20425. During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote. Test Plan: Added three more tests in phi-and-select.ll Reviewers: nlewycky, eliben, meheff, chandlerc Reviewed By: chandlerc Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits Differential Revision: http://reviews.llvm.org/D4659 llvm-svn: 216299
* InstCombine: Don't unconditionally preserve 'nuw' when shrinking constantsDavid Majnemer2014-08-221-6/+12
| | | | | | | | | | | | Consider: %add = add nuw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nuw' from the instruction. llvm-svn: 216273
* InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MINDavid Majnemer2014-08-221-1/+4
| | | | | | We can preserve nsw during this transform if -C won't overflow. llvm-svn: 216269
* InstCombine: Don't unconditionally preserve 'nsw' when shrinking constantsDavid Majnemer2014-08-221-0/+8
| | | | | | | | | | | | | | Consider: %add = add nsw i32 %a, -16777216 %and = and i32 %add, 255 Regardless of whether or not we demand the sign bit of %add, we cannot replace -16777216 with 2130706432 without also removing 'nsw' from the instruction. This fixes PR20377. llvm-svn: 216261
* fix: SLPVectorizer crashes for unreachable blocks containing not schedulable ↵Erik Eckstein2014-08-221-0/+8
| | | | | | | | | | | | instructions. In unreachable blocks it's legal to have instructions like "%x = op %x". Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for unreachable blocks and ignore them. Fixes bug 20646. llvm-svn: 216256
* [dfsan] Fix non-determinism bug in non-zero label check annotator.Peter Collingbourne2014-08-221-10/+8
| | | | | | | We now use a std::vector instead of a DenseSet to store the list of label checks so that we can iterate over it deterministically. llvm-svn: 216255
* SROA: Handle a case of store size being smaller than allocation sizeReid Kleckner2014-08-221-4/+6
| | | | | | | | | | | | | | | | In this case, we are creating an x86_fp80 slice for a union from C where the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes, and that's just fine. We can't, however, use regular loads and stores to access the slice, because the store size is only 10 bytes / 80 bits. Instead, use memcpy and memset. Fixes PR18726. Reviewed By: chandlerc Differential Revision: http://reviews.llvm.org/D5012 llvm-svn: 216248
* Use DILexicalBlockFile, rather than DILexicalBlock, to track discriminator ↵David Blaikie2014-08-211-4/+2
| | | | | | | | | | | | | | | changes to ensure discriminator changes don't introduce new DWARF DW_TAG_lexical_blocks. Somewhat unnoticed in the original implementation of discriminators, but it could cause instructions to end up in new, small, DW_TAG_lexical_blocks due to the use of DILexicalBlock to track discriminator changes. Instead, use DILexicalBlockFile which we already use to track file changes without introducing new scopes, so it works well to track discriminator changes in the same way. llvm-svn: 216239
* Move some logic to populateLTOPassManager.Rafael Espindola2014-08-211-5/+36
| | | | | | | This will avoid code duplication in the next commit which calls it directly from the gold plugin. llvm-svn: 216211
* Respect LibraryInfo in populateLTOPassManager and use it. NFC.Rafael Espindola2014-08-211-0/+4
| | | | llvm-svn: 216203
* Handle inlining in populateLTOPassManager like in populateModulePassManager.Rafael Espindola2014-08-211-5/+13
| | | | | | No functionality change. llvm-svn: 216178
* [CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing.Zinovy Nis2014-08-211-1/+0
| | | | llvm-svn: 216176
* Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder.Rafael Espindola2014-08-211-6/+6
| | | | llvm-svn: 216174
* Reassociate x + -0.1234 * y into x - 0.1234 * yErik Verbruggen2014-08-211-2/+49
| | | | | | | | | | | This does not require -ffast-math, and it gives CSE/GVN more options to eliminate duplicate expressions in, e.g.: return ((x + 0.1234 * y) * (x - 0.1234 * y)); Differential Revision: http://reviews.llvm.org/D4904 llvm-svn: 216169
* [INDVARS] Extend using of widening of induction variables for the cases of ↵Zinovy Nis2014-08-211-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | "sub nsw" and "mul nsw" instructions. Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like: int N = 100; float **A; void foo(int x0, int x1) { float * A_cur = &A[0][0]; float * A_next = &A[1][0]; for(int x = x0; x < x1; ++x). { // Currently only [x+N] case is widened. Others 2 cases lead to sext. // This patch fixes it, so all 3 cases do not need sext. const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N]; A_next[x] = div; } } ... > clang++ test.cpp -march=core-avx2 -Ofast -fno-unroll-loops -fno-tree-vectorize -S -o - Differential Revision: http://reviews.llvm.org/D4695 llvm-svn: 216160
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-2122-75/+73
| | | | | | needing to mention the size. llvm-svn: 216158
* InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1David Majnemer2014-08-212-0/+46
| | | | | | Adapted from a patch by Richard Smith, test-case written by me. llvm-svn: 216157
* [LoopVectorizer] Limit unroll factor in the presence of nested reductions.James Molloy2014-08-201-0/+17
| | | | | | If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation. llvm-svn: 216140
* New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + ↵Yi Jiang2014-08-201-0/+55
| | | | | | C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition llvm-svn: 216135
* InstCombine: Annotate sub with nuw when we prove it's safeDavid Majnemer2014-08-202-0/+19
| | | | | | | We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is negative and the right-hand side is non-negative. llvm-svn: 216045
* [dfsan] Treat vararg custom functions like unimplemented functions.Peter Collingbourne2014-08-201-1/+1
| | | | | | | | | Because declarations of these functions can appear in places like autoconf checks, they have to be handled somehow, even though we do not support vararg custom functions. We do so by printing a warning and calling the uninstrumented function, as we do for unimplemented functions. llvm-svn: 216042
* InstCombine: Annotate sub with nsw when we prove it's safeDavid Majnemer2014-08-192-1/+40
| | | | | | | | | | We can prove that a 'sub' can be a 'sub nsw' under certain conditions: - The sign bits of the operands is the same. - Both operands have more than 1 sign bit. The subtraction cannot be a signed overflow in either case. llvm-svn: 216037
* Revert "Small refactor on VectorizerHint for deduplication"Renato Golin2014-08-191-147/+91
| | | | | | This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness. llvm-svn: 215999
* Small refactor on VectorizerHint for deduplicationRenato Golin2014-08-191-91/+147
| | | | | | | | | | | | | | | Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. llvm-svn: 215994
* InstCombine: ((A & ~B) ^ (~A & B)) to A ^ BMayur Pandey2014-08-191-0/+10
| | | | | | | | | | | | | Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Differential Revision: http://reviews.llvm.org/D4898 llvm-svn: 215974
* Const-correct and prevent a copy of a SmallPtrSet.Craig Topper2014-08-191-2/+2
| | | | llvm-svn: 215973
* test commit (spelling correction)Mayur Pandey2014-08-191-1/+1
| | | | llvm-svn: 215970
* Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to ↵Craig Topper2014-08-1822-69/+69
| | | | | | | | avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-1722-69/+69
| | | | | | needing to mention the size. llvm-svn: 215868
* Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to ↵Owen Anderson2014-08-171-30/+0
| | | | | | | | | | | | | (select y, x, 0.0) when the multiply has fast math flags set. While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible! It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level information (the fast math flags), which is undesirable. llvm-svn: 215825
* InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C)David Majnemer2014-08-161-1/+1
| | | | | | | | | | | | | | | While *most* (X sdiv 1) operations will get caught by InstSimplify, it is still possible for a sdiv to appear in the worklist which hasn't been simplified yet. This means that it is possible for 0 - (X sdiv 1) to get transformed into (X sdiv -1); dividing by -1 can make the transform produce undef values instead of the proper result. Sorry for the lack of testcase, it's a bit problematic because it relies on the exact order of operations in the worklist. llvm-svn: 215818
* InstCombine: Combine mul with div.David Majnemer2014-08-161-2/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can combne a mul with a div if one of the operands is a multiple of the other: %mul = mul nsw nuw %a, C1 %ret = udiv %mul, C2 => %ret = mul nsw %a, (C1 / C2) This can expose further optimization opportunities if we end up multiplying or dividing by a power of 2. Consider this small example: define i32 @f(i32 %a) { %mul = mul nuw i32 %a, 14 %div = udiv exact i32 %mul, 7 ret i32 %div } which gets CodeGen'd to: imull $14, %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $2, %eax retq We can now transform this into: define i32 @f(i32 %a) { %shl = shl nuw i32 %a, 1 ret i32 %shl } which gets CodeGen'd to: leal (%rdi,%rdi), %eax retq This fixes PR20681. llvm-svn: 215815
* Introduce a helper to combine instruction metadata.Rafael Espindola2014-08-154-75/+70
| | | | | | | | | Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use it. Patch by Björn Steinbrink! llvm-svn: 215723
* Copy noalias metadata from call sites to inlined instructionsHal Finkel2014-08-141-4/+28
| | | | | | | | | | | | | | When a call site with noalias metadata is inlined, that metadata can be propagated directly to the inlined instructions (only those that might access memory because it is not useful on the others). Prior to inlining, the noalias metadata could express that a call would not alias with some other memory access, which implies that no instruction within that called function would alias. By propagating the metadata to the inlined instructions, we preserve that knowledge. This should complete the enhancements requested in PR20500. llvm-svn: 215676
* Add noalias metadata for general calls (not just memory intrinsics) during ↵Hal Finkel2014-08-141-7/+18
| | | | | | | | | | | | | | | | inlining When preserving noalias function parameter attributes by adding noalias metadata in the inliner, we should do this for general function calls (not just memory intrinsics). The logic is very similar to what already existed (except that we want to add this metadata even for functions taking no relevant parameters). This metadata can be used by ModRef queries in the caller after inlining. This addresses the first part of PR20500. Adding noalias metadata during inlining is still turned off by default. llvm-svn: 215657
* [Reassociation] Add support for reassociation with unsafe algebra.Chad Rosier2014-08-141-81/+228
| | | | | | | Vector instructions are (still) not supported for either integer or floating point. Hopefully, that work will be landed shortly. llvm-svn: 215647
* InstCombine: ((A | ~B) ^ (~A | B)) to A ^ BDavid Majnemer2014-08-141-0/+10
| | | | | | | | | | | | | | | Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A | ~B),(~A |B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Patch by Mayur Pandey! Differential Revision: http://reviews.llvm.org/D4883 llvm-svn: 215621
* Added InstCombine Transform for ((B | C) & A) | B -> B | (A & C)David Majnemer2014-08-141-0/+4
| | | | | | | | | | | | Transform ((B | C) & A) | B --> B | (A & C) Z3 Link: http://rise4fun.com/Z3/hP6p Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D4865 llvm-svn: 215619
* utils: Fix segfault in flattencfgJan Vesely2014-08-131-4/+5
| | | | | | | | | | | | | | v2: continue iterating through the rest of the bb use for loop v3: initialize FlattenCFG pass in ScalarOps add test v4: split off initializing flattencfg to a separate patch add comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215574
* Initialize FlattenCFG passJan Vesely2014-08-131-0/+1
| | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215573
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-138-22/+22
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* [optnone] Make the optnone attribute effective at suppressing functionChandler Carruth2014-08-131-7/+13
| | | | | | | | | | | | | attribute and function argument attribute synthesizing and propagating. As with the other uses of this attribute, the goal remains a best-effort (no guarantees) attempt to not optimize the function or assume things about the function when optimizing. This is particularly useful for compiler testing, bisecting miscompiles, triaging things, etc. I was hitting specific issues using optnone to isolate test code from a test driver for my fuzz testing, and this is one step of fixing that. llvm-svn: 215538
* Revert r215415 which causse MSan to crash on a great deal of C++ code.Chandler Carruth2014-08-131-10/+0
| | | | | | I've followed up on the original commit as well. llvm-svn: 215532
* InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b)Karthik Bhat2014-08-131-0/+12
| | | | | | | | | | | | | Correctness proof of the transform using CVC3- $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B; $ cvc3 t.cvc Valid. llvm-svn: 215524
* Allwo bitcast + struct GEP transform to work with addrspacecastMatt Arsenault2014-08-121-3/+20
| | | | llvm-svn: 215467
* msan: Handle musttail callsReid Kleckner2014-08-121-0/+10
| | | | | | | | | | | | | | | | First, avoid calling setTailCall(false) on musttail calls. The funciton prototypes should be "congruent", so the shadow layout should be exactly the same. Second, avoid inserting instrumentation after a musttail call to propagate the return value shadow. We don't need to propagate the result of a tail call, it should already be in the right place. Reviewed By: eugenis Differential Revision: http://reviews.llvm.org/D4331 llvm-svn: 215415
* Move helper for getting a terminating musttail call to BasicBlockReid Kleckner2014-08-121-30/+5
| | | | | | | | | | | No functional change. To be used in future commits that need to look for such instructions. Reviewed By: rafael Differential Revision: http://reviews.llvm.org/D4504 llvm-svn: 215413
* InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b)David Majnemer2014-08-111-1/+23
| | | | | | | | | | | | | | What follows bellow is a correctness proof of the transform using CVC3. $ < t.cvc A, B : BITVECTOR(32); QUERY BVPLUS(32, A & B, A | B) = BVPLUS(32, A, B); $ cvc3 < t.cvc Valid. llvm-svn: 215400
* [LoopVectorizer] Enable support for floating-point subtraction reductionsJames Molloy2014-08-081-1/+2
| | | | llvm-svn: 215200
* GlobalOpt: Optimize in the face of insertvalue/extractvalueDavid Majnemer2014-08-081-0/+11
| | | | | | | | | | GlobalOpt didn't know how to simulate InsertValueInst or ExtractValueInst. Optimizing these is pretty straightforward. N.B. This came up when looking at clang's IRGen for MS ABI member pointers; they are represented as aggregates. llvm-svn: 215184
OpenPOWER on IntegriCloud