summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Avoid combining GEPs that might overflow at runtime.Stuart Hastings2011-05-141-1/+3
| | | | | | | | rdar://problem/9267970 Patch by Julien Lerouge! llvm-svn: 131339
* PR9838: Fix transform introduced in r127064 to not trigger when only one ↵Eli Friedman2011-05-051-1/+1
| | | | | | side of the icmp is an exact shift. llvm-svn: 130954
* Remove unused variable.Duncan Sands2011-05-021-1/+1
| | | | llvm-svn: 130705
* Move some rem transforms out of instcombine and into instsimplify.Duncan Sands2011-05-021-42/+19
| | | | | | | This automagically provides a transform noticed by my super-optimizer as occurring quite often: "rem x, (select cond, x, 1)" -> 0. llvm-svn: 130694
* InstCombine: Turn (zext A) udiv (zext B) into (zext (A udiv B)). Same for ↵Benjamin Kramer2011-04-301-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | urem or constant B. This obviously helps a lot if the division would be turned into a libcall (think i64 udiv on i386), but div is also one of the few remaining instructions on modern CPUs that become more expensive when the bitwidth gets bigger. This also helps register pressure on i386 when dividing chars, divb needs two 8-bit parts of a 16 bit register as input where divl uses two registers. int foo(unsigned char a) { return a/10; } int bar(unsigned char a, unsigned char b) { return a/b; } compiles into (x86_64) _foo: imull $205, %edi, %eax shrl $11, %eax ret _bar: movzbl %dil, %eax divb %sil, %al movzbl %al, %eax ret llvm-svn: 130615
* Use SimplifyDemandedBits on div instructions.Benjamin Kramer2011-04-301-0/+4
| | | | | | This folds away silly stuff like (a&255)/1000 -> 0. llvm-svn: 130614
* Balance parentheses.Benjamin Kramer2011-04-291-1/+1
| | | | llvm-svn: 130489
* InstCombine: turn (C1 << A) << C2) into (C1 << C2) << A)Benjamin Kramer2011-04-291-1/+8
| | | | | | Fixes PR9809. llvm-svn: 130485
* We require threse bits to be zero, too.Benjamin Kramer2011-04-281-2/+2
| | | | | | | This shouldn't happen in practice because the icmp would be a constant. Add a check so we don't miscompile code if something goes wrong. llvm-svn: 130446
* Fix a comment.Benjamin Kramer2011-04-281-1/+1
| | | | llvm-svn: 130428
* InstCombine: Merge "(trunc x) == C1 & (and x, CA) == C2" into a single and+icmp.Benjamin Kramer2011-04-281-0/+36
| | | | | | This happens when GVN widens loads. Part of PR6627. llvm-svn: 130405
* Stop trying to have instcombine preserve LCSSA form: this was notDuncan Sands2011-04-273-6/+0
| | | | | | | | | | effective in avoiding recomputation of LCSSA form; the widespread use of instsimplify (which looks through phi nodes) means it was not preserving LCSSA form anyway; and instcombine is no longer scheduled in the middle of the loop passes so this doesn't matter anymore. llvm-svn: 130301
* Transform: "icmp eq (trunc (lshr(X, cst1)), cst" to "icmp (and X, mask), cst"Chris Lattner2011-04-261-0/+25
| | | | | | | | | | when X has multiple uses. This is useful for exposing secondary optimizations, but the X86 backend isn't ready for this when X has a single use. For example, this can disable load folding. This is inching towards resolving PR6627. llvm-svn: 130238
* some random cleanups, no functionality change.Chris Lattner2011-04-261-5/+5
| | | | llvm-svn: 130237
* Rename a misleadingly-named variable.Frits van Bommel2011-04-161-5/+5
| | | | llvm-svn: 129644
* Fix bug when checking phi operands in InstCombiner::visitPHINode(),Jay Foad2011-04-161-1/+1
| | | | | | found by code inspection. llvm-svn: 129641
* Fix a ton of comment typos found by codespell. Patch byChris Lattner2011-04-153-3/+3
| | | | | | Luis Felipe Strano Moraes! llvm-svn: 129558
* Add an instcombine for constructs like a | -(b != c); a select is moreEli Friedman2011-04-141-1/+8
| | | | | | | canonical, and generally leads to better code. Found while looking at an article about saturating arithmetic. llvm-svn: 129545
* Reapply r129401 with patch for clang.Bill Wendling2011-04-131-5/+1
| | | | llvm-svn: 129419
* Revert r129401 for now. Clang is using the old way of doing things.Bill Wendling2011-04-121-1/+5
| | | | llvm-svn: 129403
* Remove the unaligned load intrinsics in favor of using native unaligned loads.Bill Wendling2011-04-121-5/+1
| | | | | | | | | Now that we have a first-class way to represent unaligned loads, the unaligned load intrinsics are superfluous. First part of <rdar://problem/8460511>. llvm-svn: 129401
* Don't include Operator.h from InstrTypes.h.Jay Foad2011-04-111-0/+1
| | | | llvm-svn: 129271
* InstCombine optimizes gep(bitcast(x)) even when the bitcasts casts away addressNadav Rotem2011-04-051-8/+11
| | | | | | | space info. We crash with an assert in this case. This change checks that the address space of the bitcasted pointer is the same as the gep ptr. llvm-svn: 128884
* While SimplifyDemandedBits constant folds this, we can't rely on it here.Benjamin Kramer2011-04-021-2/+7
| | | | | | | | | | It's possible to craft an input that hits the recursion limits in a way that SimplifyDemandedBits doesn't simplify the icmp but ComputeMaskedBits can infer which bits are zero. No test case as it depends on too many other things. Fixes PR9609. llvm-svn: 128777
* Fix comment.Benjamin Kramer2011-04-011-2/+2
| | | | llvm-svn: 128745
* Tweaks to the icmp+sext-to-shifts optimization to address Frits' comments:Benjamin Kramer2011-04-011-6/+6
| | | | | | | | | | - Localize the check if an icmp has one use to a place where we know we're introducing something that's likely more expensive than a sext from i1. - Add an assert to make sure a case that would lead to a miscompilation is folded away earlier. - Fix a typo. llvm-svn: 128744
* Fix build.Benjamin Kramer2011-04-011-1/+2
| | | | llvm-svn: 128733
* InstCombine: Turn icmp + sext into bitwise/integer ops when the input has ↵Benjamin Kramer2011-04-011-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | only one unknown bit. int test1(unsigned x) { return (x&8) ? 0 : -1; } int test3(unsigned x) { return (x&8) ? -1 : 0; } before (x86_64): _test1: andl $8, %edi cmpl $1, %edi sbbl %eax, %eax ret _test3: andl $8, %edi cmpl $1, %edi sbbl %eax, %eax notl %eax ret after: _test1: shrl $3, %edi andl $1, %edi leal -1(%rdi), %eax ret _test3: shll $28, %edi movl %edi, %eax sarl $31, %eax ret llvm-svn: 128732
* InstCombine: Move (sext icmp) transforms into their own method. No intended ↵Benjamin Kramer2011-04-012-37/+43
| | | | | | functionality change. llvm-svn: 128731
* Instcombile optimization: extractelement(cast) -> cast(extractelement)Nadav Rotem2011-03-311-1/+9
| | | | llvm-svn: 128683
* InstCombine: APFloat can't perform arithmetic on PPC double doubles, don't ↵Benjamin Kramer2011-03-311-2/+4
| | | | | | | | even try. Thanks Eli! llvm-svn: 128676
* InstCombine: Fix transform to use the swapped predicate.Benjamin Kramer2011-03-311-2/+2
| | | | | | Thanks Frits! llvm-svn: 128628
* InstCombine: fold fcmp (fneg x), (fneg y) -> fcmp x, yBenjamin Kramer2011-03-311-0/+5
| | | | llvm-svn: 128627
* InstCombine: fold fcmp pred (fneg x), C -> fcmp swap(pred) x, -CBenjamin Kramer2011-03-311-0/+8
| | | | llvm-svn: 128626
* InstCombine: Shrink "fcmp (fpext x), C" to "fcmp x, C" if C can be ↵Benjamin Kramer2011-03-311-0/+34
| | | | | | | | losslessly converted to the type of x. Fixes PR9592. llvm-svn: 128625
* InstCombine: fold fcmp (fpext x), (fpext y) -> fcmp x, y.Benjamin Kramer2011-03-311-0/+7
| | | | llvm-svn: 128624
* InstCombine: If the divisor of an fdiv has an exact inverse, turn it into an ↵Benjamin Kramer2011-03-301-0/+12
| | | | | | | | fmul. Fixes PR9587. llvm-svn: 128546
* Remove PHINode::reserveOperandSpace(). Instead, add a parameter toJay Foad2011-03-304-16/+10
| | | | | | PHINode::Create() giving the (known or expected) number of operands. llvm-svn: 128537
* (Almost) always call reserveOperandSpace() on newly created PHINodes.Jay Foad2011-03-302-0/+2
| | | | llvm-svn: 128535
* InstCombine: Add a few missing combines for ANDs and ORs of sign bit tests.Benjamin Kramer2011-03-291-0/+24
| | | | | | | | On x86 we now compile "if (a < 0 && b < 0)" into testl %edi, %esi js IF.THEN llvm-svn: 128496
* Remove tabs I accidentally added.Nick Lewycky2011-03-281-15/+15
| | | | llvm-svn: 128413
* Make more use of PHINode::getNumIncomingValues().Jay Foad2011-03-282-5/+5
| | | | llvm-svn: 128406
* Add some debug output when -instcombine uses RAUW. This can make debug ↵Frits van Bommel2011-03-271-1/+4
| | | | | | output for those cases much clearer since without this it only showed that the original instruction was removed, not what it was replaced with. llvm-svn: 128399
* Teach the transformation that moves binary operators around selects to preserveNick Lewycky2011-03-271-8/+22
| | | | | | the subclass optional data. llvm-svn: 128388
* Use APInt's umul_ov instead of rolling our own overflow detection.Benjamin Kramer2011-03-271-5/+6
| | | | llvm-svn: 128380
* Add a small missed optimization: turn X == C ? X : Y into X == C ? C : Y. ThisNick Lewycky2011-03-271-0/+13
| | | | | | | | | | removes one use of X which helps it pass the many hasOneUse() checks. In my analysis, this turns up very often where X = A >>exact B and that can't be simplified unless X has one use (except by increasing the lifetime of A which is generally a performance loss). llvm-svn: 128373
* Try to not lose variable's debug info during instcombine.Devang Patel2011-03-171-0/+4
| | | | | | | This is done by lowering dbg.declare intrinsic into dbg.value intrinsic. Radar 9143931. llvm-svn: 127834
* If we don't know how long a string is we can't fold an _chk version to theEric Christopher2011-03-151-3/+7
| | | | | | | | normal version. Fixes rdar://9123638 llvm-svn: 127636
* This case is solved by Scalar Replacement of Aggregates (DT) andJin-Gu Kang2011-03-141-25/+3
| | | | | | Early CSE pass so this patch reverts it to original source code. llvm-svn: 127574
* Add comment as following:Jin-Gu Kang2011-03-131-0/+12
| | | | | | | | | | | | | | | | | load and store reference same memory location, the memory location is represented by getelementptr with two uses (load and store) and the getelementptr's base is alloca with single use. At this point, instructions from alloca to store can be removed. (this pattern is generated when bitfield is accessed.) For example, %u = alloca %struct.test, align 4 ; [#uses=1] %0 = getelementptr inbounds %struct.test* %u, i32 0, i32 0;[#uses=2] %1 = load i8* %0, align 4 ; [#uses=1] %2 = and i8 %1, -16 ; [#uses=1] %3 = or i8 %2, 5 ; [#uses=1] store i8 %3, i8* %0, align 4 llvm-svn: 127565
OpenPOWER on IntegriCloud