summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Fix all the remaining lost-fast-math-flags bugs I've been able to find. The ↵Owen Anderson2014-01-203-10/+49
| | | | | | | | most important of these are cases in the generic logic for combining BinaryOperators. This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd. llvm-svn: 199629
* InstCombine: Modernize a bunch of cast combines.Benjamin Kramer2014-01-191-44/+23
| | | | | | Also make them vector-aware. llvm-svn: 199608
* InstCombine: Hoist 3 copies of AddOne/SubOne into a header.Benjamin Kramer2014-01-194-30/+9
| | | | llvm-svn: 199605
* InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with ↵Benjamin Kramer2014-01-191-21/+4
| | | | | | the real thing. llvm-svn: 199604
* InstCombine: Teach most integer add/sub/mul/div combines how to deal with ↵Benjamin Kramer2014-01-192-76/+82
| | | | | | vectors. llvm-svn: 199602
* InstCombine: Refactor fmul/fdiv combines to handle vectors.Benjamin Kramer2014-01-192-65/+78
| | | | llvm-svn: 199598
* Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ↵Nick Lewycky2014-01-181-3/+4
| | | | | | ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu! llvm-svn: 199564
* InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle ↵Benjamin Kramer2014-01-181-6/+4
| | | | | | | | vectors too. PR18532. llvm-svn: 199553
* Fix more instances of dropped fast math flags when optimizing FADD ↵Owen Anderson2014-01-183-7/+33
| | | | | | instructions. All found by inspection (aka grep). llvm-svn: 199528
* Fix two cases where we could lose fast math flags when optimizing FADD ↵Owen Anderson2014-01-161-4/+10
| | | | | | expressions. llvm-svn: 199427
* Fix an instance where we would drop fast math flags when performing an fdiv ↵Owen Anderson2014-01-161-1/+3
| | | | | | to reciprocal multiply transformation. llvm-svn: 199425
* Fix a bug in InstCombine where we failed to preserve fast math flags when ↵Owen Anderson2014-01-161-2/+5
| | | | | | optimizing an FMUL expression. llvm-svn: 199424
* Teach InstCombine that (fmul X, -1.0) can be simplified to (fneg X), which ↵Owen Anderson2014-01-161-0/+10
| | | | | | LLVM expresses as (fsub -0.0, X). llvm-svn: 199420
* Do pointer cast simplifications on addrspacecastMatt Arsenault2014-01-141-1/+1
| | | | llvm-svn: 199254
* Remove a check for an illegal condition.Matt Arsenault2014-01-141-5/+0
| | | | | | Bitcasts can't be between address spaces anymore. llvm-svn: 199253
* Fix a bug about generating undef operand when optimising shuffle vector and ↵Hao Liu2014-01-081-2/+3
| | | | | | insert element in instruction combine. llvm-svn: 198730
* Stay classy (and legal) LLVM. Remove links to 3rd party SMT solver whose ↵Kay Tiong Khoo2013-12-191-4/+2
| | | | | | links may not be permanent. llvm-svn: 197713
* Improved fix for PR17827 (instcombine of shift/and/compare).Kay Tiong Khoo2013-12-191-22/+32
| | | | | | | | | This change fixes the case of arithmetic shift right - do not attempt to fold that case. This change also relaxes the conditions when attempting to fold the logical shift right and shift left cases. No additional IR-level test cases included at this time. See http://llvm.org/bugs/show_bug.cgi?id=17827 for proofs that these are correct transformations. llvm-svn: 197705
* Fix assert with copy from global through addrspacecastMatt Arsenault2013-12-071-3/+3
| | | | llvm-svn: 196638
* Don't use isNullValue to evaluate ConstantExprDuncan P. N. Exon Smith2013-12-061-1/+4
| | | | | | | | ConstantExpr can evaluate to false even when isNullValue gives false. Fixes PR18143. llvm-svn: 196611
* Use local variable for repeated use rather than 'get' method. No functional ↵Kay Tiong Khoo2013-12-021-4/+3
| | | | | | change intended. llvm-svn: 196164
* Move variables to where they are used and give them better names. No ↵Kay Tiong Khoo2013-12-021-6/+8
| | | | | | functional change intended. llvm-svn: 196163
* Rename variables to be consistent (CST -> Cst). No functional change intended.Kay Tiong Khoo2013-12-021-30/+30
| | | | llvm-svn: 196161
* Conservative fix for PR17827 - don't optimize a shift + and + compare ↵Kay Tiong Khoo2013-12-021-4/+12
| | | | | | sequence where the shift is logical unless the comparison is unsigned llvm-svn: 196129
* Rein in overzealous InstCombine of fptrunc(OP(fpextend, fpextend)).Stephen Canon2013-11-281-26/+82
| | | | llvm-svn: 195934
* Apply the InstCombine fptrunc sqrt optimization to llvm.sqrtHal Finkel2013-11-161-6/+11
| | | | | | | | | | | | | | | | | | | InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls: (fptrunc (sqrt (fpext x))) -> (sqrtf x) but does not apply the same optimization to llvm.sqrt. This is a problem because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in fast-math mode, and because this optimization is being applied to sqrt and not applied to llvm.sqrt, sometimes the fast-math code is slower. This change makes InstCombine apply this optimization to llvm.sqrt as well. This fixes the specific problem in PR17758, although the same underlying issue (optimizations applied to libcalls are not applied to intrinsics) exists for other optimizations in SimplifyLibCalls. llvm-svn: 194935
* InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs.Benjamin Kramer2013-11-161-0/+18
| | | | | | This is common in bitfield code. llvm-svn: 194925
* Add instcombine visitor for addrspacecastMatt Arsenault2013-11-152-0/+5
| | | | llvm-svn: 194786
* Update the docs to match the function name.Nadav Rotem2013-11-131-1/+1
| | | | llvm-svn: 194537
* Fold (iszero(A&K1) | iszero(A&K2)) -> (A&(K1|K2)) != (K1|K2) if we know ↵Nadav Rotem2013-11-121-3/+50
| | | | | | that K1 and K2 are 'one-hot' (only one bit is on). llvm-svn: 194525
* Scalarize select vector arguments when extracted.Matt Arsenault2013-11-041-0/+32
| | | | | | | | When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. llvm-svn: 194013
* Remove x86_sse42_crc32_64_8 intrinsic. It has no functional difference from ↵Craig Topper2013-10-151-1/+0
| | | | | | x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext. llvm-svn: 192672
* Pull fptrunc's upwards through selects when one of the select's selectands ↵Owen Anderson2013-10-031-0/+13
| | | | | | was a constant. This has a number of benefits, including producing small immediates (easier to materialize, smaller constant pools) as well as being more likely to allow the fptrunc to fuse with a preceding instruction (truncating selects are unusual). llvm-svn: 191929
* Make gep i8* X, -(ptrtoint Y) transform work with address spacesMatt Arsenault2013-10-031-8/+10
| | | | llvm-svn: 191920
* Use right address space size in InstCombineComparesMatt Arsenault2013-09-301-3/+6
| | | | | | | The test's output doesn't change, but this ensures this is actually hit with a different address space. llvm-svn: 191701
* Constant fold ptrtoint + compare with address spacesMatt Arsenault2013-09-301-1/+1
| | | | llvm-svn: 191699
* InstCombine: Replace manual fast math flag copying with the new IRBuilder ↵Benjamin Kramer2013-09-301-22/+20
| | | | | | | | | RAII helper. Defines away the issue where cast<Instruction> would fail because constant folding happened. Also slightly cleaner. llvm-svn: 191674
* Fix a bug in InstCombine where it attempted to cast a Value* to an Instruction*Joey Gouly2013-09-301-2/+2
| | | | | | | | | when it was actually a Constant*. There are quite a few other casts to Instruction that might have the same problem, but this is the only one I have a test case for. llvm-svn: 191668
* Use type helper functionsMatt Arsenault2013-09-272-3/+2
| | | | llvm-svn: 191574
* InstCombine: Only foldSelectICmpAndOr for integer typesJustin Bogner2013-09-271-1/+1
| | | | | | | | | | Currently foldSelectICmpAndOr asserts if the "or" involves a vector containing several of the same power of two. We can easily avoid this by only performing the fold on integer types, like foldSelectICmpAnd does. Fixes <rdar://problem/15012516> llvm-svn: 191552
* Push analysis passes to InstSimplify when they're around anyways.Benjamin Kramer2013-09-241-1/+1
| | | | llvm-svn: 191309
* InstCombine: Remove unused argument. No functionality change.Benjamin Kramer2013-09-202-12/+6
| | | | llvm-svn: 191112
* InstCombine: Canonicalize (gep i8* X, -(ptrtoint Y)) to (sub (ptrtoint X), ↵Benjamin Kramer2013-09-201-0/+14
| | | | | | | | | | (ptrtoint Y)) The GEP pattern is what SCEV expander emits for "ugly geps". The latter is what you get for pointer subtraction in C code. The rest of instcombine already knows how to deal with that so just canonicalize on that. llvm-svn: 191090
* [Fast-math] Disable "(C1/X)*C2 => (C1*C2)/X" if C1/X has multiple uses.Shuxin Yang2013-09-191-3/+6
| | | | | | | | | | | | | | | | | | If "C1/X" were having multiple uses, the only benefit of this transformation is to potentially shorten critical path. But it is at the cost of instroducing additional div. The additional div may or may not incur cost depending on how div is implemented. If it is implemented using Newton–Raphson iteration, it dosen't seem to incur any cost (FIXME). However, if the div blocks the entire pipeline, that sounds to be pretty expensive. Let CodeGen to take care this transformation. This patch sees 6% on a benchmark. rdar://15032743 llvm-svn: 191037
* InstCombine: Don't allow turning vector-of-pointer loads into vector-of-integer.Benjamin Kramer2013-09-191-1/+2
| | | | | | The code below can't handle any pointers. PR17293. llvm-svn: 191036
* Revert the load slicing done in r190870.Quentin Colombet2013-09-171-285/+0
| | | | | | | To avoid regressions with bitfield optimizations, this slicing should take place later, like ISel time. llvm-svn: 190891
* Cleanup handling of constant function casts.Matt Arsenault2013-09-171-24/+8
| | | | | | | | | | Some of this code is no longer necessary since int<->ptr casts are no longer occur as of r187444. This also fixes handling vectors of pointers, and adds a bunch of new testcases for vectors and address spaces. llvm-svn: 190885
* [InstCombiner] Slice a big load in two loads when the elements are next to eachQuentin Colombet2013-09-171-0/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | other in memory. The motivation was to get rid of truncate and shift right instructions that get in the way of paired load or floating point load. E.g., Consider the following example: struct Complex { float real; float imm; }; When accessing a complex, llvm was generating a 64-bits load and the imm field was obtained by a trunc(lshr) sequence, resulting in poor code generation, at least for x86. The idea is to declare that two load instructions is the canonical form for loading two arithmetic type, which are next to each other in memory. Two scalar loads at a constant offset from each other are pretty easy to detect for the sorts of passes that like to mess with loads. <rdar://problem/14477220> llvm-svn: 190870
* Get rid of unused isPodLike definitions.Eli Friedman2013-09-111-2/+0
| | | | llvm-svn: 190461
* [InstCombiner] Expose opportunities to merge subtract and comparison.Quentin Colombet2013-09-091-1/+46
| | | | | | | | | | | | | | | | | | | | | Several architectures use the same instruction to perform both a comparison and a subtract. The instruction selection framework does not allow to consider different basic blocks to expose such fusion opportunities. Therefore, these instructions are “merged” by CSE at MI IR level. To increase the likelihood of CSE to apply in such situation, we reorder the operands of the comparison, when they have the same complexity, so that they matches the order of the most frequent subtract. E.g., icmp A, B ... sub B, A <rdar://problem/14514580> llvm-svn: 190352
OpenPOWER on IntegriCloud