summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
...
* Simplify a vpermil* with constant mask.Rafael Espindola2014-04-211-0/+30
| | | | | | | | With a constant mask a vpermil* is just a shufflevector. This patch implements that simplification. This allows us to produce denser code. It should also allow more folding down the line. llvm-svn: 206801
* Revert "Revert r206045, "Fix shift by constants for vector.""Matt Arsenault2014-04-142-0/+145
| | | | | | Fix cases where the Value itself is used, and not the constant value. llvm-svn: 206214
* Whitespace.NAKAMURA Takumi2014-04-141-8/+8
| | | | llvm-svn: 206154
* Revert r206045, "Fix shift by constants for vector."NAKAMURA Takumi2014-04-141-72/+8
| | | | | | It broke some builders, at least, i686. llvm-svn: 206153
* Recognize test for overflow in integer multiplication.Serge Pavlov2014-04-131-0/+164
| | | | | | | | | | | | | | | | | | If multiplication involves zero-extended arguments and the result is compared as in the patterns: %mul32 = trunc i64 %mul64 to i32 %zext = zext i32 %mul32 to i64 %overflow = icmp ne i64 %mul64, %zext or %overflow = icmp ugt i64 %mul64 , 0xffffffff then the multiplication may be replaced by call to umul.with.overflow. This change fixes PR4917 and PR4918. Differential Revision: http://llvm-reviews.chandlerc.com/D2814 llvm-svn: 206137
* Fix shift by constants for vector.Matt Arsenault2014-04-111-8/+72
| | | | | | ashr <N x iM>, <N x iM> M -> undef llvm-svn: 206045
* Fix PR19270 - type mismatch caused by invalid optimization.Eli Bendersky2014-04-031-1/+16
| | | | | | Patch by Jingyue Wu. llvm-svn: 205547
* ARM64: initial backend importTim Northover2014-03-292-3/+67
| | | | | | | | | | | | This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. llvm-svn: 205090
* Revert "InstCombine: merge constants in both operands of icmp."Erik Verbruggen2014-03-281-63/+0
| | | | | | | | | This reverts commit r204912, and follow-up commit r204948. This introduced a performance regression, and the fix is not completely clear yet. llvm-svn: 205010
* InstCombine: Don't combine constants on unsigned icmpsReid Kleckner2014-03-271-0/+10
| | | | | | | | | Fixes a miscompile introduced in r204912. It would miscompile code like (unsigned)(a + -49) <= 5U. The transform would turn this into (unsigned)a < 55U, which would return true for values in [0, 49], when it should not. llvm-svn: 204948
* InstCombine: merge constants in both operands of icmp.Erik Verbruggen2014-03-271-0/+53
| | | | | | | | | | Transform: icmp X+Cst2, Cst into: icmp X, Cst-Cst2 when Cst-Cst2 does not overflow, and the add has nsw. llvm-svn: 204912
* [InstCombine] Don't fold bitcast into store if it would need addrspacecastRichard Osborne2014-03-251-2/+16
| | | | | | | | | | | | | | | | | | Summary: Previously the code didn't check if the before and after types for the store were pointers to different address spaces. This resulted in instcombine using a bitcast to convert between pointers to different address spaces, causing an assertion due to the invalid cast. It is not be appropriate to use addrspacecast this case because it is not guaranteed to be a no-op cast. Instead bail out and do not do the transformation. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3117 llvm-svn: 204733
* Allow constant folding of ceil function whenever feasibleKarthik Bhat2014-03-241-0/+56
| | | | llvm-svn: 204583
* Fix a bug in InstCombine where we would incorrectly attempt to construct aOwen Anderson2014-03-131-0/+12
| | | | | | | bitcast between pointers of two different address spaces if they happened to have the same pointer size. llvm-svn: 203862
* Reject alias to undefined symbols in the verifier.Rafael Espindola2014-03-122-2/+6
| | | | | | | | | | | | | | | On ELF and COFF an alias is just another name for a position in the file. There is no way to refer to a position in another file, so an alias to undefined is meaningless. MachO currently doesn't support aliases. The spec has a N_INDR, which when implemented will have a different set of restrictions. Adding support for it shouldn't be harder than any other IR extension. For now, having the IR represent what is actually possible with current tools makes it easier to fix the design of GlobalAlias. llvm-svn: 203705
* Revert r203488 and r203520.Evan Cheng2014-03-121-23/+0
| | | | llvm-svn: 203687
* For functions with ARM target specific calling convention, when simplify-libcallEvan Cheng2014-03-101-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimize a call to a llvm intrinsic to something that invovles a call to a C library call, make sure it sets the right calling convention on the call. e.g. extern double pow(double, double); double t(double x) { return pow(10, x); } Compiles to something like this for AAPCS-VFP: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x) ret double %0 } declare double @llvm.pow.f64(double, double) #1 Simplify libcall (part of instcombine) will turn the above into: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %__exp10 = call double @__exp10(double %x) #1 ret double %__exp10 } declare double @__exp10(double) The pre-instcombine code works because calls to LLVM builtins are special. Instruction selection will chose the right calling convention for the call. However, the code after instcombine is wrong. The call to __exp10 will use the C calling convention. I can think of 3 options to fix this. 1. Make "C" calling convention just work since the target should know what CC is being used. This doesn't work because each function can use different CC with the "pcs" attribute. 2. Have Clang add the right CC keyword on the calls to LLVM builtin. This will work but it doesn't match the LLVM IR specification which states these are "Standard C Library Intrinsics". 3. Fix simplify libcall so the resulting calls to the C routines will have the proper CC keyword. e.g. %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1 This works and is the solution I implemented here. Both solutions #2 and #3 would work. After carefully considering the pros and cons, I decided to implement #3 for the following reasons. 1. It doesn't change the "spec" of the intrinsics. 2. It's a self-contained fix. There are a couple of potential downsides. 1. There could be other places in the optimizer that is broken in the same way that's not addressed by this. 2. There could be other calling conventions that need to be propagated by simplify-libcall that's not handled. But for now, this is the fix that I'm most comfortable with. llvm-svn: 203488
* InstCombine: form shuffles from wider range of insert/extractelementsTim Northover2014-03-071-0/+37
| | | | | | | | | | | | Sequences of insertelement/extractelements are sometimes used to build vectorsr; this code tries to put them back together into shuffles, but could only produce a completely uniform shuffle types (<N x T> from two <N x T> sources). This should allow shuffles with different numbers of elements on the input and output sides as well. llvm-svn: 203229
* Allow constant folding of round function whenever feasibleKarthik Bhat2014-03-071-0/+90
| | | | llvm-svn: 203198
* Allow constant folding of copysignKarthik Bhat2014-03-061-0/+49
| | | | llvm-svn: 203076
* Change math intrinsic attributes from readonly to readnone. TheseRaul E. Silvera2014-03-061-2/+2
| | | | | | | | | | are operations that do not access memory but may be sensitive to floating-point environment changes. LLVM does not attempt to model FP environment changes, so this was unnecessarily conservative and was getting on the way of some optimizations, in particular SLP vectorization. llvm-svn: 203037
* ConstantFolding: Also fold the vector overloads of our math intrinsics.Benjamin Kramer2014-03-051-0/+8
| | | | llvm-svn: 202997
* Allow constant folding of fma and fmuladdMatt Arsenault2014-03-051-0/+39
| | | | llvm-svn: 202914
* Fix broken FileCheck prefixesNico Rieck2014-02-262-8/+8
| | | | llvm-svn: 202308
* Fix broken FileCheck prefixNico Rieck2014-02-261-4/+4
| | | | llvm-svn: 202291
* Make sure that value handle users see the transformation of an indirect call ↵Nick Lewycky2014-02-201-0/+23
| | | | | | to a direct call. This is important for the CallGraph iteration. Patch by Björn Steinbrink! llvm-svn: 201822
* Do more addrspacecast transforms that happen for bitcast.Matt Arsenault2014-02-142-2/+70
| | | | | | Makes addrspacecast (gep) do addrspacecast (gep) instead. llvm-svn: 201376
* Remove a very old instcombine where we would turn sequences of selects intoOwen Anderson2014-02-121-0/+24
| | | | | | | | | | | | | logical operations on the i1's driving them. This is a bad idea for every target I can think of (confirmed with micro tests on all of: x86-64, ARM, AArch64, Mips, and PowerPC) because it forces the i1 to be materialized into a general purpose register, whereas consuming it directly into a select generally allows it to exist only transiently in a predicate or flags register. Chandler ran a set of performance tests with this change, and reported no measurable change on x86-64. llvm-svn: 201275
* InstCombine: Teach icmp merging about the equivalence of bit tests and ↵Benjamin Kramer2014-02-111-0/+38
| | | | | | | | | UGE/ULT with a power of 2. This happens in bitfield code. While there reorganize the existing code a bit. llvm-svn: 201176
* SimplifyLibCalls: Push TLI through the exp2->ldexp transform.Benjamin Kramer2014-02-041-0/+24
| | | | | | For the odd case of platforms with exp2 available but not ldexp. llvm-svn: 200795
* OS X: the correct function is __sincospif_stret, not __sincospi_stretfTim Northover2014-02-041-4/+4
| | | | | | rdar://problem/13729466 llvm-svn: 200771
* Add strchr(p, 0) -> p + strlen(p) to SimplifyLibCallsKai Nacke2014-02-041-0/+13
| | | | | | | | Add the missing transformation strchr(p, 0) -> p + strlen(p) to SimplifyLibCalls and remove the ToDo comment. Reviewer: Duncan P.N. Exan Smith llvm-svn: 200736
* Update optimization passes to handle inalloca argumentsReid Kleckner2014-01-281-0/+22
| | | | | | | | | | | | | | | Summary: I searched Transforms/ and Analysis/ for 'ByVal' and updated those call sites to check for inalloca if appropriate. I added tests for any change that would allow an optimization to fire on inalloca. Reviewers: nlewycky Differential Revision: http://llvm-reviews.chandlerc.com/D2449 llvm-svn: 200281
* InstCombine: Don't try to use aggregate elements of ConstantExprs.Benjamin Kramer2014-01-241-0/+8
| | | | | | PR18600. llvm-svn: 200028
* Add CHECK-LABELsMatt Arsenault2014-01-221-0/+7
| | | | llvm-svn: 199846
* Fix all the remaining lost-fast-math-flags bugs I've been able to find. The ↵Owen Anderson2014-01-203-0/+38
| | | | | | | | most important of these are cases in the generic logic for combining BinaryOperators. This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd. llvm-svn: 199629
* InstCombine: Modernize a bunch of cast combines.Benjamin Kramer2014-01-192-0/+57
| | | | | | Also make them vector-aware. llvm-svn: 199608
* InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with ↵Benjamin Kramer2014-01-191-2/+2
| | | | | | the real thing. llvm-svn: 199604
* InstCombine: Teach most integer add/sub/mul/div combines how to deal with ↵Benjamin Kramer2014-01-196-2/+138
| | | | | | vectors. llvm-svn: 199602
* InstCombine: Refactor fmul/fdiv combines to handle vectors.Benjamin Kramer2014-01-192-0/+40
| | | | llvm-svn: 199598
* Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ↵Nick Lewycky2014-01-181-0/+12
| | | | | | ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu! llvm-svn: 199564
* InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle ↵Benjamin Kramer2014-01-181-0/+9
| | | | | | | | vectors too. PR18532. llvm-svn: 199553
* Fix more instances of dropped fast math flags when optimizing FADD ↵Owen Anderson2014-01-182-0/+25
| | | | | | instructions. All found by inspection (aka grep). llvm-svn: 199528
* Fix two cases where we could lose fast math flags when optimizing FADD ↵Owen Anderson2014-01-161-0/+20
| | | | | | expressions. llvm-svn: 199427
* Fix an instance where we would drop fast math flags when performing an fdiv ↵Owen Anderson2014-01-161-0/+8
| | | | | | to reciprocal multiply transformation. llvm-svn: 199425
* Fix a bug in InstCombine where we failed to preserve fast math flags when ↵Owen Anderson2014-01-161-3/+2
| | | | | | optimizing an FMUL expression. llvm-svn: 199424
* Teach InstCombine that (fmul X, -1.0) can be simplified to (fneg X), which ↵Owen Anderson2014-01-161-0/+12
| | | | | | LLVM expresses as (fsub -0.0, X). llvm-svn: 199420
* Do pointer cast simplifications on addrspacecastMatt Arsenault2014-01-141-0/+9
| | | | llvm-svn: 199254
* Fix broken CHECK lines.Benjamin Kramer2014-01-111-1/+1
| | | | llvm-svn: 199016
* Fix a bug about generating undef operand when optimising shuffle vector and ↵Hao Liu2014-01-081-0/+17
| | | | | | insert element in instruction combine. llvm-svn: 198730
OpenPOWER on IntegriCloud