summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InstCombine/fast-math.ll
Commit message (Collapse)AuthorAgeFilesLines
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-141-1/+1
| | | | | | | | Reapply r374240 with fix for Ocaml test, namely Bindings/OCaml/core.ml. Differential Revision: https://reviews.llvm.org/D61675 llvm-svn: 374782
* Revert "[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator"Dmitri Gribenko2019-10-101-1/+1
| | | | | | | This reverts commit r374240. It broke OCaml tests: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19014 llvm-svn: 374354
* [IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperatorCameron McInally2019-10-091-1/+1
| | | | | | | | Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus. Differential Revision: https://reviews.llvm.org/D61675 llvm-svn: 374240
* [InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnumSanjay Patel2019-06-291-32/+21
| | | | | | | | | | | | | | This transform came up in D62414, but we should deal with it first. We have LLVM intrinsics that correspond exactly to libm calls (unlike most libm calls, these libm calls never set errno). This holds without any fast-math-flags, so we should always canonicalize to those intrinsics directly for better optimization. Currently, we convert to fcmp+select only when we have FMF (nnan) because fcmp+select does not preserve the semantics of the call in the general case. Differential Revision: https://reviews.llvm.org/D63214 llvm-svn: 364714
* [InstCombine] add tests for fmin/fmax libcalls; NFCSanjay Patel2019-06-121-0/+18
| | | | llvm-svn: 363175
* [IR] allow fast-math-flags on select of FP valuesSanjay Patel2019-05-221-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | This is a minimal start to correcting a problem most directly discussed in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 We have been hacking around a limitation for FP select patterns by using the fast-math-flags on the condition of the select rather than the select itself. This patch just allows FMF to appear with the 'select' opcode. No changes are needed to "FPMathOperator" because it already includes select-of-FP because that definition is based on the (return) value type. Once we have this ability, we can start correcting and adding IR transforms to use the FMF on a 'select' instruction. The instcombine and vectorizer test diffs only show that the IRBuilder change is behaving as expected by applying an FMF guard value to 'select'. For reference: rL241901 - allowed FMF with fcmp rL255555 - allowed FMF with FP calls Differential Revision: https://reviews.llvm.org/D61917 llvm-svn: 361401
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+931
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-931/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* [InstCombine] canonicalize fdiv after fmul if reassociation is allowedSanjay Patel2019-04-151-2/+2
| | | | | | | | (X / Y) * Z --> (X * Z) / Y This can allow other optimizations/reassociations as shown in the test diffs. llvm-svn: 358404
* [InstCombine] move/add tests for fadd/fsub factorization; NFCSanjay Patel2018-08-121-373/+0
| | | | llvm-svn: 339518
* [InstCombine] add tests for fsub factorization; NFCSanjay Patel2018-08-101-10/+84
| | | | | | | | The tests show that; 1. The fold doesn't fire for vectors, but it should. 2. The fold fires regardless of uses, but it shouldn't. llvm-svn: 339470
* [InstCombine] allow fsub+fmul FMF folds for vectorsSanjay Patel2018-08-091-8/+6
| | | | llvm-svn: 339368
* [InstCombine] add vector tests for fsub+fmul; NFCSanjay Patel2018-08-091-15/+44
| | | | llvm-svn: 339361
* [InstCombine] fold fadd+fsub with common operandSanjay Patel2018-08-081-10/+8
| | | | | | | This is a sibling to the simplify from: https://reviews.llvm.org/rL339174 llvm-svn: 339267
* [InstCombine] fold fsub+fsub with common operandSanjay Patel2018-08-081-4/+3
| | | | | | | This is a sibling to the simplify from: rL339171 llvm-svn: 339266
* [InstCombine] add tests for fsub folds; NFCSanjay Patel2018-08-081-68/+137
| | | | | | | | | | | | | | The scalar cases are handled in instcombine's internal reassociation pass for FP ops, but it misses the vector types. These patterns are similar to what was handled in InstSimplify in: https://reviews.llvm.org/rL339171 https://reviews.llvm.org/rL339174 https://reviews.llvm.org/rL339176 ...but we can't use instsimplify on these because we require negation of the original operand. llvm-svn: 339263
* [InstCombine] simplify fneg+fadd folds; NFCSanjay Patel2018-04-161-24/+0
| | | | | | | | Two cleanups: 1. As noted in D45453, we had tests that don't need FMF that were misplaced in the 'fast-math.ll' test file. 2. This removes the final uses of dyn_castFNegVal, so that can be deleted. We use 'match' now. llvm-svn: 330126
* [InstCombine] Enable Add/Sub simplifications with only 'reassoc' FMFWarren Ristow2018-04-141-22/+423
| | | | | | | | These simplifications were previously enabled only with isFast(), but that is more restrictive than required. Since r317488, FMF has 'reassoc' to control these cases at a finer level. llvm-svn: 330089
* [InstCombine] distribute fmul over fadd/fsubSanjay Patel2018-03-261-4/+5
| | | | | | | | | | This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502
* [PatternMatch] allow undef elements when matching vector FP +0.0Sanjay Patel2018-03-251-1/+1
| | | | | | | | | | | | | This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461
* [InstSimplify, InstCombine] add/update tests with FP +0.0 vector with undef; NFCSanjay Patel2018-03-251-0/+9
| | | | llvm-svn: 328455
* [InstCombine] move/add tests for fmul distribution; NFCSanjay Patel2018-03-211-79/+0
| | | | | | | | | There are at least 3 problems: 1. We're distributing across large patterns, but fail to do that for the minimal patterns. 2. We're not checking uses, so we may create more instructions than we eliminate. 3. We should be able to do these transforms with less than full 'fast' fmuls. llvm-svn: 328152
* [InstCombine] add test to show fmul transform creates extra fdiv; NFCSanjay Patel2018-03-121-88/+0
| | | | | | Also, move fmul reassociation tests to the same file as other fmul transforms. llvm-svn: 327342
* [InstCombine] move/add tests for fmul reassociation; NFCSanjay Patel2018-03-011-12/+0
| | | | | | | This transform may be out-of-scope for instcombine, but this is only documenting the current behavior. llvm-svn: 326442
* [InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C)Sanjay Patel2018-02-201-1/+6
| | | | llvm-svn: 325590
* [InstCombine] move fdiv tests; NFCSanjay Patel2018-02-191-33/+0
| | | | | | Also, use vector constants just to prove that already works. llvm-svn: 325530
* [InstCombine] test fdiv folds better; NFCSanjay Patel2018-02-151-24/+0
| | | | | | | We had redundant tests, but no tests for extra uses or vectors. 'fast' is an overly conservative requirement for these folds. llvm-svn: 325262
* [InstCombine] regenerate checks; NFCSanjay Patel2018-02-141-227/+352
| | | | llvm-svn: 325144
* [InstCombine] Make sure we preserve fast math flags when folding fp ↵Craig Topper2017-04-101-0/+23
| | | | | | | | | | | | | | | | instructions into phi nodes Summary: I noticed in the select folding code that we copied fast math flags, but did not do the same for the similar handling in phi nodes. This patch fixes that to do the same thing as select Reviewers: spatel, davide, majnemer, hfinkel Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31690 llvm-svn: 299838
* InstSimplify: Eliminate fabs on known positiveMatt Arsenault2017-01-111-5/+3
| | | | llvm-svn: 291624
* SimplifyLibCalls: Remove incorrect optimization of fabsMatt Arsenault2017-01-071-2/+4
| | | | | | | | fabs(x * x) is not generally safe to assume x is positive if x is a NaN. This is also less general than it could be, so this will be replaced with a transformation on the intrinsic. llvm-svn: 291359
* [LibCallSimplifier] don't allow sqrt transform unless all ops are unsafeSanjay Patel2016-01-111-0/+15
| | | | | | | Fix the FIXME added with: http://reviews.llvm.org/rL257400 llvm-svn: 257404
* [LibCallSimplifier] use instruction-level fast-math-flags to transform sqrt ↵Sanjay Patel2016-01-111-30/+24
| | | | | | | | | | | | | | | | | | | | | | | calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 The intent of the patch is to preserve the current behavior of the transform except that we use the sqrt instruction's 'fast' attribute as a trigger rather than the function-level attribute. But this raises a bug noted by the new FIXME comment. In order to do this transform: sqrt((x * x) * y) ---> fabs(x) * sqrt(y) ...we need all of the sqrt, the first fmul, and the second fmul to be 'fast'. If any of those ops is strict, we should bail out. Differential Revision: http://reviews.llvm.org/D15937 llvm-svn: 257400
* [LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax ↵Sanjay Patel2016-01-051-17/+16
| | | | | | transforms llvm-svn: 256871
* add fast-math-flags to 'call' instructions (PR21290)Sanjay Patel2015-12-141-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds optional fast-math-flags (the same that apply to fmul/fadd/fsub/fdiv/frem/fcmp) to call instructions in IR. Follow-up patches would use these flags in LibCallSimplifier, add support to clang, and extend FMF to the DAG for calls. Motivating example: %y = fmul fast float %x, %x %z = tail call float @sqrtf(float %y) We'd like to be able to optimize sqrt(x*x) into fabs(x). We do this today using a function-wide attribute for unsafe-math, but we really want to trigger on the instructions themselves: %z = tail call fast float @sqrtf(float %y) because in an LTO build it's possible that calls with fast semantics have been inlined into a function with non-fast semantics. The code changes and tests are based on the recent commits that added "notail": http://reviews.llvm.org/rL252368 and added FMF to fcmp: http://reviews.llvm.org/rL241901 Differential Revision: http://reviews.llvm.org/D14707 llvm-svn: 255555
* transform fmin/fmax calls when possible (PR24314)Sanjay Patel2015-08-161-0/+107
| | | | | | | | | | | | | | | If we can ignore NaNs, fmin/fmax libcalls can become compare and select (this is what we turn std::min / std::max into). This IR should then be optimized in the backend to whatever is best for any given target. Eg, x86 can use minss/maxss instructions. This should solve PR24314: https://llvm.org/bugs/show_bug.cgi?id=24314 Differential Revision: http://reviews.llvm.org/D11866 llvm-svn: 245187
* [InstCombine] Fix an assertion when fmul has a ConstantExpr operandMichael Kuperstein2015-03-051-0/+8
| | | | | | | | | isNormalFp and isFiniteNonZeroFp should not assume vector operands can not be constant expressions. Patch by Pawel Jurek <pawel.jurek@intel.com> Differential Revision: http://reviews.llvm.org/D8053 llvm-svn: 231359
* InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, XSanjay Patel2014-12-311-0/+8
| | | | | | | | | | | | | Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 llvm-svn: 225050
* use -0.0 when creating an fneg instructionSanjay Patel2014-12-191-1/+1
| | | | | | | | | | | | | | | | | | | Backends recognize (-0.0 - X) as the canonical form for fneg and produce better code. Eg, ppc64 with 0.0: lis r2, ha16(LCPI0_0) lfs f0, lo16(LCPI0_0)(r2) fsubs f1, f0, f1 blr vs. -0.0: fneg f1, f1 blr Differential Revision: http://reviews.llvm.org/D6723 llvm-svn: 224583
* fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)Sanjay Patel2014-10-161-0/+170
| | | | | | | | | | | | | | | | | | | | | | | | | If a square root call has an FP multiplication argument that can be reassociated, then we can hoist a repeated factor out of the square root call and into a fabs(). In the simplest case, this: y = sqrt(x * x); becomes this: y = fabs(x); This patch relies on an earlier optimization in instcombine or reassociate to put the multiplication tree into a canonical form, so we don't have to search over every permutation of the multiplication tree. Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to use function-level attributes to do this optimization. This needs to be fixed for both the intrinsics and in the backend. Differential Revision: http://reviews.llvm.org/D5787 llvm-svn: 219944
* InstCombine: Refactor fmul/fdiv combines to handle vectors.Benjamin Kramer2014-01-191-0/+17
| | | | llvm-svn: 199598
* Fix more instances of dropped fast math flags when optimizing FADD ↵Owen Anderson2014-01-181-0/+16
| | | | | | instructions. All found by inspection (aka grep). llvm-svn: 199528
* Fix two cases where we could lose fast math flags when optimizing FADD ↵Owen Anderson2014-01-161-0/+20
| | | | | | expressions. llvm-svn: 199427
* [Fast-math] Disable "(C1/X)*C2 => (C1*C2)/X" if C1/X has multiple uses.Shuxin Yang2013-09-191-0/+12
| | | | | | | | | | | | | | | | | | If "C1/X" were having multiple uses, the only benefit of this transformation is to potentially shorten critical path. But it is at the cost of instroducing additional div. The additional div may or may not incur cost depending on how div is implemented. If it is implemented using Newton–Raphson iteration, it dosen't seem to incur any cost (FIXME). However, if the div blocks the entire pipeline, that sounds to be pretty expensive. Let CodeGen to take care this transformation. This patch sees 6% on a benchmark. rdar://15032743 llvm-svn: 191037
* Update Transforms tests to use CHECK-LABEL for easier debugging. No ↵Stephen Lin2013-07-141-36/+36
| | | | | | | | | | | | | | | | | | | | | | functionality change. This update was done with the following bash script: find test/Transforms -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268
* Fix a bug in fast-math fadd/fsub simplification. Shuxin Yang2013-03-251-0/+10
| | | | | | | | | | | The problem is that the code mistakenly took for granted that following constructor is able to create an APFloat from a *SIGNED* integer: APFloat::APFloat(const fltSemantics &ourSemantics, integerPart value) rdar://13486998 llvm-svn: 177906
* Perform factorization as a last resort of unsafe fadd/fsub simplification.Shuxin Yang2013-03-141-0/+105
| | | | | | | | | | | | | | | Rules include: 1)1 x*y +/- x*z => x*(y +/- z) (the order of operands dosen't matter) 2) y/x +/- z/x => (y +/- z)/x The transformation is disabled if the new add/sub expr "y +/- z" is a denormal/naz/inifinity. rdar://12911472 llvm-svn: 177088
* Fix a bug in instcombine for fmul in fast math mode.Quentin Colombet2013-02-281-0/+11
| | | | | | | | | | | | | | | The instcombine recognized pattern looks like: a = b * c d = a +/- Cst or a = b * c d = Cst +/- a When creating the new operands for fadd or fsub instruction following the related fmul, the first operand was created with the second original operand (M0 was created with C1) and the second with the first (M1 with Opnd0). The fix consists in creating the new operands with the appropriate original operand, i.e., M0 with Opnd0 and M1 with C1. llvm-svn: 176300
* Preserve fast-math flags after reassociation and commutation. Update test casesMichael Ilseman2013-02-071-4/+4
| | | | llvm-svn: 174571
* whitespaceMichael Ilseman2013-02-071-12/+12
| | | | llvm-svn: 174569
OpenPOWER on IntegriCloud