summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfeTom Stellard2018-11-081-23/+7
| | | | | | | | | | | | | | | | | | | Summary: When the 3rd argument to these intrinsics is zero, lowering them to shift instructions produces poison values, since we end up with shift amounts equal to the number of bits in the shifted value. This means we can only lower these intrinsics if we can prove that the 3rd argument is not zero. Reviewers: arsenm Reviewed By: arsenm Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D53739 llvm-svn: 346422
* [InstCombine] propagate FMF for fcmp+fabs foldsSanjay Patel2018-11-071-12/+12
| | | | | | | By morphing the instruction rather than deleting and creating a new one, we retain fast-math-flags and potentially other metadata (profile info?). llvm-svn: 346331
* [InstCombine] peek through fabs() when checking isnan()Sanjay Patel2018-11-071-4/+2
| | | | | | | | | That should be the end of the missing cases for this fold. See earlier patches in this series: rL346321 rL346324 llvm-svn: 346327
* [InstCombine] add tests for isnan(fabs(X)); NFCSanjay Patel2018-11-071-0/+22
| | | | llvm-svn: 346325
* [InstCombine] add folds for fcmp Pred fabs(X), 0.0Sanjay Patel2018-11-071-4/+2
| | | | | | | | Similar to rL346321, we had folds for the ordered versions of these compares already, so add the unordered siblings for completeness. llvm-svn: 346324
* [InstCombine] add tests for more fcmp+fabs preds; NFCSanjay Patel2018-11-071-0/+22
| | | | llvm-svn: 346323
* [InstCombine] add fold for fabs(X) u< 0.0Sanjay Patel2018-11-071-2/+1
| | | | | | | | | | | | | | The sibling fold for 'oge' --> 'ord' was already here, but this half was missing. The result of fabs() must be positive or nan, so asking if the result is negative or nan is the same as asking if the result is nan. This is another step towards fixing: https://bugs.llvm.org/show_bug.cgi?id=39475 llvm-svn: 346321
* [InstCombine] add test for fcmp+fabs; NFCSanjay Patel2018-11-071-0/+20
| | | | llvm-svn: 346320
* [InstCombine] add FMF to fcmp to show failure to propagate; NFCSanjay Patel2018-11-071-7/+7
| | | | llvm-svn: 346317
* [InstCombine] do not shrink switch conditions to illegal types (PR29009)Sanjay Patel2018-11-071-7/+61
| | | | | | | | | | | | | | | | | This patch makes shrinking switch conditions less aggressive which was introduced by: rL274233 Note that we have 2 new bugs to track potential follow-ups that might have solved PR29009 in different ways: https://bugs.llvm.org/show_bug.cgi?id=39569 https://bugs.llvm.org/show_bug.cgi?id=39578 Patch by: @dendibakh (Denis Bakhvalov) Differential Revision: https://reviews.llvm.org/D54115 llvm-svn: 346315
* [InstCombine] allow vector types for fcmp+fpext foldSanjay Patel2018-11-061-2/+1
| | | | llvm-svn: 346245
* [InstCombine] add vector test for fcmp+fpext; NFCSanjay Patel2018-11-061-8/+19
| | | | llvm-svn: 346243
* [InstCombine] propagate fast-math-flags when folding fcmp+fpext, part 2Sanjay Patel2018-11-061-1/+1
| | | | llvm-svn: 346242
* [InstCombine] propagate fast-math-flags when folding fcmp+fpextSanjay Patel2018-11-061-1/+1
| | | | llvm-svn: 346240
* [InstCombine] adjust tests to show dropping FMF; NFCSanjay Patel2018-11-061-2/+2
| | | | llvm-svn: 346239
* [InstCombine] propagate fast-math-flags when folding fcmp+fneg, part 2Sanjay Patel2018-11-061-2/+2
| | | | llvm-svn: 346238
* [InstCombine] adjust tests to show dropping FMF; NFCSanjay Patel2018-11-061-4/+4
| | | | | | Also, remove some stale FIXME comments ( rL346234 ). llvm-svn: 346236
* [InstCombine] propagate fast-math-flags when folding fcmp+fnegSanjay Patel2018-11-061-2/+2
| | | | | | | | | | This is another part of solving PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 This might be enough to fix that particular issue, but as noted with the FIXME, we're still dropping FMF on other folds around here. llvm-svn: 346234
* [InstCombine] add tests for FMF propagation failure; NFCSanjay Patel2018-11-061-0/+24
| | | | llvm-svn: 346232
* [InstCombine] Ensure nested shifts are in range (OSS-Fuzz #9880)Simon Pilgrim2018-11-061-0/+19
| | | | llvm-svn: 346225
* [InstCombine] add/adjust tests for fcmp+select substitution; NFCSanjay Patel2018-11-051-28/+91
| | | | | | | | There was no coverage for at least 2 out of the 4 patterns because of fcmp canonicalization. The tests and code should be moved to InstSimplify in a follow-up because this doesn't create any new values. llvm-svn: 346150
* [InstCombine] canonicalize -0.0 to +0.0 in fcmpSanjay Patel2018-11-054-25/+25
| | | | | | | | | | | | | | | | | As stated in IEEE-754 and discussed in: https://bugs.llvm.org/show_bug.cgi?id=38086 ...the sign of zero does not affect any FP compare predicate. Known regressions were fixed with: rL346097 (D54001) rL346143 The transform will help reduce pattern-matching complexity to solve: https://bugs.llvm.org/show_bug.cgi?id=39475 ...as well as improve CSE and codegen (a zero constant is almost always easier to produce than 0x80..00). llvm-svn: 346147
* [InstCombine] loosen FP 0.0 constraint for fcmp+select substitutionSanjay Patel2018-11-051-21/+14
| | | | | | | | | | | | | | It looks like we correctly removed edge cases with 0.0 from D50714, but we were a bit conservative because getBinOpIdentity() doesn't distinguish between +0.0 and -0.0 and 'nsz' is effectively always true for fcmp (see discussion in: https://bugs.llvm.org/show_bug.cgi?id=38086 Without this change, we would get regressions by canonicalizing to +0.0 in all fcmp, and that's a step towards solving: https://bugs.llvm.org/show_bug.cgi?id=39475 llvm-svn: 346143
* [InstCombine] adjust tests for select with FP identity op; NFCSanjay Patel2018-11-051-30/+32
| | | | | | These are mislabeled as negative tests. llvm-svn: 346142
* [InstCombine] add/adjust tests for select with fsub identity op; NFCSanjay Patel2018-11-051-4/+22
| | | | llvm-svn: 346138
* [InstCombine] add tests for select with FP identity op; NFCSanjay Patel2018-11-051-0/+64
| | | | llvm-svn: 346136
* [ValueTracking] determine sign of 0.0 from select when matching min/max FPSanjay Patel2018-11-041-23/+15
| | | | | | | | | | | | | | | | | | | In PR39475: https://bugs.llvm.org/show_bug.cgi?id=39475 ..we may fail to recognize/simplify fabs() in some cases because we do not canonicalize fcmp with a -0.0 operand. Adding that canonicalization can cause regressions on min/max FP tests, so that's this patch: for the purpose of determining whether something is min/max, let the value returned by the select determine how we treat a 0.0 operand in the fcmp. This patch doesn't actually change the -0.0 to +0.0. It just changes the analysis, so we don't fail to recognize equivalent min/max patterns that only differ in the signbit of 0.0. Differential Revision: https://reviews.llvm.org/D54001 llvm-svn: 346097
* [ValueTracking] peek through 2-input shuffles in ComputeNumSignBitsSanjay Patel2018-11-031-5/+3
| | | | | | | | | | | This patch gives the IR ComputeNumSignBits the same functionality as the DAG version (the code is derived from the existing code). This an extension of the single input shuffle analysis added with D53659. Differential Revision: https://reviews.llvm.org/D53987 llvm-svn: 346071
* [InstCombine] add test for ComputeNumSignBits on 2-input shuffle; NFCSanjay Patel2018-11-011-0/+21
| | | | llvm-svn: 345852
* [InstCombine] add tests for fmin/fmax pattern matching failure; NFCSanjay Patel2018-10-311-2/+44
| | | | llvm-svn: 345771
* [InstCombine] regenerate test checks; NFCSanjay Patel2018-10-311-32/+32
| | | | llvm-svn: 345757
* [InstCombine] add tests for fcmp with -0.0; NFCSanjay Patel2018-10-311-0/+57
| | | | | | From IEEE754: "Comparisons shall ignore the sign of zero (so +0 = −0)." llvm-svn: 345752
* [InstCombine] Combine nested min/max intrinsics with constantsVolkan Keles2018-10-314-48/+24
| | | | | | | | | | | | Reviewers: arsenm, spatel Reviewed By: spatel Subscribers: lebedev.ri, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D53774 llvm-svn: 345751
* [InstCombine] refactor fabs+fcmp fold; NFCSanjay Patel2018-10-311-160/+60
| | | | | | | Also, remove/replace/minimize/enhance the tests for this fold. The code drops FMF, so it needs more tests and at least 1 fix. llvm-svn: 345734
* [InstCombine] Teach the move free before null test opti how to deal with ↵Quentin Colombet2018-10-301-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | noop casts InstCombine features an optimization that essentially replaces: if (a) free(a) into: free(a) Right now, this optimization is gated by the minsize attribute and therefore we only perform it if we can prove that we are going to be able to eliminate the branch and the destination block. However when casts are involved the optimization would fail to apply, because the optimization was not smart enough to realize that it is possible to also move the casts away from the destination block and that is harmless to the performance since they are just noops. E.g., foo(int *a) if (a) free((char*)a) Wouldn't be optimized by instcombine, because - We would refuse to hoist the `bitcast i32* %a to i8` in the source block - We would fail to see that `bitcast i32* %a to i8` and %a are the same value. This patch fixes both these problems: - It teaches the pattern matching of the comparison how to look through casts. - It checks that whether the additional instruction in the destination block can be hoisted and are harmless performance-wise. - It hoists all the code of the destination block in the source block. Differential Revision: D53356 llvm-svn: 345644
* [InstCombine] Add preliminary tests for nested min/max combines. NFCVolkan Keles2018-10-304-2/+242
| | | | | | | | | | | | | | Summary: As requested in D53774. Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53875 llvm-svn: 345616
* [InstCombine] try to turn shuffle into insertelementSanjay Patel2018-10-301-20/+19
| | | | | | | | | | | | | | | | | | | | | | shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC' The motivating case is at least a couple of steps away: I noticed that SLPVectorizer does not analyze shuffles as well as sequences of insert/extract in PR34724: https://bugs.llvm.org/show_bug.cgi?id=34724 ...so SLP may fail to vectorize when source code has shuffles to start with or instcombine has converted insert/extract to shuffles. Independent of that, an insertelement is always a simpler op for IR analysis vs. a shuffle, so we should transform to insert when possible. I don't think there's any codegen concern here - if a target can't insert a scalar directly to some fixed element in a vector (x86?), then this should get expanded to the insert+shuffle that we started with. Differential Revision: https://reviews.llvm.org/D53507 llvm-svn: 345607
* [Local] Keep K's range if K does not move when combining metadata.Florian Hahn2018-10-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As K has to dominate I, IIUC I's range metadata must be a subset of K's. After Eli's recent clarification to the LangRef, loading a value outside of the range is undefined behavior. Therefore if I's range contains elements outside of K's range and we would load one such value, K would cause undefined behavior. In cases like hoisting/sinking, we still want the most generic range over all code paths to/from the hoist/sink point. As suggested in the patches related to D47339, I will refactor the handling of those scenarios and try to decouple it from this function as follow up, once we switched to a similar handling of metadata in most of combineMetadata. I updated some tests checking mostly the merging of metadata to keep the metadata of to dominating load. The most interesting one is probably test8 in test/Transforms/JumpThreading/thread-loads.ll. It contained a comment about the alias metadata preventing us to eliminate the branch, but it seem like the actual problem currently is that we merge the ranges of both loads and cannot eliminate the icmp afterwards. With this patch, we manage to eliminate the icmp, as the range of the first load excludes 8. Reviewers: efriedma, nlopes, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D51629 llvm-svn: 345456
* [ValueTracking] peek through shuffles in ComputeNumSignBits (PR37549)Sanjay Patel2018-10-262-9/+22
| | | | | | | | | | | | | | | | | The motivating case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The analysis improvement allows us to form a vector 'select' out of bitwise logic (the use of ComputeNumSignBits was added at rL345149). The smaller test shows another InstCombine improvement - we use ComputeNumSignBits to add 'nsw' to shift-left. But the negative test shows an example where we must not add 'nsw' - when the shuffle mask contains undef elements. Differential Revision: https://reviews.llvm.org/D53659 llvm-svn: 345429
* [FPEnv] Last BinaryOperator::isFNeg(...) to m_FNeg(...) changesCameron McInally2018-10-251-2/+2
| | | | | | | | | Replacing BinaryOperator::isFNeg(...) to avoid regressions when we separate FNeg from the FSub IR instruction. Differential Revision: https://reviews.llvm.org/D53650 llvm-svn: 345295
* Add -instcombine-code-sinking optionGabor Buella2018-10-251-0/+19
| | | | | | | | | | Reviewers: craig.topper, andrew.w.kaylor, efriedma Reviewed By: craig.topper, andrew.w.kaylor, efriedma Differential Revision: https://reviews.llvm.org/D52709 llvm-svn: 345248
* [InstCombine] add test for fptrunc with vector with undef elt; NFCSanjay Patel2018-10-241-2/+13
| | | | | | This should be fixed with D53650. llvm-svn: 345206
* [InstCombine] add test for ComputeNumSignBits with shuffle; NFCSanjay Patel2018-10-241-25/+58
| | | | llvm-svn: 345162
* [InstCombine] add test for select with shuffled condition (PR37549); NFCSanjay Patel2018-10-241-0/+35
| | | | llvm-svn: 345156
* [InstCombine] try harder to form select from logic ops (2nd try)Sanjay Patel2018-10-242-24/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original patch was committed here: rL344609 ...and reverted: rL344612 ...because it did not properly check/test data types before calling ComputeNumSignBits(). The tests that caused bot failures for the previous commit are over-reaching front-end tests that run the entire -O optimizer pipeline: Clang :: CodeGen/builtins-systemz-zvector.c Clang :: CodeGen/builtins-systemz-zvector2.c I've added a negative test here to ensure coverage for that case. The new early exit check also tests the type of the 'B' parameter, so we don't waste time on matching if either value is unsuitable. Original commit message: This is part of solving PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 The patterns shown here are a special case of something that we already convert to select. Using ComputeNumSignBits() catches that case (but not the more complicated motivating patterns yet). The backend has hooks/logic to convert back to logic ops if that's better for the target. llvm-svn: 345149
* [InstCombine] use 'match' to handle vectors and simplify codeSanjay Patel2018-10-231-3/+2
| | | | | | | This is another step towards completely removing the fake binop queries for not/neg/fneg. llvm-svn: 345036
* [InstCombine] swap select profile metadata when swapping select opsSanjay Patel2018-10-231-5/+5
| | | | llvm-svn: 345034
* [InstCombine] add/move tests for select with inverted condition; NFCSanjay Patel2018-10-232-10/+39
| | | | | | | The transform is broken in 2 ways - it doesn't correct metadata (or even drop it), and it doesn't work with vectors with undef elements. llvm-svn: 345033
* [InstCombine] add tests for shuffle+insert folds; NFCSanjay Patel2018-10-221-0/+123
| | | | llvm-svn: 344908
* [InstCombine] add test for possible shuffle fold; NFCSanjay Patel2018-10-201-31/+51
| | | | llvm-svn: 344860
OpenPOWER on IntegriCloud