summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
Commit message (Collapse)AuthorAgeFilesLines
* InstCombine: Clean up some trailing whitespace. NFCJustin Bogner2016-08-051-3/+3
| | | | llvm-svn: 277793
* InstCombine: Replace some never-null pointers with references. NFCJustin Bogner2016-08-051-1/+1
| | | | llvm-svn: 277792
* [InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more ↵Tobias Grosser2016-08-031-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | aggressive cast-folding Summary: InstCombine unfolds expressions of the form `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` such that in a later iteration of InstCombine the exposed `zext(icmp)` instructions can be optimized. We now combine this unfolding and the subsequent `zext(icmp)` optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of `logic(cast(icmp), cast(icmp))` expressions to `cast(logic(icmp, icmp))` which had been disabled due to its interference with the unfolding transformation. Tested via `make check` and `lnt`. Background ========== For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in `visitZExt()` in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` in order to expose `zext(icmp)` operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these `zext(icmp)` would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the `zext`-unfolding and the elimination of the exposed `zext(icmp)` to happen at one go because this enables us to still allow the cast-folding in `logic(cast(icmp), cast(icmp))` without entering an endless loop again. Details on the submitted changes ================================ - In `visitZExt()` we combine the unfolding and optimization of `zext` instructions. - In `transformZExtICmp()` we have to use `Builder->CreateIntCast()` instead of `CastInst::CreateIntegerCast()` to make sure that the new `CastInst` is inserted in a `BasicBlock`. The new calls to `transformZExtICmp()` that we introduce in `visitZExt()` would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of `replaceInstUsesWith()` is necessary to ensure that the new `CastInst` replaces the `ZExtInst` accordingly. - In InstCombineAndOrXor.cpp we again allow the folding of casts on `icmp` instructions. - The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes. - The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105. Reviewers: grosser, majnemer, spatel Subscribers: eli.friedman, majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22864 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 277635
* [ConstnatFolding] Teach the folder how to fold ConstantVectorDavid Majnemer2016-07-291-3/+2
| | | | | | | | | | | A ConstantVector can have ConstantExpr operands and vice versa. However, the folder had no ability to fold ConstantVectors which, in some cases, was an optimization barrier. Instead, rephrase the folder in terms of Constants instead of ConstantExprs and teach callers how to deal with failure. llvm-svn: 277099
* [InstCombine] Handle failures from ConstantFoldConstantExpressionDavid Majnemer2016-07-281-1/+2
| | | | | | | | ConstantFoldConstantExpression returns null when folding fails. This fixes PR28745. llvm-svn: 276952
* [InstCombine] LogicOpc (zext X), C --> zext (LogicOpc X, C) (PR28476)Sanjay Patel2016-07-211-8/+0
| | | | | | | | | | | | | | | | The benefits of this change include: 1. Remove DeMorgan-matching code that was added specifically to work-around the missing transform in http://reviews.llvm.org/rL248634. 2. Makes the DeMorgan transform work for vectors too. 3. Fix PR28476: https://llvm.org/bugs/show_bug.cgi?id=28476 Extending this transform to other casts and other associative operators may be useful too. See https://reviews.llvm.org/D22421 for a prerequisite for doing that though. Differential Revision: https://reviews.llvm.org/D22271 llvm-svn: 276221
* [InstCombine] Minor cleanup of cast simplification code [NFC]Tobias Grosser2016-07-191-46/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch cleans up parts of InstCombine to raise its compliance with the LLVM coding standards and to increase its readability. The changes and according rationale are summarized in the following: - Rename `ShouldOptimizeCast()` to `shouldOptimizeCast()` since functions should start with a lower case letter. - Move `shouldOptimizeCast()` from InstCombineCasts.cpp to InstCombineAndOrXor.cpp since it's only used there. - Simplify interface of `shouldOptimizeCast()`. - Minor code style adaptions in `shouldOptimizeCast()`. - Remove the documentation on the function definition of `shouldOptimizeCast()` since it just repeats the documentation on its declaration. Also enhance the documentation on its declaration with more information describing its intended use and make it doxygen-compliant. - Change a comment in `foldCastedBitwiseLogic()` from `fold (logic (cast A), (cast B)) -> (cast (logic A, B))` to `fold logic(cast(A), cast(B)) -> cast(logic(A, B))` since the surrounding comments use this format. - Remove comment `Only do this if the casts both really cause code to be generated.` in `foldCastedBitwiseLogic()` since it just repeats parts of the documentation of `shouldOptimizeCast()` and does not help to improve readability. - Simplify the interface of `isEliminableCastPair()`. - Removed the documentation on the function definition of `isEliminableCastPair()` which only contained obvious statements about its implementation. Instead added more general doxygen-compliant documentation to its declaration. - Renamed parameter `DoXform` of `transformZExtIcmp()` to `DoTransform` to make its intention clearer. - Moved documentation of `transformZExtIcmp()` from its definition to its declaration and made it doxygen-compliant. Reviewers: vtjnash, grosser Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D22449 Contributed-by: Matthias Reisinger llvm-svn: 275964
* Revert "InstCombine rule to fold truncs whose value is available"Anna Thomas2016-07-081-23/+1
| | | | | | | This reverts commit r274853. Caused failure in ppcBE build llvm-svn: 274943
* InstCombine rule to fold truncs whose value is availableAnna Thomas2016-07-081-1/+23
| | | | | | | | | | | | | We can fold truncs whose operand feeds from a load, if the trunc value is available through a prior load/store. This change is from: http://reviews.llvm.org/D21246, which folded the trunc but missed the bitcast or ptrtoint/inttoptr required in the RAUW call, when the load type didnt match the prior load/store type. Differential Revision: http://reviews.llvm.org/D21791 llvm-svn: 274853
* Revert "[InstCombine] Avoid combining the bitcast of a var that is used as ↵Eric Christopher2016-06-291-113/+0
| | | | | | | | | | | | both address and result of load instructions" Revert "[InstCombine] Combine A->B->A BitCast" as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847 This reverts commits r270135 and r263734. llvm-svn: 274094
* Revert "InstCombine rule to fold trunc when value available"Reid Kleckner2016-06-241-20/+1
| | | | | | | | | | | | This reverts commit r273608. Broke building code with sanitizers, where apparently these kinds of loads, casts, and truncations are common: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/24502 http://crbug.com/623099 llvm-svn: 273703
* InstCombine rule to fold trunc when value availableAnna Thomas2016-06-231-1/+20
| | | | | | | | | | | | | | Summary: This instcombine rule folds away trunc operations that have value available from a prior load or store. This kind of code can be generated as a result of GVN widening the load or from source code as well. Reviewers: reames, majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D21246 llvm-svn: 273608
* Revert "Revert "Revert "InstCombine: Reduce trunc (shl x, K) width."""Matt Arsenault2016-06-171-22/+5
| | | | | | | This seems to be causing an infinite loop / crash in instcombine on some bots. llvm-svn: 273069
* Revert "Revert "InstCombine: Reduce trunc (shl x, K) width.""Matt Arsenault2016-06-171-5/+22
| | | | | | | Reapply r272987. Condition should be in terms of the destination type, and the flags should not be copied. llvm-svn: 273045
* Revert "InstCombine: Reduce trunc (shl x, K) width."Matt Arsenault2016-06-171-24/+5
| | | | | | | | This reverts commit r272987. This might be causing crashes on some bots. llvm-svn: 272990
* InstCombine: Reduce trunc (shl x, K) width.Matt Arsenault2016-06-171-5/+24
| | | | llvm-svn: 272987
* [InstCombine] Fix assertion when bitcast is converted to gepGerolf Hoflehner2016-05-231-0/+7
| | | | | | | | | | When an aggregate contains an opaque type its size cannot be determined. This triggers an "Invalid GetElementPtrInst indices for type" assert in function checkGEPType. The fix suppresses the conversion in this case. http://reviews.llvm.org/D20319 llvm-svn: 270479
* [InstCombine] Avoid combining the bitcast of a var that is used as both ↵Guozhi Wei2016-05-191-0/+7
| | | | | | | | | | address and result of load instructions This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703. If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it. llvm-svn: 270135
* [InstCombine] Propagate operand bundlesDavid Majnemer2016-04-291-1/+4
| | | | | | | We neglected to transfer operand bundles for some transforms. These were found via inspection, I'll try to come up with some test cases. llvm-svn: 268010
* [InstCombine] Combine A->B->A BitCastGuozhi Wei2016-03-171-0/+106
| | | | | | | | | | This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734
* Revert "[InstCombine] Combine A->B->A BitCast"Junmo Park2016-03-081-103/+0
| | | | | | This reverts commit r262670 due to compile failure. llvm-svn: 262916
* [InstCombine] Combine A->B->A BitCastGuozhi Wei2016-03-031-0/+103
| | | | | | | | | | This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 262670
* revert r262424 because there's a *clang test* for AArch64 that checks -O3 ↵Sanjay Patel2016-03-021-17/+5
| | | | | | | | asm output that is broken by this change llvm-svn: 262440
* [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to ↵Sanjay Patel2016-03-011-5/+17
| | | | | | | | | | | | | | | | | | shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for *scalar* integers comparisons to *vector* integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424
* function names start with a lowercase letter; NFCSanjay Patel2016-02-011-17/+17
| | | | llvm-svn: 259425
* fix formatting; NFCSanjay Patel2015-12-301-8/+8
| | | | llvm-svn: 256645
* [InstCombine] Adding "\n" to debug output. NFC.Weiming Zhao2015-12-171-2/+2
| | | | | | | | | | | | | | | Summary: [InstCombine] Adding '\n' to debug output. NFC. Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org> Reviewers: apazos, majnemer, weimingz Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15403 llvm-svn: 255920
* getParent() ^ 3 == getModule() ; NFCISanjay Patel2015-12-141-3/+2
| | | | llvm-svn: 255511
* [InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543)Sanjay Patel2015-12-141-55/+47
| | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The idea is to take the existing fold of: bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X) ( http://reviews.llvm.org/rL112232 ) And break it into less specific transforms so we'll catch more cases such as the example in the bug report: bitcast ( trunc ( lshr ( bitcast X))) --> bitcast ( extractelement (bitcast X)) --> extractelement (bitcast X) Enabling patches for this change: http://reviews.llvm.org/rL255399 (combine bitcasts) http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X)) Differential Revision: http://reviews.llvm.org/D15392 llvm-svn: 255504
* [InstCombine] canonicalize (bitcast (extractelement X)) --> ↵Sanjay Patel2015-12-121-28/+17
| | | | | | | | | | | | | | | | | | | | | (extractelement(bitcast X)) This change was discussed in D15392. It allows us to remove the fold that was added in: http://reviews.llvm.org/r255261 ...and it will allow us to generalize this fold: http://reviews.llvm.org/rL112232 while preserving the order of bitcast + extract that it produces and testing shows is better handled by the backend. Note that the existing check for "isVectorTy()" wasn't strong enough in general and specifically because: x86_mmx. It's not a vector, but it's not vectorizable either. So here we check VectorType::isValidElementType() directly before proceeding with the transform. llvm-svn: 255433
* [InstCombine] fold bitcasts around an extractelement (3rd try)Sanjay Patel2015-12-101-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261
* Revert r255137.Akira Hatanaka2015-12-101-39/+0
| | | | | | This commit broke apple's internal bot. llvm-svn: 255227
* [InstCombine] fold bitcasts around an extractelement (2nd try)Sanjay Patel2015-12-091-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255124 (reverted at r255126) with an added check for a scalar destination type and an added test for the failure seen in Clang's test/CodeGen/vector.c. The extra test shows a different missing optimization. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255137
* Revert "[InstCombine] fold bitcasts around an extractelement"Mehdi Amini2015-12-091-37/+0
| | | | | | | | | This reverts commit r255124. Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255126
* [InstCombine] fold bitcasts around an extractelementSanjay Patel2015-12-091-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255124
* fix typo; NFCSanjay Patel2015-11-211-1/+1
| | | | llvm-svn: 253785
* [InstCombine] refactor optimizeIntToFloatBitCast() ; NFCISanjay Patel2015-11-181-38/+29
| | | | | | | | | | | | | | | The logic for handling the pattern without a shift is identical to the logic for handling the pattern with a shift if you set the shift amount to zero for the former. This should make it easier to see that we probably don't even need optimizeIntToFloatBitCast(). If we call something like foldVecTruncToExtElt() from visitTrunc(), we'll solve PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 llvm-svn: 253403
* fix typos; NFCSanjay Patel2015-11-171-2/+2
| | | | llvm-svn: 253359
* use local variables; NFCISanjay Patel2015-11-171-7/+7
| | | | llvm-svn: 253356
* InstCombine: Remove ilist iterator implicit conversions, NFCDuncan P. N. Exon Smith2015-10-131-1/+1
| | | | | | | Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. llvm-svn: 250183
* There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts thatJakub Kuderski2015-09-101-0/+20
| | | | | | | | | | | removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 247271
* Revert trunc(lshr (sext A), Cst) to ashr A, CstDavid Majnemer2015-09-091-20/+0
| | | | | | This reverts commit r246997, it introduced a regression (PR24763). llvm-svn: 247180
* function names start with a lower case letter; NFCSanjay Patel2015-09-091-54/+54
| | | | llvm-svn: 247150
* don't repeat function names in comments; NFCSanjay Patel2015-09-091-35/+32
| | | | llvm-svn: 247148
* There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts thatJakub Kuderski2015-09-081-0/+20
| | | | | | | | | | | removes cast by performing the lshr on smaller types. However, currently there is no trunc(lshr (sext A), Cst) variant. This patch add such optimization by transforming trunc(lshr (sext A), Cst) to ashr A, Cst. Differential Revision: http://reviews.llvm.org/D12520 llvm-svn: 246997
* Add support for floating-point minnum and maxnumJames Molloy2015-08-111-2/+8
| | | | | | | | | | | | | | | | | The select pattern recognition in ValueTracking (as used by InstCombine and SelectionDAGBuilder) only knew about integer patterns. This teaches it about minimum and maximum operations. matchSelectPattern() has been extended to return a struct containing the existing Flavor and a new enum defining the pattern's behavior when given one NaN operand. C minnum() is defined to return the non-NaN operand in this case, but the idiomatic C "a < b ? a : b" would return the NaN operand. ARM and AArch64 at least have different instructions for these different cases. llvm-svn: 244580
* Reapply r237539 with a fix for the Chromium build.James Molloy2015-05-201-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure if we're truncating a constant that would then be sign extended that the sign extension of the truncated constant is the same as the original constant. > Canonicalize min/max expressions correctly. > > This patch introduces a canonical form for min/max idioms where one operand > is extended or truncated. This often happens when the other operand is a > constant. For example: > > %1 = icmp slt i32 %a, i32 0 > %2 = sext i32 %a to i64 > %3 = select i1 %1, i64 %2, i64 0 > > Would now be canonicalized into: > > %1 = icmp slt i32 %a, i32 0 > %2 = select i1 %1, i32 %a, i32 0 > %3 = sext i32 %2 to i64 > > This builds upon a patch posted by David Majenemer > (https://www.marc.info/?l=llvm-commits&m=143008038714141&w=2). That pass > passively stopped instcombine from ruining canonical patterns. This > patch additionally actively makes instcombine canonicalize too. > > Canonicalization of expressions involving a change in type from int->fp > or fp->int are not yet implemented. llvm-svn: 237821
* Revert r237539: "Reapply r237520 with another fix for infinite looping"Hans Wennborg2015-05-191-9/+0
| | | | | | This caused PR23583. llvm-svn: 237739
* Reapply r237520 with another fix for infinite loopingJames Molloy2015-05-171-0/+9
| | | | | | | | | SimplifyDemandedBits was "simplifying" a constant by removing just sign bits. This caused a canonicalization race between different parts of instcombine. Fix and regression test added - third time lucky? llvm-svn: 237539
* Revert commits r237521 and r237520.James Molloy2015-05-161-9/+0
| | | | | | | | The AArch64 LNT bot is unhappy - I've found that the problem is in SimpliftDemandedBits, but that's going to require another code review so reverting in the meantime. llvm-svn: 237528
OpenPOWER on IntegriCloud