summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
...
* [Local] Add a utility to insert replacement dbg.values, NFCVedant Kumar2018-06-201-11/+6
| | | | | | | | | | | | | | | | | | | | | The purpose of this utility is to make it easier for optimizations to insert replacement dbg.values for instructions they are deleting. This is useful in situations where salvageDebugInfo is inapplicable, say, because the new dbg.value cannot refer to an operand of the dying value. The utility is called insertReplacementDbgValues. It assumes that the instruction 'From' is going to be deleted, and inserts replacement dbg.values for each debug user of 'From'. The newly-inserted dbg.values refer to 'To' instead of 'From'. Each replacement dbg.value has the same location and variable as the debug user it replaces, has a DIExpression determined by the result of 'RewriteExpr' applied to an old debug user of 'From', and is placed before 'InsertBefore'. This should simplify future patches, like D48331. llvm-svn: 335144
* [InstCombine] ignore debuginfo when removing redundant assumes (PR37726)Sanjay Patel2018-06-201-3/+5
| | | | | | | | | | This is similar to: rL335083 Fixes:: https://bugs.llvm.org/show_bug.cgi?id=37726 llvm-svn: 335121
* [NFC][SCEV] Add tests related to bit masking (PR37793)Roman Lebedev2018-06-201-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Related to https://bugs.llvm.org/show_bug.cgi?id=37793, https://reviews.llvm.org/D46760#1127287 We'd like to do this canonicalization https://rise4fun.com/Alive/Gmc But it is currently restricted by rL155136 / rL155362, which says: ``` // This is a constant shift of a constant shift. Be careful about hiding // shl instructions behind bit masks. They are used to represent multiplies // by a constant, and it is important that simple arithmetic expressions // are still recognizable by scalar evolution. // // The transforms applied to shl are very similar to the transforms applied // to mul by constant. We can be more aggressive about optimizing right // shifts. // // Combinations of right and left shifts will still be optimized in // DAGCombine where scalar evolution no longer applies. ``` I think these tests show that for *constants*, SCEV has no issues with that canonicalization. Reviewers: mkazantsev, spatel, efriedma, sanjoy Reviewed By: mkazantsev Subscribers: sanjoy, javed.absar, llvm-commits, stoklund, bixia Differential Revision: https://reviews.llvm.org/D48229 llvm-svn: 335101
* [IR] Introduce helpers to skip debug instructions (NFC)Vedant Kumar2018-06-191-11/+3
| | | | | | | | | | | | | | | | | | | | | | | This patch introduces two helpers to make it easier to ignore debug intrinsics: - Instruction::getNextNonDebugInstruction() This is just like Instruction::getNextNode(), except that it skips debug info. - skipDebugInfo(BasicBlock::iterator) A free function which advances a BasicBlock iterator past any debug info. This is a no-op when the iterator already points to a non-debug instruction. Part of: llvm.org/PR37728 Related to: https://reviews.llvm.org/D47874 Differential Revision: https://reviews.llvm.org/D48305 llvm-svn: 335083
* [InstCombine] Replacing X86-specific rounding intrinsics with generic floor-ceilMikhail Dvoretckii2018-06-191-2/+128
| | | | | | | | | | | | This patch replaces calls to X86-specific intrinsics with floor-ceil semantics with calls to target-independent @llvm.floor.* and @llvm.ceil.* intrinsics. This doesn't affect the resulting machine code, as those intrinsics are lowered to the same instructions, but exposes these specific rounding cases to generic optimizations. Differential Revision: https://reviews.llvm.org/D48067 llvm-svn: 335039
* [X86] Lowering sqrt intrinsics to native IRTomasz Krupa2018-06-151-2/+0
| | | | | | | | | | | | | | Summary: Complementary patch to lowering sqrt intrinsics in Clang. Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, mike.dvoretsky, llvm-commits Differential Revision: https://reviews.llvm.org/D41599 llvm-svn: 334849
* [InstCombine] Avoid iteration/mutation conflictJoseph Tremoulet2018-06-151-1/+2
| | | | | | | | | | | | | | | | | | Summary: When iterating users of a multiply in processUMulZExtIdiom, the call to setOperand in the truncation case may replace the use being visited; make sure the iterator has been advanced before doing that replacement. Reviewers: majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48192 llvm-svn: 334844
* [InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y)Roman Lebedev2018-06-151-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | Summary: We already do it for splat constants, but not just values. Also, undef cases are mostly non-functional. The original commit was reverted because it broke tests for amdgpu backend, which i didn't check. Now, the backed was updated to recognize these new patterns, so we are good. https://bugs.llvm.org/show_bug.cgi?id=37603 https://rise4fun.com/Alive/cplX Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm Reviewed By: spatel, rampitec, nhaehnle Subscribers: wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D47980 llvm-svn: 334818
* Revert rL334371 / D47980: "[InstCombine] Fold (x << y) >> y -> x & (-1 >> y)"Roman Lebedev2018-06-101-9/+0
| | | | | | | test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll broke, and i did not notice because i did not build that backend. llvm-svn: 334373
* [InstCombine] Fold (x >> y) << y -> x & (-1 << y)Roman Lebedev2018-06-101-1/+10
| | | | | | | | | | | | | | | | | | | | | | | Summary: We already do it for matching splat constants, but not just values. Further improvements for non-matching splat constants, as noted in https://reviews.llvm.org/D46760#1123713 will be needed, but i'd prefer to do that as a follow-up. https://bugs.llvm.org/show_bug.cgi?id=37603 https://rise4fun.com/Alive/cplX https://rise4fun.com/Alive/0HF Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47981 llvm-svn: 334372
* [InstCombine] Fold (x << y) >> y -> x & (-1 >> y)Roman Lebedev2018-06-101-0/+9
| | | | | | | | | | | | | | | | | | | Summary: We already do it for splat constants, but not just values. Also, undef cases are mostly non-functional. https://bugs.llvm.org/show_bug.cgi?id=37603 https://rise4fun.com/Alive/cplX Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47980 llvm-svn: 334371
* [X86] Remove masking from the 512-bit masked floating point add/sub/mul/div ↵Craig Topper2018-06-101-50/+17
| | | | | | intrinsics. Use a select in IR instead. llvm-svn: 334358
* [InstCombine] Skip dbg.value(s) when looking at stack{save,restore}.Davide Italiano2018-06-081-1/+8
| | | | | | Fixes PR37713. llvm-svn: 334317
* [InstCombine] fold another shifty abs pattern to cmp+sel (PR36036)Sanjay Patel2018-06-062-1/+22
| | | | | | | | | | | | | | | | | | | | | | The bug report: https://bugs.llvm.org/show_bug.cgi?id=36036 ...requests a DAG change for this, but an IR canonicalization probably handles most cases. If we still want to match this pattern in the backend, there's a proposal for that too: D47831 Alive proofs including nsw/nuw cases that were first noted in: D46988 https://rise4fun.com/Alive/Kmp This patch is largely copied from the existing code that was initially added with: D40984 ...but I didn't see much gain from trying to share code. llvm-svn: 334137
* [InstCombine] PR37603: low bit mask canonicalizationRoman Lebedev2018-06-061-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is [[ https://bugs.llvm.org/show_bug.cgi?id=37603 | PR37603 ]]. https://godbolt.org/g/VCMNpS https://rise4fun.com/Alive/idM When doing bit manipulations, it is quite common to calculate some bit mask, and apply it to some value via `and`. The typical C code looks like: ``` int mask_signed_add(int nbits) { return (1 << nbits) - 1; } ``` which is translated into (with `-O3`) ``` define dso_local i32 @mask_signed_add(int)(i32) local_unnamed_addr #0 { %2 = shl i32 1, %0 %3 = add nsw i32 %2, -1 ret i32 %3 } ``` But there is a second, less readable variant: ``` int mask_signed_xor(int nbits) { return ~(-(1 << nbits)); } ``` which is translated into (with `-O3`) ``` define dso_local i32 @mask_signed_xor(int)(i32) local_unnamed_addr #0 { %2 = shl i32 -1, %0 %3 = xor i32 %2, -1 ret i32 %3 } ``` Since we created such a mask, it is quite likely that we will use it in `and` next. And then we may get rid of `not` op by folding into `andn`. But now that i have actually looked: https://godbolt.org/g/VTUDmU _some_ backend changes will be needed too. We clearly loose `bzhi` recognition. Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47428 llvm-svn: 334127
* InstCombine: ignore debug instructions during fence combineTim Northover2018-06-061-1/+5
| | | | | | | | | | We should never get different CodeGen based on whether the code is being compiled in debug mode so we must skip over @llvm.dbg.value (and similar) calls. Should fix at least the worst part of PR37690. llvm-svn: 334090
* [InstCombine] Correct the cmp operand type used when canonicalizing abs/nabsJohn Brawn2018-06-051-1/+1
| | | | | | | | | | When adjusting a cmp in order to canonicalize an abs/nabs select pattern we need to use the type of the existing operand when creating a new operand not the type of a select operand, as the two may be different. This fixes PR37686. llvm-svn: 334019
* [InstCombine] refine UB-handling in shuffle-binop transformSanjay Patel2018-06-041-14/+14
| | | | | | | | | | | | | | | | | | As noted in rL333782, we can be both better for optimization and safer with this transform: BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask The only potentially unsafe-to-speculate binops are integer div/rem. All other binops are always safe (although I don't see a way to assert that in code here). For opcodes like shifts that can produce poison, it can't matter here because we know the lanes with undef are dropped by the subsequent shuffle. Differential Revision: https://reviews.llvm.org/D47686 llvm-svn: 333962
* Move Analysis/Utils/Local.h back to TransformsDavid Blaikie2018-06-046-6/+6
| | | | | | | | | | Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
* [InstCombine] Fix div handlingSerguei Katkov2018-06-041-2/+2
| | | | | | | | | | | | | When we optimize select basing on fact that div by 0 is undef we should not traverse the instruction which are not guaranteed to transfer execution to next instruction. Guard intrinsic is an example. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47576 llvm-svn: 333864
* [InstCombine] improve sub with bool foldsSanjay Patel2018-06-031-13/+14
| | | | | | | | There's a patchwork of existing transforms trying to handle these cases, but as seen in the changed test, we weren't catching them all. llvm-svn: 333845
* [InstCombine] call simplify before trying vector foldsSanjay Patel2018-06-026-76/+58
| | | | | | | | | | | | | | | | | | | | As noted in the review thread for rL333782, we could have made a bug harder to hit if we were simplifying instructions before trying other folds. The shuffle transform in question isn't ever a simplification; it's just a canonicalization. So I've renamed that to make that clearer. This is NFCI at this point, but I've regenerated the test file to show the cosmetic value naming difference of using instcombine's RAUW vs. the builder. Possible follow-ups: 1. Move reassociation folds after simplifies too. 2. Refactor common code; we shouldn't have so much repetition. llvm-svn: 333820
* [InstCombine] fix vector shuffle transform to replace undef elements (PR37648)Sanjay Patel2018-06-011-0/+16
| | | | | | | | | | | | | | This bug: https://bugs.llvm.org/show_bug.cgi?id=37648 ...was created with the enhancement to this transform with rL332479. The urem test shows the disaster potential: any undef divisor lane makes the whole op undef. The test diffs show that vector demanded elements turns some of the potential, but not all, unused binop operands back into undef already. llvm-svn: 333782
* [InstCombine] narrow select to match condition operands' sizeSanjay Patel2018-05-311-8/+11
| | | | | | | | | | | | | This is the planned enhancement to D47163 / rL333611. We want to match cmp/select sizes because that will be recognized as min/max more easily and lead to better codegen (especially for vector types). As mentioned in D47163, this improves some of the tests that would also be folded by D46380, so we may want to adjust that patch to match the new patterns where the extend op occurs after the select. llvm-svn: 333689
* [InstCombine, ARM] Convert vld1 to llvm loadAlexandros Lamprineas2018-05-311-1/+30
| | | | | | | | | | Convert a vector load intrinsic into an llvm load instruction. This is beneficial when the underlying object being addressed comes from a constant, since we get constant-folding for free. Differential Revision: https://reviews.llvm.org/D46273 llvm-svn: 333643
* Revert rL333106 / D46814: [InstCombine] Fold unfolded masked merge pattern ↵Roman Lebedev2018-05-311-36/+0
| | | | | | | | | | | | | with variable mask! In post-commit review, Eric Christopher notes that many new MSan warnings are being observed with this patch. The probable reason is: if 'y' is undef here and we could evaluate it twice and get different results. We can't increase the number of uses of a value. llvm-svn: 333631
* [InstCombine] don't change the size of a select if it would mismatch its ↵Sanjay Patel2018-05-311-4/+10
| | | | | | | | | | | | | | | | | | | | | condition operands' sizes Don't always: cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C' This is something that came up as far back as D26556, and I lost track of it. I suspect that this transform is part of the underlying problem that is inspiring some of the recent proposals that seek to match larger patterns that include a cast op. Even if that's not true, this transform causes problems for codegen (particularly with vector types). A transform to actively match the size of cmp and select operand sizes should follow. This patch just removes the harmful canonicalization in the other direction. Differential Revision: https://reviews.llvm.org/D47163 llvm-svn: 333611
* [InstCombine] don't negate constant expression with fsub (PR37605)Sanjay Patel2018-05-301-1/+3
| | | | | | | X + (-C) would be transformed back into X - C, so infinite loop: https://bugs.llvm.org/show_bug.cgi?id=37605 llvm-svn: 333610
* [InstCombine, ARM, AArch64] Convert table lookup to shuffle vectorAlexandros Lamprineas2018-05-301-0/+46
| | | | | | | | | | | Turning a table lookup intrinsic into a shuffle vector instruction can be beneficial. If the mask used for the lookup is the constant vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse instructions instead. Differential Revision: https://reviews.llvm.org/D46133 llvm-svn: 333550
* [InstCombine] Combine XOR and AES instructions on ARM/ARM64.Chad Rosier2018-05-241-0/+17
| | | | | | | | | | | The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in the instruction. Therefore, if the AES key is zero and the AES data was previously XORed, it can be combined into a single instruction. Differential Revision: https://reviews.llvm.org/D47239 Patch by Michael Brase! llvm-svn: 333193
* [InstCombine] Fold unfolded masked merge pattern with variable mask!Roman Lebedev2018-05-231-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Finally fixes [[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]]. Now that the backend is all done, we can finally fold it! The canonical unfolded masked merge pattern is ```(x & m) | (y & ~m)``` There is a second, equivalent variant: ```(x | ~m) & (y | m)``` Only one of them (the or-of-and's i think) is canonical. And if the mask is not a constant, we should fold it to: ```((x ^ y) & M) ^ y``` https://rise4fun.com/Alive/ndQw Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: nicholas, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D46814 llvm-svn: 333106
* [InstCombine] Negate ABS/NABS patterns by swapping the select operands to ↵Craig Topper2018-05-231-0/+16
| | | | | | | | remove the negation Differential Revision: https://reviews.llvm.org/D47236 llvm-svn: 333101
* [AMDGPU] Optimze old value of v_mov_b32_dppStanislav Mekhanoshin2018-05-221-0/+17
| | | | | | | | | | We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf. This is alternative implementation working with the intrinsic in InstCombine. Original review for past-ISel optimization: D46570. Differential Revision: https://reviews.llvm.org/D46596 llvm-svn: 332956
* [InstCombine] remove fptrunc (select) code; NFCISanjay Patel2018-05-211-17/+0
| | | | | | | This pattern is handled within commonCastTransforms(), so the code here is dead AFAICT. llvm-svn: 332887
* [InstCombine] Fix PR37526: MinMax patterns produce an infinite loop.Alexey Bataev2018-05-211-3/+8
| | | | | | | | | | | | | | | | | Summary: This patch fixes PR37526 by simplifying the newly generated LoadInst instructions. If the pointer address is a bitcast from the pointer to the NewType, we can just remove this extra bitcast instead of creating the new one. This fixes the PR37526 + may speed up the whole compilation process. Reviewers: spatel, RKSimon, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47144 llvm-svn: 332855
* [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵Craig Topper2018-05-201-20/+12
| | | | | | | | in IR instead. Someday maybe we'll use selects for all intrinsics. llvm-svn: 332824
* [InstCombine] choose 1 form of abs and nabs as canonicalSanjay Patel2018-05-201-0/+51
| | | | | | | | | | | | | | | We already do this for min/max (see the blob above the diff), so we should do the same for abs/nabs. A sign-bit check (<s 0) is used as a predicate for other IR transforms and it's likely the best for codegen. This might solve the motivating cases for D47037 and D47041, but I think those patches still make sense. We can't guarantee this canonicalization if the icmp has more than one use. Differential Revision: https://reviews.llvm.org/D47076 llvm-svn: 332819
* [InstCombine] Qualify a select pattern based transform to restrct to only ↵Craig Topper2018-05-181-1/+1
| | | | | | min/max and ignore abs/nabs. llvm-svn: 332770
* [WebAssembly] Add Wasm personality and isScopedEHPersonality()Heejin Ahn2018-05-171-0/+1
| | | | | | | | | | | | | | | | | | | | | Summary: - Add wasm personality function - Re-categorize the existing `isFuncletEHPersonality()` function into two different functions: `isFuncletEHPersonality()` and `isScopedEHPersonality(). This becomes necessary as wasm EH uses scoped EH instructions (catchswitch, catchpad/ret, and cleanuppad/ret) but not outlined funclets. - Changed some callsites of `isFuncletEHPersonality()` to `isScopedEHPersonality()` if they are related to scoped EH IR-level stuff. Reviewers: majnemer, dschuff, rnk Subscribers: jfb, sbc100, jgravelle-google, eraman, JDevlieghere, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45559 llvm-svn: 332667
* Add a limit for phi folding instcombineXinliang David Li2018-05-171-1/+9
| | | | | | Differential Revision: http://reviews.llvm.org/D47023 llvm-svn: 332653
* [InstCombine] Propagate the nsw/nuw flags from the add in the 'shifty' abs ↵Craig Topper2018-05-171-1/+5
| | | | | | | | | | | | pattern to the sub in the select version. According to alive this is valid. I'm hoping to use this to make an assumption that the sign bit is zero after this sequence. The only way it wouldn't be is if the input was INT__MIN, but by preserving the flags we can make doing this to INT_MIN UB. The nuw flags is weird because it creates such a contradiction that the original number would have to be positive meaning we could remove the select entirely, but we don't get that far. Differential Revision: https://reviews.llvm.org/D46988 llvm-svn: 332623
* [InstCombine] allow more binop (shuffle X), C transformsSanjay Patel2018-05-161-2/+4
| | | | | | | | | The canonicalization was restricted to shuffle masks with a 1-to-1 mapping to the constant vector, but that disqualifies the common splat pattern. This is part of solving PR37463: https://bugs.llvm.org/show_bug.cgi?id=37463 llvm-svn: 332479
* [InstCombine] fix binop (shuffle X), C --> shuffle (binop X, C') to check usesSanjay Patel2018-05-151-1/+1
| | | | llvm-svn: 332407
* [InstCombine] clean up code for binop-shuffle transforms; NFCISanjay Patel2018-05-151-39/+34
| | | | llvm-svn: 332399
* [InstCombine] fix binop-of-shuffles to check usesSanjay Patel2018-05-151-12/+10
| | | | llvm-svn: 332375
* [InstCombine] fix crash due to ignored addrspacecastKeno Fischer2018-05-141-2/+3
| | | | | | | | | | | | | | | | | | Summary: Part of the InstCombine code for simplifying GEPs looks through addrspacecasts. However, this was done by updating a variable also used by the next transformation, for marking GEPs as inbounds. This led to replacing a GEP with a similar instruction in a different addrspace, which caused an assertion failure in RAUW. This caused julia issue https://github.com/JuliaLang/julia/issues/27055 Patch by Jeff Bezanson <jeff@juliacomputing.com> Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D46722 llvm-svn: 332302
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-147-42/+50
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 ↵Craig Topper2018-05-131-9/+12
| | | | | | bit version. llvm-svn: 332202
* [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.Craig Topper2018-05-131-2/+0
| | | | llvm-svn: 332198
* [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer ↵Craig Topper2018-05-112-12/+0
| | | | | | used by clang. llvm-svn: 332146
OpenPOWER on IntegriCloud