summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Fold x u> x & C to x u> CRoman Lebedev2018-07-141-0/+5
| | | | | | | | | | https://bugs.llvm.org/show_bug.cgi?id=38123 https://rise4fun.com/Alive/JvS This pattern is not commutative. But InstSimplify will already have taken care of the 'commutative' variant. llvm-svn: 337100
* [InstCombine] Fold x & (-1 >> y) u< x to x u> (-1 >> y)Roman Lebedev2018-07-141-0/+5
| | | | | | | | | | https://bugs.llvm.org/show_bug.cgi?id=38123 https://rise4fun.com/Alive/ocb This pattern is not commutative. But InstSimplify will already have taken care of the 'commutative' variant. llvm-svn: 337098
* [InstCombine] Fold x & (-1 >> y) u>= x to x u<= (-1 >> y)Roman Lebedev2018-07-141-0/+5
| | | | | | | | | | https://bugs.llvm.org/show_bug.cgi?id=38123 https://rise4fun.com/Alive/azI This pattern is not commutative. But InstSimplify will already have taken care of the 'commutative' variant. llvm-svn: 337096
* [InstCombine] return when SimplifyAssociativeOrCommutative makes a changeSanjay Patel2018-07-133-12/+28
| | | | | | | | | | | | | This bug was created by rL335258 because we used to always call instsimplify after trying the associative folds. After that change it became possible for subsequent folds to encounter unsimplified code (and potentially assert because of it). Instead of carrying changed state through instcombine, we can just return immediately. This allows instsimplify to run, so we can continue assuming that easy folds have already occurred. llvm-svn: 336965
* Simplify recursive launder.invariant.group and stripPiotr Padlewski2018-07-121-1/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is crucial for proving equality laundered/stripped pointers. eg: bool foo(A *a) { return a == std::launder(a); } Clang with -fstrict-vtable-pointers will emit something like: define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) { entry: %c = bitcast %struct.A* %a to i8* %call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c) %0 = bitcast %struct.A* %a to i8* %1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0) %2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call) %cmp = icmp eq i8* %1, %2 ret i1 %cmp } and because %2 can be replaced with @llvm.strip.invariant.group(%0) and that %2 and %1 will produce the same value (because strip is readnone) we can replace compare with true. Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D47423 llvm-svn: 336963
* [InstCombine] Fold x & (-1 >> y) != x to x u> (-1 >> y)Roman Lebedev2018-07-121-0/+4
| | | | | | | | | | | | | | | | | | | | Summary: A complementary fold to D49179. https://bugs.llvm.org/show_bug.cgi?id=38123 https://rise4fun.com/Alive/Rny Caveat: one more thing in `test/Transforms/InstCombine/icmp-logical.ll` breaks. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49205 llvm-svn: 336911
* [X86] Remove and autoupgrade the scalar fma intrinsics with masking.Craig Topper2018-07-122-47/+0
| | | | | | This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches. llvm-svn: 336871
* [InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)Roman Lebedev2018-07-111-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: https://bugs.llvm.org/show_bug.cgi?id=38123 This pattern will be produced by Implicit Integer Truncation sanitizer, https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530 in unsigned case, therefore it is probably a good idea to improve it. https://rise4fun.com/Alive/Rny ^ there are more opportunities for folds, i will follow up with them afterwards. Caveat: this somehow exposes a missing opportunities in `test/Transforms/InstCombine/icmp-logical.ll` It seems, the problem is in `foldLogOpOfMaskedICmps()` in `InstCombineAndOrXor.cpp`. But i'm not quite sure what is wrong, because it calls `getMaskedTypeForICmpPair()`, which calls `decomposeBitTestICmp()` which should already work for these cases... As @spatel notes in https://reviews.llvm.org/D49179#1158760, that code is a rather complex mess, so we'll let it slide. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: yamauchi, majnemer, t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D49179 llvm-svn: 336834
* [InstCombine] allow flag propagation when using safe constantSanjay Patel2018-07-101-2/+3
| | | | | | | This corresponds with the code for the single binop pattern added in rL336684. llvm-svn: 336696
* [InstCombine] safely allow non-commutative binop identity constant foldsSanjay Patel2018-07-101-8/+11
| | | | | | | | | | | | | | | | This was originally intended with D48893, but as discussed there, we have to make the folds safe from producing extra poison. This should give the single binop folds the same capabilities as the existing folds for 2-binops+shuffle. LLVM binary opcode review: there are a total of 18 binops. There are 7 commutative binops (add, mul, and, or, xor, fadd, fmul) which we already fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr, fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't bother with sub/fsub with constant operand 1 because those are canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18. llvm-svn: 336684
* [InstCombine] drop poison flags when shuffle mask undef propagates to constantSanjay Patel2018-07-101-0/+5
| | | | llvm-svn: 336679
* [InstCombine] allow more shuffle-binop folds with safe constantsSanjay Patel2018-07-101-10/+13
| | | | | | | | | | | | The case with 2 variables is more complicated than the case where we eliminate the shuffle entirely because a shuffle with an undef mask element creates an undef result. I'm not aware of any current analysis/transform that recognizes that undef propagating to a div/rem/shift, but we have to guard against the possibility. llvm-svn: 336668
* [InstCombine] allow more shuffle folds using safe constantsSanjay Patel2018-07-093-25/+50
| | | | | | | | | | getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming that the constant we wanted was operand 1 (RHS). That's wrong, but I don't think we could expose a bug or even a suboptimal fold from that because the callers have other guards for any binop that would have been affected. llvm-svn: 336617
* llvm: Add support for "-fno-delete-null-pointer-checks"Manoj Gupta2018-07-092-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Support for this option is needed for building Linux kernel. This is a very frequently requested feature by kernel developers. More details : https://lkml.org/lkml/2018/4/4/601 GCC option description for -fdelete-null-pointer-checks: This Assume that programs cannot safely dereference null pointers, and that no code or data element resides at address zero. -fno-delete-null-pointer-checks is the inverse of this implying that null pointer dereferencing is not undefined. This feature is implemented in LLVM IR in this CL as the function attribute "null-pointer-is-valid"="true" in IR (Under review at D47894). The CL updates several passes that assumed null pointer dereferencing is undefined to not optimize when the "null-pointer-is-valid"="true" attribute is present. Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv Reviewed By: efriedma, george.burgess.iv Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47895 llvm-svn: 336613
* [InstCombine] avoid extra poison when moving shift above shuffleSanjay Patel2018-07-091-8/+5
| | | | | | | | | | As discussed in D49047 / D48987, shift-by-undef produces poison, so we can't use undef vector elements in that case.. Note that we need to extend this for poison-generating flags, and there's a proposal to create poison from FMF in D47963, llvm-svn: 336562
* [InstCombine] generalize safe vector constant utilitySanjay Patel2018-07-093-30/+23
| | | | | | | | | | | This is almost NFC, but there could be some case where the original code had undefs in the constants (rather than just the shuffle mask), and we'll use safe constants rather than undefs now. The FIXME noted in foldShuffledBinop() is already visible in existing tests, so correcting that is the next step. llvm-svn: 336558
* [InstCombine] fix shuffle-of-binops transform to avoid poison/undef Sanjay Patel2018-07-091-21/+52
| | | | | | | | | | | | | | | | | | | | As noted in D48987, there are many different ways for this transform to go wrong. In particular, the poison potential for shifts means we have to more careful with those ops. I added tests to make that behavior visible for all of the different cases that I could find. This is a partial fix. To make this review easier, I did not make changes for the single binop pattern (handled in foldSelectShuffleWith1Binop()). I also left out some potential optimizations noted with TODO comments. I'll follow-up once we're confident that things are correct here. The goal is to correct all marked FIXME tests to either avoid the shuffle transform or do it safely. Note that distinguishing when the shuffle mask contains undefs and using getBinOpIdentity() allows for some improvements to div/rem patterns, so there are wins along with the missed opportunities and fixes. Differential Revision: https://reviews.llvm.org/D49047 llvm-svn: 336546
* Use Type::isIntOrPtrTy where possible, NFCVedant Kumar2018-07-061-1/+1
| | | | | | | | | | | It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() || T.isPointerTy()`. I used Python's re.sub with this regex to update users: r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)' llvm-svn: 336462
* [Local] replaceAllDbgUsesWith: Update debug values before RAUWVedant Kumar2018-07-061-15/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The replaceAllDbgUsesWith utility helps passes preserve debug info when replacing one value with another. This improves upon the existing insertReplacementDbgValues API by: - Updating debug intrinsics in-place, while preventing use-before-def of the replacement value. - Falling back to salvageDebugInfo when a replacement can't be made. - Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into common utility code. Along with the API change, this teaches replaceAllDbgUsesWith how to create DIExpressions for three basic integer and pointer conversions: - The no-op conversion. Applies when the values have the same width, or have bit-for-bit compatible pointer representations. - Truncation. Applies when the new value is wider than the old one. - Zero/sign extension. Applies when the new value is narrower than the old one. Testing: - check-llvm, check-clang, a stage2 `-g -O3` build of clang, regression/unit testing. - This resolves a number of mis-sized dbg.value diagnostics from Debugify. Differential Revision: https://reviews.llvm.org/D48676 llvm-svn: 336451
* Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms ↵Max Kazantsev2018-07-061-7/+3
| | | | | | are done" llvm-svn: 336410
* Fix asserts in AMDGCN fmed3 folding by handling more cases of NaNMatt Arsenault2018-07-051-7/+18
| | | | | | | | | | | | | | | Better NaN handling for AMDGCN fmed3. All operands are checked for NaN now. The checks were moved before the canonicalization to provide a better mapping from fclamp. Changed the behaviour of fmed3(x,y,NaN) to return max(x,y) instead of min(x,y) in light of this. Updated tests as a result and added some new cases to cover the fix. Patch by Alan Baker llvm-svn: 336375
* [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart ↵Craig Topper2018-07-052-4/+0
| | | | | | independent FMA and extractelement/insertelement. llvm-svn: 336315
* [InstCombine] allow narrowing of min/max/absSanjay Patel2018-07-041-14/+14
| | | | | | | | | | | | | | | We have bailout hacks based on min/max in various places in instcombine that shouldn't be necessary. The affected test was added for: D48930 ...which is a consequence of the improvement in: D48584 (https://reviews.llvm.org/rL336172) I'm assuming the visitTrunc bailout in this patch was added specifically to avoid a change from SimplifyDemandedBits, so I'm just moving that below the EvaluateInDifferentType optimization. A narrow min/max is still a min/max. llvm-svn: 336293
* [DebugInfo][InstCombine] Preserve DI after combining zextAnastasis Grammenos2018-07-041-0/+11
| | | | | | | | | | | When zext is EvaluatedInDifferentType, InstCombine drops the dbg.value intrinsic. This patch tries to preserve said DI, by inserting the zext's old DI in the resulting instruction. (Only for integer type for now) Differential Revision: https://reviews.llvm.org/D48331 llvm-svn: 336254
* [InstCombine] fold shuffle-with-binop and common valueSanjay Patel2018-07-031-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 llvm-svn: 336196
* [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are doneMax Kazantsev2018-07-031-3/+7
| | | | | | | | | | | | This patch changes order of transform in InstCombineCompares to avoid performing transforms based on ranges which produce complex bit arithmetics before more simple things (like folding with constants) are done. See PR37636 for the motivating example. Differential Revision: https://reviews.llvm.org/D48584 Reviewed By: spatel, lebedev.ri llvm-svn: 336172
* [InstCombine] reverse canonicalization of add --> or to allow more shuffle ↵Sanjay Patel2018-07-021-12/+55
| | | | | | | | | | | | | | | | folding This extends D48485 to allow another pair of binops (add/or) to be combined either with or without a leading shuffle: or X, C --> add X, C (when X and C have no common bits set) Here, we need value tracking to determine that the 'or' can be reversed into an 'add', and we've added general infrastructure to allow extending to other opcodes or moving to where other passes could use that functionality. Differential Revision: https://reviews.llvm.org/D48662 llvm-svn: 336128
* [InstCombine] enhance shuffle-of-binops to allow different variable ops ↵Sanjay Patel2018-06-291-12/+38
| | | | | | | | | | | | | | | | | | | | | | | (PR37806) This was discussed in D48401 as another improvement for: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have 2 different variable values, then we shuffle (select) those lanes, shuffle (select) the constants, and then perform the binop. This eliminates a binop. The new shuffle uses the same shuffle mask as the existing shuffle, so there's no danger of creating a difficult shuffle. All of the earlier constraints still apply, but we also check for extra uses to avoid creating more instructions than we'll remove. Additionally, we're disallowing the fold for div/rem because that could expose a UB hole. Differential Revision: https://reviews.llvm.org/D48678 llvm-svn: 335974
* [InstCombine] fix opcode check in shuffle foldSanjay Patel2018-06-281-1/+1
| | | | | | | | | There's no way to expose this difference currently, but we should use the updated variable because the original opcodes can go stale if we transform into something new. llvm-svn: 335920
* [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)Sanjay Patel2018-06-281-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | This is an enhancement to D48401 that was discussed in: https://bugs.llvm.org/show_bug.cgi?id=37806 We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases. This requires a small hack to make sure we don't introduce any extra poison: https://rise4fun.com/Alive/ZGv Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There are planned enhancements for opcode transforms such or -> add. Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because this fold is positioned above the demanded elements fold currently. Differential Revision: https://reviews.llvm.org/D48485 llvm-svn: 335888
* [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics ↵Craig Topper2018-06-271-6/+6
| | | | | | | | that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". llvm-svn: 335744
* [InstCombine] Avoid creating mis-sized dbg.values in commonCastTransforms()Vedant Kumar2018-06-271-2/+5
| | | | | | | | | | | | | | This prevents InstCombine from creating mis-sized dbg.values when replacing a sequence of casts with a simpler cast. For example, in: (fptrunc (floor (fpext X))) -> (floorf X) We no longer emit dbg.value(X) (with a 32-bit float operand) to describe (fpext X) (which is a 64-bit float). This was diagnosed by the debugify check added in r335682. llvm-svn: 335696
* [Local] Add a convenient insertReplacementDbgValues overload, NFCVedant Kumar2018-06-261-5/+1
| | | | | | | Add an overload for the common case where the replacement dbg.values have the same DIExpressions as the originals. llvm-svn: 335643
* [InstCombine] simplify code for urem fold; NFCISanjay Patel2018-06-261-5/+2
| | | | llvm-svn: 335623
* [InstCombine] fold urem with sext bool divisorSanjay Patel2018-06-261-2/+13
| | | | | | | | | | | | | | | | | | | | | | Similar to other patches in this series: https://reviews.llvm.org/rL335512 https://reviews.llvm.org/rL335527 https://reviews.llvm.org/rL335597 https://reviews.llvm.org/rL335616 ...this is filling a gap in analysis that is exposed by an unrelated select-of-constants transform. I didn't see a way to unify the sext cases because each div/rem opcode results in a different fold. Note that in this case, the backend might want to convert the select into math: Name: sext urem %e = sext i1 %x to i32 %r = urem i32 %y, %e => %c = icmp eq i32 %y, -1 %z = zext i1 %c to i32 %r = add i32 %z, %y llvm-svn: 335622
* [InstCombine] fold udiv with sext bool divisorSanjay Patel2018-06-261-1/+7
| | | | | | | | | | Note: I didn't add a hasOneUse() check because the existing, related fold doesn't have that check. I suspect that the improved analysis and codegen make these some of the rare canonicalization cases where we allow an increase in instructions. llvm-svn: 335597
* [InstCombine] (A + 1) + (B ^ -1) --> A - BGil Rapaport2018-06-261-0/+5
| | | | | | | | | Turn canonicalized subtraction back into (-1 - B) and combine it with (A + 1) into (A - B). This is similar to the folding already done for (B ^ -1) + Const into (-1 + Const) - B. Differential Revision: https://reviews.llvm.org/D48535 llvm-svn: 335579
* [InstCombine] cleanup udiv folds; NFCISanjay Patel2018-06-251-30/+20
| | | | | | | | | This removes a "UDivFoldAction" in favor of a simple constant matcher. In theory, the existing code could do more matching, but I don't see any evidence or need for it. I've left a TODO about using ValueTracking in case we see any regressions. llvm-svn: 335545
* [InstCombine] fold sdiv with sext bool divisorSanjay Patel2018-06-251-4/+7
| | | | llvm-svn: 335527
* Use APInt[] bit access to avoid "32-bit shift implicitly converted to 64 ↵Simon Pilgrim2018-06-251-1/+1
| | | | | | bits" MSVC warning (again). NFCI. llvm-svn: 335457
* Use APInt[] bit access to avoid "32-bit shift implicitly converted to 64 ↵Simon Pilgrim2018-06-251-1/+1
| | | | | | bits" MSVC warning. NFCI. llvm-svn: 335454
* [InstCombine] rearrange shuffle-of-binops logic; NFCSanjay Patel2018-06-221-17/+12
| | | | | | | | The commutative matcher makes things more complicated here, and I'm planning an enhancement where this form is more readable. llvm-svn: 335343
* [InstCombine] fix shuffle-of-binops bugSanjay Patel2018-06-211-2/+8
| | | | | | | | | With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312
* [InstCombine] fold vector select of binops with constant ops to 1 binop ↵Sanjay Patel2018-06-211-0/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 llvm-svn: 335283
* [InstCombine] use constant pattern matchers with icmp+sextSanjay Patel2018-06-211-14/+11
| | | | | | | | The previous code worked with vectors, but it failed when the vector constants contained undef elements. The matchers handle those cases. llvm-svn: 335262
* [InstCombine] simplify binops before trying other foldsSanjay Patel2018-06-214-52/+64
| | | | | | | | | | This is outwardly NFC from what I can tell, but it should be more efficient to simplify first (despite the name, SimplifyAssociativeOrCommutative does not actually simplify as InstSimplify does - it creates/morphs instructions). This should make it easier to refactor duplicated code that runs for all binops. llvm-svn: 335258
* [InstCombine] make div/rem vector constant utility function; NFCISanjay Patel2018-06-212-13/+24
| | | | | | This was originally in D48401 and will be used there. llvm-svn: 335242
* AMDGPU: Remove old-style image intrinsicsNicolai Haehnle2018-06-211-51/+1
| | | | | | | | | | | | | | | | | | | | Summary: This also removes the need for atomic pseudo instructions, since we select the correct encoding directly in SITargetLowering::lowerImage for dimension-aware image intrinsics. Mesa uses dimension-aware image intrinsics since commit a9a7993441. Change-Id: I7473d20009476a4ed6d919cae4e6dca9ff42e77a Reviewers: arsenm, rampitec, mareko, tpr, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48167 llvm-svn: 335231
* InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemandedNicolai Haehnle2018-06-214-71/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Use the expanded features of the TableGen generic tables to avoid manually adding the combinatorially exploded set of intrinsics. The getAMDGPUImageDimIntrinsic lookup function is early-out, i.e. non-AMDGPU intrinsics will never look at the underlying table. Use a generic approach for getting the new intrinsic overload to keep the code simple, and make the image dmask handling more generic: - handle non-sampler image loads - handle the case where the set of demanded elements is not a prefix There is some overlap between this code and an optimization that happens in the backend during code generation. They currently complement each other: - only the codegen optimization can generate vec3 loads - only the InstCombine optimization can handle D16 The InstCombine optimization also likely covers more cases since the codegen optimization is fairly ad-hoc. Ideally, we'll remove the optimization in codegen once the infrastructure for vec3 is in place (which will probably take a long time). Modify the test cases to use dimension-aware intrinsics. This makes it easier to see that the test coverage for the new intrinsics is equivalent, and the old style intrinsics will be removed in a follow-up commit anyway. Change-Id: I4b91ea661413d13004956fe4ef7d13d41b8ce3ad Reviewers: arsenm, rampitec, majnemer Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48165 llvm-svn: 335230
* [IR] add/use isIntDivRem convenience functionSanjay Patel2018-06-201-3/+1
| | | | | | | | There are more existing potential users of this, but I've limited this patch to the first couple that I found to minimize typo risk. llvm-svn: 335157
OpenPOWER on IntegriCloud