summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine
Commit message (Collapse)AuthorAgeFilesLines
* Create instruction classes for identifying any atomicity of memory ↵Daniel Neilson2017-10-302-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | intrinsic. (NFC) Summary: For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html This patch fleshes out the instruction class hierarchy with respect to atomic and non-atomic memory intrinsics. With this change, the relevant part of the class hierarchy becomes: IntrinsicInst -> MemIntrinsicBase (methods-only class) -> MemIntrinsic (non-atomic intrinsics) -> MemSetInst -> MemTransferInst -> MemCpyInst -> MemMoveInst -> AtomicMemIntrinsic (atomic intrinsics) -> AtomicMemSetInst -> AtomicMemTransferInst -> AtomicMemCpyInst -> AtomicMemMoveInst -> AnyMemIntrinsic (both atomicities) -> AnyMemSetInst -> AnyMemTransferInst -> AnyMemCpyInst -> AnyMemMoveInst This involves some class renaming: ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst A script for doing this renaming in downstream trees is included below. An example of where the Any* classes should be used in LLVM is when reasoning about the effects of an instruction (ex: aliasing). --- Script for renaming AtomicMem* classes: PREFIXES="[<,([:space:]]" CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst" SUFFIXES="[;)>,[:space:]]" REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})" REGEX2="visitElementUnorderedAtomic(${CLASSES})" FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq ) SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g" SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g" for f in $FILES; do echo "Processing: $f" sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f done Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev Reviewed By: sanjoy Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38419 llvm-svn: 316950
* [Transforms] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-10-247-121/+250
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 316503
* AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)Marek Olsak2017-10-241-0/+8
| | | | | | | | | | | | | | | | Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427
* AMDGPU: Add llvm.amdgcn.wqm.vote intrinsicMarek Olsak2017-10-241-0/+7
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426
* [InstCombine] don't unnecessarily generate a constant; NFCISanjay Patel2017-10-161-3/+2
| | | | llvm-svn: 315910
* Move folding of icmp with zero after checking for min/max idioms.Nikolai Bozhenov2017-10-162-11/+23
| | | | | | | | | | | | | | | | | | | | | Summary: The following transformation for cmp instruction: icmp smin(x, PositiveValue), 0 -> icmp x, 0 should only be done after checking for min/max to prevent infinite looping caused by a reverse canonicalization. That is why this transformation was moved to place after the mentioned check. Reviewers: spatel, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38934 Patch by: Artur Gainullin <artur.gainullin@intel.com> llvm-svn: 315895
* revert r314984: revert r314698 - [InstCombine] remove one-use restriction ↵Sanjay Patel2017-10-151-6/+6
| | | | | | | | | for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1) Recommitting r314698. The bug exposed by this change should be fixed with: https://reviews.llvm.org/rL315579 llvm-svn: 315857
* [InstCombine] use m_Neg() to reduce code; NFCISanjay Patel2017-10-131-13/+9
| | | | llvm-svn: 315762
* [InstCombine] move code to remove repeated constant check; NFCISanjay Patel2017-10-131-8/+7
| | | | | | Also, consolidate tests for this fold in one place. llvm-svn: 315745
* [InstCombine] recycle adds for better efficiencySanjay Patel2017-10-131-26/+21
| | | | | | Also, clean up unnecessary matcher capture variable initializations. llvm-svn: 315743
* [InstCombine] use local var to reduce code duplication; NFCISanjay Patel2017-10-131-16/+15
| | | | llvm-svn: 315728
* [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing ↵Sanjay Patel2017-10-131-4/+2
| | | | | | instructions llvm-svn: 315718
* [InstCombine] use AddOne helper to reduce code; NFCSanjay Patel2017-10-131-6/+3
| | | | llvm-svn: 315709
* [InstCombine] rearrange code to remove repeated constant check; NFCISanjay Patel2017-10-132-7/+7
| | | | llvm-svn: 315703
* [InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector typesSanjay Patel2017-10-131-10/+15
| | | | | | | The backend should be prepared for this transform after: https://reviews.llvm.org/rL311731 llvm-svn: 315701
* Renable r314928Xinliang David Li2017-10-102-0/+239
| | | | | | | | | | | Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272
* Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*Adam Nemet2017-10-091-1/+1
| | | | | | | Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249
* [InstCombine] fix formatting; NFCSanjay Patel2017-10-091-9/+7
| | | | llvm-svn: 315223
* [InstCombine] use correct type when propagating constant condition in ↵Sanjay Patel2017-10-061-2/+3
| | | | | | simplifyDivRemOfSelectWithZeroOp (PR34856) llvm-svn: 315130
* [InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, ↵Sanjay Patel2017-10-062-20/+20
| | | | | | | | | simplify code; NFCI There's at least one bug here - this code can fail with vector types (PR34856). It's also being called for FREM; I'm still trying to understand how that is valid. llvm-svn: 315127
* Revert "Roll forward r314928"Reid Kleckner2017-10-062-238/+0
| | | | | | | | | | | | | | | | | | | | This appears to be miscompiling Clang, as shown on two Windows bootstrap bots: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611 http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870 Nothing else is in the blame list. Both emit errors on this valid code in the Windows ucrt headers: C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char *' and 'int') _Ptr = (char*)_Ptr + _ALLOCA_S_MARKER_SIZE; ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~ I am attempting to reproduce this now. This reverts r315044 llvm-svn: 315108
* Roll forward r314928Xinliang David Li2017-10-062-0/+238
| | | | | | | Fixed ThinLTO bootstrap failure : track new bitcast per incomingVal. Added new tests. llvm-svn: 315044
* [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2Sanjay Patel2017-10-051-37/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C' This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: ADD: %1 = udiv i4 %x, 2 IC: Old = %c = icmp ugt i4 %1, 1 New = <badref> = icmp uge i4 %x, 4 IC: ADD: %c = icmp uge i4 %x, 4 IC: ERASE %2 = icmp ugt i4 %1, 1 IC: Visiting: %c = icmp uge i4 %x, 4 IC: Old = %c = icmp uge i4 %x, 4 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %2 = icmp uge i4 %x, 4 IC: Visiting: %c = icmp ugt i4 %x, 3 IC: DCE: %1 = udiv i4 %x, 2 IC: ERASE %1 = udiv i4 %x, 2 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: ret i1 %c When we could go directly to canonical icmp form: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: Old = %c = icmp ugt i4 %s, 1 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %1 = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: %c = icmp ugt i4 %x, 3 ...but then I noticed that the folds were incomplete too: https://godbolt.org/g/aB2hLE Here are attempts to prove the logic with Alive: https://rise4fun.com/Alive/92o Name: lshr_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) Name: ashr_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_ugt Pre: (((C2+1) << C1) u>> C1) == (C2+1) %sh = lshr i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, ((C2+1) << C1) - 1 Name: ashr_sgt Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1) %sh = ashr i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, ((C2+1) << C1) - 1 Name: ashr_exact_sgt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, (C2 << C1) Name: ashr_exact_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_exact_ugt Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, (C2 << C1) Name: lshr_exact_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) We did something similar for 'shl' in D28406. Differential Revision: https://reviews.llvm.org/D38514 llvm-svn: 315021
* revert r314698 - [InstCombine] remove one-use restriction for icmp (shr ↵Sanjay Patel2017-10-051-6/+6
| | | | | | | | | | | exact X, C1), C2 --> icmp X, (C2<<C1) There is a bot failure that appears to be related to this change: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117 ...so reverting to confirm that and attempting to keep the bot green while investigating. llvm-svn: 314984
* [InstCombine] Fix a vector splat handling bug in transformZExtICmp.Craig Topper2017-10-051-3/+1
| | | | | | | | | | We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us. This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first. Fixes PR34841. llvm-svn: 314971
* Revert r314928 to investigate thinLTO bootstrap failureXinliang David Li2017-10-052-234/+0
| | | | llvm-svn: 314961
* [InstCombine] Improve support for ashr in foldICmpAndShiftCraig Topper2017-10-041-9/+12
| | | | | | | | We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945
* Recommit r314561 after fixing msan build failureXinliang David Li2017-10-042-0/+234
| | | | | | | (trial 2) Incoming val defined by terminator instruction which also requires bitcasts can not be handled. llvm-svn: 314928
* [InstCombine] Use isSignBitCheck to simplify an if statement. Directly ↵Craig Topper2017-10-031-12/+8
| | | | | | | | create new sign bit compares instead of manipulating the constant. NFCI Since we no longer had the direct constant compares, manipulating the constant seemeded less clear. llvm-svn: 314830
* [InstCombine] Change a bunch of methods to take APInts by reference instead ↵Craig Topper2017-10-032-134/+134
| | | | | | | | of pointer. This allows us to remove a bunch of dereferences and only have a few dereferences at the call sites. llvm-svn: 314762
* [InstCombine] Replace an equality compare of two APInt pointers with a ↵Craig Topper2017-10-031-1/+1
| | | | | | | | compare of the APInts themselves. Apparently this works by virtue of the fact that the pointers are pointers to the APInts stored inside of the ConstantInt objects. But I really don't think we should be relying on that. llvm-svn: 314761
* [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> ↵Sanjay Patel2017-10-021-6/+6
| | | | | | icmp X, (C2<<C1) llvm-svn: 314698
* Update getMergedLocation to check the instruction type and merge properly.Dehao Chen2017-10-023-16/+16
| | | | | | | | | | | | | | Summary: If the merged instruction is call instruction, we need to set the scope to the closes common scope between 2 locations, otherwise it will cause trouble when the call is getting inlined. Reviewers: dblaikie, aprantl Reviewed By: dblaikie, aprantl Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D37877 llvm-svn: 314694
* [InstCombine] Use APInt for all the math in foldICmpDivConstantCraig Topper2017-10-011-95/+46
| | | | | | | | | | | | | | Summary: This currently uses ConstantExpr to do its math, but as noted in a TODO it can all be done directly on APInt. Reviewers: spatel, majnemer Reviewed By: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38440 llvm-svn: 314640
* Revert r314579: "Recommi r314561 after fixing over-debug assertion".Daniel Jasper2017-10-012-225/+0
| | | | | | | | | | | | And follow-up r314585. Leads to segfaults. I'll forward reproduction instructions to the patch author. Also, for a recommit, still add the original patch description. Otherwise, it becomes really tedious to find out what a patch actually does. The fact that it is a recommit with a fix is somewhat secondary. llvm-svn: 314622
* Fix buildbot failure -- tighten type check for matching phiXinliang David Li2017-09-301-1/+1
| | | | llvm-svn: 314585
* Recommi r314561 after fixing over-debug assertionXinliang David Li2017-09-302-0/+225
| | | | llvm-svn: 314579
* Revert 314561 due to debug build assertion failureXinliang David Li2017-09-292-223/+0
| | | | llvm-svn: 314563
* Eliminate PHI (int typed) which has only one use by intptrXinliang David Li2017-09-292-0/+223
| | | | | | | | | | This patch will eliminate redundant intptr/ptrtoint that pessimizes analyses such as SCEV, AA and will make optimization passes such as auto-vectorization more powerful. Differential revision: http://reviews.llvm.org/D37832 llvm-svn: 314561
* Revert r314017 '[InstCombine] Simplify check for RHS being a splat constant ↵Craig Topper2017-09-271-16/+22
| | | | | | | | in foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt.' This reverts r314017 and similar code added in later commits. It seems to not work for pointer compares and is causing a bot failure for the last several days. llvm-svn: 314360
* [InstCombine] Gating select arithmetic optimization.Chad Rosier2017-09-271-2/+3
| | | | | | | | | | | | These changes faciliate positive behavior for arithmetic based select expressions that match its translation criteria, keeping code size gated to neutral or improved scenarios. Patch by Michael Berg <michael_c_berg@apple.com>! Differential Revision: https://reviews.llvm.org/D38263 llvm-svn: 314320
* [InstCombine] Remove one use restriction on the shift for calls to ↵Craig Topper2017-09-261-3/+3
| | | | | | | | | | | | foldICmpAndShift. If this transformation succeeds, we're going to remove our dependency on the shift by rewriting the and. So it doesn't matter how many uses the shift has. This distributes the one use check to other transforms in foldICmpAndConstConst that do need it. Differential Revision: https://reviews.llvm.org/D38206 llvm-svn: 314233
* [InstCombine] Move an optimization from foldICmpAndConstConst to ↵Craig Topper2017-09-251-16/+10
| | | | | | | | | | foldICmpUsingKnownBits All this optimization cares about is knowing how many low bits of LHS is known to be zero and whether that means that the result is 0 or greater than the RHS constant. It doesn't matter where the zeros in the low bits came from. So we don't need to specifically look for an AND. Instead we can use known bits. Differential Revision: https://reviews.llvm.org/D38195 llvm-svn: 314153
* [InstCombine] remove extract-of-select vector transform (2nd try)Sanjay Patel2017-09-251-33/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 1st attempt at this: https://reviews.llvm.org/rL314117 was reverted at: https://reviews.llvm.org/rL314118 because of bot fails for clang tests that were checking optimized IR. That should be fixed with: https://reviews.llvm.org/rL314144 ...so try again. Original commit message: The transform to convert an extract-of-a-select-of-vectors was added at: https://reviews.llvm.org/rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314147
* revert r314117 because there are bogus clang tests that depend on the optimizerSanjay Patel2017-09-251-0/+33
| | | | llvm-svn: 314118
* [InstCombine] remove extract-of-select vector transformSanjay Patel2017-09-251-33/+0
| | | | | | | | | | | | | | | | | | | | | | | The transform to convert an extract-of-a-select-of-vectors was added at: rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 llvm-svn: 314117
* [InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to ↵Craig Topper2017-09-221-0/+8
| | | | | | | | equality comparisons when the min/max ranges intersect in a single value. This is the inverse of what we do for SGT/SLT/UGT/ULT. llvm-svn: 314032
* [InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases ↵Craig Topper2017-09-221-6/+5
| | | | | | in foldICmpUsingKnownBits. llvm-svn: 314025
* [InstCombine] Move the call to isSignBitCheck into getDemandedBitsLHSMask ↵Craig Topper2017-09-221-15/+8
| | | | | | | | instead of calling it outside and passing its result through a flag. NFCI The result of the isSignBitCheck isn't used anywhere else and this allows us to share the m_APInt call in the likely case that it isn't a sign bit check. llvm-svn: 314018
* [InstCombine] Simplify check for RHS being a splat constant in ↵Craig Topper2017-09-221-8/+6
| | | | | | foldICmpUsingKnownBits by just checking Op1Min==Op1Max rather than going through m_APInt. llvm-svn: 314017
OpenPOWER on IntegriCloud