summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/VectorUtils.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [Scalarizer] Add scalarizer support for smul.fix.satBjorn Pettersson2019-06-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: Handle smul.fix.sat in the scalarizer. This is done by adding smul.fix.sat to the set of "isTriviallyVectorizable" intrinsics. The addition of smul.fix.sat in isTriviallyVectorizable and hasVectorInstrinsicScalarOpd can also be seen as a preparation to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63704 llvm-svn: 364177
* [Analysis] add isSplatValue() for vectors in IRSanjay Patel2019-06-111-0/+39
| | | | | | | | | | | | | | We have the related getSplatValue() already in IR (see code just above the proposed addition). But sometimes we only need to know that the value is a splat rather than capture the splatted scalar value. Also, we have an isSplatValue() function already in SDAG. Motivation - recent bugs that would potentially benefit from improved splat analysis in IR: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 Differential Revision: https://reviews.llvm.org/D63138 llvm-svn: 363106
* [Analysis] simplify code for getSplatValue(); NFCSanjay Patel2019-06-071-20/+11
| | | | | | | | | AFAIK, this is only currently called by TTI, but it could be used from instcombine or CGP to help solve problems like: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 llvm-svn: 362810
* Consolidate existing utilities for interpreting vector predicate maskes [NFC]Philip Reames2019-04-251-0/+46
| | | | llvm-svn: 359163
* [Vectorizer] Add vectorization support for fixed smul/umul intrinsicsSimon Pilgrim2019-02-251-0/+5
| | | | | | | | This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790
* [NFC] fix trivial typos in commentsHiroshi Inoue2019-02-051-1/+1
| | | | llvm-svn: 353147
* Move saturated arithmetic intrinsics to other integer intrinsics. NFCI.Simon Pilgrim2019-01-231-4/+4
| | | | | | They were in the floating point group. llvm-svn: 351953
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [SLPVectorizer] Flag ADD/SUB SSAT/USAT intrinsics trivially vectorizable ↵Simon Pilgrim2019-01-031-0/+4
| | | | | | | | (PR40123) Enables SLP vectorization for the SSE2 PADDS/PADDUS/PSUBS/PSUBUS style intrinsics llvm-svn: 350300
* [NFC] Fix trailing comma after function.Clement Courbet2018-12-201-1/+1
| | | | | | lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732
* Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.Michael Kruse2018-12-201-4/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725
* [VectorUtils] Use namespace for InterleaveGroup template specialization.Florian Hahn2018-11-131-4/+6
| | | | llvm-svn: 346759
* [VPlan] VPlan version of InterleavedAccessInfo.Florian Hahn2018-11-131-10/+24
| | | | | | | | | | | | | | | | | This patch turns InterleaveGroup into a template with the instruction type being a template parameter. It also adds a VPInterleavedAccessInfo class, which only contains a mapping from VPInstructions to their respective InterleaveGroup. As we do not have access to scalar evolution in VPlan, we can re-use convert InterleavedAccessInfo to VPInterleavedAccess info. Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D49489 llvm-svn: 346758
* [VectorUtils] add funnel-shifts to the list of vectorizable intrinsicsSanjay Patel2018-11-121-0/+2
| | | | | | | | | | | | | | | | This just identifies the intrinsics as candidates for vectorization. It does not mean we will attempt to vectorize under normal conditions (the test file is forcing vectorization). The cost model must be fixed to show that the transform is profitable in general. Allowing vectorization with these intrinsics is required to avoid potential regressions from canonicalizing to the intrinsics from generic IR: https://bugs.llvm.org/show_bug.cgi?id=37417 llvm-svn: 346661
* [VectorUtils] reorder list of vectorizable intrinsics; NFCSanjay Patel2018-11-121-10/+9
| | | | | | | We need to add funnel-shifts to this list, so clean up the random order before it gets worse. llvm-svn: 346660
* [LV] Support vectorization of interleave-groups that require an epilog underDorit Nuzman2018-10-311-2/+22
| | | | | | | | | | | | | | | | | | | | | | optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
* [IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when Dorit Nuzman2018-10-221-0/+24
| | | | | | | | | | | | | | | | | | | | | | | optimizing for size LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog. The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate: "FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog." (requiresScalarEpilog() has nothing to do with VF). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53420 llvm-svn: 344883
* [LoopVectorize] Loop vectorization for minimum and maximumThomas Lively2018-10-191-0/+2
| | | | | | | | | | | | Summary: Depends on D52766. Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52767 llvm-svn: 344816
* recommit 344472 after fixing build failure on ARM and PPC.Dorit Nuzman2018-10-141-9/+20
| | | | llvm-svn: 344475
* revert 344472 due to failures.Dorit Nuzman2018-10-141-20/+9
| | | | llvm-svn: 344473
* [IAI,LV] Add support for vectorizing predicated strided accesses using maskedDorit Nuzman2018-10-141-9/+20
| | | | | | | | | | | | | | | | | | | | | | | interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
* [IAI,LV] Avoid creating interleave-groups for predicated accesseDorit Nuzman2018-10-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | This patch fixes PR39099. When strided loads are predicated, each of them will form an interleaved-group (with gaps). However, subsequent stages of vectorization (planning and transformation) assume that if a load is part of an Interleave-Group it is not predicated, resulting in wrong code - unmasked wide loads are created. The Interleaving Analysis does take care not to have conditional interleave groups of size > 1, but until we extend the planning and transformation stages to support masked-interleave-groups we should also avoid having them for size == 1. Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D52682 llvm-svn: 343931
* [Analysis] add comment to generalize finding a scalar op from vector; NFCSanjay Patel2018-09-241-3/+4
| | | | llvm-svn: 342906
* Fix vectorization of canonicalizeMatt Arsenault2018-09-171-0/+1
| | | | llvm-svn: 342390
* [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)Florian Hahn2018-09-121-0/+327
| | | | | | | | | | | | | Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use them for VPlan outside LoopVectorize.cpp Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00 Reviewed By: rengolin, xbolva00 Differential Revision: https://reviews.llvm.org/D49488 llvm-svn: 342027
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-9/+9
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* Fixed spelling mistake in comments of LLVM Analysis passesVedant Kumar2018-02-281-1/+1
| | | | | | | | Patch by Reshabh Sharma! Differential Revision: https://reviews.llvm.org/D43861 llvm-svn: 326352
* Add an @llvm.sideeffect intrinsicDan Gohman2017-11-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements Chandler's idea [0] for supporting languages that require support for infinite loops with side effects, such as Rust, providing part of a solution to bug 965 [1]. Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual effect, but which appears to optimization passes to have obscure side effects, such that they don't optimize away loops containing it. It also teaches several optimization passes to ignore this intrinsic, so that it doesn't significantly impact optimization in most cases. As discussed on llvm-dev [2], this patch is the first of two major parts. The second part, to change LLVM's semantics to have defined behavior on infinite loops by default, with a function attribute for opting into potential-undefined-behavior, will be implemented and posted for review in a separate patch. [0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html [1] https://bugs.llvm.org/show_bug.cgi?id=965 [2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html Differential Revision: https://reviews.llvm.org/D38336 llvm-svn: 317729
* [Constants] If we already have a ConstantInt*, prefer to use ↵Craig Topper2017-07-061-1/+1
| | | | | | | | isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* Introduce experimental generic intrinsics for horizontal vector reductions.Amara Emerson2017-05-091-0/+1
| | | | | | | | | | | | | | - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514
* [LV] Move interleaved access helper functions to VectorUtils (NFC)Matthew Simpson2017-02-011-0/+85
| | | | | | | | | | | | This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-2/+2
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* Add handling of !invariant.load to PropagateMetadata.Justin Lebar2016-09-111-6/+6
| | | | | | | | | | | | | | Summary: This will let e.g. the load/store vectorizer propagate this metadata appropriately. Reviewers: arsenm Subscribers: tra, jholewinski, hfinkel, mzolotukhin Differential Revision: https://reviews.llvm.org/D23479 llvm-svn: 281153
* SLPVectorizer: Move propagateMetadata to VectorUtilsMatt Arsenault2016-06-301-0/+41
| | | | | | | | This will be re-used by the LoadStoreVectorizer. Fix handling of range metadata and testcase by Justin Lebar. llvm-svn: 274281
* [Analysis] Enabled BITREVERSE as a vectorizable intrinsicSimon Pilgrim2016-06-041-0/+1
| | | | | | Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve. llvm-svn: 271803
* Revert "[VectorUtils] Query number of sign bits to allow more truncations"James Molloy2016-05-101-14/+4
| | | | | | | | This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051
* [VectorUtils] Query number of sign bits to allow more truncationsJames Molloy2016-05-091-4/+14
| | | | | | When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations. llvm-svn: 268921
* [ValueTracking, VectorUtils] Refactor getIntrinsicIDForCallDavid Majnemer2016-04-191-145/+8
| | | | | | | | | | | | | The functionality contained within getIntrinsicIDForCall is two-fold: it checks if a CallInst's callee is a vectorizable intrinsic. If it isn't an intrinsic, it attempts to map the call's target to a suitable intrinsic. Move the mapping functionality into getIntrinsicForCallSite and rename getIntrinsicIDForCall to getVectorIntrinsicIDForCall while reimplementing it in terms of getIntrinsicForCallSite. llvm-svn: 266801
* [InstCombine] We folded an fcmp to an i1 instead of a vector of i1David Majnemer2016-04-131-2/+2
| | | | | | | | | Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
* [SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnanDavid Majnemer2016-04-061-1/+3
| | | | | | | | | | | | | | To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521
* [SLPVectorizer] Vectorize libcalls of sqrtDavid Majnemer2016-04-061-0/+4
| | | | | | | We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493
* [VectorUtils] Don't try and truncate PHIs to a smaller bitwidthJames Molloy2016-03-301-0/+15
| | | | | | | | | | We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852
* [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with ↵Eduard Burtescu2016-01-191-2/+1
| | | | | | | | | | | | | | | | | | get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145
* [NFC] Remove one dead PointerType::getElementType() call.Manuel Jacob2016-01-171-2/+0
| | | | | | | | | | | | Reviewers: dblaikie, mjacob Subscribers: llvm-commits, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16274 llvm-svn: 258022
* [SCEV] Add and use SCEVConstant::getAPInt; NFCISanjoy Das2015-12-171-2/+1
| | | | llvm-svn: 255921
* Fixed a failure in getSpaltValue()Elena Demikhovsky2015-12-011-1/+2
| | | | llvm-svn: 254409
* Fixed a failure in cost calculation for vector GEPElena Demikhovsky2015-12-011-3/+4
| | | | | | | | | Cost calculation for vector GEP failed with due to invalid cast to GEP index operand. The bug is fixed, added a test. http://reviews.llvm.org/D14976 llvm-svn: 254408
* [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.Charlie Turner2015-11-261-3/+3
| | | | | | | | | | | | | | | | | The order in which instructions are truncated in truncateToMinimalBitwidths effects code generation. Switch to a map with a determinisic order, since the iteration order over a DenseMap is not defined. This code is not hot, so the difference in container performance isn't interesting. Many thanks to David Blaikie for making me aware of MapVector! Fixes PR25490. Differential Revision: http://reviews.llvm.org/D14981 llvm-svn: 254179
* [LoopVectorize] Address post-commit feedback on r250032James Molloy2015-11-091-19/+19
| | | | | | | | | | Implemented as many of Michael's suggestions as were possible: * clang-format the added code while it is still fresh. * tried to change Value* to Instruction* in many places in computeMinimumValueSizes - unfortunately there are several places where Constants need to be handled so this wasn't possible. * Reduce the pass list on loop-vectorization-factors.ll. * Fix a bug where we were querying MinBWs for I->getOperand(0) but using MinBWs[I]. llvm-svn: 252469
OpenPOWER on IntegriCloud