summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/VectorUtils.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[VectorUtils] Introduce the Vector Function Database (VFDatabase)."Francesco Petrogalli2019-12-131-1/+0
| | | | | | | | | This reverts commit 0be81968a283fd4161cb9ac9748d5ed200926292. The VFDatabase needs some rework to be able to handle vectorization and subsequent scalarization of intrinsics in out-of-tree versions of the compiler. For more details, see the discussion in https://reviews.llvm.org/D67572.
* [VectorUtils] Introduce the Vector Function Database (VFDatabase).Francesco Petrogalli2019-12-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [*] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [*] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572
* [VectorUtils] API for VFShape, update VFInfo.Francesco Petrogalli2019-12-041-0/+44
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces an API to build and modify vector shapes. The validity of a VFShape can be checked with the `hasValidParameterList` method, which is also run in an assertion each time a VFShape is modified. The field VFISAKind has been moved to VFInfo under the assumption that different ISAs can map to the same VFShape (as it can be in the case of vector extensions with the same registers size, for example AVX and AVX2). Reviewers: sdesmalen, jdoerfert, simoll, hsaito Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70513
* [SVFS] Inject TLI Mappings in VFABI attribute.Francesco Petrogalli2019-11-151-0/+3
| | | | | | | | | | | | | | This patch introduces a function pass to inject the scalar-to-vector mappings stored in the TargetLIbraryInfo (TLI) into the Vector Function ABI (VFABI) variants attribute. The test is testing the injection for three vector libraries supported by the TLI (Accelerate, SVML, MASSV). The pass does not change any of the analysis associated to the function. Differential Revision: https://reviews.llvm.org/D70107
* Add missing includes needed to prune LLVMContext.h include, NFCReid Kleckner2019-11-141-0/+1
| | | | | These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280
* [VFABI] Read/Write functions for the VFABI attribute.Francesco Petrogalli2019-11-121-0/+19
| | | | | | | | | | | | | | The attribute is stored at the `FunctionIndex` attribute set, with the name "vector-function-abi-variant". The get/set methods of the attribute have assertion to verify that: 1. Each name in the attribute is a valid VFABI mangled name. 2. Each name in the attribute correspond to a function declared in the module. Differential Revision: https://reviews.llvm.org/D69976
* [Alignment][NFC] Make VectorUtils uas llvm::AlignGuillaume Chatelet2019-10-101-6/+6
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, rogfer01, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68784 llvm-svn: 374330
* InterleavedAccessInfo - Don't dereference a dyn_cast result. NFCI.Simon Pilgrim2019-09-171-1/+1
| | | | llvm-svn: 372117
* [Intrinsic] Add the llvm.umul.fix.sat intrinsicBjorn Pettersson2019-09-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Patch by: leonardchan, bjope Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel Reviewed By: leonardchan Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57836 llvm-svn: 371308
* [LV] Avoid building interleaved group in presence of WAW dependencyHideki Saito2019-08-021-0/+4
| | | | | | | | | | | | Reviewers: hsaito, Ayal, fhahn, anna, mkazantsev Reviewed By: hsaito Patch by evrevnov, thanks! Differential Revision: https://reviews.llvm.org/D63981 llvm-svn: 367654
* [Scalarizer] Add scalarizer support for smul.fix.satBjorn Pettersson2019-06-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: Handle smul.fix.sat in the scalarizer. This is done by adding smul.fix.sat to the set of "isTriviallyVectorizable" intrinsics. The addition of smul.fix.sat in isTriviallyVectorizable and hasVectorInstrinsicScalarOpd can also be seen as a preparation to be able to use hasVectorInstrinsicScalarOpd in ConstantFolding. Reviewers: rengolin, RKSimon, dblaikie Reviewed By: rengolin Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63704 llvm-svn: 364177
* [Analysis] add isSplatValue() for vectors in IRSanjay Patel2019-06-111-0/+39
| | | | | | | | | | | | | | We have the related getSplatValue() already in IR (see code just above the proposed addition). But sometimes we only need to know that the value is a splat rather than capture the splatted scalar value. Also, we have an isSplatValue() function already in SDAG. Motivation - recent bugs that would potentially benefit from improved splat analysis in IR: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 Differential Revision: https://reviews.llvm.org/D63138 llvm-svn: 363106
* [Analysis] simplify code for getSplatValue(); NFCSanjay Patel2019-06-071-20/+11
| | | | | | | | | AFAIK, this is only currently called by TTI, but it could be used from instcombine or CGP to help solve problems like: https://bugs.llvm.org/show_bug.cgi?id=37428 https://bugs.llvm.org/show_bug.cgi?id=42174 llvm-svn: 362810
* Consolidate existing utilities for interpreting vector predicate maskes [NFC]Philip Reames2019-04-251-0/+46
| | | | llvm-svn: 359163
* [Vectorizer] Add vectorization support for fixed smul/umul intrinsicsSimon Pilgrim2019-02-251-0/+5
| | | | | | | | This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790
* [NFC] fix trivial typos in commentsHiroshi Inoue2019-02-051-1/+1
| | | | llvm-svn: 353147
* Move saturated arithmetic intrinsics to other integer intrinsics. NFCI.Simon Pilgrim2019-01-231-4/+4
| | | | | | They were in the floating point group. llvm-svn: 351953
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [SLPVectorizer] Flag ADD/SUB SSAT/USAT intrinsics trivially vectorizable ↵Simon Pilgrim2019-01-031-0/+4
| | | | | | | | (PR40123) Enables SLP vectorization for the SSE2 PADDS/PADDUS/PSUBS/PSUBUS style intrinsics llvm-svn: 350300
* [NFC] Fix trailing comma after function.Clement Courbet2018-12-201-1/+1
| | | | | | lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732
* Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.Michael Kruse2018-12-201-4/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725
* [VectorUtils] Use namespace for InterleaveGroup template specialization.Florian Hahn2018-11-131-4/+6
| | | | llvm-svn: 346759
* [VPlan] VPlan version of InterleavedAccessInfo.Florian Hahn2018-11-131-10/+24
| | | | | | | | | | | | | | | | | This patch turns InterleaveGroup into a template with the instruction type being a template parameter. It also adds a VPInterleavedAccessInfo class, which only contains a mapping from VPInstructions to their respective InterleaveGroup. As we do not have access to scalar evolution in VPlan, we can re-use convert InterleavedAccessInfo to VPInterleavedAccess info. Reviewers: Ayal, mssimpso, hfinkel, dcaballe, rengolin, mkuper, hsaito Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D49489 llvm-svn: 346758
* [VectorUtils] add funnel-shifts to the list of vectorizable intrinsicsSanjay Patel2018-11-121-0/+2
| | | | | | | | | | | | | | | | This just identifies the intrinsics as candidates for vectorization. It does not mean we will attempt to vectorize under normal conditions (the test file is forcing vectorization). The cost model must be fixed to show that the transform is profitable in general. Allowing vectorization with these intrinsics is required to avoid potential regressions from canonicalizing to the intrinsics from generic IR: https://bugs.llvm.org/show_bug.cgi?id=37417 llvm-svn: 346661
* [VectorUtils] reorder list of vectorizable intrinsics; NFCSanjay Patel2018-11-121-10/+9
| | | | | | | We need to add funnel-shifts to this list, so clean up the random order before it gets worse. llvm-svn: 346660
* [LV] Support vectorization of interleave-groups that require an epilog underDorit Nuzman2018-10-311-2/+22
| | | | | | | | | | | | | | | | | | | | | | optsize using masked wide loads Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53668 llvm-svn: 345705
* [IAI,LV] Avoid creating a scalar epilogue due to gaps in interleave-groups when Dorit Nuzman2018-10-221-0/+24
| | | | | | | | | | | | | | | | | | | | | | | optimizing for size LV is careful to respect -Os and not to create a scalar epilog in all cases (runtime tests, trip-counts that require a remainder loop) except for peeling due to gaps in interleave-groups. This patch fixes that; -Os will now have us invalidate such interleave-groups and vectorize without an epilog. The patch also removes a related FIXME comment that is now obsolete, and was also inaccurate: "FIXME: return None if loop requiresScalarEpilog(<MaxVF>), or look for a smaller MaxVF that does not require a scalar epilog." (requiresScalarEpilog() has nothing to do with VF). Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53420 llvm-svn: 344883
* [LoopVectorize] Loop vectorization for minimum and maximumThomas Lively2018-10-191-0/+2
| | | | | | | | | | | | Summary: Depends on D52766. Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52767 llvm-svn: 344816
* recommit 344472 after fixing build failure on ARM and PPC.Dorit Nuzman2018-10-141-9/+20
| | | | llvm-svn: 344475
* revert 344472 due to failures.Dorit Nuzman2018-10-141-20/+9
| | | | llvm-svn: 344473
* [IAI,LV] Add support for vectorizing predicated strided accesses using maskedDorit Nuzman2018-10-141-9/+20
| | | | | | | | | | | | | | | | | | | | | | | interleave-group The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave- groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/ stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles. Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D53011 llvm-svn: 344472
* [IAI,LV] Avoid creating interleave-groups for predicated accesseDorit Nuzman2018-10-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | This patch fixes PR39099. When strided loads are predicated, each of them will form an interleaved-group (with gaps). However, subsequent stages of vectorization (planning and transformation) assume that if a load is part of an Interleave-Group it is not predicated, resulting in wrong code - unmasked wide loads are created. The Interleaving Analysis does take care not to have conditional interleave groups of size > 1, but until we extend the planning and transformation stages to support masked-interleave-groups we should also avoid having them for size == 1. Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D52682 llvm-svn: 343931
* [Analysis] add comment to generalize finding a scalar op from vector; NFCSanjay Patel2018-09-241-3/+4
| | | | llvm-svn: 342906
* Fix vectorization of canonicalizeMatt Arsenault2018-09-171-0/+1
| | | | llvm-svn: 342390
* [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)Florian Hahn2018-09-121-0/+327
| | | | | | | | | | | | | Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use them for VPlan outside LoopVectorize.cpp Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00 Reviewed By: rengolin, xbolva00 Differential Revision: https://reviews.llvm.org/D49488 llvm-svn: 342027
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-9/+9
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* Fixed spelling mistake in comments of LLVM Analysis passesVedant Kumar2018-02-281-1/+1
| | | | | | | | Patch by Reshabh Sharma! Differential Revision: https://reviews.llvm.org/D43861 llvm-svn: 326352
* Add an @llvm.sideeffect intrinsicDan Gohman2017-11-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements Chandler's idea [0] for supporting languages that require support for infinite loops with side effects, such as Rust, providing part of a solution to bug 965 [1]. Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual effect, but which appears to optimization passes to have obscure side effects, such that they don't optimize away loops containing it. It also teaches several optimization passes to ignore this intrinsic, so that it doesn't significantly impact optimization in most cases. As discussed on llvm-dev [2], this patch is the first of two major parts. The second part, to change LLVM's semantics to have defined behavior on infinite loops by default, with a function attribute for opting into potential-undefined-behavior, will be implemented and posted for review in a separate patch. [0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html [1] https://bugs.llvm.org/show_bug.cgi?id=965 [2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html Differential Revision: https://reviews.llvm.org/D38336 llvm-svn: 317729
* [Constants] If we already have a ConstantInt*, prefer to use ↵Craig Topper2017-07-061-1/+1
| | | | | | | | isZero/isOne/isMinusOne instead of isNullValue/isOneValue/isAllOnesValue inherited from Constant. NFCI Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne. llvm-svn: 307292
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* Introduce experimental generic intrinsics for horizontal vector reductions.Amara Emerson2017-05-091-0/+1
| | | | | | | | | | | | | | - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514
* [LV] Move interleaved access helper functions to VectorUtils (NFC)Matthew Simpson2017-02-011-0/+85
| | | | | | | | | | | | This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-2/+2
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* Add handling of !invariant.load to PropagateMetadata.Justin Lebar2016-09-111-6/+6
| | | | | | | | | | | | | | Summary: This will let e.g. the load/store vectorizer propagate this metadata appropriately. Reviewers: arsenm Subscribers: tra, jholewinski, hfinkel, mzolotukhin Differential Revision: https://reviews.llvm.org/D23479 llvm-svn: 281153
* SLPVectorizer: Move propagateMetadata to VectorUtilsMatt Arsenault2016-06-301-0/+41
| | | | | | | | This will be re-used by the LoadStoreVectorizer. Fix handling of range metadata and testcase by Justin Lebar. llvm-svn: 274281
* [Analysis] Enabled BITREVERSE as a vectorizable intrinsicSimon Pilgrim2016-06-041-0/+1
| | | | | | Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve. llvm-svn: 271803
* Revert "[VectorUtils] Query number of sign bits to allow more truncations"James Molloy2016-05-101-14/+4
| | | | | | | | This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051
* [VectorUtils] Query number of sign bits to allow more truncationsJames Molloy2016-05-091-4/+14
| | | | | | When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations. llvm-svn: 268921
* [ValueTracking, VectorUtils] Refactor getIntrinsicIDForCallDavid Majnemer2016-04-191-145/+8
| | | | | | | | | | | | | The functionality contained within getIntrinsicIDForCall is two-fold: it checks if a CallInst's callee is a vectorizable intrinsic. If it isn't an intrinsic, it attempts to map the call's target to a suitable intrinsic. Move the mapping functionality into getIntrinsicForCallSite and rename getIntrinsicIDForCall to getVectorIntrinsicIDForCall while reimplementing it in terms of getIntrinsicForCallSite. llvm-svn: 266801
* [InstCombine] We folded an fcmp to an i1 instead of a vector of i1David Majnemer2016-04-131-2/+2
| | | | | | | | | Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
OpenPOWER on IntegriCloud