summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis/VectorUtils.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* Introduce experimental generic intrinsics for horizontal vector reductions.Amara Emerson2017-05-091-0/+1
| | | | | | | | | | | | | | - This change allows targets to opt-in to using them instead of the log2 shufflevector algorithm. - The SLP and Loop vectorizers have the common code to do shuffle reductions factored out into LoopUtils, and now have a unified interface for generating reductions regardless of the preference of the target. LoopUtils now uses TTI to determine what kind of reductions the target wants to handle. - For CodeGen, basic legalization support is added. Differential Revision: https://reviews.llvm.org/D30086 llvm-svn: 302514
* [LV] Move interleaved access helper functions to VectorUtils (NFC)Matthew Simpson2017-02-011-0/+85
| | | | | | | | | | | | This patch moves some helper functions related to interleaved access vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like to use these functions in a follow-on patch that improves interleaved load and store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was already duplicated there and has been removed. Differential Revision: https://reviews.llvm.org/D29398 llvm-svn: 293788
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-021-2/+2
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* Add handling of !invariant.load to PropagateMetadata.Justin Lebar2016-09-111-6/+6
| | | | | | | | | | | | | | Summary: This will let e.g. the load/store vectorizer propagate this metadata appropriately. Reviewers: arsenm Subscribers: tra, jholewinski, hfinkel, mzolotukhin Differential Revision: https://reviews.llvm.org/D23479 llvm-svn: 281153
* SLPVectorizer: Move propagateMetadata to VectorUtilsMatt Arsenault2016-06-301-0/+41
| | | | | | | | This will be re-used by the LoadStoreVectorizer. Fix handling of range metadata and testcase by Justin Lebar. llvm-svn: 274281
* [Analysis] Enabled BITREVERSE as a vectorizable intrinsicSimon Pilgrim2016-06-041-0/+1
| | | | | | Allows XOP to vectorize BITREVERSE - other targets will follow as their costmodels improve. llvm-svn: 271803
* Revert "[VectorUtils] Query number of sign bits to allow more truncations"James Molloy2016-05-101-14/+4
| | | | | | | | This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690. This reverts commit r268921. llvm-svn: 269051
* [VectorUtils] Query number of sign bits to allow more truncationsJames Molloy2016-05-091-4/+14
| | | | | | When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations. llvm-svn: 268921
* [ValueTracking, VectorUtils] Refactor getIntrinsicIDForCallDavid Majnemer2016-04-191-145/+8
| | | | | | | | | | | | | The functionality contained within getIntrinsicIDForCall is two-fold: it checks if a CallInst's callee is a vectorizable intrinsic. If it isn't an intrinsic, it attempts to map the call's target to a suitable intrinsic. Move the mapping functionality into getIntrinsicForCallSite and rename getIntrinsicIDForCall to getVectorIntrinsicIDForCall while reimplementing it in terms of getIntrinsicForCallSite. llvm-svn: 266801
* [InstCombine] We folded an fcmp to an i1 instead of a vector of i1David Majnemer2016-04-131-2/+2
| | | | | | | | | Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175
* [SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnanDavid Majnemer2016-04-061-1/+3
| | | | | | | | | | | | | | To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521
* [SLPVectorizer] Vectorize libcalls of sqrtDavid Majnemer2016-04-061-0/+4
| | | | | | | We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493
* [VectorUtils] Don't try and truncate PHIs to a smaller bitwidthJames Molloy2016-03-301-0/+15
| | | | | | | | | | We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852
* [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with ↵Eduard Burtescu2016-01-191-2/+1
| | | | | | | | | | | | | | | | | | get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145
* [NFC] Remove one dead PointerType::getElementType() call.Manuel Jacob2016-01-171-2/+0
| | | | | | | | | | | | Reviewers: dblaikie, mjacob Subscribers: llvm-commits, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16274 llvm-svn: 258022
* [SCEV] Add and use SCEVConstant::getAPInt; NFCISanjoy Das2015-12-171-2/+1
| | | | llvm-svn: 255921
* Fixed a failure in getSpaltValue()Elena Demikhovsky2015-12-011-1/+2
| | | | llvm-svn: 254409
* Fixed a failure in cost calculation for vector GEPElena Demikhovsky2015-12-011-3/+4
| | | | | | | | | Cost calculation for vector GEP failed with due to invalid cast to GEP index operand. The bug is fixed, added a test. http://reviews.llvm.org/D14976 llvm-svn: 254408
* [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.Charlie Turner2015-11-261-3/+3
| | | | | | | | | | | | | | | | | The order in which instructions are truncated in truncateToMinimalBitwidths effects code generation. Switch to a map with a determinisic order, since the iteration order over a DenseMap is not defined. This code is not hot, so the difference in container performance isn't interesting. Many thanks to David Blaikie for making me aware of MapVector! Fixes PR25490. Differential Revision: http://reviews.llvm.org/D14981 llvm-svn: 254179
* [LoopVectorize] Address post-commit feedback on r250032James Molloy2015-11-091-19/+19
| | | | | | | | | | Implemented as many of Michael's suggestions as were possible: * clang-format the added code while it is still fresh. * tried to change Value* to Instruction* in many places in computeMinimumValueSizes - unfortunately there are several places where Constants need to be handled so this wasn't possible. * Reduce the pass list on loop-vectorization-factors.ll. * Fix a bug where we were querying MinBWs for I->getOperand(0) but using MinBWs[I]. llvm-svn: 252469
* [LoopVectorize] Shrink integer operations into the smallest type possibleJames Molloy2015-10-121-0/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C semantics force sub-int-sized values (e.g. i8, i16) to be promoted to int type (e.g. i32) whenever arithmetic is performed on them. For targets with native i8 or i16 operations, usually InstCombine can shrink the arithmetic type down again. However InstCombine refuses to create illegal types, so for targets without i8 or i16 registers, the lengthening and shrinking remains. Most SIMD ISAs (e.g. NEON) however support vectors of i8 or i16 even when their scalar equivalents do not, so during vectorization it is important to remove these lengthens and truncates when deciding the profitability of vectorization. The algorithm this uses starts at truncs and icmps, trawling their use-def chains until they terminate or instructions outside the loop are found (or unsafe instructions like inttoptr casts are found). If the use-def chains starting from different root instructions (truncs/icmps) meet, they are unioned. The demanded bits of each node in the graph are ORed together to form an overall mask of the demanded bits in the entire graph. The minimum bitwidth that graph can be truncated to is the bitwidth minus the number of leading zeroes in the overall mask. The intention is that this algorithm should "first do no harm", so it will never insert extra cast instructions. This is why the use-def graphs are unioned, so that subgraphs with different minimum bitwidths do not need casts inserted between them. This algorithm works hard to reduce compile time impact. DemandedBits are only queried if there are extends of illegal types and if a truncate to an illegal type is seen. In the general case, this results in a simple linear scan of the instructions in the loop. No non-noise compile time impact was seen on a clang bootstrap build. llvm-svn: 250032
* NFC: Code style in VectorUtils.cppElena Demikhovsky2015-08-301-10/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D12478 llvm-svn: 246381
* Revert "Revert "New interface function is added to VectorUtils Value ↵Renato Golin2015-08-301-0/+26
| | | | | | | | | *getSplatValue(Value *Val);"" This reverts commit r246379. It seems that the commit was not the culprit, and the bot will be investigated for instability. llvm-svn: 246380
* Revert "New interface function is added to VectorUtils Value ↵Renato Golin2015-08-301-26/+0
| | | | | | | | | | *getSplatValue(Value *Val);" This reverts commit r246371, as it cause a rather obscure bug in AArch64 test-suite paq8p (time outs, seg-faults). I'll investigate it before reapplying. llvm-svn: 246379
* New interface function is added to VectorUtilsElena Demikhovsky2015-08-301-0/+26
| | | | | | | | | | | | | Value *getSplatValue(Value *Val); It complements the CreateVectorSplat(), which creates 2 instructions - insertelement and shuffle with all-zero mask. The new function recognizes the pattern - insertelement+shuffle and returns the splat value (or nullptr). It also returns a splat value form ConstantDataVector, for completeness. Differential Revision: http://reviews.llvm.org/D11124 llvm-svn: 246371
* [InstSimplify] Don't assume getAggregateElement will succeedDavid Majnemer2015-08-181-4/+4
| | | | | | | It isn't always possible to get a value from getAggregateElement. This fixes PR24488. llvm-svn: 245365
* [VectorUtils] Replace 'llvm::' qualification with 'using llvm'David Majnemer2015-08-181-18/+15
| | | | | | | No funcitonal change is intended, this just makes the file look more like the rest of LLVM. llvm-svn: 245364
* De-constify pointers to Type since they can't be modified. NFCCraig Topper2015-08-011-1/+1
| | | | | | This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842
* [InstSimplify] Teach InstSimplify how to simplify extractelementDavid Majnemer2015-07-131-0/+52
| | | | llvm-svn: 242008
* Move getStrideFromPointer and friends from LoopVectorize to VectorUtilsHal Finkel2015-07-111-0/+146
| | | | | | | | | | | | | | | | The following functions are moved from the LoopVectorizer to VectorUtils: - getGEPInductionOperand - stripGetElementPtr - getUniqueCastUse - getStrideFromPointer These used to be static functions in LoopVectorize, but will also be used by the upcoming loop versioning LICM transformation. Patch by Ashutosh Nema! llvm-svn: 241980
* Move VectorUtils from Transforms to Analysis to correct layering violationDavid Blaikie2015-06-261-0/+213
llvm-svn: 240804
OpenPOWER on IntegriCloud