summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Vectorize
Commit message (Collapse)AuthorAgeFilesLines
* Remove redundant variable.Michael Zolotukhin2014-12-091-4/+2
| | | | | | | Tested by adding assert(LoopVectorPreHeader == VecPreheader) on LLVM test suite and SPECs. llvm-svn: 223847
* IR: Split Metadata from ValueDuncan P. N. Exon Smith2014-12-091-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do *not* have a `Type`. - `MDNode`'s operands are all `Metadata *` (instead of `Value *`). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode*`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode *N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode *N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode *N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the *only* class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802
* LoopVectorize: Remove unnecessary RAUWDuncan P. N. Exon Smith2014-12-031-2/+0
| | | | | | | | | | Remove an unnecessary `MDNode::replaceAllUsesWith()`. In the preceding line, `TheLoop->setLoopID()` visits all backedges and sets the new loop ID. This sufficiently updates the loop metadata. Metadata RAUW is going away as part of PR21532. llvm-svn: 223210
* PR21302. Vectorize only bottom-tested loops.Michael Zolotukhin2014-12-021-0/+9
| | | | | | rdar://problem/18886083 llvm-svn: 223171
* Revert "Masked Vector Load and Store Intrinsics."Duncan P. N. Exon Smith2014-11-281-83/+15
| | | | | | | | | | | This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936
* Masked Vector Load and Store Intrinsics.Elena Demikhovsky2014-11-231-15/+83
| | | | | | | | | | | | | | Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632
* Vectorize a reduction chain feeding into a 'return' statement.Suyog Sarda2014-11-191-0/+15
| | | | | | | | | e.x return (a[0]+b[0]) + (a[1]+b[1]) Differential Revision: http://reviews.llvm.org/D6227 llvm-svn: 222364
* Update SetVector to rely on the underlying set's insert to return a ↵David Blaikie2014-11-192-6/+7
| | | | | | | | | | | | | pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334
* IR: Make MDString::getName() privateDuncan P. N. Exon Smith2014-11-131-1/+1
| | | | | | | | | | Hide the fact that `MDString`'s string is stored in `Value::Name` -- that's going to change soon. Update the only in-tree client that was using it instead of `Value::getString()`. Part of PR21532. llvm-svn: 221951
* Revert "IR: MDNode => Value"Duncan P. N. Exon Smith2014-11-112-4/+4
| | | | | | | | | | | | | | | | | Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711
* LoopVectorize: Don't assume pointees are sizedDavid Majnemer2014-11-071-1/+7
| | | | | | | | | | A pointer's pointee might not be sized: the pointee could be a function. Report this as IK_NoInduction when calculating isInductionVariable. This fixes PR21508. llvm-svn: 221501
* IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()Duncan P. N. Exon Smith2014-11-032-3/+3
| | | | | | | Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. llvm-svn: 221167
* IR: MDNode => Value: Instruction::getMetadata()Duncan P. N. Exon Smith2014-11-011-1/+1
| | | | | | | | | | Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024
* Correctly update dom-tree after loop vectorizer.Michael Zolotukhin2014-10-311-1/+1
| | | | llvm-svn: 221009
* Reformat partially, where I touched for whitespace changes.NAKAMURA Takumi2014-10-281-2/+5
| | | | llvm-svn: 220773
* Untabify and whitespace cleanups.NAKAMURA Takumi2014-10-281-5/+5
| | | | llvm-svn: 220771
* LoopVectorize: Simplify code. No functionality change.Benjamin Kramer2014-10-221-19/+7
| | | | llvm-svn: 220405
* Add minnum / maxnum intrinsicsMatt Arsenault2014-10-211-0/+2
| | | | | | | | | | | | These are named following the IEEE-754 names for these functions, rather than the libm fmin / fmax to avoid possible ambiguities. Some languages may implement something resembling fmin / fmax which return NaN if either operand is to propagate errors. These implement the IEEE-754 semantics of returning the other operand if either is a NaN representing missing data. llvm-svn: 220341
* [SLPVectorize] Basic ephemeral-value awarenessHal Finkel2014-10-151-3/+30
| | | | | | | | | | | | | | The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). llvm-svn: 219816
* No need to cache this unused variable.Eric Christopher2014-10-141-3/+1
| | | | | | Patch by Ehsan Akhgari. llvm-svn: 219749
* [LoopVectorize] Ignore @llvm.assume for cost estimates and legalityHal Finkel2014-10-141-3/+32
| | | | | | | | | | | | | | A few minor changes to prevent @llvm.assume from interfering with loop vectorization. First, treat @llvm.assume like the lifetime intrinsics, which are scalarized (but don't otherwise interfere with the legality checking). Second, ignore the cost of ephemeral instructions in the loop (these will go away anyway during CodeGen). Alignment assumptions and other uses of @llvm.assume can often end up inside of loops that should be vectorized (this is not uncommon for assumptions generated by __attribute__((align_value(n))), for example). llvm-svn: 219741
* [SCEV] Fix one more caller blindly passing the latch to SCEV'sChandler Carruth2014-10-111-2/+1
| | | | | | | | | | | getSmallConstantTripCount even when it isn't the exiting block. I missed this in my first audit, very sorry. This was found in LNT and elsewhere. I don't have a test case, but it was completely obvious from inspection that this was the problem. I'll see if I can reduce a test case, but I'm not really hopeful, and the value seems quite low. llvm-svn: 219562
* [SCEV] Add some asserts to the recently improved trip count computationChandler Carruth2014-10-111-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | routines and fix all of the bugs they expose. I hit a test case that crashed even without these asserts due to passing a non-exiting latch to the ExitingBlock parameter of the trip count computation machinery. However, when I add the nice asserts, it turns out we have plenty of coverage of these bugs, they just didn't manifest in crashers. The core problem seems to stem from an assumption that the latch *is* the exiting block. While this is often true, and somewhat the "normal" way to think about loops, it isn't necessarily true. The correct way to call the trip count routines in a *generic* fashion (that is, without a particular exit in mind) is to just use the loop's single exiting block if it has one. The trip count can't be computed generically unless it does. This works great for the loop vectorizer. The loop unroller actually *wants* to select the latch when it has to chose between multiple exits because for unrolling it is the latch trips that matter. But if this is the desire, it needs to explicitly guard for non-exiting latches and check for the generic trip count in that case. I've added the asserts, and added convenience APIs for querying the trip count generically that check for a single exit block. I've kept the APIs consistent between computing trip count and trip multiples. Thansk to Mark for the help debugging and tracking down the *right* fix here! llvm-svn: 219550
* Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option ↵Sanjay Patel2014-09-101-39/+39
| | | | | | | | | | | names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528
* Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802).Sanjay Patel2014-09-031-5/+30
| | | | | | | | | | | | | | | | | The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar instructions into vectors. But this isn't a simple copy - we need to take the intersection (the logical 'and') of the sets of flags on the scalars. The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after: http://reviews.llvm.org/D4015 http://llvm.org/viewvc/llvm-project?view=revision&revision=211339 The vast majority of changed files are existing tests that were not propagating IR flags, but I've also added a new test file for focused testing of IR flag possibilities. Differential Revision: http://reviews.llvm.org/D5172 llvm-svn: 217051
* Change name of copyFlags() to copyIRFlags(). Add convenience method for ↵Sanjay Patel2014-09-031-1/+1
| | | | | | | | | | logical 'and' of all flags. NFC. Adding 'IR' to the names in an attempt to be less ambiguous about the flags we're dealing with here. The 'and' method is needed by the SLPVectorizer (PR20802) and possibly other passes. llvm-svn: 217004
* Generate extract for in-tree uses if the use is scalar operand in vectorized ↵Yi Jiang2014-09-021-18/+69
| | | | | | instruction. radar://18144665 llvm-svn: 216946
* Add a convenience method to copy wrapping, exact, and fast-math flags (NFC).Sanjay Patel2014-09-011-13/+3
| | | | | | | | | | | | | | The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions. This patch adds a convenience method to make that operation easier because we need to do this in the loop vectorizer, SLP vectorizer, and possibly other places. Although this is a 'no functional change' patch, I've added a testcase to verify that the exact flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked in existing testcases. Differential Revision: http://reviews.llvm.org/D5138 llvm-svn: 216886
* Small refactor on VectorizerHint for deduplicationRenato Golin2014-09-011-93/+160
| | | | | | | | | | | | | | | | | | | | Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. Re-applying again due to MSVC 2013 being minimum requirement, and this patch having C++11 that MSVC 2012 didn't support. Fixes PR20655. llvm-svn: 216870
* Fix: SLPVectorizer tried to move an instruction which was replaced by a ↵Erik Eckstein2014-08-281-4/+0
| | | | | | | | | | vector instruction. For a detailed description of the problem see the comment in the test file. The problematic moveBefore() calls are not required anymore because the new scheduling algorithm ensures a correct ordering anyway. llvm-svn: 216656
* [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).Michael Zolotukhin2014-08-271-0/+101
| | | | llvm-svn: 216549
* Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or ↵Craig Topper2014-08-271-6/+4
| | | | | | just letting them be implicitly created. llvm-svn: 216525
* Revert r210342 and r210343, add test case for the crasher.Joerg Sonnenberger2014-08-261-91/+0
| | | | | | PR 20642. llvm-svn: 216475
* fix typos in commentsSanjay Patel2014-08-261-4/+4
| | | | llvm-svn: 216424
* Allow vectorization of division by uniform power of 2.Karthik Bhat2014-08-252-8/+31
| | | | | | | | This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible. Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend. Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971) llvm-svn: 216371
* fix: SLPVectorizer crashes for unreachable blocks containing not schedulable ↵Erik Eckstein2014-08-221-0/+8
| | | | | | | | | | | | instructions. In unreachable blocks it's legal to have instructions like "%x = op %x". Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for unreachable blocks and ignore them. Fixes bug 20646. llvm-svn: 216256
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-211-5/+5
| | | | | | needing to mention the size. llvm-svn: 216158
* [LoopVectorizer] Limit unroll factor in the presence of nested reductions.James Molloy2014-08-201-0/+17
| | | | | | If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation. llvm-svn: 216140
* Revert "Small refactor on VectorizerHint for deduplication"Renato Golin2014-08-191-147/+91
| | | | | | This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness. llvm-svn: 215999
* Small refactor on VectorizerHint for deduplicationRenato Golin2014-08-191-91/+147
| | | | | | | | | | | | | | | Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. llvm-svn: 215994
* Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to ↵Craig Topper2014-08-181-5/+5
| | | | | | | | avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-171-5/+5
| | | | | | needing to mention the size. llvm-svn: 215868
* Introduce a helper to combine instruction metadata.Rafael Espindola2014-08-151-32/+7
| | | | | | | | | Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use it. Patch by Björn Steinbrink! llvm-svn: 215723
* [LoopVectorizer] Enable support for floating-point subtraction reductionsJames Molloy2014-08-081-1/+2
| | | | llvm-svn: 215200
* SLPVectorizer: Use the type of the value loaded/stored to get the ABI alignmentArnold Schwaighofer2014-08-071-2/+3
| | | | | | We were using the pointer type which is incorrect. llvm-svn: 215162
* Teach the SLP Vectorizer that keeping some values live over a callsite can ↵James Molloy2014-08-051-0/+68
| | | | | | | | have a cost. Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account. llvm-svn: 214859
* fix bug 20513 - Crash in SLP VectorizerErik Eckstein2014-08-021-10/+14
| | | | llvm-svn: 214638
* Add diagnostics to the vectorizer cost model.Tyler Nowicki2014-08-021-16/+30
| | | | | | | | | | When the cost model determines vectorization is not possible/profitable these remarks print an analysis of that decision. Note that in selectVectorizationFactor() we can assume that OptForSize and ForceVectorization are mutually exclusive. Reviewed by Arnold Schwaighofer llvm-svn: 214599
* SLPVectorizer: fix build problem in Release configurationErik Eckstein2014-08-011-1/+5
| | | | llvm-svn: 214496
* SLPVectorizer: improved scheduling algorithm.Erik Eckstein2014-08-011-249/+693
| | | | llvm-svn: 214494
OpenPOWER on IntegriCloud