summaryrefslogtreecommitdiffstats
path: root/llvm/include
diff options
context:
space:
mode:
authorAyal Zaks <ayal.zaks@intel.com>2018-10-18 15:03:15 +0000
committerAyal Zaks <ayal.zaks@intel.com>2018-10-18 15:03:15 +0000
commitb0b5312e677ccbe568ffe4ea8247c4384d30b000 (patch)
tree8842218ea6576623d45b67fcf7125a660841b436 /llvm/include
parenta1e6e65b9fd68490db04530a45c9333cf69b6213 (diff)
downloadbcm5719-llvm-b0b5312e677ccbe568ffe4ea8247c4384d30b000.tar.gz
bcm5719-llvm-b0b5312e677ccbe568ffe4ea8247c4384d30b000.zip
[LV] Fold tail by masking to vectorize loops of arbitrary trip count under opt for size
When optimizing for size, a loop is vectorized only if the resulting vector loop completely replaces the original scalar loop. This holds if no runtime guards are needed, if the original trip-count TC does not overflow, and if TC is a known constant that is a multiple of the VF. The last two TC-related conditions can be overcome by 1. rounding the trip-count of the vector loop up from TC to a multiple of VF; 2. masking the vector body under a newly introduced "if (i <= TC-1)" condition. The patch allows loops with arbitrary trip counts to be vectorized under -Os, subject to the existing cost model considerations. It also applies to loops with small trip counts (under -O2) which are currently handled as if under -Os. The patch does not handle loops with reductions, live-outs, or w/o a primary induction variable, and disallows interleave groups. (Third, final and main part of -) Differential Revision: https://reviews.llvm.org/D50480 llvm-svn: 344743
Diffstat (limited to 'llvm/include')
-rw-r--r--llvm/include/llvm/Analysis/VectorUtils.h21
-rw-r--r--llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h4
2 files changed, 19 insertions, 6 deletions
diff --git a/llvm/include/llvm/Analysis/VectorUtils.h b/llvm/include/llvm/Analysis/VectorUtils.h
index 2ac49f67662..937a52fb968 100644
--- a/llvm/include/llvm/Analysis/VectorUtils.h
+++ b/llvm/include/llvm/Analysis/VectorUtils.h
@@ -345,20 +345,29 @@ public:
const LoopAccessInfo *LAI)
: PSE(PSE), TheLoop(L), DT(DT), LI(LI), LAI(LAI) {}
- ~InterleavedAccessInfo() {
+ ~InterleavedAccessInfo() { reset(); }
+
+ /// Analyze the interleaved accesses and collect them in interleave
+ /// groups. Substitute symbolic strides using \p Strides.
+ /// Consider also predicated loads/stores in the analysis if
+ /// \p EnableMaskedInterleavedGroup is true.
+ void analyzeInterleaving(bool EnableMaskedInterleavedGroup);
+
+ /// Invalidate groups, e.g., in case all blocks in loop will be predicated
+ /// contrary to original assumption. Although we currently prevent group
+ /// formation for predicated accesses, we may be able to relax this limitation
+ /// in the future once we handle more complicated blocks.
+ void reset() {
SmallPtrSet<InterleaveGroup *, 4> DelSet;
// Avoid releasing a pointer twice.
for (auto &I : InterleaveGroupMap)
DelSet.insert(I.second);
for (auto *Ptr : DelSet)
delete Ptr;
+ InterleaveGroupMap.clear();
+ RequiresScalarEpilogue = false;
}
- /// Analyze the interleaved accesses and collect them in interleave
- /// groups. Substitute symbolic strides using \p Strides.
- /// Consider also predicated loads/stores in the analysis if
- /// \p EnableMaskedInterleavedGroup is true.
- void analyzeInterleaving(bool EnableMaskedInterleavedGroup);
/// Check if \p Instr belongs to any interleave group.
bool isInterleaved(Instruction *Instr) const {
diff --git a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
index 2a6242099b2..ceb660daa28 100644
--- a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
+++ b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
@@ -241,6 +241,10 @@ public:
/// If false, good old LV code.
bool canVectorize(bool UseVPlanNativePath);
+ /// Return true if we can vectorize this loop while folding its tail by
+ /// masking.
+ bool canFoldTailByMasking();
+
/// Returns the primary induction variable.
PHINode *getPrimaryInduction() { return PrimaryInduction; }
OpenPOWER on IntegriCloud