Introduce minimalistic cost model for auto parallelization

Instead of parallelizing every parallel outermost loop, we now use a very minimalistic cost model. Specifically, we assume innermost loops are not worth parallelising and all non-innermost loops are. When parallelizing all loops in LNT we got several slowdowns/timeouts due to us parallelizing innermost loops that are executed only a couple of times (number of iterations not known statically). With this basic heuristic enabled LNT does not show any more timeouts, while several interesting loops are still parallelized. There are many ways to obtain an improved heuristic. Constructing such an improvide heuristic from a position of minimal slow-down and zero code size increase seems to be the best, as it allows us to track progress on LNT. llvm-svn: 222096
author: Tobias Grosser <tobias@grosser.es> 2014-11-16 14:24:53 +0000
committer: Tobias Grosser <tobias@grosser.es> 2014-11-16 14:24:53 +0000
commit: bf34f1d2b25f1c9a5e0904bdd8145e730268a498 (patch)
tree: 84599bf52438d4a1159c853d6863b8d19352fb24 /polly/lib/CodeGen
parent: 670bdb5a64291362d585bb4e744a48094f7d4695 (diff)
download: bcm5719-llvm-bf34f1d2b25f1c9a5e0904bdd8145e730268a498.tar.gz
bcm5719-llvm-bf34f1d2b25f1c9a5e0904bdd8145e730268a498.zip
1 files changed, 22 insertions, 2 deletions
diff --git a/polly/lib/CodeGen/IslAst.cpp b/polly/lib/CodeGen/IslAst.cpp
index 3d442fb2122..b5fb851c853 100644
--- a/polly/lib/CodeGen/IslAst.cpp
+++ b/polly/lib/CodeGen/IslAst.cpp
@@ -47,6 +47,11 @@ static cl::opt<bool>
                   cl::desc("Generate thread parallel code (isl codegen only)"),
                   cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));
 
+static cl::opt<bool> PollyParallelForce(
+    "polly-parallel-force",
+    cl::desc("Force generation of thread parallel code ignoring any cost model"),
+    cl::init(false), cl::ZeroOrMore, cl::cat(PollyCategory));
+
 static cl::opt<bool> UseContext("polly-ast-use-context",
                                 cl::desc("Use context"), cl::Hidden,
                                 cl::init(false), cl::ZeroOrMore,
@@ -454,8 +459,23 @@ bool IslAstInfo::isReductionParallel(__isl_keep isl_ast_node *Node) {
 }
 
 bool IslAstInfo::isExecutedInParallel(__isl_keep isl_ast_node *Node) {
-  return PollyParallel && isOutermostParallel(Node) &&
-         !isReductionParallel(Node);
+
+  if (!PollyParallel)
+    return false;
+
+  // Do not parallelize innermost loops.
+  //
+  // Parallelizing innermost loops is often not profitable, especially if
+  // they have a low number of iterations.
+  //
+  // TODO: Decide this based on the number of loop iterations that will be
+  //       executed. This can possibly require run-time checks, which again
+  //       raises the question of both run-time check overhead and code size
+  //       costs.
+  if (!PollyParallelForce && isInnermost(Node))
+    return false;
+
+  return isOutermostParallel(Node) && !isReductionParallel(Node);
 }
 
 isl_union_map *IslAstInfo::getSchedule(__isl_keep isl_ast_node *Node) {
author	Tobias Grosser <tobias@grosser.es>	2014-11-16 14:24:53 +0000
committer	Tobias Grosser <tobias@grosser.es>	2014-11-16 14:24:53 +0000
commit	bf34f1d2b25f1c9a5e0904bdd8145e730268a498 (patch)
tree	84599bf52438d4a1159c853d6863b8d19352fb24 /polly/lib/CodeGen
parent	670bdb5a64291362d585bb4e744a48094f7d4695 (diff)
download	bcm5719-llvm-bf34f1d2b25f1c9a5e0904bdd8145e730268a498.tar.gz bcm5719-llvm-bf34f1d2b25f1c9a5e0904bdd8145e730268a498.zip