[Analyzer] Widening loops which do not exit

Summary: Dear All, We have been looking at the following problem, where any code after the constant bound loop is not analyzed because of the limit on how many times the same block is visited, as described in bugzillas #7638 and #23438. This problem is of interest to us because we have identified significant bugs that the checkers are not locating. We have been discussing a solution involving ranges as a longer term project, but I would like to propose a patch to improve the current implementation. Example issue: ``` for (int i = 0; i < 1000; ++i) {...something...} int *p = 0; *p = 0xDEADBEEF; ``` The proposal is to go through the first and last iterations of the loop. The patch creates an exploded node for the approximate last iteration of constant bound loops, before the max loop limit / block visit limit is reached. It does this by identifying the variable in the loop condition and finding the value which is “one away” from the loop being false. For example, if the condition is (x < 10), then an exploded node is created where the value of x is 9. Evaluating the loop body with x = 9 will then result in the analysis continuing after the loop, providing x is incremented. The patch passes all the tests, with some modifications to coverage.c, in order to make the ‘function_which_gives_up’ continue to give up, since the changes allowed the analysis to progress past the loop. This patch does introduce possible false positives, as a result of not knowing the state of variables which might be modified in the loop. I believe that, as a user, I would rather have false positives after loops than do no analysis at all. I understand this may not be the common opinion and am interested in hearing your views. There are also issues regarding break statements, which are not considered. A more advanced implementation of this approach might be able to consider other conditions in the loop, which would allow paths leading to breaks to be analyzed. Lastly, I have performed a study on large code bases and I think there is little benefit in having “max-loop” default to 4 with the patch. For variable bound loops this tends to result in duplicated analysis after the loop, and it makes little difference to any constant bound loop which will do more than a few iterations. It might be beneficial to lower the default to 2, especially for the shallow analysis setting. Please let me know your opinions on this approach to processing constant bound loops and the patch itself. Regards, Sean Eveson SN Systems - Sony Computer Entertainment Group Reviewers: jordan_rose, krememek, xazax.hun, zaks.anna, dcoughlin Subscribers: krememek, xazax.hun, cfe-commits Differential Revision: http://reviews.llvm.org/D12358 llvm-svn: 251621
author: Sean Eveson <eveson.sean@gmail.com> 2015-10-29 10:04:41 +0000
committer: Sean Eveson <eveson.sean@gmail.com> 2015-10-29 10:04:41 +0000
commit: 83390e45b3c50a95ea9df0441f447e683f529b93 (patch)
tree: edbcbc7c85ae9dd1d813f133b2b41352cc7033ba /clang/lib
parent: 72640f1c9f90cb8ae9c85e6225543f77d2102361 (diff)
download: bcm5719-llvm-83390e45b3c50a95ea9df0441f447e683f529b93.tar.gz
bcm5719-llvm-83390e45b3c50a95ea9df0441f447e683f529b93.zip
4 files changed, 94 insertions, 1 deletions
diff --git a/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp b/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
index bf8369ea35f..54c668cd2d6 100644
--- a/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
+++ b/clang/lib/StaticAnalyzer/Core/AnalyzerOptions.cpp
@@ -338,3 +338,9 @@ bool AnalyzerOptions::shouldInlineLambdas() {
     InlineLambdas = getBooleanOption("inline-lambdas", /*Default=*/true);
   return InlineLambdas.getValue();
 }
+
+bool AnalyzerOptions::shouldWidenLoops() {
+  if (!WidenLoops.hasValue())
+    WidenLoops = getBooleanOption("widen-loops", /*Default=*/false);
+  return WidenLoops.getValue();
+}
diff --git a/clang/lib/StaticAnalyzer/Core/CMakeLists.txt b/clang/lib/StaticAnalyzer/Core/CMakeLists.txt
index 4499323de93..aaffb0b82ce 100644
--- a/clang/lib/StaticAnalyzer/Core/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Core/CMakeLists.txt
@@ -28,6 +28,7 @@ add_clang_library(clangStaticAnalyzerCore
   ExprEngineObjC.cpp
   FunctionSummary.cpp
   HTMLDiagnostics.cpp
+  LoopWidening.cpp
   MemRegion.cpp
   PathDiagnostic.cpp
   PlistDiagnostics.cpp
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index f6129a963f8..b6aed88c600 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -26,6 +26,7 @@
 #include "clang/StaticAnalyzer/Core/CheckerManager.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/LoopWidening.h"
 #include "llvm/ADT/ImmutableList.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/Support/raw_ostream.h"
@@ -1395,8 +1396,25 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
                                          ExplodedNode *Pred) {
   PrettyStackTraceLocationContext CrashInfo(Pred->getLocationContext());
 
+  // If this block is terminated by a loop and it has already been visited the
+  // maximum number of times, widen the loop.
+  unsigned int BlockCount = nodeBuilder.getContext().blockCount();
+  if (BlockCount == AMgr.options.maxBlockVisitOnPath - 1 &&
+      AMgr.options.shouldWidenLoops()) {
+    const Stmt *Term = nodeBuilder.getContext().getBlock()->getTerminator();
+    if (!(Term &&
+          (isa<ForStmt>(Term) || isa<WhileStmt>(Term) || isa<DoStmt>(Term))))
+      return;
+    // Widen.
+    const LocationContext *LCtx = Pred->getLocationContext();
+    ProgramStateRef WidenedState =
+        getWidenedLoopState(Pred->getState(), LCtx, BlockCount, Term);
+    nodeBuilder.generateNode(WidenedState, Pred);
+    return;
+  }
+
   // FIXME: Refactor this into a checker.
-  if (nodeBuilder.getContext().blockCount() >= AMgr.options.maxBlockVisitOnPath) {
+  if (BlockCount >= AMgr.options.maxBlockVisitOnPath) {
     static SimpleProgramPointTag tag(TagProviderName, "Block count exceeded");
     const ExplodedNode *Sink =
                    nodeBuilder.generateSink(Pred->getState(), Pred, &tag);
diff --git a/clang/lib/StaticAnalyzer/Core/LoopWidening.cpp b/clang/lib/StaticAnalyzer/Core/LoopWidening.cpp
new file mode 100644
index 00000000000..1fc871f60be
--- /dev/null
+++ b/clang/lib/StaticAnalyzer/Core/LoopWidening.cpp
@@ -0,0 +1,68 @@
+//===--- LoopWidening.cpp - Instruction class definition --------*- C++ -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+///
+/// This file contains functions which are used to widen loops. A loop may be
+/// widened to approximate the exit state(s), without analyzing every
+/// iteration. The widening is done by invalidating anything which might be
+/// modified by the body of the loop.
+///
+//===----------------------------------------------------------------------===//
+
+#include "clang/StaticAnalyzer/Core/PathSensitive/LoopWidening.h"
+
+using namespace clang;
+using namespace ento;
+
+/// Return the loops condition Stmt or NULL if LoopStmt is not a loop
+static const Expr *getLoopCondition(const Stmt *LoopStmt) {
+  switch (LoopStmt->getStmtClass()) {
+  default:
+    return NULL;
+  case Stmt::ForStmtClass:
+    return cast<ForStmt>(LoopStmt)->getCond();
+  case Stmt::WhileStmtClass:
+    return cast<WhileStmt>(LoopStmt)->getCond();
+  case Stmt::DoStmtClass:
+    return cast<DoStmt>(LoopStmt)->getCond();
+  }
+}
+
+namespace clang {
+namespace ento {
+
+ProgramStateRef getWidenedLoopState(ProgramStateRef PrevState,
+                                    const LocationContext *LCtx,
+                                    unsigned BlockCount, const Stmt *LoopStmt) {
+
+  assert(isa<ForStmt>(LoopStmt) || isa<WhileStmt>(LoopStmt) ||
+         isa<DoStmt>(LoopStmt));
+
+  // Invalidate values in the current state.
+  // TODO Make this more conservative by only invalidating values that might
+  //      be modified by the body of the loop.
+  // TODO Nested loops are currently widened as a result of the invalidation
+  //      being so inprecise. When the invalidation is improved, the handling
+  //      of nested loops will also need to be improved.
+  const StackFrameContext *STC = LCtx->getCurrentStackFrame();
+  MemRegionManager &MRMgr = PrevState->getStateManager().getRegionManager();
+  const MemRegion *Regions[] = {MRMgr.getStackLocalsRegion(STC),
+                                MRMgr.getStackArgumentsRegion(STC),
+                                MRMgr.getGlobalsRegion()};
+  RegionAndSymbolInvalidationTraits ITraits;
+  for (auto *Region : Regions) {
+    ITraits.setTrait(Region,
+                     RegionAndSymbolInvalidationTraits::TK_EntireMemSpace);
+  }
+  return PrevState->invalidateRegions(Regions, getLoopCondition(LoopStmt),
+                                      BlockCount, LCtx, true, nullptr, nullptr,
+                                      &ITraits);
+}
+
+} // end namespace ento
+} // end namespace clang
author	Sean Eveson <eveson.sean@gmail.com>	2015-10-29 10:04:41 +0000
committer	Sean Eveson <eveson.sean@gmail.com>	2015-10-29 10:04:41 +0000
commit	83390e45b3c50a95ea9df0441f447e683f529b93 (patch)
tree	edbcbc7c85ae9dd1d813f133b2b41352cc7033ba /clang/lib
parent	72640f1c9f90cb8ae9c85e6225543f77d2102361 (diff)
download	bcm5719-llvm-83390e45b3c50a95ea9df0441f447e683f529b93.tar.gz bcm5719-llvm-83390e45b3c50a95ea9df0441f447e683f529b93.zip