[clang][ubsan] Implicit Conversion Sanitizer - integer sign change - clang part

This is the second half of Implicit Integer Conversion Sanitizer. It completes the first half, and finally makes the sanitizer fully functional! Only the bitfield handling is missing. Summary: C and C++ are interesting languages. They are statically typed, but weakly. The implicit conversions are allowed. This is nice, allows to write code while balancing between getting drowned in everything being convertible, and nothing being convertible. As usual, this comes with a price: ``` void consume(unsigned int val); void test(int val) { consume(val); // The 'val' is `signed int`, but `consume()` takes `unsigned int`. // If val is negative, then consume() will be operating on a large // unsigned value, and you may or may not have a bug. // But yes, sometimes this is intentional. // Making the conversion explicit silences the sanitizer. consume((unsigned int)val); } ``` Yes, there is a `-Wsign-conversion`` diagnostic group, but first, it is kinda noisy, since it warns on everything (unlike sanitizers, warning on an actual issues), and second, likely there are cases where it does **not** warn. The actual detection is pretty easy. We just need to check each of the values whether it is negative, and equality-compare the results of those comparisons. The unsigned value is obviously non-negative. Zero is non-negative too. https://godbolt.org/g/w93oj2 We do not have to emit the check *always*, there are obvious situations where we can avoid emitting it, since it would **always** get optimized-out. But i do think the tautological IR (`icmp ult %x, 0`, which is always false) should be emitted, and the middle-end should cleanup it. This sanitizer is in the `-fsanitize=implicit-conversion` group, and is a logical continuation of D48958 `-fsanitize=implicit-integer-truncation`. As for the ordering, i'we opted to emit the check **after** `-fsanitize=implicit-integer-truncation`. At least on these simple 16 test cases, this results in 1 of the 12 emitted checks being optimized away, as compared to 0 checks being optimized away if the order is reversed. This is a clang part. The compiler-rt part is D50251. Finishes fixing [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]]. Finishes partially fixing [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]]. Finishes fixing https://github.com/google/sanitizers/issues/940. Only the bitfield handling is missing. Reviewers: vsk, rsmith, rjmccall, #sanitizers, erichkeane Reviewed By: rsmith Subscribers: chandlerc, filcab, cfe-commits, regehr Tags: #sanitizers, #clang Differential Revision: https://reviews.llvm.org/D50250 llvm-svn: 345660
author: Roman Lebedev <lebedev.ri@gmail.com> 2018-10-30 21:58:56 +0000
committer: Roman Lebedev <lebedev.ri@gmail.com> 2018-10-30 21:58:56 +0000
commit: 62debd8055159de9576eb443b138803bf11c4bd7 (patch)
tree: 4180552a5f9004a318866f7aae6612b0655488ed /clang/lib/CodeGen
parent: 320e9af3096bf7b52d5a6fd780bab83e117989a9 (diff)
download: bcm5719-llvm-62debd8055159de9576eb443b138803bf11c4bd7.tar.gz
bcm5719-llvm-62debd8055159de9576eb443b138803bf11c4bd7.zip
1 files changed, 223 insertions, 34 deletions
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp
index a2b3f99573e..e95cd7d2d63 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -306,6 +306,8 @@ public:
     ICCK_IntegerTruncation = 0, // Legacy, was only used by clang 7.
     ICCK_UnsignedIntegerTruncation = 1,
     ICCK_SignedIntegerTruncation = 2,
+    ICCK_IntegerSignChange = 3,
+    ICCK_SignedIntegerTruncationOrSignChange = 4,
   };
 
   /// Emit a check that an [implicit] truncation of an integer  does not
@@ -313,15 +315,23 @@ public:
   void EmitIntegerTruncationCheck(Value *Src, QualType SrcType, Value *Dst,
                                   QualType DstType, SourceLocation Loc);
 
+  /// Emit a check that an [implicit] conversion of an integer does not change
+  /// the sign of the value. It is not UB, so we use the value after conversion.
+  /// NOTE: Src and Dst may be the exact same value! (point to the same thing)
+  void EmitIntegerSignChangeCheck(Value *Src, QualType SrcType, Value *Dst,
+                                  QualType DstType, SourceLocation Loc);
+
   /// Emit a conversion from the specified type to the specified destination
   /// type, both of which are LLVM scalar types.
   struct ScalarConversionOpts {
     bool TreatBooleanAsSigned;
     bool EmitImplicitIntegerTruncationChecks;
+    bool EmitImplicitIntegerSignChangeChecks;
 
     ScalarConversionOpts()
         : TreatBooleanAsSigned(false),
-          EmitImplicitIntegerTruncationChecks(false) {}
+          EmitImplicitIntegerTruncationChecks(false),
+          EmitImplicitIntegerSignChangeChecks(false) {}
   };
   Value *
   EmitScalarConversion(Value *Src, QualType SrcTy, QualType DstTy,
@@ -947,66 +957,231 @@ void ScalarExprEmitter::EmitFloatConversionCheck(
                 SanitizerHandler::FloatCastOverflow, StaticArgs, OrigSrc);
 }
 
+// Should be called within CodeGenFunction::SanitizerScope RAII scope.
+// Returns 'i1 false' when the truncation Src -> Dst was lossy.
+static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+                 std::pair<llvm::Value *, SanitizerMask>>
+EmitIntegerTruncationCheckHelper(Value *Src, QualType SrcType, Value *Dst,
+                                 QualType DstType, CGBuilderTy &Builder) {
+  llvm::Type *SrcTy = Src->getType();
+  llvm::Type *DstTy = Dst->getType();
+
+  // This should be truncation of integral types.
+  assert(Src != Dst);
+  assert(SrcTy->getScalarSizeInBits() > Dst->getType()->getScalarSizeInBits());
+  assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
+         "non-integer llvm type");
+
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+
+  // If both (src and dst) types are unsigned, then it's an unsigned truncation.
+  // Else, it is a signed truncation.
+  ScalarExprEmitter::ImplicitConversionCheckKind Kind;
+  SanitizerMask Mask;
+  if (!SrcSigned && !DstSigned) {
+    Kind = ScalarExprEmitter::ICCK_UnsignedIntegerTruncation;
+    Mask = SanitizerKind::ImplicitUnsignedIntegerTruncation;
+  } else {
+    Kind = ScalarExprEmitter::ICCK_SignedIntegerTruncation;
+    Mask = SanitizerKind::ImplicitSignedIntegerTruncation;
+  }
+
+  llvm::Value *Check = nullptr;
+  // 1. Extend the truncated value back to the same width as the Src.
+  Check = Builder.CreateIntCast(Dst, SrcTy, DstSigned, "anyext");
+  // 2. Equality-compare with the original source value
+  Check = Builder.CreateICmpEQ(Check, Src, "truncheck");
+  // If the comparison result is 'i1 false', then the truncation was lossy.
+  return std::make_pair(Kind, std::make_pair(Check, Mask));
+}
+
 void ScalarExprEmitter::EmitIntegerTruncationCheck(Value *Src, QualType SrcType,
                                                    Value *Dst, QualType DstType,
                                                    SourceLocation Loc) {
   if (!CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitIntegerTruncation))
     return;
 
-  llvm::Type *SrcTy = Src->getType();
-  llvm::Type *DstTy = Dst->getType();
-
   // We only care about int->int conversions here.
   // We ignore conversions to/from pointer and/or bool.
   if (!(SrcType->isIntegerType() && DstType->isIntegerType()))
     return;
 
-  assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
-         "clang integer type lowered to non-integer llvm type");
-
-  unsigned SrcBits = SrcTy->getScalarSizeInBits();
-  unsigned DstBits = DstTy->getScalarSizeInBits();
+  unsigned SrcBits = Src->getType()->getScalarSizeInBits();
+  unsigned DstBits = Dst->getType()->getScalarSizeInBits();
   // This must be truncation. Else we do not care.
   if (SrcBits <= DstBits)
     return;
 
   assert(!DstType->isBooleanType() && "we should not get here with booleans.");
 
+  // If the integer sign change sanitizer is enabled,
+  // and we are truncating from larger unsigned type to smaller signed type,
+  // let that next sanitizer deal with it.
   bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
   bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+  if (CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange) &&
+      (!SrcSigned && DstSigned))
+    return;
 
-  // If both (src and dst) types are unsigned, then it's an unsigned truncation.
-  // Else, it is a signed truncation.
-  ImplicitConversionCheckKind Kind;
-  SanitizerMask Mask;
-  if (!SrcSigned && !DstSigned) {
-    Kind = ICCK_UnsignedIntegerTruncation;
-    Mask = SanitizerKind::ImplicitUnsignedIntegerTruncation;
-  } else {
-    Kind = ICCK_SignedIntegerTruncation;
-    Mask = SanitizerKind::ImplicitSignedIntegerTruncation;
-  }
+  CodeGenFunction::SanitizerScope SanScope(&CGF);
+
+  std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+            std::pair<llvm::Value *, SanitizerMask>>
+      Check =
+          EmitIntegerTruncationCheckHelper(Src, SrcType, Dst, DstType, Builder);
+  // If the comparison result is 'i1 false', then the truncation was lossy.
 
   // Do we care about this type of truncation?
-  if (!CGF.SanOpts.has(Mask))
+  if (!CGF.SanOpts.has(Check.second.second))
     return;
 
-  CodeGenFunction::SanitizerScope SanScope(&CGF);
+  llvm::Constant *StaticArgs[] = {
+      CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
+      CGF.EmitCheckTypeDescriptor(DstType),
+      llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+  CGF.EmitCheck(Check.second, SanitizerHandler::ImplicitConversion, StaticArgs,
+                {Src, Dst});
+}
+
+// Should be called within CodeGenFunction::SanitizerScope RAII scope.
+// Returns 'i1 false' when the conversion Src -> Dst changed the sign.
+static std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+                 std::pair<llvm::Value *, SanitizerMask>>
+EmitIntegerSignChangeCheckHelper(Value *Src, QualType SrcType, Value *Dst,
+                                 QualType DstType, CGBuilderTy &Builder) {
+  llvm::Type *SrcTy = Src->getType();
+  llvm::Type *DstTy = Dst->getType();
+
+  assert(isa<llvm::IntegerType>(SrcTy) && isa<llvm::IntegerType>(DstTy) &&
+         "non-integer llvm type");
 
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+  unsigned SrcBits = SrcTy->getScalarSizeInBits();
+  unsigned DstBits = DstTy->getScalarSizeInBits();
+  (void)SrcBits; // Only used in assert()
+  (void)DstBits; // Only used in assert()
+
+  assert(((SrcBits != DstBits) || (SrcSigned != DstSigned)) &&
+         "either the widths should be different, or the signednesses.");
+
+  // NOTE: zero value is considered to be non-negative.
+  auto EmitIsNegativeTest = [&Builder](Value *V, QualType VType,
+                                       const char *Name) -> Value * {
+    // Is this value a signed type?
+    bool VSigned = VType->isSignedIntegerOrEnumerationType();
+    llvm::Type *VTy = V->getType();
+    if (!VSigned) {
+      // If the value is unsigned, then it is never negative.
+      // FIXME: can we encounter non-scalar VTy here?
+      return llvm::ConstantInt::getFalse(VTy->getContext());
+    }
+    // Get the zero of the same type with which we will be comparing.
+    llvm::Constant *Zero = llvm::ConstantInt::get(VTy, 0);
+    // %V.isnegative = icmp slt %V, 0
+    // I.e is %V *strictly* less than zero, does it have negative value?
+    return Builder.CreateICmp(llvm::ICmpInst::ICMP_SLT, V, Zero,
+                              llvm::Twine(Name) + "." + V->getName() +
+                                  ".negativitycheck");
+  };
+
+  // 1. Was the old Value negative?
+  llvm::Value *SrcIsNegative = EmitIsNegativeTest(Src, SrcType, "src");
+  // 2. Is the new Value negative?
+  llvm::Value *DstIsNegative = EmitIsNegativeTest(Dst, DstType, "dst");
+  // 3. Now, was the 'negativity status' preserved during the conversion?
+  //    NOTE: conversion from negative to zero is considered to change the sign.
+  //    (We want to get 'false' when the conversion changed the sign)
+  //    So we should just equality-compare the negativity statuses.
   llvm::Value *Check = nullptr;
+  Check = Builder.CreateICmpEQ(SrcIsNegative, DstIsNegative, "signchangecheck");
+  // If the comparison result is 'false', then the conversion changed the sign.
+  return std::make_pair(
+      ScalarExprEmitter::ICCK_IntegerSignChange,
+      std::make_pair(Check, SanitizerKind::ImplicitIntegerSignChange));
+}
 
-  // 1. Extend the truncated value back to the same width as the Src.
-  Check = Builder.CreateIntCast(Dst, SrcTy, DstSigned, "anyext");
-  // 2. Equality-compare with the original source value
-  Check = Builder.CreateICmpEQ(Check, Src, "truncheck");
-  // If the comparison result is 'i1 false', then the truncation was lossy.
+void ScalarExprEmitter::EmitIntegerSignChangeCheck(Value *Src, QualType SrcType,
+                                                   Value *Dst, QualType DstType,
+                                                   SourceLocation Loc) {
+  if (!CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange))
+    return;
+
+  llvm::Type *SrcTy = Src->getType();
+  llvm::Type *DstTy = Dst->getType();
+
+  // We only care about int->int conversions here.
+  // We ignore conversions to/from pointer and/or bool.
+  if (!(SrcType->isIntegerType() && DstType->isIntegerType()))
+    return;
+
+  bool SrcSigned = SrcType->isSignedIntegerOrEnumerationType();
+  bool DstSigned = DstType->isSignedIntegerOrEnumerationType();
+  unsigned SrcBits = SrcTy->getScalarSizeInBits();
+  unsigned DstBits = DstTy->getScalarSizeInBits();
+
+  // Now, we do not need to emit the check in *all* of the cases.
+  // We can avoid emitting it in some obvious cases where it would have been
+  // dropped by the opt passes (instcombine) always anyways.
+  // If it's a cast between the same type, just differently-sugared. no check.
+  QualType CanonSrcType = CGF.getContext().getCanonicalType(SrcType);
+  QualType CanonDstType = CGF.getContext().getCanonicalType(DstType);
+  if (CanonSrcType == CanonDstType)
+    return;
+  // At least one of the values needs to have signed type.
+  // If both are unsigned, then obviously, neither of them can be negative.
+  if (!SrcSigned && !DstSigned)
+    return;
+  // If the conversion is to *larger* *signed* type, then no check is needed.
+  // Because either sign-extension happens (so the sign will remain),
+  // or zero-extension will happen (the sign bit will be zero.)
+  if ((DstBits > SrcBits) && DstSigned)
+    return;
+  if (CGF.SanOpts.has(SanitizerKind::ImplicitSignedIntegerTruncation) &&
+      (SrcBits > DstBits) && SrcSigned) {
+    // If the signed integer truncation sanitizer is enabled,
+    // and this is a truncation from signed type, then no check is needed.
+    // Because here sign change check is interchangeable with truncation check.
+    return;
+  }
+  // That's it. We can't rule out any more cases with the data we have.
+
+  CodeGenFunction::SanitizerScope SanScope(&CGF);
+
+  std::pair<ScalarExprEmitter::ImplicitConversionCheckKind,
+            std::pair<llvm::Value *, SanitizerMask>>
+      Check;
+
+  // Each of these checks needs to return 'false' when an issue was detected.
+  ImplicitConversionCheckKind CheckKind;
+  llvm::SmallVector<std::pair<llvm::Value *, SanitizerMask>, 2> Checks;
+  // So we can 'and' all the checks together, and still get 'false',
+  // if at least one of the checks detected an issue.
+
+  Check = EmitIntegerSignChangeCheckHelper(Src, SrcType, Dst, DstType, Builder);
+  CheckKind = Check.first;
+  Checks.emplace_back(Check.second);
+
+  if (CGF.SanOpts.has(SanitizerKind::ImplicitSignedIntegerTruncation) &&
+      (SrcBits > DstBits) && !SrcSigned && DstSigned) {
+    // If the signed integer truncation sanitizer was enabled,
+    // and we are truncating from larger unsigned type to smaller signed type,
+    // let's handle the case we skipped in that check.
+    Check =
+        EmitIntegerTruncationCheckHelper(Src, SrcType, Dst, DstType, Builder);
+    CheckKind = ICCK_SignedIntegerTruncationOrSignChange;
+    Checks.emplace_back(Check.second);
+    // If the comparison result is 'i1 false', then the truncation was lossy.
+  }
 
   llvm::Constant *StaticArgs[] = {
       CGF.EmitCheckSourceLocation(Loc), CGF.EmitCheckTypeDescriptor(SrcType),
       CGF.EmitCheckTypeDescriptor(DstType),
-      llvm::ConstantInt::get(Builder.getInt8Ty(), Kind)};
-  CGF.EmitCheck(std::make_pair(Check, Mask),
-                SanitizerHandler::ImplicitConversion, StaticArgs, {Src, Dst});
+      llvm::ConstantInt::get(Builder.getInt8Ty(), CheckKind)};
+  // EmitCheck() will 'and' all the checks together.
+  CGF.EmitCheck(Checks, SanitizerHandler::ImplicitConversion, StaticArgs,
+                {Src, Dst});
 }
 
 /// Emit a conversion from the specified type to the specified destination type,
@@ -1081,8 +1256,13 @@ Value *ScalarExprEmitter::EmitScalarConversion(Value *Src, QualType SrcType,
   }
 
   // Ignore conversions like int -> uint.
-  if (SrcTy == DstTy)
+  if (SrcTy == DstTy) {
+    if (Opts.EmitImplicitIntegerSignChangeChecks)
+      EmitIntegerSignChangeCheck(Src, NoncanonicalSrcType, Src,
+                                 NoncanonicalDstType, Loc);
+
     return Src;
+  }
 
   // Handle pointer conversions next: pointers can only be converted to/from
   // other pointers and integers. Check for pointer types in terms of LLVM, as
@@ -1226,6 +1406,10 @@ Value *ScalarExprEmitter::EmitScalarConversion(Value *Src, QualType SrcType,
     EmitIntegerTruncationCheck(Src, NoncanonicalSrcType, Res,
                                NoncanonicalDstType, Loc);
 
+  if (Opts.EmitImplicitIntegerSignChangeChecks)
+    EmitIntegerSignChangeCheck(Src, NoncanonicalSrcType, Res,
+                               NoncanonicalDstType, Loc);
+
   return Res;
 }
 
@@ -2010,9 +2194,14 @@ Value *ScalarExprEmitter::VisitCastExpr(CastExpr *CE) {
 
   case CK_IntegralCast: {
     ScalarConversionOpts Opts;
-    if (CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitIntegerTruncation)) {
-      if (auto *ICE = dyn_cast<ImplicitCastExpr>(CE))
-        Opts.EmitImplicitIntegerTruncationChecks = !ICE->isPartOfExplicitCast();
+    if (auto *ICE = dyn_cast<ImplicitCastExpr>(CE)) {
+      if (CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitConversion) &&
+          !ICE->isPartOfExplicitCast()) {
+        Opts.EmitImplicitIntegerTruncationChecks =
+            CGF.SanOpts.hasOneOf(SanitizerKind::ImplicitIntegerTruncation);
+        Opts.EmitImplicitIntegerSignChangeChecks =
+            CGF.SanOpts.has(SanitizerKind::ImplicitIntegerSignChange);
+      }
     }
     return EmitScalarConversion(Visit(E), E->getType(), DestTy,
                                 CE->getExprLoc(), Opts);
author	Roman Lebedev <lebedev.ri@gmail.com>	2018-10-30 21:58:56 +0000
committer	Roman Lebedev <lebedev.ri@gmail.com>	2018-10-30 21:58:56 +0000
commit	62debd8055159de9576eb443b138803bf11c4bd7 (patch)
tree	4180552a5f9004a318866f7aae6612b0655488ed /clang/lib/CodeGen
parent	320e9af3096bf7b52d5a6fd780bab83e117989a9 (diff)
download	bcm5719-llvm-62debd8055159de9576eb443b138803bf11c4bd7.tar.gz bcm5719-llvm-62debd8055159de9576eb443b138803bf11c4bd7.zip