diff options
author | Roman Lebedev <lebedev.ri@gmail.com> | 2019-10-10 09:25:02 +0000 |
---|---|---|
committer | Roman Lebedev <lebedev.ri@gmail.com> | 2019-10-10 09:25:02 +0000 |
commit | 536b0ee40ab97f2878dc124a321cf9108ee3d233 (patch) | |
tree | 00e71c9bd87325d531cf9ecd56b4df23f0affbed /clang/lib/CodeGen | |
parent | 0226c35262df380dafb701df999e1b77bcf8f7f9 (diff) | |
download | bcm5719-llvm-536b0ee40ab97f2878dc124a321cf9108ee3d233.tar.gz bcm5719-llvm-536b0ee40ab97f2878dc124a321cf9108ee3d233.zip |
[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour
Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4 When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
elements ([dcl.array]), the expressions P + J and J + P
(where J has the value j) point to the (possibly-hypothetical) array
element i+j of x if 0≤i+j≤n and the expression P - J points to the
(possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```
Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)
To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".
Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566
Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.
`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown
Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers
Reviewed By: rsmith
Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D67122
llvm-svn: 374293
Diffstat (limited to 'clang/lib/CodeGen')
-rw-r--r-- | clang/lib/CodeGen/CGExprScalar.cpp | 149 |
1 files changed, 99 insertions, 50 deletions
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 19b1cedf32e..0da7a19debe 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -4542,7 +4542,7 @@ struct GEPOffsetAndOverflow { llvm::Value *OffsetOverflows; }; -/// Evaluate given GEPVal, which must be an inbounds GEP, +/// Evaluate given GEPVal, which is either an inbounds GEP, or a constant, /// and compute the total offset it applies from it's base pointer BasePtr. /// Returns offset in bytes and a boolean flag whether an overflow happened /// during evaluation. @@ -4550,10 +4550,28 @@ static GEPOffsetAndOverflow EmitGEPOffsetInBytes(Value *BasePtr, Value *GEPVal, llvm::LLVMContext &VMContext, CodeGenModule &CGM, CGBuilderTy Builder) { + const auto &DL = CGM.getDataLayout(); + + // The total (signed) byte offset for the GEP. + llvm::Value *TotalOffset = nullptr; + + // Was the GEP already reduced to a constant? + if (isa<llvm::Constant>(GEPVal)) { + // Compute the offset by casting both pointers to integers and subtracting: + // GEPVal = BasePtr + ptr(Offset) <--> Offset = int(GEPVal) - int(BasePtr) + Value *BasePtr_int = + Builder.CreatePtrToInt(BasePtr, DL.getIntPtrType(BasePtr->getType())); + Value *GEPVal_int = + Builder.CreatePtrToInt(GEPVal, DL.getIntPtrType(GEPVal->getType())); + TotalOffset = Builder.CreateSub(GEPVal_int, BasePtr_int); + return {TotalOffset, /*OffsetOverflows=*/Builder.getFalse()}; + } + auto *GEP = cast<llvm::GEPOperator>(GEPVal); + assert(GEP->getPointerOperand() == BasePtr && + "BasePtr must be the the base of the GEP."); assert(GEP->isInBounds() && "Expected inbounds GEP"); - const auto &DL = CGM.getDataLayout(); auto *IntPtrTy = DL.getIntPtrType(GEP->getPointerOperandType()); // Grab references to the signed add/mul overflow intrinsics for intptr_t. @@ -4563,8 +4581,6 @@ static GEPOffsetAndOverflow EmitGEPOffsetInBytes(Value *BasePtr, Value *GEPVal, auto *SMulIntrinsic = CGM.getIntrinsic(llvm::Intrinsic::smul_with_overflow, IntPtrTy); - // The total (signed) byte offset for the GEP. - llvm::Value *TotalOffset = nullptr; // The offset overflow flag - true if the total offset overflows. llvm::Value *OffsetOverflows = Builder.getFalse(); @@ -4635,69 +4651,102 @@ CodeGenFunction::EmitCheckedInBoundsGEP(Value *Ptr, ArrayRef<Value *> IdxList, if (!SanOpts.has(SanitizerKind::PointerOverflow)) return GEPVal; - // If the GEP has already been reduced to a constant, leave it be. - if (isa<llvm::Constant>(GEPVal)) - return GEPVal; + llvm::Type *PtrTy = Ptr->getType(); + + // Perform nullptr-and-offset check unless the nullptr is defined. + bool PerformNullCheck = !NullPointerIsDefined( + Builder.GetInsertBlock()->getParent(), PtrTy->getPointerAddressSpace()); + // Check for overflows unless the GEP got constant-folded, + // and only in the default address space + bool PerformOverflowCheck = + !isa<llvm::Constant>(GEPVal) && PtrTy->getPointerAddressSpace() == 0; - // Only check for overflows in the default address space. - if (GEPVal->getType()->getPointerAddressSpace()) + if (!(PerformNullCheck || PerformOverflowCheck)) return GEPVal; + const auto &DL = CGM.getDataLayout(); + SanitizerScope SanScope(this); + llvm::Type *IntPtrTy = DL.getIntPtrType(PtrTy); GEPOffsetAndOverflow EvaluatedGEP = EmitGEPOffsetInBytes(Ptr, GEPVal, getLLVMContext(), CGM, Builder); - auto *GEP = cast<llvm::GEPOperator>(GEPVal); - - const auto &DL = CGM.getDataLayout(); - auto *IntPtrTy = DL.getIntPtrType(GEP->getPointerOperandType()); + assert((!isa<llvm::Constant>(EvaluatedGEP.TotalOffset) || + EvaluatedGEP.OffsetOverflows == Builder.getFalse()) && + "If the offset got constant-folded, we don't expect that there was an " + "overflow."); auto *Zero = llvm::ConstantInt::getNullValue(IntPtrTy); - // Common case: if the total offset is zero, don't emit a check. - if (EvaluatedGEP.TotalOffset == Zero) + // Common case: if the total offset is zero, and we are using C++ semantics, + // where nullptr+0 is defined, don't emit a check. + if (EvaluatedGEP.TotalOffset == Zero && CGM.getLangOpts().CPlusPlus) return GEPVal; // Now that we've computed the total offset, add it to the base pointer (with // wrapping semantics). - auto *IntPtr = Builder.CreatePtrToInt(GEP->getPointerOperand(), IntPtrTy); + auto *IntPtr = Builder.CreatePtrToInt(Ptr, IntPtrTy); auto *ComputedGEP = Builder.CreateAdd(IntPtr, EvaluatedGEP.TotalOffset); - llvm::SmallVector<std::pair<llvm::Value *, SanitizerMask>, 1> Checks; - - // The GEP is valid if: - // 1) The total offset doesn't overflow, and - // 2) The sign of the difference between the computed address and the base - // pointer matches the sign of the total offset. - llvm::Value *ValidGEP; - auto *NoOffsetOverflow = Builder.CreateNot(EvaluatedGEP.OffsetOverflows); - if (SignedIndices) { - // GEP is computed as `unsigned base + signed offset`, therefore: - // * If offset was positive, then the computed pointer can not be - // [unsigned] less than the base pointer, unless it overflowed. - // * If offset was negative, then the computed pointer can not be - // [unsigned] greater than the bas pointere, unless it overflowed. - auto *PosOrZeroValid = Builder.CreateICmpUGE(ComputedGEP, IntPtr); - auto *PosOrZeroOffset = - Builder.CreateICmpSGE(EvaluatedGEP.TotalOffset, Zero); - llvm::Value *NegValid = Builder.CreateICmpULT(ComputedGEP, IntPtr); - ValidGEP = Builder.CreateSelect(PosOrZeroOffset, PosOrZeroValid, NegValid); - } else if (!IsSubtraction) { - // GEP is computed as `unsigned base + unsigned offset`, therefore the - // computed pointer can not be [unsigned] less than base pointer, - // unless there was an overflow. - // Equivalent to `@llvm.uadd.with.overflow(%base, %offset)`. - ValidGEP = Builder.CreateICmpUGE(ComputedGEP, IntPtr); - } else { - // GEP is computed as `unsigned base - unsigned offset`, therefore the - // computed pointer can not be [unsigned] greater than base pointer, - // unless there was an overflow. - // Equivalent to `@llvm.usub.with.overflow(%base, sub(0, %offset))`. - ValidGEP = Builder.CreateICmpULE(ComputedGEP, IntPtr); - } - ValidGEP = Builder.CreateAnd(ValidGEP, NoOffsetOverflow); - Checks.emplace_back(ValidGEP, SanitizerKind::PointerOverflow); + llvm::SmallVector<std::pair<llvm::Value *, SanitizerMask>, 2> Checks; + + if (PerformNullCheck) { + // In C++, if the base pointer evaluates to a null pointer value, + // the only valid pointer this inbounds GEP can produce is also + // a null pointer, so the offset must also evaluate to zero. + // Likewise, if we have non-zero base pointer, we can not get null pointer + // as a result, so the offset can not be -intptr_t(BasePtr). + // In other words, both pointers are either null, or both are non-null, + // or the behaviour is undefined. + // + // C, however, is more strict in this regard, and gives more + // optimization opportunities: in C, additionally, nullptr+0 is undefined. + // So both the input to the 'gep inbounds' AND the output must not be null. + auto *BaseIsNotNullptr = Builder.CreateIsNotNull(Ptr); + auto *ResultIsNotNullptr = Builder.CreateIsNotNull(ComputedGEP); + auto *Valid = + CGM.getLangOpts().CPlusPlus + ? Builder.CreateICmpEQ(BaseIsNotNullptr, ResultIsNotNullptr) + : Builder.CreateAnd(BaseIsNotNullptr, ResultIsNotNullptr); + Checks.emplace_back(Valid, SanitizerKind::PointerOverflow); + } + + if (PerformOverflowCheck) { + // The GEP is valid if: + // 1) The total offset doesn't overflow, and + // 2) The sign of the difference between the computed address and the base + // pointer matches the sign of the total offset. + llvm::Value *ValidGEP; + auto *NoOffsetOverflow = Builder.CreateNot(EvaluatedGEP.OffsetOverflows); + if (SignedIndices) { + // GEP is computed as `unsigned base + signed offset`, therefore: + // * If offset was positive, then the computed pointer can not be + // [unsigned] less than the base pointer, unless it overflowed. + // * If offset was negative, then the computed pointer can not be + // [unsigned] greater than the bas pointere, unless it overflowed. + auto *PosOrZeroValid = Builder.CreateICmpUGE(ComputedGEP, IntPtr); + auto *PosOrZeroOffset = + Builder.CreateICmpSGE(EvaluatedGEP.TotalOffset, Zero); + llvm::Value *NegValid = Builder.CreateICmpULT(ComputedGEP, IntPtr); + ValidGEP = + Builder.CreateSelect(PosOrZeroOffset, PosOrZeroValid, NegValid); + } else if (!IsSubtraction) { + // GEP is computed as `unsigned base + unsigned offset`, therefore the + // computed pointer can not be [unsigned] less than base pointer, + // unless there was an overflow. + // Equivalent to `@llvm.uadd.with.overflow(%base, %offset)`. + ValidGEP = Builder.CreateICmpUGE(ComputedGEP, IntPtr); + } else { + // GEP is computed as `unsigned base - unsigned offset`, therefore the + // computed pointer can not be [unsigned] greater than base pointer, + // unless there was an overflow. + // Equivalent to `@llvm.usub.with.overflow(%base, sub(0, %offset))`. + ValidGEP = Builder.CreateICmpULE(ComputedGEP, IntPtr); + } + ValidGEP = Builder.CreateAnd(ValidGEP, NoOffsetOverflow); + Checks.emplace_back(ValidGEP, SanitizerKind::PointerOverflow); + } assert(!Checks.empty() && "Should have produced some checks."); |