summaryrefslogtreecommitdiffstats
path: root/compiler-rt/lib/scudo/scudo_allocator.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [GWP-ASan] [Scudo] Add GWP-ASan backtrace for alloc/free to Scudo.Mitch Phillips2019-07-021-1/+5
| | | | | | | | | | | | | | | | | | | | | Summary: Adds allocation and deallocation stack trace support to Scudo. The default provided backtrace library for GWP-ASan is supplied by the libc unwinder, and is suitable for production variants of Scudo. If Scudo in future has its own unwinder, it may choose to use its own over the generic unwinder instead. Reviewers: cryptoad Reviewed By: cryptoad Subscribers: kubamracek, mgorny, #sanitizers, llvm-commits, morehouse, vlad.tsyrklevich, eugenis Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D64085 llvm-svn: 364966
* [GWP-ASan] Integration with Scudo [5].Mitch Phillips2019-06-171-0/+47
| | | | | | | | | | | | | | | | | | | | | | Summary: See D60593 for further information. This patch adds GWP-ASan support to the Scudo hardened allocator. It also implements end-to-end integration tests using Scudo as the backing allocator. The tests include crash handling for buffer over/underflow as well as use-after-free detection. Reviewers: vlad.tsyrklevich, cryptoad Reviewed By: vlad.tsyrklevich, cryptoad Subscribers: kubamracek, mgorny, #sanitizers, llvm-commits, morehouse Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D62929 llvm-svn: 363584
* [scudo] Tuning changes based on feedback from current useKostya Kortchinsky2019-01-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: This tunes several of the default parameters used within the allocator: - disable the deallocation type mismatch on Android by default; this was causing too many issues with third party libraries; - change the default `SizeClassMap` to `Dense`, it caches less entries and is way more memory efficient overall; - relax the timing of the RSS checks, 10 times per second was too much, lower it to 4 times (every 250ms), and update the test so that it passes with the new default. Reviewers: eugenis Reviewed By: eugenis Subscribers: srhines, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D57116 llvm-svn: 352057
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [scudo] Replace eraseHeader with compareExchangeHeader for Quarantined chunksKostya Kortchinsky2018-08-241-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The reason for the existence of `eraseHeader` was that it was deemed faster to null-out a chunk header, effectively making it invalid, rather than marking it as available, which incurred a checksum computation and a cmpxchg. A previous use of `eraseHeader` was removed with D50655 due to a race. Now we remove the second use of it in the Quarantine deallocation path and replace is with a `compareExchangeHeader`. The reason for this is that greatly helps debugging some heap bugs as the chunk header is now valid and the chunk marked available, as opposed to the header being invalid. Eg: we get an invalid state error, instead of an invalid header error, which reduces the possibilities. The computational penalty is negligible. Reviewers: alekseyshl, flowerhack, eugenis Reviewed By: eugenis Subscribers: delcypher, jfb, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D51224 llvm-svn: 340633
* [scudo] Fix race condition in deallocation path when Quarantine is bypassedKostya Kortchinsky2018-08-141-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: There is a race window in the deallocation path when the Quarantine is bypassed. Initially we would just erase the header of a chunk if we were not to use the Quarantine, as opposed to using a compare-exchange primitive, to make things faster. It turned out to be a poor decision, as 2 threads (or more) could simultaneously deallocate the same pointer, and if the checks were to done before the header got erased, this would result in the pointer being added twice (or more) to distinct thread caches, and eventually be reused. Winning the race is not trivial but can happen with enough control over the allocation primitives. The repro added attempts to trigger the bug, with a moderate success rate, but it should be enough to notice if the bug ever make its way back into the code. Since I am changing things in this file, there are 2 smaller changes tagging along, marking a variable `const`, and improving the Quarantine bypass test at runtime. Reviewers: alekseyshl, eugenis, kcc, vitalybuka Reviewed By: eugenis, vitalybuka Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D50655 llvm-svn: 339705
* [scudo] Simplify internal names (NFC)Kostya Kortchinsky2018-07-201-46/+42
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: There is currently too much redundancy in the class/variable/* names in Scudo: - we are in the namespace `__scudo`, so there is no point in having something named `ScudoX` to end up with a final name of `__scudo::ScudoX`; - there are a lot of types/* that have `Allocator` in the name, given that Scudo is an allocator I figure this doubles up as well. So change a bunch of the Scudo names to make them shorter, less redundant, and overall simpler. They should still be pretty self explaining (or at least it looks so to me). The TSD part will be done in another CL (eg `__scudo::ScudoTSD`). Reviewers: alekseyshl, eugenis Reviewed By: alekseyshl Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D49505 llvm-svn: 337557
* [scudo] Move noinline functions definitions out of lineKostya Kortchinsky2018-06-191-63/+67
| | | | | | | | | | | | | | | | Summary: Mark `isRssLimitExceeded` as `NOINLINE`, and move it's definition as well as the one of `performSanityChecks` out of the class definition, as requested. Reviewers: filcab, alekseyshl Reviewed By: alekseyshl Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D48228 llvm-svn: 335054
* [scudo] Add verbose failures in place of CHECK(0)Kostya Kortchinsky2018-06-151-23/+42
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The current `FailureHandler` mechanism was fairly opaque with regard to the failure reason due to using `CHECK(0)`. Scudo is a bit different from the other Sanitizers as it prefers to avoid spurious processing in its failure path. So we just `dieWithMessage` using a somewhat explicit string. Adapted the tests for the new strings. While this takes care of the `OnBadRequest` & `OnOOM` failures, the next step is probably to migrate the other Scudo failures in the same failes (header corruption, invalid state and so on). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: filcab, mgorny, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D48199 llvm-svn: 334843
* [scudo] Add C++17 aligned new/delete operators supportKostya Kortchinsky2018-06-121-18/+12
| | | | | | | | | | | | | | | | | | | Summary: This CL adds support for aligned new/delete operators (C++17). Currently we do not support alignment inconsistency detection on deallocation, as this requires a header change, but the APIs are introduced and are functional. Add a smoke test for the aligned version of the operators. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D48031 llvm-svn: 334505
* [scudo] Improve the scalability of the shared TSD modelKostya Kortchinsky2018-06-111-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The shared TSD model in its current form doesn't scale. Here is an example of rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core machines (defaulting to a `CompactSizeClassMap` and no Quarantine): - with tcmalloc: 337K reqs/sec, peak RSS of 338MB; - with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB; - with scudo (shared): 241K reqs/sec, peak RSS of 324MB. This isn't great, since the exclusive model uses a lot of memory, while the shared model doesn't even come close to be competitive. This is mostly due to the fact that we are consistently scanning the TSD pool starting at index 0 for an available TSD, which can result in a lot of failed lock attempts, and touching some memory that needs not be touched. This CL attempts to make things better in most situations: - first, use a thread local variable on Linux (intead of pthread APIs) to store the current TSD in the shared model; - move the locking boolean out of the TSD: this allows the compiler to use a register and potentially optimize out a branch instead of reading it from the TSD everytime (we also save a tiny bit of memory per TSD); - 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so store the `Precedence` in a `uptr` instead of a `u64`. We lose some nanoseconds of precision and we'll wrap around at some point, but the benefit is worth it; - change a `CHECK` to a `DCHECK`: this should never happen, but if something is ever terribly wrong, we'll crash on a near null AV if the TSD happens to be null; - based on an idea by dvyukov@, we are implementing a bound random scan for an available TSD. This requires computing the coprimes for the number of TSDs, and attempting to lock up to 4 TSDs in an random order before falling back to the current one. This is obviously slightly more expansive when we have just 2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still basically corresponds to the moment of the first contention on a TSD. To seed on random choice, we use the precedence of the current TSD since it is very likely to be non-zero (since we are in the slow path after a failed `tryLock`) With those modifications, the benchmark yields to: - with scudo (shared): 330K reqs/sec, peak RSS of 327MB. So the shared model for this specific situation not only becomes competitive but outperforms the exclusive model. I experimented with some values greater than 4 for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just sticking with the current TSD is also a tad slower. Numbers on platforms with less cores (eg: Android) remain similar. Reviewers: alekseyshl, dvyukov, javed.absar Reviewed By: alekseyshl, dvyukov Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D47289 llvm-svn: 334410
* [scudo] Quarantine optimizationKostya Kortchinsky2018-05-161-1/+2
| | | | | | | | | | | | | | | | | | | | Summary: It turns out that the previous code construct was not optimizing the allocation and deallocation of batches. The class id was read as a class member (even though a precomputed one) and nothing else was optimized. By changing the construct this way, the compiler actually optimizes most of the allocation and deallocation away to only work with a single class id, which not only saves some CPU but also some code footprint. Reviewers: alekseyshl, dvyukov Reviewed By: dvyukov Subscribers: dvyukov, delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D46961 llvm-svn: 332502
* [scudo] Adding an interface function to print allocator statsKostya Kortchinsky2018-04-251-0/+9
| | | | | | | | | | | | | | | | Summary: This adds `__scudo_print_stats` as an interface function to display the Primary and Secondary allocator statistics for Scudo. Reviewers: alekseyshl, flowerhack Reviewed By: alekseyshl Subscribers: delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D46016 llvm-svn: 330857
* [sanitizer] Allow for the allocator "names" to be set by the toolsKostya Kortchinsky2018-04-131-0/+3
| | | | | | | | | | | | | | | | | | | Summary: In the same spirit of SanitizerToolName, allow the Primary & Secondary allocators to have names that can be set by the tools via PrimaryAllocatorName and SecondaryAllocatorName. Additionally, set a non-default name for Scudo. Reviewers: alekseyshl, vitalybuka Reviewed By: alekseyshl, vitalybuka Subscribers: kubamracek, delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D45600 llvm-svn: 330055
* [scudo] Add Chunk::getSize, rework Chunk::getUsableSizeKostya Kortchinsky2018-03-141-19/+28
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Using `getActuallyAllocatedSize` from the Combined resulting in mediocre compiled code, as the `ClassId != 0` predicament was not propagated there, resulting in additional branches and dead code. Move the logic in the frontend, which results in better compiled code. Also I think it makes it slightly easier to distinguish between the size the user requested, and the size that was actually allocated by the allocator. `const` a couple of things as well. This has no functional impact. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D44444 llvm-svn: 327525
* [scudo] Make logging more consistentKostya Kortchinsky2018-03-071-61/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A few changes related to logging: - prepend `Scudo` to the error messages so that users can identify that we reported an error; - replace a couple of `Report` calls in the RSS check code with `dieWithMessage`/`Print`, mark a condition as `UNLIKELY` in the process; - change some messages so that they all look more or less the same. This includes the `CHECK` message; - adapt a couple of tests with the new strings. A couple of side notes: this results in a few 1-line-blocks, for which I left brackets. There doesn't seem to be any style guide for that, I can remove them if need be. I didn't use `SanitizerToolName` in the strings, but directly `Scudo` because we are the only users, I could change that too. Reviewers: alekseyshl, flowerhack Reviewed By: alekseyshl Subscribers: mgorny, delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D44171 llvm-svn: 326901
* [scudo] Introduce Chunk::getHeaderSizeKostya Kortchinsky2018-02-271-13/+12
| | | | | | | | | | | | | | | | | | | | | Summary: Instead of using `AlignedChunkHeaderSize`, introduce a `constexpr` function `getHeaderSize` in the `Chunk` namespace. Switch `RoundUpTo` to a `constexpr` as well (so we can use it in `constexpr` declarations). Mark a few variables in the areas touched as `const`. Overall this has no functional change, and is mostly to make things a bit more consistent. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D43772 llvm-svn: 326206
* [scudo] Add default implementations for weak functionsKostya Kortchinsky2018-01-301-0/+12
| | | | | | | | | | | | | | | | Summary: This is in preparation for platforms where `SANITIZER_SUPPORTS_WEAK_HOOKS` is 0. They require a default implementation. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D42557 llvm-svn: 323795
* [scudo] Allow for weak hooks, gated by a defineKostya Kortchinsky2018-01-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: Hooks in the allocation & deallocation paths can be a security risk (see for an example https://scarybeastsecurity.blogspot.com/2016/11/0day-exploit-advancing-exploitation.html which used the glibc's __free_hook to complete exploitation). But some users have expressed a need for them, even if only for tests and memory benchmarks. So allow for `__sanitizer_malloc_hook` & `__sanitizer_free_hook` to be called if defined, and gate them behind a global define `SCUDO_CAN_USE_HOOKS` defaulting to 0. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42430 llvm-svn: 323278
* [scudo] Fix for the Scudo interface function scopeKostya Kortchinsky2018-01-171-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A forgotten include in `scudo_allocator.cpp` made the symbol only local :/ Before: ``` nm ./lib/clang/7.0.0/lib/linux/libclang_rt.scudo-i686-android.so | grep rss 00024730 t __scudo_set_rss_limit ``` After: ``` nm ./lib/clang/7.0.0/lib/linux/libclang_rt.scudo-i686-android.so | grep rs 00024760 T __scudo_set_rss_limit ``` And we want `T`! This include also means that we can get rid of the `extern "C"` in the C++ file, the compiler does fine without it (note that this was already the case for all the `__sanitizer_*` interface functions. Reviewers: alekseyshl, eugenis Reviewed By: eugenis Subscribers: #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D42199 llvm-svn: 322782
* [scudo] s/unsigned long/size_t/ for __scudo_set_rss_limitKostya Kortchinsky2018-01-041-3/+3
| | | | | | | | | | | | | | | | | | | | Summary: `__scudo_set_rss_limit`'s `LimitMb` should really be a `size_t`. Update accordingly the prototype. To avoid the `NOLINT` and conform with the other Sanitizers, use the sanitizers types for the internal definition. This should have no functional change. Additionally, capitalize a variable name to follow the LLVM coding standards. Reviewers: alekseyshl, flowerhack Reviewed By: alekseyshl Subscribers: #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D41704 llvm-svn: 321803
* [scudo] Refactor ScudoChunkKostya Kortchinsky2017-12-141-119/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The initial implementation used an ASan like Chunk class that was deriving from a Header class. Due to potential races, we ended up working with local copies of the Header and never using the parent class fields. ScudoChunk was never constructed but cast, and we were using `this` as the pointer needed for our computations. This was meh. So we refactored ScudoChunk to be now a series of static functions within the namespace `__scudo::Chunk` that take a "user" pointer as first parameter (former `this`). A compiled binary doesn't really change, but the code is more sensible. Clang tends to inline all those small function (in -O2), but GCC left a few not inlined, so we add the `INLINE` keyword to all. Since we don't have `ScudoChunk` pointers anymore, a few variables were renamed here and there to introduce a clearer distinction between a user pointer (usually `Ptr`) and a backend pointer (`BackendPtr`). Reviewers: alekseyshl, flowerhack Reviewed By: alekseyshl Subscribers: #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D41200 llvm-svn: 320745
* [scudo] Adding a public Scudo interfaceKostya Kortchinsky2017-12-131-0/+18
| | | | | | | | | | | | | | | | Summary: The first and only function to start with allows to set the soft or hard RSS limit at runtime. Add associated tests. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: mgorny, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D41128 llvm-svn: 320611
* [sanitizer] Introduce a vDSO aware timing functionKostya Kortchinsky2017-12-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: See D40657 & D40679 for previous versions of this patch & description. A couple of things were fixed here to have it not break some bots. Weak symbols can't be used with `SANITIZER_GO` so the previous version was breakin TsanGo. I set up some additional local tests and those pass now. I changed the workaround for the glibc vDSO issue: `__progname` is initialized after the vDSO and is actually public and of known type, unlike `__vdso_clock_gettime`. This works better, and with all compilers. The rest is the same. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, kubamracek, krytarowski, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D41121 llvm-svn: 320594
* [scudo] Inline getScudoChunk function.Kostya Kortchinsky2017-12-131-1/+1
| | | | | | | | | | | | | | | Summary: getScudoChunk function is implicitly inlined for optimized builds on clang, but not on gcc. It's a small enough function that it seems sensible enough to just inline it by default. Reviewers: cryptoad, alekseyshl Reviewed By: cryptoad Differential Revision: https://reviews.llvm.org/D41138 llvm-svn: 320592
* [sanitizer] Revert rL320409Kostya Kortchinsky2017-12-111-2/+2
| | | | | | | | | | | | | | Summary: D40679 broke a couple of builds, reverting while investigating. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, kubamracek, krytarowski, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D41088 llvm-svn: 320417
* [sanitizer] Introduce a vDSO aware time function, and use it in the ↵Kostya Kortchinsky2017-12-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allocator [redo] Summary: Redo of D40657, which had the initial discussion. The initial code had to move into a libcdep file, and things had to be shuffled accordingly. `NanoTime` is a time sink when checking whether or not to release memory to the OS. While reducing the amount of calls to said function is in the works, another solution that was found to be beneficial was to use a timing function that can leverage the vDSO. We hit a couple of snags along the way, like the fact that the glibc crashes when clock_gettime is called from a preinit_array, or the fact that `__vdso_clock_gettime` is mangled (for security purposes) and can't be used directly, and also that clock_gettime can be intercepted. The proposed solution takes care of all this as far as I can tell, and significantly improve performances and some Scudo load tests with memory reclaiming enabled. @mcgrathr: please feel free to follow up on https://reviews.llvm.org/D40657#940857 here. I posted a reply at https://reviews.llvm.org/D40657#940974. Reviewers: alekseyshl, krytarowski, flowerhack, mcgrathr, kubamracek Reviewed By: alekseyshl, krytarowski Subscribers: #sanitizers, mcgrathr, srhines, llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D40679 llvm-svn: 320409
* [scudo] Minor code generation improvementKostya Kortchinsky2017-12-081-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: It looks like clang was generating somewhat weird assembly with the current code. `FromPrimary`, even though `const`, was replaced every time with the code generated for `size <= SizeClassMap::kMaxSize` instead of using a variable or register, and `FromPrimary` didn't induce `ClassId != 0` for the compiler, so a dead branch was generated for `getActuallyAllocatedSize(Ptr, ClassId)` since it's never called for `ClassId = 0` (Secondary backed allocations) [this one was more wishful thinking on my side than anything else]. I rearranged the code bit so that the generated assembly is less clunky. Also changed 2 whitespace inconsistencies that were bothering me. Reviewers: alekseyshl, flowerhack Reviewed By: flowerhack Subscribers: llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D40976 llvm-svn: 320160
* [scudo] Get rid of the thread local PRNG & header saltKostya Kortchinsky2017-12-051-48/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: It was deemed that the salt in the chunk header didn't improve security significantly (and could actually decrease it). The initial idea was that the same chunk would different headers on different allocations, allowing for less predictability. The issue is that gathering the same chunk header with different salts can give information about the other "secrets" (cookie, pointer), and that if an attacker leaks a header, they can reuse it anyway for that same chunk anyway since we don't enforce the salt value. So we get rid of the salt in the header. This means we also get rid of the thread local Prng, and that we don't need a global Prng anymore as well. This makes everything faster. We reuse those 8 bits to store the `ClassId` of a chunk now (0 for a secondary based allocation). This way, we get some additional speed gains: - `ClassId` is computed outside of the locked block; - `getActuallyAllocatedSize` doesn't need the `GetSizeClass` call; - same for `deallocatePrimary`; We add a sanity check at init for this new field (all sanity checks are moved in their own function, `init` was getting crowded). Reviewers: alekseyshl, flowerhack Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40796 llvm-svn: 319791
* [scudo] Overhaul hardware CRC32 feature detectionKostya Kortchinsky2017-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch aims at condensing the hardware CRC32 feature detection and making it slightly more effective on Android. The following changes are included: - remove the `CPUFeature` enum, and get rid of one level of nesting of functions: we only used CRC32, so we just implement and use `hasHardwareCRC32`; - allow for a weak `getauxval`: the Android toolchain is compiled at API level 14 for Android ARM, meaning no `getauxval` at compile time, yet we will run on API level 27+ devices. The `/proc/self/auxv` fallback can work but is worthless for a process like `init` where the proc filesystem doesn't exist yet. If a weak `getauxval` doesn't exist, then fallback. - couple of extra corrections. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, aemerson, srhines, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D40322 llvm-svn: 318859
* [scudo] Soft and hard RSS limit checksKostya Kortchinsky2017-11-151-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This implements an opportunistic check for the RSS limit. For ASan, this was implemented thanks to a background thread checking the current RSS vs the set limit every 100ms. This was deemed problematic for Scudo due to potential Android concerns (Zygote as pointed out by Aleksey) as well as the general inconvenience of having a permanent background thread. If a limit (soft or hard) is specified, we will attempt to update the RSS limit status (exceeded or not) every 100ms. This is done in an opportunistic way: if we can update it, we do it, if not we return the current status, mostly because we don't need it to be fully consistent (it's done every 100ms anyway). If the limit is exceeded `allocate` will act as if OOM for a soft limit, or just die for a hard limit. We use the `common_flags()`'s `hard_rss_limit_mb` & `soft_rss_limit_mb` for configuration of the limits. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40038 llvm-svn: 318301
* [scudo] Simplify initialization and flagsKostya Kortchinsky2017-11-141-60/+20
| | | | | | | | | | | | | | | | | | | | | | | Summary: This is mostly some cleanup and shouldn't affect functionalities. Reviewing some code for a future addition, I realized that the complexity of the initialization path was unnecessary, and so was maintaining a structure for the allocator options throughout the initialization. So we get rid of that structure, of an extraneous level of nesting for the `init` function, and correct a couple of related code inaccuracies in the flags cpp. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39974 llvm-svn: 318157
* [scudo] Allow for non-Android Shared TSD platforms, part 1Kostya Kortchinsky2017-10-121-6/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: This first part just prepares the grounds for part 2 and doesn't add any new functionality. It mostly consists of small refactors: - move the `pthread.h` include higher as it will be used in the headers; - use `errno.h` in `scudo_allocator.cpp` instead of the sanitizer one, update the `errno` assignments accordingly (otherwise it creates conflicts on some platforms due to `pthread.h` including `errno.h`); - introduce and use `getCurrentTSD` and `setCurrentTSD` for the shared TSD model code; Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D38826 llvm-svn: 315583
* [scudo] Scudo thread specific data refactor, part 3Kostya Kortchinsky2017-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previous parts: D38139, D38183. In this part of the refactor, we abstract the Linux vs Android TSD dissociation in favor of a Exclusive vs Shared one, allowing for easier platform introduction and configuration. Most of this change consist of shuffling the files around to reflect the new organization. We introduce `scudo_platform.h` where platform specific definition lie. This involves the TSD model and the platform specific allocator parameters. In an upcoming CL, those will be configurable via defines, but we currently stick with conservative defaults. Reviewers: alekseyshl, dvyukov Reviewed By: alekseyshl, dvyukov Subscribers: srhines, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D38244 llvm-svn: 314224
* [scudo] Scudo thread specific data refactor, part 2Kostya Kortchinsky2017-09-251-41/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Following D38139, we now consolidate the TSD definition, merging the shared TSD definition with the exclusive TSD definition. We introduce a boolean set at initializaton denoting the need for the TSD to be unlocked or not. This adds some unused members to the exclusive TSD, but increases consistency and reduces the definitions fragmentation. We remove the fallback mechanism from `scudo_allocator.cpp` and add a fallback TSD in the non-shared version. Since the shared version doesn't require one, this makes overall more sense. There are a couple of additional cosmetic changes: removing the header guards from the remaining `.inc` files, added error string to a `CHECK`. Question to reviewers: I thought about friending `getTSDAndLock` in `ScudoTSD` so that the `FallbackTSD` could `Mutex.Lock()` directly instead of `lock()` which involved zeroing out the `Precedence`, which is unused otherwise. Is it worth doing? Reviewers: alekseyshl, dvyukov, kcc Reviewed By: dvyukov Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D38183 llvm-svn: 314110
* [scudo] Scudo thread specific data refactor, part 1Kostya Kortchinsky2017-09-221-36/+24
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: We are going through an overhaul of Scudo's TSD, to allow for new platforms to be integrated more easily, and make the code more sound. This first part is mostly renaming, preferring some shorter names, correcting some comments. I removed `getPrng` and `getAllocatorCache` to directly access the members, there was not really any benefit to them (and it was suggested by Dmitry in D37590). The only functional change is in `scudo_tls_android.cpp`: we enforce bounds to the `NumberOfTSDs` and most of the logic in `getTSDAndLockSlow` is skipped if we only have 1 TSD. Reviewers: alekseyshl, dvyukov, kcc Reviewed By: dvyukov Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D38139 llvm-svn: 313987
* [scudo] Fix bad request handling when allocator has not been initializedKostya Kortchinsky2017-09-141-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In a few functions (`scudoMemalign` and the like), we would call `ScudoAllocator::FailureHandler::OnBadRequest` if the parameters didn't check out. The issue is that if the allocator had not been initialized (eg: if this is the first heap related function called), we would use variables like `allocator_may_return_null` and `exitcode` that still had their default value (as opposed to the one set by the user or the initialization path). To solve this, we introduce `handleBadRequest` that will call `initThreadMaybe`, allowing the options to be correctly initialized. Unfortunately, the tests were passing because `exitcode` was still 0, so the results looked like success. Change those tests to do what they were supposed to. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37853 llvm-svn: 313294
* [scudo] Fix improper TSD init after TLS destructors are calledKostya Kortchinsky2017-09-111-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some of glibc's own thread local data is destroyed after a user's thread local destructors are called, via __libc_thread_freeres. This might involve calling free, as is the case for strerror_thread_freeres. If there is no prior heap operation in the thread, this free would end up initializing some thread specific data that would never be destroyed properly (as user's pthread destructors have already been called), while still being deallocated when the TLS goes away. As a result, a program could SEGV, usually in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked list links would refer to a now unmapped memory area. To prevent this from happening, we will not do a full initialization from the deallocation path. This means that the fallback cache & quarantine will be used if no other heap operation has been called, and we effectively prevent the TSD being initialized and never destroyed. The TSD will be fully initialized for all other paths. In the event of a thread doing only frees and nothing else, a TSD would never be initialized for that thread, but this situation is unlikely and we can live with that. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37697 llvm-svn: 312939
* [scudo] Application & platform compatibility changesKostya Kortchinsky2017-08-161-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch changes a few (small) things around for compatibility purposes for the current Android & Fuchsia work: - `realloc`'ing some memory that was not allocated with `malloc`, `calloc` or `realloc`, while UB according to http://pubs.opengroup.org/onlinepubs/009695399/functions/realloc.html is more common that one would think. We now only check this if `DeallocationTypeMismatch` is set; change the "mismatch" error messages to be more homogeneous; - some sketchily written but widely used libraries expect a call to `realloc` to copy the usable size of the old chunk to the new one instead of the requested size. We have to begrundingly abide by this de-facto standard. This doesn't seem to impact security either way, unless someone comes up with something we didn't think about; - the CRC32 intrinsics for 64-bit take a 64-bit first argument. This is misleading as the upper 32 bits end up being ignored. This was also raising `-Wconversion` errors. Change things to take a `u32` as first argument. This also means we were (and are) only using 32 bits of the Cookie - not a big thing, but worth mentioning. - Includes-wise: prefer `stddef.h` to `cstddef`, move `scudo_flags.h` where it is actually needed. - Add tests for the memalign-realloc case, and the realloc-usable-size one. (Edited typos) Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36754 llvm-svn: 311018
* [scudo] Check for pvalloc overflowKostya Kortchinsky2017-07-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously we were rounding up the size passed to `pvalloc` to the next multiple of page size no matter what. There is an overflow possibility that wasn't accounted for. So now, return null in the event of an overflow. The man page doesn't seem to indicate the errno to set in this particular situation, but the glibc unit tests go for ENOMEM (https://code.woboq.org/userspace/glibc/malloc/tst-pvalloc.c.html#54) so we'll do the same. Update the aligned allocation funtions tests to check for properly aligned returned pointers, and the `pvalloc` corner cases. @alekseyshl: do you want me to do the same in the other Sanitizers? Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, alekseyshl, llvm-commits Differential Revision: https://reviews.llvm.org/D35818 llvm-svn: 309033
* [scudo] Quarantine overhaulKostya Kortchinsky2017-07-241-38/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: First, some context. The main feedback we get about the quarantine is that it's too memory hungry. A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and things quickly get out of hand in terms of memory usage, and the quarantine ends up disabled. The main objective of the quarantine is to protect from use-after-free exploitation by making it harder for an attacker to reallocate a controlled chunk in place of the targeted freed chunk. This is achieved by not making it available to the backend right away for reuse, but holding it a little while. Historically, what has usually been the target of such attacks was objects, where vtable pointers or other function pointers could constitute a valuable targeti to replace. Those are usually on the smaller side. There is barely any advantage in putting the quarantine several megabytes of RGB data or the like. Now for the patch. This patch introduces a new way the Quarantine behaves in Scudo. First of all, the size of the Quarantine will be defined in KB instead of MB, then we introduce a new option: the size up to which (lower than or equal to) a chunk will be quarantined. This way, we only quarantine smaller chunks, and the size of the quarantine remains manageable. It also prevents someone from triggering a recycle by allocating something huge. We default to 512 bytes on 32-bit and 2048 bytes on 64-bit platforms. In details, the patches includes the following: - introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall back to the old behavior (meaning no threshold in that case); `QuarantineSizeMb` is described as deprecated in the options descriptios; documentation update will follow; - introduce `QuarantineChunksUpToSize`, the new threshold value; - update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`; - remove `AllocatorOptions::copyTo`, it wasn't used; - slightly change the logic around `quarantineOrDeallocateChunk` to accomodate for the new logic; rename a couple of variables there as well; Rewriting the tests, I found a somewhat annoying bug where non-default aligned chunks would account for more than needed when placed in the quarantine due to `<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for now. Reviewers: alekseyshl, kcc Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35694 llvm-svn: 308884
* [Sanitizers] ASan/MSan/LSan allocators set errno on failure.Alex Shlyapnikov2017-07-181-18/+13
| | | | | | | | | | | | | | | | | | | | | Summary: ASan/MSan/LSan allocators set errno on allocation failures according to malloc/calloc/etc. expected behavior. MSan allocator was refactored a bit to make its structure more similar with other allocators. Also switch Scudo allocator to the internal errno definitions. TSan allocator changes will follow. Reviewers: eugenis Subscribers: llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D35275 llvm-svn: 308344
* [Sanitizers] Scudo allocator set errno on failure.Alex Shlyapnikov2017-07-141-21/+29
| | | | | | | | | | | | | | | Summary: Set proper errno code on alloction failure and change pvalloc and posix_memalign implementation to satisfy their man-specified requirements. Reviewers: cryptoad Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35429 llvm-svn: 308053
* [scudo] Do not grab a cache for secondary allocation & per related changesKostya Kortchinsky2017-07-131-41/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Secondary backed allocations do not require a cache. While it's not necessary an issue when each thread has its cache, it becomes one with a shared pool of caches (Android), as a Secondary backed allocation or deallocation holds a cache that could be useful to another thread doing a Primary backed allocation. We introduce an additional PRNG and its mutex (to avoid contention with the Fallback one for Primary allocations) that will provide the `Salt` needed for Secondary backed allocations. I changed some of the code in a way that feels more readable to me (eg: using some values directly rather than going through ternary assigned variables, using directly `true`/`false` rather than `FromPrimary`). I will let reviewers decide if it actually is. An additional change is to mark `CheckForCallocOverflow` as `UNLIKELY`. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35358 llvm-svn: 307958
* [scudo] PRNG makeoverKostya Kortchinsky2017-07-121-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This follows the addition of `GetRandom` with D34412. We remove our `/dev/urandom` code and use the new function. Additionally, change the PRNG for a slightly faster version. One of the issues with the old code is that we have 64 full bits of randomness per "next", using only 8 of those for the Salt and discarding the rest. So we add a cached u64 in the PRNG that can serve up to 8 u8 before having to call the "next" function again. During some integration work, I also realized that some very early processes (like `init`) do not benefit from `/dev/urandom` yet. So if there is no `getrandom` syscall as well, we have to fallback to some sort of initialization of the PRNG. Now a few words on why XoRoShiRo and not something else. I have played a while with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64 are usually faster but produce respectively 15 & 31 bits of entropy, meaning that to get a full 64-bit, you would need to call them several times. The simple XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and produces full 64 bits. %%% root@tulip-chiphd:/data # ./randtest.arm [+] starting xs32... [?] xs32 duration: 22431833053ns [+] starting lcg32... [?] lcg32 duration: 14941402090ns [+] starting pcg32... [?] pcg32 duration: 44941973771ns [+] starting xs128p... [?] xs128p duration: 48889786981ns [+] starting lcg64... [?] lcg64 duration: 33831042391ns [+] starting xos128p... [?] xos128p duration: 44850878605ns root@tulip-chiphd:/data # ./randtest.aarch64 [+] starting xs32... [?] xs32 duration: 22425151678ns [+] starting lcg32... [?] lcg32 duration: 14954255257ns [+] starting pcg32... [?] pcg32 duration: 37346265726ns [+] starting xs128p... [?] xs128p duration: 22523807219ns [+] starting lcg64... [?] lcg64 duration: 26141304679ns [+] starting xos128p... [?] xos128p duration: 14937033215ns %%% Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35221 llvm-svn: 307798
* MergeAlex Shlyapnikov2017-06-291-2/+1
| | | | llvm-svn: 306748
* [scudo] Change aligned alloc functions to be more compliant & perf changesKostya Kortchinsky2017-06-291-26/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We were not following the `man` documented behaviors for invalid arguments to `memalign` and associated functions. Using `CHECK` for those was a bit extreme, so we relax the behavior to return null pointers as expected when this happens. Adapt the associated test. I am using this change also to change a few more minor performance improvements: - mark as `UNLIKELY` a bunch of unlikely conditions; - the current `CHECK` in `__sanitizer::RoundUpTo` is redundant for us in *all* calls. So I am introducing our own version without said `CHECK`. - change our combined allocator `GetActuallyAllocatedSize`. We already know if the pointer is from the Primary or Secondary, so the `PointerIsMine` check is redundant as well, and costly for the 32-bit Primary. So we get the size by directly using the available Primary functions. Finally, change a `int` to `uptr` to avoid a warning/error when compiling on Android. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34782 llvm-svn: 306698
* [Sanitizers] Move cached allocator_may_return_null flag to sanitizer_allocatorAlex Shlyapnikov2017-06-201-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Move cached allocator_may_return_null flag to sanitizer_allocator.cc and provide API to consolidate and unify the behavior of all specific allocators. Make all sanitizers using CombinedAllocator to follow AllocatorReturnNullOrDieOnOOM() rules to behave the same way when OOM happens. When OOM happens, turn allocator_out_of_memory flag on regardless of allocator_may_return_null flag value (it used to not to be set when allocator_may_return_null == true). release_to_os_interval_ms and rss_limit_exceeded will likely be moved to sanitizer_allocator.cc too (later). Reviewers: eugenis Subscribers: srhines, kubamracek, llvm-commits Differential Revision: https://reviews.llvm.org/D34310 llvm-svn: 305858
* [scudo] Use our own combined allocatorKostya Kortchinsky2017-05-111-38/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The reasoning behind this change is twofold: - the current combined allocator (sanitizer_allocator_combined.h) implements features that are not relevant for Scudo, making some code redundant, and some restrictions not pertinent (alignments for example). This forced us to do some weird things between the frontend and our secondary to make things work; - we have enough information to be able to know if a chunk will be serviced by the Primary or Secondary, allowing us to avoid extraneous calls to functions such as `PointerIsMine` or `CanAllocate`. As a result, the new scudo-specific combined allocator is very straightforward, and allows us to remove some now unnecessary code both in the frontend and the secondary. Unused functions have been left in as unimplemented for now. It turns out to also be a sizeable performance gain (3% faster in some Android memory_replay benchmarks, doing some more on other platforms). Reviewers: alekseyshl, kcc, dvyukov Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33007 llvm-svn: 302830
* [scudo] CRC32 optimizationsKostya Kortchinsky2017-05-091-21/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change optimizes several aspects of the checksum used for chunk headers. First, there is no point in checking the weak symbol `computeHardwareCRC32` everytime, it will either be there or not when we start, so check it once during initialization and set the checksum type accordingly. Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was not optimized out, while not necessary. So I reshuffled that part of the code, which duplicates a tiny bit of code, but ends up in a much cleaner assembly (and faster as we avoid an extraneous load and some calls). The following code is the checksum at the end of `scudoMalloc` for x86_64 with full SSE 4.2, before: ``` mov rax, 0FFFFFFFFFFFFFFh shl r10, 38h mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie and r14, rax lea rsi, [r13-10h] movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm or r14, r10 mov rbx, r14 xor bx, bx call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong) mov rsi, rbx mov edi, eax call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong) mov r14w, ax mov rax, r13 mov [r13-10h], r14 ``` After: ``` mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie lea rcx, [rbx-10h] mov rdx, 0FFFFFFFFFFFFFFh and r14, rdx shl r9, 38h or r14, r9 crc32 eax, rcx mov rdx, r14 xor dx, dx mov eax, eax crc32 eax, rdx mov r14w, ax mov rax, rbx mov [rbx-10h], r14 ``` Reviewers: dvyukov, alekseyshl, kcc Reviewed By: alekseyshl Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D32971 llvm-svn: 302538
OpenPOWER on IntegriCloud