summaryrefslogtreecommitdiffstats
path: root/compiler-rt/lib/scudo
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "[scudo] Make getNumberOfCPUs Fuchsia compliant"Reid Kleckner2017-11-081-1/+8
| | | | | | | | This reverts commit r317604. Android doesn't have cpu_set_t. llvm-svn: 317655
* [scudo] Make getNumberOfCPUs Fuchsia compliantKostya Kortchinsky2017-11-071-8/+1
| | | | | | | | | | | | | | Summary: This change allows Fuchsia to boot properly using the Scudo allocator. Reviewers: cryptoad, alekseyshl, krytarowski Reviewed By: cryptoad, krytarowski Subscribers: krytarowski, kubamracek, llvm-commits Differential Revision: https://reviews.llvm.org/D39490 llvm-svn: 317604
* [scudo] Rearrange #include orderKostya Kortchinsky2017-11-031-3/+3
| | | | | | | | | | | | | | | | Summary: To be compliant with https://llvm.org/docs/CodingStandards.html#include-style, system headers have to come after local headers. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39623 llvm-svn: 317390
* [scudo] Implement stricter separation of C vs C++Kostya Kortchinsky2017-11-011-23/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Initially, Scudo had a monolithic design where both C and C++ functions were living in the same library. This was not necessarily ideal, and with the work on -fsanitize=scudo, it became more apparent that this needed to change. We are splitting the new/delete interceptor in their own C++ library. This allows more flexibility, notably with regard to std::bad_alloc when the work is done. This also allows us to not link new & delete when using pure C. Additionally, we add the UBSan runtimes with Scudo, in order to be able to have a -fsanitize=scudo,undefined in Clang (see work in D39334). The changes in this patch: - split the cxx specific code in the scudo cmake file into a new library; (remove the spurious foreach loop, that was not necessary) - add the UBSan runtimes (both C and C++); - change the test cmake file to allow for specific C & C++ tests; - make C tests pure C, rename their extension accordingly. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D39461 llvm-svn: 317097
* [scudo] Allow to specify the maximum number of TSDs at compile timeKostya Kortchinsky2017-10-272-6/+8
| | | | | | | | | | | | | | | | | | | | Summary: This introduces `SCUDO_MAX_CACHES` allowing to define an upper bound to the number of `ScudoTSD` created in the Shared TSD model (by default 32U). This name felt clearer than `SCUDO_MAX_TSDS` which is technically what it really is. I am opened to suggestions if that doesn't feel right. Additionally change `getNumberOfCPUs` to return a `u32` to be more consistent. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39338 llvm-svn: 316788
* [scudo] Remove comment about security of the 32-bit allocatorKostya Kortchinsky2017-10-251-3/+0
| | | | | | | | | | | | | | | | | | Summary: The 32-bit allocator is now on par with the 64-bit in terms of security (chunks randomization is done, batches separation is done). Unless objection, the comment can go away. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39303 llvm-svn: 316620
* [scudo] Add a shared runtimeKostya Kortchinsky2017-10-231-3/+20
| | | | | | | | | | | | | | | | | Summary: Up to now, the Scudo cmake target only provided a static library that had to be linked to an executable to benefit from the hardened allocator. This introduces a shared library as well, that can be LD_PRELOAD'ed. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D38980 llvm-svn: 316342
* [scudo] Do not include sanitizer_posix.h if not on a Posix platformKostya Kortchinsky2017-10-161-2/+2
| | | | | | | | | | | | | | | | Summary: Move the `sanitizer_posix.h` include within the `SANITIZER_ANDROID` `#if`, otherwise this errors when built on non-Posix platforms (eg: Fuchsia). Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38956 llvm-svn: 315917
* [scudo] Allow for non-Android Shared TSD platforms, part 2Kostya Kortchinsky2017-10-133-12/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Follow up to D38826. We introduce `pthread_{get,set}specific` versions of `{get,set}CurrentTSD` to allow for non Android platforms to use the Shared TSD model. We now allow `SCUDO_TSD_EXCLUSIVE` to be defined at compile time. A couple of things: - I know that `#if SANITIZER_ANDROID` is not ideal within a function, but in the end I feel it looks more compact and clean than going the .inc route; I am open to an alternative if anyone has one; - `SCUDO_TSD_EXCLUSIVE=1` requires ELF TLS support (and not emutls as this uses malloc). I haven't found anything to enforce that, so it's currently not checked. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D38854 llvm-svn: 315751
* [scudo] Allow for non-Android Shared TSD platforms, part 1Kostya Kortchinsky2017-10-125-17/+22
| | | | | | | | | | | | | | | | | | | | | | Summary: This first part just prepares the grounds for part 2 and doesn't add any new functionality. It mostly consists of small refactors: - move the `pthread.h` include higher as it will be used in the headers; - use `errno.h` in `scudo_allocator.cpp` instead of the sanitizer one, update the `errno` assignments accordingly (otherwise it creates conflicts on some platforms due to `pthread.h` including `errno.h`); - introduce and use `getCurrentTSD` and `setCurrentTSD` for the shared TSD model code; Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D38826 llvm-svn: 315583
* [scudo] Scudo thread specific data refactor, part 3Kostya Kortchinsky2017-09-269-59/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previous parts: D38139, D38183. In this part of the refactor, we abstract the Linux vs Android TSD dissociation in favor of a Exclusive vs Shared one, allowing for easier platform introduction and configuration. Most of this change consist of shuffling the files around to reflect the new organization. We introduce `scudo_platform.h` where platform specific definition lie. This involves the TSD model and the platform specific allocator parameters. In an upcoming CL, those will be configurable via defines, but we currently stick with conservative defaults. Reviewers: alekseyshl, dvyukov Reviewed By: alekseyshl, dvyukov Subscribers: srhines, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D38244 llvm-svn: 314224
* [scudo] Scudo thread specific data refactor, part 2Kostya Kortchinsky2017-09-258-145/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Following D38139, we now consolidate the TSD definition, merging the shared TSD definition with the exclusive TSD definition. We introduce a boolean set at initializaton denoting the need for the TSD to be unlocked or not. This adds some unused members to the exclusive TSD, but increases consistency and reduces the definitions fragmentation. We remove the fallback mechanism from `scudo_allocator.cpp` and add a fallback TSD in the non-shared version. Since the shared version doesn't require one, this makes overall more sense. There are a couple of additional cosmetic changes: removing the header guards from the remaining `.inc` files, added error string to a `CHECK`. Question to reviewers: I thought about friending `getTSDAndLock` in `ScudoTSD` so that the `FallbackTSD` could `Mutex.Lock()` directly instead of `lock()` which involved zeroing out the `Precedence`, which is unused otherwise. Is it worth doing? Reviewers: alekseyshl, dvyukov, kcc Reviewed By: dvyukov Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D38183 llvm-svn: 314110
* [scudo] Scudo thread specific data refactor, part 1Kostya Kortchinsky2017-09-228-100/+90
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: We are going through an overhaul of Scudo's TSD, to allow for new platforms to be integrated more easily, and make the code more sound. This first part is mostly renaming, preferring some shorter names, correcting some comments. I removed `getPrng` and `getAllocatorCache` to directly access the members, there was not really any benefit to them (and it was suggested by Dmitry in D37590). The only functional change is in `scudo_tls_android.cpp`: we enforce bounds to the `NumberOfTSDs` and most of the logic in `getTSDAndLockSlow` is skipped if we only have 1 TSD. Reviewers: alekseyshl, dvyukov, kcc Reviewed By: dvyukov Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D38139 llvm-svn: 313987
* [scudo] Fix bad request handling when allocator has not been initializedKostya Kortchinsky2017-09-141-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In a few functions (`scudoMemalign` and the like), we would call `ScudoAllocator::FailureHandler::OnBadRequest` if the parameters didn't check out. The issue is that if the allocator had not been initialized (eg: if this is the first heap related function called), we would use variables like `allocator_may_return_null` and `exitcode` that still had their default value (as opposed to the one set by the user or the initialization path). To solve this, we introduce `handleBadRequest` that will call `initThreadMaybe`, allowing the options to be correctly initialized. Unfortunately, the tests were passing because `exitcode` was still 0, so the results looked like success. Change those tests to do what they were supposed to. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37853 llvm-svn: 313294
* [scudo] Fix improper TSD init after TLS destructors are calledKostya Kortchinsky2017-09-116-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Some of glibc's own thread local data is destroyed after a user's thread local destructors are called, via __libc_thread_freeres. This might involve calling free, as is the case for strerror_thread_freeres. If there is no prior heap operation in the thread, this free would end up initializing some thread specific data that would never be destroyed properly (as user's pthread destructors have already been called), while still being deallocated when the TLS goes away. As a result, a program could SEGV, usually in __sanitizer::AllocatorGlobalStats::Unregister, where one of the doubly linked list links would refer to a now unmapped memory area. To prevent this from happening, we will not do a full initialization from the deallocation path. This means that the fallback cache & quarantine will be used if no other heap operation has been called, and we effectively prevent the TSD being initialized and never destroyed. The TSD will be fully initialized for all other paths. In the event of a thread doing only frees and nothing else, a TSD would never be initialized for that thread, but this situation is unlikely and we can live with that. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37697 llvm-svn: 312939
* [scudo] getauxval alternative for AndroidKostya Kortchinsky2017-09-061-12/+44
| | | | | | | | | | | | | | | | | | Summary: `getauxval` was introduced with API level 18. In order to get things to work at lower API levels (for the toolchain itself which is built at 14 for 32-bit), we introduce an alternative implementation reading directly from `/proc/self/auxv`. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D37488 llvm-svn: 312653
* [sanitizer] Re-introduce kUseSeparateSizeClassForBatch for the 32-bit PrimaryKostya Kortchinsky2017-08-281-13/+14
| | | | | | | | | | | | | | | | | | | | | | | Summary: Currently `TransferBatch` are located within the same memory regions as "regular" chunks. This is not ideal for security: they make for an interesting target to overwrite, and are not protected by the frontend (namely, Scudo). To solve this, we re-introduce `kUseSeparateSizeClassForBatch` for the 32-bit Primary allowing for `TransferBatch` to end up in their own memory region. Currently only Scudo would use this new feature, the default behavior remains unchanged. The separate `kBatchClassID` was used for a brief period of time previously but removed when the 64-bit ended up using the "free array". Reviewers: alekseyshl, kcc, eugenis Reviewed By: alekseyshl Subscribers: llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D37082 llvm-svn: 311891
* [scudo] Application & platform compatibility changesKostya Kortchinsky2017-08-163-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch changes a few (small) things around for compatibility purposes for the current Android & Fuchsia work: - `realloc`'ing some memory that was not allocated with `malloc`, `calloc` or `realloc`, while UB according to http://pubs.opengroup.org/onlinepubs/009695399/functions/realloc.html is more common that one would think. We now only check this if `DeallocationTypeMismatch` is set; change the "mismatch" error messages to be more homogeneous; - some sketchily written but widely used libraries expect a call to `realloc` to copy the usable size of the old chunk to the new one instead of the requested size. We have to begrundingly abide by this de-facto standard. This doesn't seem to impact security either way, unless someone comes up with something we didn't think about; - the CRC32 intrinsics for 64-bit take a 64-bit first argument. This is misleading as the upper 32 bits end up being ignored. This was also raising `-Wconversion` errors. Change things to take a `u32` as first argument. This also means we were (and are) only using 32 bits of the Cookie - not a big thing, but worth mentioning. - Includes-wise: prefer `stddef.h` to `cstddef`, move `scudo_flags.h` where it is actually needed. - Add tests for the memalign-realloc case, and the realloc-usable-size one. (Edited typos) Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36754 llvm-svn: 311018
* [sanitizers] Add a blocking boolean to GetRandom prototypeKostya Kortchinsky2017-08-141-5/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: On platforms with `getrandom`, the system call defaults to blocking. This becomes an issue in the very early stage of the boot for Scudo, when the RNG source is not set-up yet: the syscall will block and we'll stall. Introduce a parameter to specify that the function should not block, defaulting to blocking as the underlying syscall does. Update Scudo to use the non-blocking version. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D36399 llvm-svn: 310839
* [scudo] Check for pvalloc overflowKostya Kortchinsky2017-07-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously we were rounding up the size passed to `pvalloc` to the next multiple of page size no matter what. There is an overflow possibility that wasn't accounted for. So now, return null in the event of an overflow. The man page doesn't seem to indicate the errno to set in this particular situation, but the glibc unit tests go for ENOMEM (https://code.woboq.org/userspace/glibc/malloc/tst-pvalloc.c.html#54) so we'll do the same. Update the aligned allocation funtions tests to check for properly aligned returned pointers, and the `pvalloc` corner cases. @alekseyshl: do you want me to do the same in the other Sanitizers? Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: kubamracek, alekseyshl, llvm-commits Differential Revision: https://reviews.llvm.org/D35818 llvm-svn: 309033
* [scudo] Quarantine overhaulKostya Kortchinsky2017-07-243-56/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: First, some context. The main feedback we get about the quarantine is that it's too memory hungry. A single MB of quarantine will have an impact of 3 to 4MB of PSS/RSS, and things quickly get out of hand in terms of memory usage, and the quarantine ends up disabled. The main objective of the quarantine is to protect from use-after-free exploitation by making it harder for an attacker to reallocate a controlled chunk in place of the targeted freed chunk. This is achieved by not making it available to the backend right away for reuse, but holding it a little while. Historically, what has usually been the target of such attacks was objects, where vtable pointers or other function pointers could constitute a valuable targeti to replace. Those are usually on the smaller side. There is barely any advantage in putting the quarantine several megabytes of RGB data or the like. Now for the patch. This patch introduces a new way the Quarantine behaves in Scudo. First of all, the size of the Quarantine will be defined in KB instead of MB, then we introduce a new option: the size up to which (lower than or equal to) a chunk will be quarantined. This way, we only quarantine smaller chunks, and the size of the quarantine remains manageable. It also prevents someone from triggering a recycle by allocating something huge. We default to 512 bytes on 32-bit and 2048 bytes on 64-bit platforms. In details, the patches includes the following: - introduce `QuarantineSizeKb`, but honor `QuarantineSizeMb` if set to fall back to the old behavior (meaning no threshold in that case); `QuarantineSizeMb` is described as deprecated in the options descriptios; documentation update will follow; - introduce `QuarantineChunksUpToSize`, the new threshold value; - update the `quarantine.cpp` test, and other tests using `QuarantineSizeMb`; - remove `AllocatorOptions::copyTo`, it wasn't used; - slightly change the logic around `quarantineOrDeallocateChunk` to accomodate for the new logic; rename a couple of variables there as well; Rewriting the tests, I found a somewhat annoying bug where non-default aligned chunks would account for more than needed when placed in the quarantine due to `<< MinAlignment` instead of `<< MinAlignmentLog`. This is fixed and tested for now. Reviewers: alekseyshl, kcc Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35694 llvm-svn: 308884
* [Sanitizers] ASan/MSan/LSan allocators set errno on failure.Alex Shlyapnikov2017-07-181-18/+13
| | | | | | | | | | | | | | | | | | | | | Summary: ASan/MSan/LSan allocators set errno on allocation failures according to malloc/calloc/etc. expected behavior. MSan allocator was refactored a bit to make its structure more similar with other allocators. Also switch Scudo allocator to the internal errno definitions. TSan allocator changes will follow. Reviewers: eugenis Subscribers: llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D35275 llvm-svn: 308344
* [Sanitizers] Scudo allocator set errno on failure.Alex Shlyapnikov2017-07-141-21/+29
| | | | | | | | | | | | | | | Summary: Set proper errno code on alloction failure and change pvalloc and posix_memalign implementation to satisfy their man-specified requirements. Reviewers: cryptoad Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35429 llvm-svn: 308053
* [scudo] Do not grab a cache for secondary allocation & per related changesKostya Kortchinsky2017-07-132-55/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Secondary backed allocations do not require a cache. While it's not necessary an issue when each thread has its cache, it becomes one with a shared pool of caches (Android), as a Secondary backed allocation or deallocation holds a cache that could be useful to another thread doing a Primary backed allocation. We introduce an additional PRNG and its mutex (to avoid contention with the Fallback one for Primary allocations) that will provide the `Salt` needed for Secondary backed allocations. I changed some of the code in a way that feels more readable to me (eg: using some values directly rather than going through ternary assigned variables, using directly `true`/`false` rather than `FromPrimary`). I will let reviewers decide if it actually is. An additional change is to mark `CheckForCallocOverflow` as `UNLIKELY`. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35358 llvm-svn: 307958
* [scudo] PRNG makeoverKostya Kortchinsky2017-07-124-56/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This follows the addition of `GetRandom` with D34412. We remove our `/dev/urandom` code and use the new function. Additionally, change the PRNG for a slightly faster version. One of the issues with the old code is that we have 64 full bits of randomness per "next", using only 8 of those for the Salt and discarding the rest. So we add a cached u64 in the PRNG that can serve up to 8 u8 before having to call the "next" function again. During some integration work, I also realized that some very early processes (like `init`) do not benefit from `/dev/urandom` yet. So if there is no `getrandom` syscall as well, we have to fallback to some sort of initialization of the PRNG. Now a few words on why XoRoShiRo and not something else. I have played a while with various PRNGs on 32 & 64 bit platforms. Some results are below. LCG 32 & 64 are usually faster but produce respectively 15 & 31 bits of entropy, meaning that to get a full 64-bit, you would need to call them several times. The simple XorShift is fast, produces 32 bits but is mediocre with regard to PRNG test suites, PCG is slower overall, and XoRoShiRo is faster than XorShift128+ and produces full 64 bits. %%% root@tulip-chiphd:/data # ./randtest.arm [+] starting xs32... [?] xs32 duration: 22431833053ns [+] starting lcg32... [?] lcg32 duration: 14941402090ns [+] starting pcg32... [?] pcg32 duration: 44941973771ns [+] starting xs128p... [?] xs128p duration: 48889786981ns [+] starting lcg64... [?] lcg64 duration: 33831042391ns [+] starting xos128p... [?] xos128p duration: 44850878605ns root@tulip-chiphd:/data # ./randtest.aarch64 [+] starting xs32... [?] xs32 duration: 22425151678ns [+] starting lcg32... [?] lcg32 duration: 14954255257ns [+] starting pcg32... [?] pcg32 duration: 37346265726ns [+] starting xs128p... [?] xs128p duration: 22523807219ns [+] starting lcg64... [?] lcg64 duration: 26141304679ns [+] starting xos128p... [?] xos128p duration: 14937033215ns %%% Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: aemerson, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35221 llvm-svn: 307798
* MergeAlex Shlyapnikov2017-06-291-2/+1
| | | | llvm-svn: 306748
* [scudo] Change aligned alloc functions to be more compliant & perf changesKostya Kortchinsky2017-06-294-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We were not following the `man` documented behaviors for invalid arguments to `memalign` and associated functions. Using `CHECK` for those was a bit extreme, so we relax the behavior to return null pointers as expected when this happens. Adapt the associated test. I am using this change also to change a few more minor performance improvements: - mark as `UNLIKELY` a bunch of unlikely conditions; - the current `CHECK` in `__sanitizer::RoundUpTo` is redundant for us in *all* calls. So I am introducing our own version without said `CHECK`. - change our combined allocator `GetActuallyAllocatedSize`. We already know if the pointer is from the Primary or Secondary, so the `PointerIsMine` check is redundant as well, and costly for the 32-bit Primary. So we get the size by directly using the available Primary functions. Finally, change a `int` to `uptr` to avoid a warning/error when compiling on Android. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34782 llvm-svn: 306698
* [Sanitizers] Operator new() interceptors always die on allocation errorAlex Shlyapnikov2017-06-281-2/+7
| | | | | | | | | | | | | | | | | | | | | Summary: Operator new interceptors behavior is now controlled by their nothrow property as well as by allocator_may_return_null flag value: - allocator_may_return_null=* + new() - die on allocation error - allocator_may_return_null=0 + new(nothrow) - die on allocation error - allocator_may_return_null=1 + new(nothrow) - return null Ideally new() should throw std::bad_alloc exception, but that is not trivial to achieve, hence TODO. Reviewers: eugenis Subscribers: kubamracek, llvm-commits Differential Revision: https://reviews.llvm.org/D34731 llvm-svn: 306604
* [Sanitizers] Move cached allocator_may_return_null flag to sanitizer_allocatorAlex Shlyapnikov2017-06-203-32/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Move cached allocator_may_return_null flag to sanitizer_allocator.cc and provide API to consolidate and unify the behavior of all specific allocators. Make all sanitizers using CombinedAllocator to follow AllocatorReturnNullOrDieOnOOM() rules to behave the same way when OOM happens. When OOM happens, turn allocator_out_of_memory flag on regardless of allocator_may_return_null flag value (it used to not to be set when allocator_may_return_null == true). release_to_os_interval_ms and rss_limit_exceeded will likely be moved to sanitizer_allocator.cc too (later). Reviewers: eugenis Subscribers: srhines, kubamracek, llvm-commits Differential Revision: https://reviews.llvm.org/D34310 llvm-svn: 305858
* [scudo] Check the return values of the pthread_* functionsKostya Kortchinsky2017-05-261-7/+9
| | | | | | | | | | | | | | | | | | | Summary: Currently we are not enforcing the success of `pthread_once`, and `pthread_setspecific`. Errors could lead to harder to debug issues later in the thread's life. This adds checks for a 0 return value for both. If `pthread_setspecific` fails in the teardown path, opt for an immediate teardown as opposed to a fatal failure. Reviewers: alekseyshl, kcc Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33555 llvm-svn: 303998
* [scudo] lower quarantine default sizesKostya Kortchinsky2017-05-181-2/+2
| | | | | | | | | | | | | | | | | | | | Summary: After discussing the current defaults with a couple of parties, the consensus is that they are too high. 1Mb of quarantine has about a 4Mb impact on PSS, so memory usage goes up quickly. This is obviously configurable, but the default value should be more "approachable", so both the global size and the thread local size are 1/4 of what they used to be. Reviewers: alekseyshl, kcc Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33321 llvm-svn: 303380
* [sanitizer] Change SizeClassAllocator32 to accept just one templateKostya Kortchinsky2017-05-151-4/+14
| | | | | | | | | | | | | | | | | | | | | Summary: With rL279771, SizeClassAllocator64 was changed to accept only one template instead of 5, for the following reasons: "First, this will make the mangled names shorter. Second, this will make adding more parameters simpler". This patch mirrors that work for SizeClassAllocator32. This is in preparation for introducing the randomization of chunks in the 32-bit SizeClassAllocator in a later patch. Reviewers: kcc, alekseyshl, dvyukov Reviewed By: alekseyshl Subscribers: llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D33141 llvm-svn: 303071
* [scudo] Use our own combined allocatorKostya Kortchinsky2017-05-114-115/+155
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The reasoning behind this change is twofold: - the current combined allocator (sanitizer_allocator_combined.h) implements features that are not relevant for Scudo, making some code redundant, and some restrictions not pertinent (alignments for example). This forced us to do some weird things between the frontend and our secondary to make things work; - we have enough information to be able to know if a chunk will be serviced by the Primary or Secondary, allowing us to avoid extraneous calls to functions such as `PointerIsMine` or `CanAllocate`. As a result, the new scudo-specific combined allocator is very straightforward, and allows us to remove some now unnecessary code both in the frontend and the secondary. Unused functions have been left in as unimplemented for now. It turns out to also be a sizeable performance gain (3% faster in some Android memory_replay benchmarks, doing some more on other platforms). Reviewers: alekseyshl, kcc, dvyukov Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33007 llvm-svn: 302830
* [scudo] CRC32 optimizationsKostya Kortchinsky2017-05-094-98/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change optimizes several aspects of the checksum used for chunk headers. First, there is no point in checking the weak symbol `computeHardwareCRC32` everytime, it will either be there or not when we start, so check it once during initialization and set the checksum type accordingly. Then, the loading of `HashAlgorithm` for SSE versions (and ARM equivalent) was not optimized out, while not necessary. So I reshuffled that part of the code, which duplicates a tiny bit of code, but ends up in a much cleaner assembly (and faster as we avoid an extraneous load and some calls). The following code is the checksum at the end of `scudoMalloc` for x86_64 with full SSE 4.2, before: ``` mov rax, 0FFFFFFFFFFFFFFh shl r10, 38h mov edi, dword ptr cs:_ZN7__scudoL6CookieE ; __scudo::Cookie and r14, rax lea rsi, [r13-10h] movzx eax, cs:_ZN7__scudoL13HashAlgorithmE ; __scudo::HashAlgorithm or r14, r10 mov rbx, r14 xor bx, bx call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong) mov rsi, rbx mov edi, eax call _ZN7__scudo20computeHardwareCRC32Ejm ; __scudo::computeHardwareCRC32(uint,ulong) mov r14w, ax mov rax, r13 mov [r13-10h], r14 ``` After: ``` mov rax, cs:_ZN7__scudoL6CookieE ; __scudo::Cookie lea rcx, [rbx-10h] mov rdx, 0FFFFFFFFFFFFFFh and r14, rdx shl r9, 38h or r14, r9 crc32 eax, rcx mov rdx, r14 xor dx, dx mov eax, eax crc32 eax, rdx mov r14w, ax mov rax, rbx mov [rbx-10h], r14 ``` Reviewers: dvyukov, alekseyshl, kcc Reviewed By: alekseyshl Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D32971 llvm-svn: 302538
* [scudo] Add Android supportKostya Kortchinsky2017-05-0510-20/+261
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change adds Android support to the allocator (but doesn't yet enable it in the cmake config), and should be the last fragment of the rewritten change D31947. Android has more memory constraints than other platforms, so the idea of a unique context per thread would not have worked. The alternative chosen is to allocate a set of contexts based on the number of cores on the machine, and share those contexts within the threads. Contexts can be dynamically reassigned to threads to prevent contention, based on a scheme suggested by @dvyuokv in the initial review. Additionally, given that Android doesn't support ELF TLS (only emutls for now), we use the TSan TLS slot to make things faster: Scudo is mutually exclusive with other sanitizers so this shouldn't cause any problem. An additional change made here, is replacing `thread_local` by `THREADLOCAL` and using the initial-exec thread model in the non-Android version to prevent extraneous weak definition and checks on the relevant variables. Reviewers: kcc, dvyukov, alekseyshl Reviewed By: alekseyshl Subscribers: srhines, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D32649 llvm-svn: 302300
* [scudo] Move thread local variables into their own filesKostya Kortchinsky2017-04-278-143/+286
| | | | | | | | | | | | | | | | | | | Summary: This change introduces scudo_tls.h & scudo_tls_linux.cpp, where we move the thread local variables used by the allocator, namely the cache, quarantine cache & prng. `ScudoThreadContext` will hold those. This patch doesn't introduce any new platform support yet, this will be the object of a later patch. This also changes the PRNG so that the structure can be POD. Reviewers: kcc, dvyukov, alekseyshl Reviewed By: dvyukov, alekseyshl Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D32440 llvm-svn: 301584
* [scudo] Bypass Quarantine if its size is set to 0Kostya Kortchinsky2017-04-211-26/+36
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In the current state of things, the deallocation path puts a chunk in the Quarantine whether it's enabled or not (size of 0). When the Quarantine is disabled, this results in the header being loaded (and checked) twice, and stored (and checksummed) once, in `deallocate` and `Recycle`. This change introduces a `quarantineOrDeallocateChunk` function that has a fast path to deallocation if the Quarantine is disabled. Even though this is not the preferred configuration security-wise, this change saves a sizeable amount of processing for that particular situation (which could be adopted by low memory devices). Additionally this simplifies a bit `deallocate` and `reallocate`. Reviewers: dvyukov, kcc, alekseyshl Reviewed By: dvyukov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32310 llvm-svn: 301015
* [scudo] Remove GetActuallyAllocatedSize calls from the fast pathKostya Kortchinsky2017-04-202-30/+49
| | | | | | | | | | | | | | | | | | | | | | Summary: GetActuallyAllocatedSize is actually expensive. In order to avoid calling this function in the malloc/free fast path, we change the Scudo chunk header to store the size of the chunk, if from the Primary, or the amount of unused bytes if from the Secondary. This way, we only have to call the culprit function for Secondary backed allocations (and still in realloc). The performance gain on a singly threaded pure malloc/free benchmark exercising the Primary allocator is above 5%. Reviewers: alekseyshl, kcc, dvyukov Reviewed By: dvyukov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32299 llvm-svn: 300861
* [scudo] Minor changes and refactoringKostya Kortchinsky2017-04-205-162/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is part of D31947 that is being split into several smaller changes. This one deals with all the minor changes, more specifically: - Rename some variables and functions to make their purpose clearer; - Reorder some code; - Mark the hot termination incurring checks as `UNLIKELY`; if they happen, the program will die anyway; - Add a `getScudoChunk` method; - Add an `eraseHeader` method to ScudoChunk that will clear a header with 0s; - Add a parameter to `allocate` to know if the allocated chunk should be filled with zeros. This allows `calloc` to not have to call `GetActuallyAllocatedSize`; more changes to get rid of this function on the hot paths will follow; - reallocate was missing a check to verify that the pointer is properly aligned on `MinAlignment`; - The `Stats` in the secondary have to be protected by a mutex as the `Add` and `Sub` methods are actually not atomic; - The software CRC32 function was moved to the header to allow for inlining. Reviewers: dvyukov, alekseyshl, kcc Reviewed By: dvyukov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32242 llvm-svn: 300846
* [scudo] 32-bit quarantine sizes adjustments and bug fixesKostya Kortchinsky2017-02-033-10/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The local and global quarantine sizes were not offering a distinction for 32-bit and 64-bit platforms. This is addressed with lower values for 32-bit. When writing additional tests for the quarantine, it was discovered that when calling some of the allocator interface function prior to any allocation operation having occured, the test would crash due to the allocator not being initialized. This was addressed by making sure the allocator is initialized for those scenarios. Relevant tests were added in interface.cpp and quarantine.cpp. Last change being the removal of the extraneous link dependencies for the tests thanks to rL293220, anf the addition of the gc-sections linker flag. Reviewers: kcc, alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29341 llvm-svn: 294037
* [scudo] Replacing std::atomic with Sanitizer's atomicsKostya Kortchinsky2017-01-202-12/+8
| | | | | | | | | | | | | | | | | Summary: In an effort to getting rid of dependencies to external libraries, we are replacing atomic PackedHeader use of std::atomic with Sanitizer's atomic_uint64_t, which allows us to avoid -latomic. Reviewers: kcc, phosek, alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28864 llvm-svn: 292630
* [scudo] Refactor of CRC32 and ARM runtime CRC32 detectionKostya Kortchinsky2017-01-187-63/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: ARM & AArch64 runtime detection for hardware support of CRC32 has been added via check of the AT_HWVAL auxiliary vector. Following Michal's suggestions in D28417, the CRC32 code has been further changed and looks better now. When compiled with full relro (which is strongly suggested to benefit from additional hardening), the weak symbol for computeHardwareCRC32 is read-only and the assembly generated is fairly clean and straight forward. As suggested, an additional optimization is to skip the runtime check if SSE 4.2 has been enabled globally, as opposed to only for scudo_crc32.cpp. scudo_crc32.h has no purpose anymore and was removed. Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek Reviewed By: rengolin, mgorny Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D28574 llvm-svn: 292409
* [scudo] Eliminate the runtime dependency on libc++abiPetr Hosek2017-01-171-1/+1
| | | | | | | | | | | | Making this variable non-static avoids the need for locking to ensure that the initialization is thread-safe which in turns eliminates the runtime dependency on libc++abi library (for __cxa_guard_acquire and __cxa_guard_release) which makes it possible to link scudo against pure C programs. Differential Revision: https://reviews.llvm.org/D28757 llvm-svn: 292253
* Revert "[sancov] moving sancov rt to sancov/ directory"Mike Aizatsky2017-01-121-1/+0
| | | | | | | | This reverts commit https://reviews.llvm.org/rL291734 Reason: mac breakage http://lab.llvm.org:8080/green//job/clang-stage1-configure-RA_build/28798/consoleFull#1657087648e9a0fee5-ebcc-4238-a641-c5aa112c323e llvm-svn: 291736
* [sancov] moving sancov rt to sancov/ directoryMike Aizatsky2017-01-121-0/+1
| | | | | | | | Subscribers: kubabrecka, mgorny Differential Revision: https://reviews.llvm.org/D28541 llvm-svn: 291734
* [scudo] Separate hardware CRC32 routinesKostya Kortchinsky2017-01-106-34/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: As raised in D28304, enabling SSE 4.2 for the whole Scudo tree leads to the emission of SSE 4.2 instructions everywhere, while the runtime checks only applied to the CRC32 computing function. This patch separates the CRC32 function taking advantage of the hardware into its own file, and only enabled -msse4.2 for that file, if detected to be supported by the compiler. Another consequence of removing SSE4.2 globally is realizing that memcpy were not being optimized, which turned out to be due to the -fno-builtin in SANITIZER_COMMON_CFLAGS. So we now explicitely enable builtins for Scudo. The resulting assembly looks good, with some CALLs are introduced instead of the CRC32 code being inlined. Reviewers: kcc, mgorny, alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28417 llvm-svn: 291570
* [cmake] Disable appending -msse4.2 flag implicitlyMichal Gorny2017-01-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | Disable the code appending -msse4.2 flag implicitly when the compiler supports it. The compiler support for this flags do not indicate that the underlying CPU will support SSE4.2, and passing it may result in SSE4.2 code being emitted *implicitly*. If the target platform supports SSE4.2 appropriately, the relevant bits should be already enabled via -march= or equivalent. In this case passing -msse4.2 is redundant. If a runtime detection is desired (which seems to be a case with SCUDO), then (as gcc manpage points out) the specific SSE4.2 needs to be isolated into a separate file, the -msse4.2 flag can be forced only for that file and the function defined in that file can only be called when the CPU is determined to support SSE4.2. This fixes SIGILL on SCUDO when it is compiled using gcc-5.4. Differential Revision: https://reviews.llvm.org/D28304 llvm-svn: 291217
* [scudo] Use DefaultSizeClassMap for 32-bitKostya Kortchinsky2016-12-152-8/+8
| | | | | | | | | | | | | | | Summary: With the recent changes to the Secondary, we use less bits for UnusedBytes, which allows us in return to increase the bits used for Offset. That means that we can use a Primary SizeClassMap allowing for a larger maximum size. Reviewers: kcc, alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27816 llvm-svn: 289838
* Corrected D27428: Do not use the alignment-rounded-up size with secondaryKostya Kortchinsky2016-12-132-19/+36
| | | | | | | | | | | | | | | | | | | | Summary: I atually had an integer overflow on 32-bit with D27428 that didn't reproduce locally, as the test servers would manage allocate addresses in the 0xffffxxxx range, which led to some issues when rounding addresses. At this point, I feel that Scudo could benefit from having its own combined allocator, as we don't get any benefit from the current one, but have to work around some hurdles (alignment checks, rounding up that is no longer needed, extraneous code). Reviewers: kcc, alekseyshl Subscribers: llvm-commits, kubabrecka Differential Revision: https://reviews.llvm.org/D27681 llvm-svn: 289572
* Reverting rL289088 while investigating some test issue on the build serversKostya Kortchinsky2016-12-091-4/+3
| | | | | | | | Subscribers: kubabrecka Differential Revision: https://reviews.llvm.org/D27605 llvm-svn: 289180
OpenPOWER on IntegriCloud