summaryrefslogtreecommitdiffstats
path: root/compiler-rt/lib/scudo/scudo_tsd.h
Commit message (Collapse)AuthorAgeFilesLines
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [scudo] Simplify internal names (NFC)Kostya Kortchinsky2018-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: There is currently too much redundancy in the class/variable/* names in Scudo: - we are in the namespace `__scudo`, so there is no point in having something named `ScudoX` to end up with a final name of `__scudo::ScudoX`; - there are a lot of types/* that have `Allocator` in the name, given that Scudo is an allocator I figure this doubles up as well. So change a bunch of the Scudo names to make them shorter, less redundant, and overall simpler. They should still be pretty self explaining (or at least it looks so to me). The TSD part will be done in another CL (eg `__scudo::ScudoTSD`). Reviewers: alekseyshl, eugenis Reviewed By: alekseyshl Subscribers: delcypher, #sanitizers, llvm-commits Differential Revision: https://reviews.llvm.org/D49505 llvm-svn: 337557
* [scudo] Improve the scalability of the shared TSD modelKostya Kortchinsky2018-06-111-14/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The shared TSD model in its current form doesn't scale. Here is an example of rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core machines (defaulting to a `CompactSizeClassMap` and no Quarantine): - with tcmalloc: 337K reqs/sec, peak RSS of 338MB; - with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB; - with scudo (shared): 241K reqs/sec, peak RSS of 324MB. This isn't great, since the exclusive model uses a lot of memory, while the shared model doesn't even come close to be competitive. This is mostly due to the fact that we are consistently scanning the TSD pool starting at index 0 for an available TSD, which can result in a lot of failed lock attempts, and touching some memory that needs not be touched. This CL attempts to make things better in most situations: - first, use a thread local variable on Linux (intead of pthread APIs) to store the current TSD in the shared model; - move the locking boolean out of the TSD: this allows the compiler to use a register and potentially optimize out a branch instead of reading it from the TSD everytime (we also save a tiny bit of memory per TSD); - 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so store the `Precedence` in a `uptr` instead of a `u64`. We lose some nanoseconds of precision and we'll wrap around at some point, but the benefit is worth it; - change a `CHECK` to a `DCHECK`: this should never happen, but if something is ever terribly wrong, we'll crash on a near null AV if the TSD happens to be null; - based on an idea by dvyukov@, we are implementing a bound random scan for an available TSD. This requires computing the coprimes for the number of TSDs, and attempting to lock up to 4 TSDs in an random order before falling back to the current one. This is obviously slightly more expansive when we have just 2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still basically corresponds to the moment of the first contention on a TSD. To seed on random choice, we use the precedence of the current TSD since it is very likely to be non-zero (since we are in the slow path after a failed `tryLock`) With those modifications, the benchmark yields to: - with scudo (shared): 330K reqs/sec, peak RSS of 327MB. So the shared model for this specific situation not only becomes competitive but outperforms the exclusive model. I experimented with some values greater than 4 for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just sticking with the current TSD is also a tad slower. Numbers on platforms with less cores (eg: Android) remain similar. Reviewers: alekseyshl, dvyukov, javed.absar Reviewed By: alekseyshl, dvyukov Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D47289 llvm-svn: 334410
* [sanitizer] Introduce a vDSO aware timing functionKostya Kortchinsky2017-12-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: See D40657 & D40679 for previous versions of this patch & description. A couple of things were fixed here to have it not break some bots. Weak symbols can't be used with `SANITIZER_GO` so the previous version was breakin TsanGo. I set up some additional local tests and those pass now. I changed the workaround for the glibc vDSO issue: `__progname` is initialized after the vDSO and is actually public and of known type, unlike `__vdso_clock_gettime`. This works better, and with all compilers. The rest is the same. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, kubamracek, krytarowski, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D41121 llvm-svn: 320594
* [sanitizer] Revert rL320409Kostya Kortchinsky2017-12-111-1/+1
| | | | | | | | | | | | | | Summary: D40679 broke a couple of builds, reverting while investigating. Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: srhines, kubamracek, krytarowski, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D41088 llvm-svn: 320417
* [sanitizer] Introduce a vDSO aware time function, and use it in the ↵Kostya Kortchinsky2017-12-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allocator [redo] Summary: Redo of D40657, which had the initial discussion. The initial code had to move into a libcdep file, and things had to be shuffled accordingly. `NanoTime` is a time sink when checking whether or not to release memory to the OS. While reducing the amount of calls to said function is in the works, another solution that was found to be beneficial was to use a timing function that can leverage the vDSO. We hit a couple of snags along the way, like the fact that the glibc crashes when clock_gettime is called from a preinit_array, or the fact that `__vdso_clock_gettime` is mangled (for security purposes) and can't be used directly, and also that clock_gettime can be intercepted. The proposed solution takes care of all this as far as I can tell, and significantly improve performances and some Scudo load tests with memory reclaiming enabled. @mcgrathr: please feel free to follow up on https://reviews.llvm.org/D40657#940857 here. I posted a reply at https://reviews.llvm.org/D40657#940974. Reviewers: alekseyshl, krytarowski, flowerhack, mcgrathr, kubamracek Reviewed By: alekseyshl, krytarowski Subscribers: #sanitizers, mcgrathr, srhines, llvm-commits, kubamracek Differential Revision: https://reviews.llvm.org/D40679 llvm-svn: 320409
* [scudo] Get rid of the thread local PRNG & header saltKostya Kortchinsky2017-12-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: It was deemed that the salt in the chunk header didn't improve security significantly (and could actually decrease it). The initial idea was that the same chunk would different headers on different allocations, allowing for less predictability. The issue is that gathering the same chunk header with different salts can give information about the other "secrets" (cookie, pointer), and that if an attacker leaks a header, they can reuse it anyway for that same chunk anyway since we don't enforce the salt value. So we get rid of the salt in the header. This means we also get rid of the thread local Prng, and that we don't need a global Prng anymore as well. This makes everything faster. We reuse those 8 bits to store the `ClassId` of a chunk now (0 for a secondary based allocation). This way, we get some additional speed gains: - `ClassId` is computed outside of the locked block; - `getActuallyAllocatedSize` doesn't need the `GetSizeClass` call; - same for `deallocatePrimary`; We add a sanity check at init for this new field (all sanity checks are moved in their own function, `init` was getting crowded). Reviewers: alekseyshl, flowerhack Reviewed By: alekseyshl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40796 llvm-svn: 319791
* [scudo] Allow for non-Android Shared TSD platforms, part 1Kostya Kortchinsky2017-10-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: This first part just prepares the grounds for part 2 and doesn't add any new functionality. It mostly consists of small refactors: - move the `pthread.h` include higher as it will be used in the headers; - use `errno.h` in `scudo_allocator.cpp` instead of the sanitizer one, update the `errno` assignments accordingly (otherwise it creates conflicts on some platforms due to `pthread.h` including `errno.h`); - introduce and use `getCurrentTSD` and `setCurrentTSD` for the shared TSD model code; Reviewers: alekseyshl Reviewed By: alekseyshl Subscribers: llvm-commits, srhines Differential Revision: https://reviews.llvm.org/D38826 llvm-svn: 315583
* [scudo] Scudo thread specific data refactor, part 3Kostya Kortchinsky2017-09-261-0/+71
Summary: Previous parts: D38139, D38183. In this part of the refactor, we abstract the Linux vs Android TSD dissociation in favor of a Exclusive vs Shared one, allowing for easier platform introduction and configuration. Most of this change consist of shuffling the files around to reflect the new organization. We introduce `scudo_platform.h` where platform specific definition lie. This involves the TSD model and the platform specific allocator parameters. In an upcoming CL, those will be configurable via defines, but we currently stick with conservative defaults. Reviewers: alekseyshl, dvyukov Reviewed By: alekseyshl, dvyukov Subscribers: srhines, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D38244 llvm-svn: 314224
OpenPOWER on IntegriCloud