summaryrefslogtreecommitdiffstats
path: root/compiler-rt/lib/xray/tests/unit
Commit message (Collapse)AuthorAgeFilesLines
* Re-land "[compiler-rt] Migrate llvm::make_unique to std::make_unique"Jonas Devlieghere2019-08-151-13/+13
| | | | | | | With the compiler-rt check for C++14 updated in r368960, this should now be fine to land. llvm-svn: 369009
* Revert "[compiler-rt] Migrate llvm::make_unique to std::make_unique"Jonas Devlieghere2019-08-151-13/+13
| | | | | | | | | | The X-ray unit tests in compiler-rt are overriding the C++ version by explicitly passing -std=c++11 in the compiler invocation. This poses a problem as these tests are including LLVM headers that can now use C++14 features. I'm temporarily reverting this as I investigate the correct solution. llvm-svn: 368952
* [compiler-rt] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-151-13/+13
| | | | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. Differential revision: https://reviews.llvm.org/D66259 llvm-svn: 368946
* compiler-rt: Rename .cc file in lib/xray/tests/unit to .cppNico Weber2019-08-0110-17/+19
| | | | | | Like r367463, but for xray/texts/unit. llvm-svn: 367550
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-199-36/+27
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [xray] [tests] Detect and handle missing LLVMTestingSupport gracefullyMichal Gorny2018-12-211-3/+9
| | | | | | | | | | | | | | | | | | Add a code to properly test for presence of LLVMTestingSupport library when performing a stand-alone build, and skip tests requiring it when it is not present. Since the library is not installed, llvm-config reported empty --libs for it and the tests failed to link with undefined references. Skipping the two fdr_* test files is better than failing to build, and should be good enough until we find a better solution. NB: both installing LLVMTestingSupport and building it automatically from within compiler-rt sources are non-trivial. The former due to dependency on gtest, the latter due to tight integration with LLVM source tree. Differential Revision: https://reviews.llvm.org/D55891 llvm-svn: 349899
* [XRay] Use preallocated memory for XRay profilingDean Michael Berris2018-12-073-12/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change builds upon D54989, which removes memory allocation from the critical path of the profiling implementation. This also changes the API for the profile collection service, to take ownership of the memory and associated data structures per-thread. The consolidation of the memory allocation allows us to do two things: - Limits the amount of memory used by the profiling implementation, associating preallocated buffers instead of allocating memory on-demand. - Consolidate the memory initialisation and cleanup by relying on the buffer queue's reference counting implementation. We find a number of places which also display some problematic behaviour, including: - Off-by-factor bug in the allocator implementation. - Unrolling semantics in cases of "memory exhausted" situations, when managing the state of the function call trie. We also add a few test cases which verify our understanding of the behaviour of the system, with important edge-cases (especially for memory-exhausted cases) in the segmented array and profile collector unit tests. Depends on D54989. Reviewers: mboerger Subscribers: dschuff, mgorny, dmgreen, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D55249 llvm-svn: 348568
* Re-land "[XRay] Move-only Allocator, FunctionCallTrie, and Array"Dean Michael Berris2018-12-072-0/+116
| | | | | | | | | | | This reverts commit r348455, with some additional changes: - Work-around deficiency of gcc-4.8 by duplicating the implementation of `AppendEmplace` in `Append`, but instead of using brace-init for the copy construction, use a placement new explicitly calling the copy constructor. llvm-svn: 348563
* Revert "[XRay] Move-only Allocator, FunctionCallTrie, and Array"Dean Michael Berris2018-12-062-116/+0
| | | | | | | This reverts commits r348438, r348445, and r348449 due to breakages with gcc-4.8 builds. llvm-svn: 348455
* Re-land r348335 "[XRay] Move-only Allocator, FunctionCallTrie, and Array"Dean Michael Berris2018-12-062-0/+116
| | | | | | | | | | | | Continuation of D54989. Additional changes: - Use `.AppendEmplace(...)` instead of `.Append(Type{...})` to appease GCC 4.8 with confusion on when an initializer_list is used as opposed to a temporary aggregate initialized object. llvm-svn: 348438
* Revert r348335 "[XRay] Move-only Allocator, FunctionCallTrie, and Array"Hans Wennborg2018-12-052-116/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | .. and also the follow-ups r348336 r348338. It broke stand-alone compiler-rt builds with GCC 4.8: In file included from /work/llvm/projects/compiler-rt/lib/xray/xray_function_call_trie.h:20:0,                  from /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.h:21,                  from /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:15: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget&}; T = __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget]’: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71:   required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget]’ /work/llvm/projects/compiler-rt/lib/xray/xray_function_call_trie.h:517:54:   required from here /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::FunctionCallTrie::mergeInto(__xray::FunctionCallTrie&) const::NodeAndTarget’      new (AlignedOffset) T{std::forward<Args>(args)...};      ^ /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::profileCollectorService::{anonymous}::ThreadTrie&}; T = __xray::profileCollectorService::{anonymous}::ThreadTrie]’: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71:   required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::profileCollectorService::{anonymous}::ThreadTrie]’ /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:98:34:   required from here /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::profileCollectorService::{anonymous}::ThreadTrie&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::profileCollectorService::{anonymous}::ThreadTrie’ /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h: In instantiation of ‘T* __xray::Array<T>::AppendEmplace(Args&& ...) [with Args = {const __xray::profileCollectorService::{anonymous}::ProfileBuffer&}; T = __xray::profileCollectorService::{anonymous}::ProfileBuffer]’: /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:383:71:   required from ‘T* __xray::Array<T>::Append(const T&) [with T = __xray::profileCollectorService::{anonymous}::ProfileBuffer] ’ /work/llvm/projects/compiler-rt/lib/xray/xray_profile_collector.cc:244:44:   required from here /work/llvm/projects/compiler-rt/lib/xray/xray_segmented_array.h:378:5: error: could not convert ‘{std::forward<const __xray::profileCollectorService::{anonymous}::ProfileBuffer&>((* & args#0))}’ from ‘<brace-enclosed initializer list>’ to ‘__xray::profileCollectorService::{anonymous}::ProfileBuffer’ > Summary: > This change makes the allocator and function call trie implementations > move-aware and remove the FunctionCallTrie's reliance on a > heap-allocated set of allocators. > > The change makes it possible to always have storage associated with > Allocator instances, not necessarily having heap-allocated memory > obtainable from these allocator instances. We also use thread-local > uninitialised storage. > > We've also re-worked the segmented array implementation to have more > precondition and post-condition checks when built in debug mode. This > enables us to better implement some of the operations with surrounding > documentation as well. The `trim` algorithm now has more documentation > on the implementation, reducing the requirement to handle special > conditions, and being more rigorous on the computations involved. > > In this change we also introduce an initialisation guard, through which > we prevent an initialisation operation from racing with a cleanup > operation. > > We also ensure that the ThreadTries array is not destroyed while copies > into the elements are still being performed by other threads submitting > profiles. > > Note that this change still has an issue with accessing thread-local > storage from signal handlers that are instrumented with XRay. We also > learn that with the testing of this patch, that there will be cases > where calls to mmap(...) (through internal_mmap(...)) might be called in > signal handlers, but are not async-signal-safe. Subsequent patches will > address this, by re-using the `BufferQueue` type used in the FDR mode > implementation for pre-allocated memory segments per active, tracing > thread. > > We still want to land this change despite the known issues, with fixes > forthcoming. > > Reviewers: mboerger, jfb > > Subscribers: jfb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D54989 llvm-svn: 348346
* [XRay] Move-only Allocator, FunctionCallTrie, and ArrayDean Michael Berris2018-12-052-0/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change makes the allocator and function call trie implementations move-aware and remove the FunctionCallTrie's reliance on a heap-allocated set of allocators. The change makes it possible to always have storage associated with Allocator instances, not necessarily having heap-allocated memory obtainable from these allocator instances. We also use thread-local uninitialised storage. We've also re-worked the segmented array implementation to have more precondition and post-condition checks when built in debug mode. This enables us to better implement some of the operations with surrounding documentation as well. The `trim` algorithm now has more documentation on the implementation, reducing the requirement to handle special conditions, and being more rigorous on the computations involved. In this change we also introduce an initialisation guard, through which we prevent an initialisation operation from racing with a cleanup operation. We also ensure that the ThreadTries array is not destroyed while copies into the elements are still being performed by other threads submitting profiles. Note that this change still has an issue with accessing thread-local storage from signal handlers that are instrumented with XRay. We also learn that with the testing of this patch, that there will be cases where calls to mmap(...) (through internal_mmap(...)) might be called in signal handlers, but are not async-signal-safe. Subsequent patches will address this, by re-using the `BufferQueue` type used in the FDR mode implementation for pre-allocated memory segments per active, tracing thread. We still want to land this change despite the known issues, with fixes forthcoming. Reviewers: mboerger, jfb Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54989 llvm-svn: 348335
* [XRay] Add a test for allocator exhaustionDean Michael Berris2018-11-201-1/+19
| | | | | | | | Use a more representative test of allocating small chunks for oddly-sized (small) objects from an allocator that has a page's worth of memory. llvm-svn: 347286
* [XRay] Move buffer extents back to the heapDean Michael Berris2018-11-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change addresses an issue which shows up with the synchronised race between threads writing into a buffer, and another thread reading the buffer. In a lot of cases, we cannot guarantee that threads will always see the signal to finalise their buffers in time despite the grace periods and state machine maintained through atomic variables. This change addresses it by ensuring that the same instance being updated to indicate how much of the buffer is "used" by the writing thread is the same instance being read by the thread processing the buffer to be written out to disk or handled through the iterators. To do this, we ensure that all the "extents" instances live in their own the backing store, in a different contiguous page from the buffer-specific backing store. We also take precautions to ensure that the atomic variables are cache-line-sized to prevent false-sharing from unnecessarily causing cache contention on unrelated writes/reads. It's feasible that we may in the future be able to move the storage of the extents objects into the single backing store, slightly changing the way to compute the size(s) of the buffers, but in the meantime we'll settle for the isolation afforded by having a different backing store for the extents instances. Reviewers: mboerger Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54684 llvm-svn: 347280
* [XRay] Add a test for function id encoding/decoding (NFC)Dean Michael Berris2018-11-091-0/+26
| | | | | | Increase test coverage for function enter/exit encoding/decoding. llvm-svn: 346477
* [XRay] Fix enter function tracing for record unwritingDean Michael Berris2018-11-091-0/+44
| | | | | | | | | | | | | | | | Summary: Before this change, we could run into a situation where we may try to undo tail exit records after writing metadata records before a function enter event. This change rectifies that by resetting the tail exit counter after writing the metadata records. Reviewers: mboerger Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54292 llvm-svn: 346475
* [XRay] Improve FDR trace handling and error messagingDean Michael Berris2018-11-091-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change covers a number of things spanning LLVM and compiler-rt, which are related in a non-trivial way. In LLVM, we have a library that handles the FDR mode even log loading, which uses C++'s runtime polymorphism feature to better faithfully represent the events that are written down by the FDR mode runtime. We do this by interpreting a trace that's serliased in a common format agreed upon by both the trace loading library and the FDR mode runtime. This library is under active development, which consists of features allowing us to reconstitute a higher-level event log. This event log is used by the conversion and visualisation tools we have for interpreting XRay traces. One of the tools we have is a diagnostic tool in llvm-xray called `fdr-dump` which we've been using to debug our expectations of what the FDR runtime should be writing and what the logical FDR event log structures are. We use this fairly extensively to reason about why some non-trivial traces we're generating with FDR mode runtimes fail to convert or fail to parse correctly. One of these failures we've found in manual debugging of some of the traces we've seen involve an inconsistency between the buffer extents (a record indicating how many bytes to follow are part of a logical thread's event log) and the record of the bytes written into the log -- sometimes it turns out the data could be garbage, due to buffers being recycled, but sometimes we're seeing the buffer extent indicating a log is "shorter" than the actual records associated with the buffer. This case happens particularly with function entry records with a call argument. This change for now updates the FDR mode runtime to write the bytes for the function call and arg record before updating the buffer extents atomically, allowing multiple threads to see a consistent view of the data in the buffer using the atomic counter associated with a buffer. What we're trying to prevent here is partial updates where we see the intermediary updates to the buffer extents (function record size then call argument record size) becoming observable from another thread, for instance, one doing the serialization/flushing. To do both diagnose this issue properly, we need to be able to honour the extents being set in the `BufferExtents` records marking the beginning of the logical buffers when reading an FDR trace. Since LLVM doesn't use C++'s RTTI mechanism, we instead follow the advice in the documentation for LLVM Style RTTI (https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html). We then rely on this RTTI feature to ensure that our file-based record producer (our streaming "deserializer") can honour the extents of individual buffers as we interpret traces. This also sets us up to be able to eventually do smart skipping/continuation of FDR logs, seeking instead to find BufferExtents records in cases where we find potentially recoverable errors. In the meantime, we make this change to operate in a strict mode when reading logical buffers with extent records. Reviewers: mboerger Subscribers: hiraditya, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D54201 llvm-svn: 346473
* [XRay] Update XRayRecord to support Custom/Typed EventsDean Michael Berris2018-11-062-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change cuts across LLVM and compiler-rt to add support for rendering custom events in the XRayRecord type, to allow for including user-provided annotations in the output YAML (as raw bytes). This work enables us to add custom event and typed event records into the `llvm::xray::Trace` type for user-provided events. This can then be programmatically handled through the C++ API and can be included in some of the tooling as well. For now we support printing the raw data we encounter in the custom events in the converted output. Future work will allow us to start interpreting these custom and typed events through a yet-to-be-defined API for extending the trace analysis library. Reviewers: mboerger Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D54139 llvm-svn: 346214
* [XRay] Update TSC math to handle wraparoundDean Michael Berris2018-11-052-63/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Prior to this change, we can run into situations where the TSC we're getting when exiting a function is less than the TSC we got when entering it. This would sometimes cause the counter for cumulative call times overflow, which was erroneously also being stored as a signed 64-bit integer. This change addresses both these issues while adding provisions for tracking CPU migrations. We do this because moving from one CPU to another doesn't guarantee that the timestamp counter for some architectures aren't guaranteed to be synchronised. For the moment, we leave the provisions there until we can update the data format to include the counting of CPU migrations we can catch. We update the necessary tests as well, ensuring that our expectations for the cycle accounting to be met in case of counter wraparound. Reviewers: mboerger Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54088 llvm-svn: 346116
* [XRay] Update delta computations in runtimeDean Michael Berris2018-11-021-0/+24
| | | | | | | | | | | | | | | | | | Summary: Fix some issues discovered from mostly manual inspection of outputs from the `llvm-xray fdr-dump` tool. It turns out we haven't been writing the deltas properly, and have been writing down zeros for deltas of some records. This change fixes this oversight born by the recent refactoring. Reviewers: mboerger Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D54022 llvm-svn: 345954
* [XRay] Fix TSC and atomic custom/typed event accountingDean Michael Berris2018-11-013-2/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a follow-on change to D53858 which turns out to have had a TSC accounting bug when writing out function exit records in FDR mode. This change adds a number of tests to ensure that: - We are handling the delta between the exit TSC and the last TSC we've seen. - We are writing the custom event and typed event records as a single update to the buffer extents. - We are able to catch boundary conditions when loading FDR logs. We introduce a TSC matcher to the test helpers, which we use in the testing/verification of the TSC accounting change. Reviewers: mboerger Subscribers: mgorny, hiraditya, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D53967 llvm-svn: 345905
* [XRay] Use more portable control blockDean Michael Berris2018-10-291-2/+11
| | | | | | | | | | | | | | | | | | | | Summary: In D53560, we assumed a specific layout for memory without using an explicit structure. This follow-up change uses more portable layout control by using unions in a struct, and consolidating the memory management code in the buffer queue. We also take the opportunity to improve the documentation on the types and operations, along with simplifying some of the logic in the buffer queue implementation. Reviewers: mboerger, eizan Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D53802 llvm-svn: 345485
* [XRay] Support generational buffers in FDR controllerDean Michael Berris2018-10-271-4/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is an intermediary step in the full support for generational buffer management in the FDR runtime. This change makes the FDR controller aware of the new generation number in the buffers handed out by the BufferQueue type. In the process of making this change, we've realised that the cleanest way of ensuring that the backing store per generation is live while all the threads that need access to it will need reference counting to tie the backing store to the lifetime of all threads that have a handle on buffers associated with the memory. We also learn that we're missing the edge-case in the function exit handler's implementation where the first record being written into the buffer is a function exit, which is caught/fixed by the test for generational buffer management. We still haven't wired the controller into the FDR mode runtime, which will need the reference counting on the backing store implemented to ensure that we're being conservatively thread-safe with this approach. Depends on D52974. Reviewers: mboerger, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53551 llvm-svn: 345445
* [XRay][compiler-rt] Generational Buffer ManagementDean Michael Berris2018-10-221-2/+114
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This change updates the buffer queue implementation to support using a generation number to identify the lifetime of buffers. This first part introduces the notion of the generation number, without changing the way we handle the buffers yet. What's missing here is the cleanup of the buffers. Ideally we'll keep the two most recent generations. We need to ensure that before we do any writes to the buffers, that we check the generation number(s) first. Those changes will follow-on from this change. Depends on D52588. Reviewers: mboerger, eizan Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D52974 llvm-svn: 344881
* [XRay] Handle allocator exhaustion in segmented arrayDean Michael Berris2018-10-222-1/+40
| | | | | | | | | | | | | | | | | Summary: This change allows us to handle allocator exhaustion properly in the segmented array implementation. Before this change, we relied on the caller of the `trim` function to provide a valid number of elements to trim. This change allows us to do the right thing in case the elements to trim is greater than the size of the container. Reviewers: mboerger, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53484 llvm-svn: 344880
* Revert commit r344670 as the test fails on a bot ↵Douglas Yung2018-10-191-114/+2
| | | | | | http://lab.llvm.org:8011/builders/clang-cmake-armv7-full/builds/2683/. llvm-svn: 344771
* [XRay][compiler-rt] Generational Buffer ManagementDean Michael Berris2018-10-171-2/+114
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This change updates the buffer queue implementation to support using a generation number to identify the lifetime of buffers. This first part introduces the notion of the generation number, without changing the way we handle the buffers yet. What's missing here is the cleanup of the buffers. Ideally we'll keep the two most recent generations. We need to ensure that before we do any writes to the buffers, that we check the generation number(s) first. Those changes will follow-on from this change. Depends on D52588. Reviewers: mboerger, eizan Subscribers: llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D52974 llvm-svn: 344670
* [XRay][compiler-rt] FDR Mode ControllerDean Michael Berris2018-10-155-29/+441
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change implements a controller for abstracting away the details of what happens when tracing with FDR mode. This controller type allows us to test in isolation the various cases where we're encountering function entry, exit, and other kinds of events we are handling when FDR mode is enabled. This change introduces a number of testing facilities we've needed to better support expressing the conditions we need for the unit tests. We leave some TODOs for moving those utilities into the LLVM project, sitting in the `Testing` library, to make matching conditions on XRay `Trace` instances through googlemock more manageable and declarative. We don't wire in the controller right away, to allow us to incrementally update the implementation(s) as we increase testing coverage of the controller type. There's a need to re-think the way we're managing buffers in a multi-threaded environment, which is more invasive than this implementation. This step in the process allows us to encode our assumptions in the implementation of the controller, and then evolve the buffer queue implementation to support generational buffer management to ensure we can continue to support the cases we're already supporting with the controller. Reviewers: mboerger, eizan Subscribers: mgorny, llvm-commits, jfb Differential Revision: https://reviews.llvm.org/D52588 llvm-svn: 344488
* [XRay] Clean up XRay build configurationDean Michael Berris2018-09-241-12/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change spans both LLVM and compiler-rt, where we do the following: - Add XRay to the LLVMBuild system, to allow for distributing the XRay trace loading library along with the LLVM distributions. - Use `llvm-config` better in the compiler-rt XRay implementation, to depend on the potentially already-distributed LLVM XRay library. While this is tested with the standalone compiler-rt build, it does require that the LLVMXRay library (and LLVMSupport as well) are available during the build. In case the static libraries are available, the unit tests will build and work fine. We're still having issues with attempting to use a shared library version of the LLVMXRay library since the shared library might not be accessible from the standard shared library lookup paths. The larger change here is the inclusion of the LLVMXRay library in the distribution, which allows for building tools around the XRay traces and profiles that the XRay runtime already generates. Reviewers: echristo, beanz Subscribers: mgorny, hiraditya, mboerger, llvm-commits Differential Revision: https://reviews.llvm.org/D52349 llvm-svn: 342859
* [XRay][compiler-rt] FDRLogWriter AbstractionDean Michael Berris2018-09-202-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change introduces an `FDRLogWriter` type which is responsible for serialising metadata and function records to character buffers. This is the first step in a refactoring of the implementation of the FDR runtime to allow for more granular testing of the individual components of the implementation. The main contribution of this change is a means of hiding the details of how specific records are written to a buffer, and for managing the extents of these buffers. We make use of C++ features (templates and some metaprogramming) to reduce repetition in the act of writing out specific kinds of records to the buffer. In this process, we make a number of changes across both LLVM and compiler-rt to allow us to use the `Trace` abstraction defined in the LLVM project in the testing of the runtime implementation. This gives us a closer end-to-end test which version-locks the runtime implementation with the loading implementation in LLVM. We also allow using gmock in compiler-rt unit tests, by adding the requisite definitions in the `AddCompilerRT.cmake` module. We also add the terminfo library detection along with inclusion of the appropriate compiler flags for header include lookup. Finally, we've gone ahead and updated the FDR logging implementation to use the FDRLogWriter for the lowest-level record-writing details. Following patches will isolate the state machine transitions which manage the set-up and tear-down of the buffers we're using in multiple threads. Reviewers: mboerger, eizan Subscribers: mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52220 llvm-svn: 342617
* Revert "[XRay][compiler-rt] FDRLogWriter Abstraction" and 1 more.Evgeniy Stepanov2018-09-192-96/+0
| | | | | | | | Revert the following 2 commits to fix standalone compiler-rt build: * r342523 [XRay] Detect terminfo library * r342518 [XRay][compiler-rt] FDRLogWriter Abstraction llvm-svn: 342596
* [XRay][compiler-rt] FDRLogWriter AbstractionDean Michael Berris2018-09-182-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change introduces an `FDRLogWriter` type which is responsible for serialising metadata and function records to character buffers. This is the first step in a refactoring of the implementation of the FDR runtime to allow for more granular testing of the individual components of the implementation. The main contribution of this change is a means of hiding the details of how specific records are written to a buffer, and for managing the extents of these buffers. We make use of C++ features (templates and some metaprogramming) to reduce repetition in the act of writing out specific kinds of records to the buffer. In this process, we make a number of changes across both LLVM and compiler-rt to allow us to use the `Trace` abstraction defined in the LLVM project in the testing of the runtime implementation. This gives us a closer end-to-end test which version-locks the runtime implementation with the loading implementation in LLVM. We also allow using gmock in compiler-rt unit tests, by adding the requisite definitions in the `AddCompilerRT.cmake` module. Finally, we've gone ahead and updated the FDR logging implementation to use the FDRLogWriter for the lowest-level record-writing details. Following patches will isolate the state machine transitions which manage the set-up and tear-down of the buffers we're using in multiple threads. Reviewers: mboerger, eizan Subscribers: mgorny, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52220 llvm-svn: 342518
* [XRay] Remove the deprecated __xray_log_init APIPetr Hosek2018-09-152-204/+0
| | | | | | | | | This API has been deprecated three months ago and shouldn't be used anymore, all clients should migrate to the new string based API. Differential Revision: https://reviews.llvm.org/D51606 llvm-svn: 342318
* [XRay][compiler-rt] Update test to use similar structureDean Michael Berris2018-07-311-3/+3
| | | | | | This is a follow-up to D50037. llvm-svn: 338349
* [XRay][compiler-rt] Profiling Mode: Include file header in buffersDean Michael Berris2018-07-311-2/+34
| | | | | | | | | | | | | | | | | Summary: This change provides access to the file header even in the in-memory buffer processing. This allows in-memory processing of the buffers to also check the version, and the format, of the profile data. Reviewers: eizan, kpw Reviewed By: eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50037 llvm-svn: 338347
* [XRay][compiler-rt] Segmented Array: Simplify and OptimiseDean Michael Berris2018-07-181-31/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a follow-on to D49217 which simplifies and optimises the implementation of the segmented array. In this patch we co-locate the book-keeping for segments in the `__xray::Array<T>` with the data it's managing. We take the chance in this patch to actually rename `Chunk` to `Segment` to better align with the high-level description of the segmented array. With measurements using benchmarks landed in D48879, we've identified that calls to `pthread_getspecific` started dominating the cycles, which led us to revert the change made in D49217 to use C++ thread_local initialisation instead (it reduces the cost by a huge margin, since we save one PLT-based call to pthread functions in the hot path). In particular, this is in `__xray::getThreadLocalData()`. We also took the opportunity to remove the least-common-multiple based calculation and instead pack as much data into segments of the array. This greatly simplifies the API of the container which hides as much of the implementation details as possible. For instance, we calculate the number of elements we need for the each segment internally in the Array instead of making it part of the type. With the changes here, we're able to get a measurable improvement on the performance of profiling mode on top of what D48879 already provides. Depends on D48879. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49363 llvm-svn: 337343
* [XRay][compiler-rt] Simplify Allocator ImplementationDean Michael Berris2018-07-183-26/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change simplifies the XRay Allocator implementation to self-manage an mmap'ed memory segment instead of using the internal allocator implementation in sanitizer_common. We've found through benchmarks and profiling these benchmarks in D48879 that using the internal allocator in sanitizer_common introduces a bottleneck on allocating memory through a central spinlock. This change allows thread-local allocators to eliminate contention on the centralized allocator. To get the most benefit from this approach, we also use a managed allocator for the chunk elements used by the segmented array implementation. This gives us the chance to amortize the cost of allocating memory when creating these internal segmented array data structures. We also took the opportunity to remove the preallocation argument from the allocator API, simplifying the usage of the allocator throughout the profiling implementation. In this change we also tweak some of the flag values to reduce the amount of maximum memory we use/need for each thread, when requesting memory through mmap. Depends on D48956. Reviewers: kpw, eizan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49217 llvm-svn: 337342
* [XRay][compiler-rt] Add PID field to llvm-xray tool and add PID metadata ↵Dean Michael Berris2018-07-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | record entry in FDR mode Summary: llvm-xray changes: - account-mode - process-id {...} shows after thread-id - convert-mode - process {...} shows after thread - parses FDR and basic mode pid entries - Checks version number for FDR log parsing. Basic logging changes: - Update header version from 2 -> 3 FDR logging changes: - Update header version from 2 -> 3 - in writeBufferPreamble, there is an additional PID Metadata record (after thread id record and tsc record) Test cases changes: - fdr-mode.cc, fdr-single-thread.cc, fdr-thread-order.cc modified to catch process id output in the log. Reviewers: dberris Reviewed By: dberris Subscribers: hiraditya, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D49153 llvm-svn: 336974
* [XRay][compiler-rt] xray::Array Freelist and Iterator UpdatesDean Michael Berris2018-07-102-20/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We found a bug while working on a benchmark for the profiling mode which manifests as a segmentation fault in the profiling handler's implementation. This change adds unit tests which replicate the issues in isolation. We've tracked this down as a bug in the implementation of the Freelist in the `xray::Array` type. This happens when we trim the array by a number of elements, where we've been incorrectly assigning pointers for the links in the freelist of chunk nodes. We've taken the chance to add more debug-only assertions to the code path and allow us to verify these assumptions in debug builds. In the process, we also took the opportunity to use iterators to implement both `front()` and `back()` which exposes a bug in the iterator decrement operation. In particular, when we decrement past a chunk size boundary, we end up moving too far back and reaching the `SentinelChunk` prematurely. This change unblocks us to allow for contributing the non-crashing version of the benchmarks in the test-suite as well. Reviewers: kpw Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D48653 llvm-svn: 336644
* [XRay][profiler] Part 4: Profiler Mode WiringDean Michael Berris2018-06-122-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is part of the larger XRay Profiling Mode effort. This patch implements the wiring required to enable us to actually select the `xray-profiling` mode, and install the handlers to start measuring the time and frequency of the function calls in call stacks. The current way to get the profile information is by working with the XRay API to `__xray_process_buffers(...)`. In subsequent changes we'll implement profile saving to files, similar to how the FDR and basic modes operate, as well as means for converting this format into those that can be loaded/visualised as flame graphs. We will also be extending the accounting tool in LLVM to support stack-based function call accounting. We also continue with the implementation to support building small histograms of latencies for the `FunctionCallTrie::Node` type, to allow us to actually approximate the distribution of latencies per function. Depends on D45758 and D46998. Reviewers: eizan, kpw, pelikan Reviewed By: kpw Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D44620 llvm-svn: 334469
* [XRay][compiler-rt] Remove __sanitizer:: from namespace __xray (NFC)Dean Michael Berris2018-06-052-2/+2
| | | | | | | This is a non-functional change that removes the full qualification of functions in __sanitizer:: being used in __xray. llvm-svn: 333983
* [XRay] Fixup: Explicitly call std::make_tuple(...)Dean Michael Berris2018-05-311-2/+2
| | | | | | Follow-up to D45758. llvm-svn: 333627
* [XRay][profiler] Part 3: Profile Collector ServiceDean Michael Berris2018-05-312-0/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is part of the larger XRay Profiling Mode effort. This patch implements a centralised collector for `FunctionCallTrie` instances, associated per thread. It maintains a global set of trie instances which can be retrieved through the XRay API for processing in-memory buffers (when registered). Future changes will include the wiring to implement the actual profiling mode implementation. This central service provides the following functionality: * Posting a `FunctionCallTrie` associated with a thread, to the central list of tries. * Serializing all the posted `FunctionCallTrie` instances into in-memory buffers. * Resetting the global state of the serialized buffers and tries. Depends on D45757. Reviewers: echristo, pelikan, kpw Reviewed By: kpw Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D45758 llvm-svn: 333624
* [XRay][profiler] Part 2: XRay Function Call TrieDean Michael Berris2018-05-152-0/+255
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is part of the larger XRay Profiling Mode effort. This patch implements a central data structure for capturing statistics about XRay instrumented function call stacks. The `FunctionCallTrie` type does the following things: * It keeps track of a shadow function call stack of XRay instrumented functions as they are entered (function enter event) and as they are exited (function exit event). * When a function is entered, the shadow stack contains information about the entry TSC, and updates the trie (or prefix tree) representing the current function call stack. If we haven't encountered this function call before, this creates a unique node for the function in this position on the stack. We update the list of callees of the parent function as well to reflect this newly found path. * When a function is exited, we compute statistics (TSC deltas, function call count frequency) for the associated function(s) up the stack as we unwind to find the matching entry event. This builds upon the XRay `Allocator` and `Array` types in Part 1 of this series of patches. Depends on D45756. Reviewers: echristo, pelikan, kpw Reviewed By: kpw Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D45757 llvm-svn: 332313
* [XRay][profiler] Part 1: XRay Allocator and Array ImplementationsDean Michael Berris2018-04-293-0/+185
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of the larger XRay Profiling Mode effort. Here we implement an arena allocator, for fixed sized buffers used in a segmented array implementation. This change adds the segmented array data structure, which relies on the allocator to provide and maintain the storage for the segmented array. Key features of the `Allocator` type: * It uses cache-aligned blocks, intended to host the actual data. These blocks are cache-line-size multiples of contiguous bytes. * The `Allocator` has a maximum memory budget, set at construction time. This allows us to cap the amount of data each specific `Allocator` instance is responsible for. * Upon destruction, the `Allocator` will clean up the storage it's used, handing it back to the internal allocator used in sanitizer_common. Key features of the `Array` type: * Each segmented array is always backed by an `Allocator`, which is either user-provided or uses a global allocator. * When an `Array` grows, it grows by appending a segment that's fixed-sized. The size of each segment is computed by the number of elements of type `T` that can fit into cache line multiples. * An `Array` does not return memory to the `Allocator`, but it can keep track of the current number of "live" objects it stores. * When an `Array` is destroyed, it will not return memory to the `Allocator`. Users should clean up the `Allocator` independently of the `Array`. * The `Array` type keeps a freelist of the chunks it's used before, so that trimming and growing will re-use previously allocated chunks. These basic data structures are used by the XRay Profiling Mode implementation to implement efficient and cache-aware storage for data that's typically read-and-write heavy for tracking latency information. We're relying on the cache line characteristics of the architecture to provide us good data isolation and cache friendliness, when we're performing operations like searching for elements and/or updating data hosted in these cache lines. Reviewers: echristo, pelikan, kpw Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D45756 llvm-svn: 331141
* Add Xray instrumentation support to FreeBSDKamil Rytarowski2018-02-151-5/+6
| | | | | | | | | | | | | | | | | | | Summary: - Enabling the build. - Using assembly for the cpuid parts. - Using thr_self FreeBSD call to get the thread id Patch by: David CARLIER Reviewers: dberris, rnk, krytarowski Reviewed By: dberris, krytarowski Subscribers: emaste, stevecheckoway, nglevin, srhines, kubamracek, dberris, mgorny, krytarowski, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D43278 llvm-svn: 325240
* [XRay] Rename Buffer.Buffer to Buffer.DataDean Michael Berris2018-02-101-4/+4
| | | | | | | | | | | | | | | | | | Summary: some compiler (msvc) treats Buffer.Buffer as constructor and refuse to compile. NFC Authored by comicfans44. Reviewers: rnk, dberris Reviewed By: dberris Subscribers: llvm-commits Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D40346 llvm-svn: 324807
* [XRay] Add missing include to unit testJonas Hahnfeld2017-12-271-0/+1
| | | | | | | FDRLoggingTest::MultiThreadedCycling uses std::array so we need to include the right C++ header and not rely on transitive dependencies. llvm-svn: 321485
* [XRay] Use optimistic logging model for FDR modeDean Michael Berris2017-11-211-15/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this change, the FDR mode implementation relied on at thread-exit handling to return buffers back to the (global) buffer queue. This introduces issues with the initialisation of the thread_local objects which, even through the use of pthread_setspecific(...) may eventually call into an allocation function. Similar to previous changes in this line, we're finding that there is a huge potential for deadlocks when initialising these thread-locals when the memory allocation implementation is also xray-instrumented. In this change, we limit the call to pthread_setspecific(...) to provide a non-null value to associate to the key created with pthread_key_create(...). While this doesn't completely eliminate the potential for the deadlock(s), it does allow us to still clean up at thread exit when we need to. The change is that we don't need to do more work when starting and ending a thread's lifetime. We also have a test to make sure that we actually can safely recycle the buffers in case we end up re-using the buffer(s) available from the queue on multiple thread entry/exits. This change cuts across both LLVM and compiler-rt to allow us to update both the XRay runtime implementation as well as the library support for loading these new versions of the FDR mode logging. Version 2 of the FDR logging implementation makes the following changes: * Introduction of a new 'BufferExtents' metadata record that's outside of the buffer's contents but are written before the actual buffer. This data is associated to the Buffer handed out by the BufferQueue rather than a record that occupies bytes in the actual buffer. * Removal of the "end of buffer" records. This is in-line with the changes we described above, to allow for optimistic logging without explicit record writing at thread exit. The optimistic logging model operates under the following assumptions: * Threads writing to the buffers will potentially race with the thread attempting to flush the log. To avoid this situation from occuring, we make sure that when we've finalized the logging implementation, that threads will see this finalization state on the next write, and either choose to not write records the thread would have written or write the record(s) in two phases -- first write the record(s), then update the extents metadata. * We change the buffer queue implementation so that once it's handed out a buffer to a thread, that we assume that buffer is marked "used" to be able to capture partial writes. None of this will be safe to handle if threads are racing to write the extents records and the reader thread is attempting to flush the log. The optimism comes from the finalization routine being required to complete before we attempt to flush the log. This is a fairly significant semantics change for the FDR implementation. This is why we've decided to update the version number for FDR mode logs. The tools, however, still need to be able to support older versions of the log until we finally deprecate those earlier versions. Reviewers: dblaikie, pelikan, kpw Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39526 llvm-svn: 318733
* [XRay][compiler-rt] Enable the XRay compiler-rt unit tests.Dean Michael Berris2017-08-312-4/+2
| | | | | | | | | | | | | | | | | | | Summary: Before this change we seemed to not be running the unit tests, and therefore we set out to run them. In the process of making this happen we found a divergence between the implementation and the tests. This includes changes to both the CMake files as well as the implementation and headers of the XRay runtime. We've also updated documentation on the changed functions. Reviewers: kpw, eizan Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37290 llvm-svn: 312202
OpenPOWER on IntegriCloud