| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
LLVM uses .h as its extension for header files.
Differential Revision: https://reviews.llvm.org/D66509
llvm-svn: 369487
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch rewrites a few loops in deque and split_buffer to better
optimize the codegen. For constructors like
`deque<unsigned char> d(500000, 0);` this patch results in a 2x speedup.
The patch improves the codegen in roughly three ways:
1. Changes do { ... } while (...) loops into more typical for loops.
The optimizer can reason about normal looking loops better.
2. Split the iteration over a range into (A) iteration over the blocks,
then (B) iteration within the block. This nested structure helps LLVM
lower the inner loop to `memset`.
3. Do fewer things each iteration. Some of these loops were incrementing
or changing 4-5 variables every loop (in addition to the
construction). Previously most loops would increment the end pointer,
the size, and decrement the count of remaining items to construct.
Now we only increment a single pointer for most iterations.
llvm-svn: 368547
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimizer is petulant and temperamental. In this case LLVM failed to lower
the the "insert at end" loop used by`vector<unsigned char>` to a `memset` despite
`memset` being substantially faster over a range of bytes.
LLVM has the ability to lower loops to `memset` whet appropriate, but the
odd nature of libc++'s loops prevented the optimization from taking places.
This patch addresses the issue by rewriting the loops from the form
`do [ ... --__n; } while (__n > 0);` to instead use a for loop over a pointer
range (For example: `for (auto *__i = ...; __i < __e; ++__i)`).
This patch also rewrites the asan annotations to unposion all additional memory
at the start of the loop instead of once per iterations. This could potentially
permit false negatives where the constructor of element N attempts to access
element N + 1 during its construction.
The before and after results for the `BM_ConstructSize/vector_byte/5140480_mean`
benchmark (run 5 times) are:
--------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------
Before
------
BM_ConstructSize/vector_byte/5140480_mean 12530140 ns 12469693 ns N/A
BM_ConstructSize/vector_byte/5140480_median 12512818 ns 12445571 ns N/A
BM_ConstructSize/vector_byte/5140480_stddev 106224 ns 107907 ns 5
-----
After
-----
BM_ConstructSize/vector_byte/5140480_mean 167285 ns 166500 ns N/A
BM_ConstructSize/vector_byte/5140480_median 166749 ns 166069 ns N/A
BM_ConstructSize/vector_byte/5140480_stddev 3242 ns 3184 ns 5
llvm-svn: 367183
|
|
|
|
| |
llvm-svn: 322812
|
|
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D25522
Patch written by Aditya Kumar.
llvm-svn: 284179
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've put some work into the Google Benchmark library in order to make it easier
to benchmark libc++. These changes have already been upstreamed into
Google Benchmark and this patch applies the changes to the in-tree version.
The main improvement in the addition of a 'compare_bench.py' script which
makes it very easy to compare benchmarks. For example to compare the native
STL to libc++ you would run:
`$ compare_bench.py ./util_smartptr.native.out ./util_smartptr.libcxx.out`
And the output would look like:
RUNNING: ./util_smartptr.native.out
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy 62 ns 62 ns 10937500
BM_SharedPtrIncDecRef 31 ns 31 ns 23972603
BM_WeakPtrIncDecRef 28 ns 28 ns 23648649
RUNNING: ./util_smartptr.libcxx.out
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy 46 ns 46 ns 14957265
BM_SharedPtrIncDecRef 31 ns 31 ns 22435897
BM_WeakPtrIncDecRef 34 ns 34 ns 21084337
Comparing ./util_smartptr.native.out to ./util_smartptr.libcxx.out
Benchmark Time CPU
-----------------------------------------------------
BM_SharedPtrCreateDestroy -0.26 -0.26
BM_SharedPtrIncDecRef +0.00 +0.00
BM_WeakPtrIncDecRef +0.21 +0.21
llvm-svn: 278147
|
|
|
|
| |
llvm-svn: 276552
|
|
|
|
| |
llvm-svn: 276549
|
|
Summary:
This patch does the following:
1. Checks in a copy of the Google Benchmark library into the libc++ repo under `utils/google-benchmark`.
2. Teaches libc++ how to build Google Benchmark against both (A) in-tree libc++ and (B) the platforms native STL.
3. Allows performance benchmarks to be built as part of the libc++ build.
Building the benchmarks (and Google Benchmark) is off by default. It must be enabled using the CMake option `-DLIBCXX_INCLUDE_BENCHMARKS=ON`. When this option is enabled the tests under `libcxx/benchmarks` can be built using the `libcxx-benchmarks` target.
On Linux platforms where libstdc++ is the default STL the CMake option `-DLIBCXX_BUILD_BENCHMARKS_NATIVE_STDLIB=ON` can be used to build each benchmark test against libstdc++ as well. This is useful for comparing performance between standard libraries.
Support for benchmarks is currently very minimal. They must be manually run by the user and there is no mechanism for detecting performance regressions.
Known Issues:
* `-DLIBCXX_INCLUDE_BENCHMARKS=ON` is only supported for Clang, and not GCC, since the `-stdlib=libc++` option is needed to build Google Benchmark.
Reviewers: danalbert, dberlin, chandlerc, mclow.lists, jroelofs
Subscribers: chandlerc, dberlin, tberghammer, danalbert, srhines, hfinkel
Differential Revision: https://reviews.llvm.org/D22240
llvm-svn: 276049
|