|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | iteration.
Instead, load the byte at the needle length, compare it directly, and
save it to use in the lookup table of lengths we can skip forward.
I also added an annotation to expect that the comparison fails so that
the loop gets laid out contiguously without the call to memcpy (and the
substantial register shuffling that the ABI requires of that call).
Finally, because this behaves especially badly with a needle length of
one (by calling memcmp with a zero length) special case that to directly
call memchr, which is what we should have been doing anyways.
This was motivated by the fact that there are a large number of test
cases in 'check-llvm' where FileCheck's performance is dominated by
calls to StringRef::find (in a release, no-asserts build). I'm working
on patches to generally improve matters there, but this alone was worth
a 12.5% improvement in one test case where FileCheck spent 92% of its
time in this routine.
I experimented a bunch with different minor variations on this theme,
for example setting the pointer *at* the last byte and indexing
backwards for the call to memcmp. That didn't improve anything on this
version and seemed more complex. I also tried other things to make the
loop flow more nicely and none worked. =/ It is a bit unfortunate, the
generated code here remains pretty gross, but I don't see any obvious
ways to improve it. At this point, most of my ideas would be really
elaborate:
1) While the remainder of the string is long enough, we could load
   a 16-byte or 32-byte vector at the address of the last byte and use
   palignr to rotate that and check the first 15- or 31-bytes at the
   front of the next segment, essentially pre-loading the first several
   bytes of the next iteration so we could quickly detect a mismatch in
   those bytes without an additional memory access. Down side would be
   the code complexity, having a fallback loop, and likely misaligned
   vector load. Plus it would make the common case of the last byte not
   matching somewhat slower (need some extraction from a vector).
2) While we have space, we could do an aligned load of a 16- or 32-byte
   vector that *contains* the end byte, and use any peceding bytes to
   have a more precise "no" test, and any subsequent bytes could be
   saved for the next iteration. This remove any unaligned load penalty,
   but still requires us to pay the overhead of vector extraction for
   the cases where we didn't need to do anything other than load and
   compare the last byte.
3) Try to walk from the last byte in a way that is more friendly to
   cache and/or memory pre-fetcher considering we have to poke the last
   byte anyways.
No idea if any of these are really worth pursuing though. They all seem
somewhat unlikely to yield big wins in practice and to be a lot of work
and complexity. So I settled here, which at least seems like a strict
improvement over the previous version.
llvm-svn: 289373 | 
| | 
| 
| 
| 
| 
| | Differential Revision: https://reviews.llvm.org/D25299
llvm-svn: 286724 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | A recent patch added support for consumeInteger() and made
getAsInteger delegate to this function.  A few buildbots are
failing as a result with an assertion failure.  On a hunch,
I tested what happens if I call getAsInteger() on an empty
string, and sure enough it crashes the same way that the
buildbots are crashing.
I confirmed that getAsInteger() on an empty string did not
crash before my patch, so I suspect this to be the cause.
I also added a unit test for the empty string.
llvm-svn: 282170 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | StringRef::getInteger() exists and treats the entire string as
an integer of the specified radix, failing if any invalid characters
are encountered or the number overflows.
Sometimes you might have something like "123456foo" and you want
to get the number 123456 and leave the string "foo" remaining.
This is similar to what would be possible by using the standard
runtime library functions strtoul et al and specifying an end
pointer.
This patch adds consumeInteger(), which does exactly that.  It
consumes as much as possible until an invalid character is found,
and modifies the StringRef in place so that upon return only
the portion of the StringRef after the number remains.
Differential Revision: https://reviews.llvm.org/D24778
llvm-svn: 282164 | 
| | 
| 
| 
| 
| 
| | Differential Revision: http://reviews.llvm.org/D14781
llvm-svn: 263802 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | and tremendously less reliant on the optimizer to fix things.
The code is always necessarily looking for the entire length of the
string when doing the equality tests in this find implementation, but it
previously was needlessly re-checking the size each time among other
annoyances.
By writing this so simply an ddirectly in terms of memcmp, it also is
about 8x faster in a debug build, which in turn makes FileCheck about 2x
faster in 'ninja check-llvm'. This saves about 8% of the time for
FileCheck-heavy parts of the test suite like the x86 backend tests.
llvm-svn: 247269 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | with the StringRef::split method when used with a MaxSplit argument
other than '-1' (which nobody really does today, but which should
actually work).
The spec claimed both to split up to MaxSplit times, but also to append
<= MaxSplit strings to the vector. One of these doesn't make sense.
Given the name "MaxSplit", let's go with it being a max over how many
*splits* occur, which means the max on how many strings get appended is
MaxSplit+1. I'm not actually sure the implementation correctly provided
this logic either, as it used a really opaque loop structure.
The implementation was also playing weird games with nullptr in the data
field to try to rely on a totally opaque hidden property of the split
method that returns a pair. Nasty IMO.
Replace all of this with what is (IMO) simpler code that doesn't use the
pair returning split method, and instead just finds each separator and
appends directly. I think this is a lot easier to read, and it most
definitely matches the spec. Added some tests that exercise the corner
cases around StringRef() and StringRef("") that all now pass.
I'll start using this in code in the next commit.
llvm-svn: 247249 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | on StringRef. Finding and splitting on a single character is
substantially faster than doing it on even a single character StringRef
-- we immediately get to a *very* tuned memchr call this way.
Even nicer, we get to this even in a debug build, shaving 18% off the
runtime of TripleTest.Normalization, helping PR23676 some more.
llvm-svn: 247244 | 
| | 
| 
| 
| 
| 
| | just letting them be implicitly created.
llvm-svn: 216525 | 
| | 
| 
| 
| 
| 
| | added to work an old gcc bug. I believe its been fixed by now.
llvm-svn: 216156 | 
| | 
| 
| 
| | llvm-svn: 205697 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | startswith_lower is ocassionally useful and I think worth adding.
endwith_lower is added for completeness.
Differential Revision: http://llvm-reviews.chandlerc.com/D2041
llvm-svn: 193706 | 
| | 
| 
| 
| 
| 
| | Patch by Ismail Pazarbasi.
llvm-svn: 189162 | 
| | 
| 
| 
| | llvm-svn: 185861 | 
| | 
| 
| 
| 
| 
| 
| 
| | Remove the implementation in include/llvm/Support/YAMLTraits.h.
Added a DenseMap type DITypeHashMap in DebugInfo.h:
  DenseMap<std::pair<StringRef, unsigned>, MDNode*>
llvm-svn: 185852 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
llvm-svn: 169131 | 
| | 
| 
| 
| | llvm-svn: 165038 | 
| | 
| 
| 
| | llvm-svn: 156652 | 
| | 
| 
| 
| 
| 
| | fixes an assert reading "1239123123123123" when the result is already 64-bit.
llvm-svn: 155329 | 
| | 
| 
| 
| 
| 
| | StringRef::getAsInteger
llvm-svn: 155298 | 
| | 
| 
| 
| 
| 
| 
| 
| | it would fail with {,u}int64_t on x86-64 Linux.
This also removes code duplication.
llvm-svn: 152517 | 
| | 
| 
| 
| | llvm-svn: 152003 | 
| | 
| 
| 
| 
| 
| | of the StringRef.Split2 unittest on 32 bit machines.
llvm-svn: 151358 | 
| | 
| 
| 
| 
| 
| | and into StringRef.cpp, which is where the other StringRef stuff is.
llvm-svn: 151054 | 
| | 
| 
| 
| 
| 
| 
| 
| | Accomplished by moving the body of StringRef::edit_distance into
a separate function that accepts two ArrayRefs, and making
StringRef::edit_distance a wrapper around the new function.
llvm-svn: 150621 | 
| | 
| 
| 
| | llvm-svn: 143890 | 
| | 
| 
| 
| | llvm-svn: 143880 | 
| | 
| 
| 
| 
| 
| | Enable bounds checking to catch this kind of bug earlier.
llvm-svn: 142247 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Based on Horspool's simplified version of Boyer-Moore. We use a constant-sized table of
uint8_ts to keep cache thrashing low, needles bigger than 255 bytes are uncommon anyways.
The worst case is still O(n*m) but we do a lot better on the average case now.
llvm-svn: 142061 | 
| | 
| 
| 
| 
| 
| | Thanks to Alexandru Dura and Jonas Paulsson for finding it.
llvm-svn: 140859 | 
| | 
| 
| 
| 
| 
| | and it is just as easy to use StringRef::substr() preceding StringRef::compare() to achieve the same thing.
llvm-svn: 130430 | 
| | 
| 
| 
| 
| 
| | strncmp(). Unit tests also included.
llvm-svn: 129582 | 
| | 
| 
| 
| 
| 
| | Luis Felipe Strano Moraes!
llvm-svn: 129558 | 
| | 
| 
| 
| 
| 
| 
| 
| | zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.
llvm-svn: 121120 | 
| | 
| 
| 
| | llvm-svn: 120495 | 
| | 
| 
| 
| | llvm-svn: 120166 | 
| | 
| 
| 
| 
| 
| | on an early return.
llvm-svn: 118370 | 
| | 
| 
| 
| 
| 
| | allowed edit distance
llvm-svn: 116867 | 
| | 
| 
| 
| 
| 
| | characters > 127.
llvm-svn: 112189 | 
| | 
| 
| 
| 
| 
| | consistent with compare in corner cases.
llvm-svn: 112185 | 
| | 
| 
| 
| 
| 
| 
| 
| | - Cache used characters in a bitset to reduce memory overhead to just 32 bytes.
- On my core2 this code is faster except when the checked string was very short
  (smaller than the list of delimiters).
llvm-svn: 111817 | 
| | 
| 
| 
| 
| 
| 
| | This means that our Registers are now ordered R7, R8, R9, R10, R12, ...
Not R1, R10, R11, R12, R2, R3, ...
llvm-svn: 104745 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | It gets its own implementation totally divorced from the (presumably
performance-sensitive) routines which parse into a uint64_t.
Add APInt::operator|=(uint64_t), which is situationally much better than
using a full APInt.
llvm-svn: 97381 | 
| | 
| 
| 
| | llvm-svn: 92896 | 
| | 
| 
| 
| 
| 
| 
| | std::vector and llvm::SmallVector have annoying performance
tradeoffs. No, I don't expect this to matter, and now it won't.
llvm-svn: 92884 | 
| | 
| 
| 
| 
| 
| | to SmallVector, and add a unit test.
llvm-svn: 92340 | 
| | 
| 
| 
| | llvm-svn: 92309 | 
| | 
| 
| 
| | llvm-svn: 89372 | 
| | 
| 
| 
| 
| 
| | StringsEqualNoCase (from StringExtras.h) to it.
llvm-svn: 87020 |