summaryrefslogtreecommitdiffstats
path: root/clang/lib/Lex/PTHLexer.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* PTH: Don't emit the PTH offset of the IdentifierInfo string data as that data isTed Kremenek2009-02-111-4/+4
| | | | | | referenced by other tables. llvm-svn: 64304
* PTH: Replace ad hoc 'file name' -> 'PTH data' lookup table in the PTH file ↵Ted Kremenek2009-02-101-52/+158
| | | | | | with an on-disk chained hash table. This data structure is implemented using templates, and will be used to replace similar data structures. This change leads to no visibile performance impact on Cocoa.h, but now we only pay a price for the table on order with the number of files accessed and not the number in the PTH file. llvm-svn: 64245
* Add more PTH diagnostics for invalid PTH files, etc.Ted Kremenek2009-01-281-11/+29
| | | | llvm-svn: 63232
* Enhance PTHManager::Create() to take an optional Diagnostic* argument that ↵Ted Kremenek2009-01-281-2/+9
| | | | | | can be used to report issues such as a missing PTH file. llvm-svn: 63231
* PTH: Use Token::setLiteralData() to directly store a pointer to cached ↵Ted Kremenek2009-01-271-145/+23
| | | | | | | | | | spelling data in the PTH file. This removes a ton of code for looking up spellings using sourcelocations in the PTH file. This simplifies both PTH-generation and reading. Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file): - PTH generation time improves by 5% - PTH reading improves by 0.3%. llvm-svn: 63072
* Silence warning.Ted Kremenek2009-01-261-1/+1
| | | | llvm-svn: 63054
* Add version number checking to PTH files.Ted Kremenek2009-01-261-2/+8
| | | | llvm-svn: 63047
* Embed the offset of the PTH table inside the prologue of the PTH file. This ↵Ted Kremenek2009-01-261-9/+10
| | | | | | will help improve gradual versioning of PTH files instead of relying that the PTH table is at a fixed offset. llvm-svn: 63045
* Check in the long promised SourceLocation rewrite. This lays theChris Lattner2009-01-261-3/+2
| | | | | | | | | | ground work for implementing #line, and fixes the "out of macro ID's" problem. There is nothing particularly tricky about the code, other than the very performance sensitive SourceManager::getFileID() method. llvm-svn: 62978
* This is a follow-up to r62675:Chris Lattner2009-01-231-0/+6
| | | | | | | | | | Refactor how the preprocessor changes a token from being an tok::identifier to a keyword (e.g. tok::kw_for). Instead of doing this in HandleIdentifier, hoist this common case out into the caller, so that every keyword doesn't have to go through HandleIdentifier. This drops time in HandleIdentifier from 1.25ms to .62ms, and speeds up clang -Eonly with PTH by about 1%. llvm-svn: 62855
* Update comment.Chris Lattner2009-01-231-2/+2
| | | | llvm-svn: 62819
* remove my gross #ifdef's, using portable abstractions now that the 32-bitChris Lattner2009-01-221-11/+8
| | | | | | | | load is always aligned. I verified that the bswap doesn't occur in the assembly code on x86. llvm-svn: 62815
* remove Read8/Read24, which are dead. Rename Read16/Read32 to be moreChris Lattner2009-01-221-58/+30
| | | | | | descriptive. llvm-svn: 62775
* Fix <rdar://problem/6512717> by correctly reading the right offset in the ↵Ted Kremenek2009-01-211-1/+1
| | | | | | token data in PTHLexer::getSourceLocation(). llvm-svn: 62725
* merge two checks for identifiers in the pth loop into one.Chris Lattner2009-01-211-9/+10
| | | | llvm-svn: 62677
* Add a bit to IdentifierInfo that acts as a simple predicate whichChris Lattner2009-01-211-1/+3
| | | | | | | | | tells us whether Preprocessor::HandleIdentifier needs to be called. Because this method is only rarely needed, this saves a call and a bunch of random checks. This drops the time in HandleIdentifier from 3.52ms to .98ms on cocoa.h on my machine. llvm-svn: 62675
* Don't crash on empty PTH files. This fixes <rdar://problem/6512714>.Ted Kremenek2009-01-211-9/+19
| | | | llvm-svn: 62673
* really we only need on Read24!Chris Lattner2009-01-211-16/+0
| | | | llvm-svn: 62672
* revert my previous patch, it assumed endianness.Chris Lattner2009-01-211-6/+38
| | | | llvm-svn: 62671
* minor cleanups: now that tokens are 4-byte aligned in a PTH Chris Lattner2009-01-211-22/+6
| | | | | | file, just load them directly as ints. llvm-svn: 62668
* Fix: <rdar://problem/6510344> [pth] PTH slows down regular lexer ↵Ted Kremenek2009-01-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | considerably (when it has substantial work) Changes to IdentifierTable: - High-level summary: StringMap never owns IdentifierInfos. It just references them. - The string map now has StringMapEntry<IdentifierInfo*> instead of StringMapEntry<IdentifierInfo>. The IdentifierInfo object is allocated using the same bump pointer allocator as used by the StringMap. Changes to IdentifierInfo: - Added an extra pointer to point to the StringMapEntry<IdentifierInfo*> in the string map. This pointer will be null if the IdentifierInfo* is *only* used by the PTHLexer (that is it isn't in the StringMap). Algorithmic changes: - Non-PTH case: IdentifierInfo::get() will always consult the StringMap first to see if we have an IdentifierInfo object. If that StringMapEntry references a null pointer, we allocate a new one from the BumpPtrAllocator and update the reference in the StringMapEntry. - PTH case: We do the same lookup as with the non-PTH case, but if we don't get a hit in the StringMap we do a secondary lookup in the PTHManager for the IdentifierInfo. If we don't find an IdentifierInfo we create a new one as in the non-PTH case. If we do find and IdentifierInfo in the PTHManager, we update the StringMapEntry to refer to it so that the IdentifierInfo will be found on the next StringMap lookup. This way we only do a binary search in the PTH file at most once for a given IdentifierInfo. This greatly speeds things up for source files containing a non-trivial amount of code. Performance impact: While these changes do add some extra indirection in IdentifierTable to access an IdentifierInfo*, I saw speedups even in the non-PTH case as well. Non-PTH: For -fsyntax-only on Cocoa.h, we see a 6% speedup. PTH (with Cocoa.h in token cache): 11% speedup. I also did an experiment where we did -fsyntax-only on a source file including a large header and Cocoa.h, but the token cache did not contain the larger header. For this file, we were seeing a performance *regression* when using PTH of 3% over non-PTH. Now we are seeing a performance improvement of 9%! Tests: The serialization tests are now failing. I looked at this extensively, and I my belief is that this change is unmasking a bug rather than introducing a new one. I have disabled the serialization tests for now. llvm-svn: 62636
* PTH: Emitted tokens now consist of 12 bytes that are loaded used 3 32-bit ↵Ted Kremenek2009-01-191-5/+8
| | | | | | loads. This reduces user time but increases system time because of the slightly larger PTH file. Although there is no performance win on Cocoa.h and -Eonly, overall this seems like a good step. llvm-svn: 62542
* rearrange GetIdentifierInfo so that the fast path can be partially inlined ↵Chris Lattner2009-01-181-10/+4
| | | | | | into PTHLexer::Lex. This speeds up the user time of PTH -Eonly by another 2ms (4.4%) llvm-svn: 62454
* rename some variables, only set a tokens identifierinfo if non-null.Chris Lattner2009-01-181-10/+11
| | | | llvm-svn: 62450
* On i386 and x86-64, just do unaligned loads Chris Lattner2009-01-181-0/+20
| | | | | | | instead of assembling from bytes. This speeds up -Eonly PTH reading of cocoa.h by about 2ms, which is 4.2%. llvm-svn: 62447
* switch PTHLexer to use Read32 and friends instead of lots of inlinedChris Lattner2009-01-181-107/+60
| | | | | | copies. I verified that this causes no performance change in PTH. llvm-svn: 62445
* switch PTH lexer from using "const char*"s to "const unsigned char*"s Chris Lattner2009-01-181-48/+71
| | | | | | | internally. This is just a cleanup that reduces the need to cast to unsigned char before assembling a larger integer. llvm-svn: 62442
* simplify PTHManager::CreateLexerChris Lattner2009-01-171-1/+2
| | | | llvm-svn: 62424
* suck the call to "getSpellingLoc" that all clients do intoChris Lattner2009-01-171-2/+3
| | | | | | the implementation of PTHManager::getSpelling. llvm-svn: 62408
* this massive patch introduces a simple new abstraction: it makesChris Lattner2009-01-171-30/+33
| | | | | | | | | | | | | | | "FileID" a concept that is now enforced by the compiler's type checker instead of yet-another-random-unsigned floating around. This is an important distinction from the "FileID" currently tracked by SourceLocation. *That* FileID may refer to the start of a file or to a chunk within it. The new FileID *only* refers to the file (and its #include stack and eventually #line data), it cannot refer to a chunk. FileID is a completely opaque datatype to all clients, only SourceManager is allowed to poke and prod it. llvm-svn: 62407
* Change some terminology in SourceLocation: instead of referring to Chris Lattner2009-01-161-1/+1
| | | | | | | the "physical" location of tokens, refer to the "spelling" location. This is more concrete and useful, tokens aren't really physical objects! llvm-svn: 62309
* PTH: Fix termination condition in binary search.Ted Kremenek2009-01-151-1/+1
| | | | llvm-svn: 62277
* IdentifierInfo:Ted Kremenek2009-01-151-11/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - IdentifierInfo can now (optionally) have its string data not be co-located with itself. This is for use with PTH. This aspect is a little gross, as getName() and getLength() now make assumptions about a possible alternate representation of IdentifierInfo. Perhaps we should make IdentifierInfo have virtual methods? IdentifierTable: - Added class "IdentifierInfoLookup" that can be used by IdentifierTable to perform "string -> IdentifierInfo" lookups using an auxilliary data structure. This is used by PTH. - Perform tests show that IdentifierTable::get() does not slow down because of the extra check for the IdentiferInfoLookup object (the regular StringMap lookup does enough work to mitigate the impact of an extra null pointer check). - The upshot is that now that some IdentifierInfo objects might be owned by the IdentiferInfoLookup object. This should be reviewed. PTH: - Modified PTHManager::GetIdentifierInfo to *not* insert entries in IdentifierTable's string map, and instead create IdentifierInfo objects on the fly when mapping from persistent IDs to IdentifierInfos. This saves a ton of work with string copies, hashing, and StringMap lookup and resizing. This change was motivated because when processing source files in the PTH cache we don't need to do any string -> IdentifierInfo lookups. - PTHManager now subclasses IdentifierInfoLookup, allowing clients of IdentifierTable to transparently use IdentifierInfo objects managed by the PTH file. PTHManager resolves "string -> IdentifierInfo" queries by doing a binary search over a sorted table of identifier strings in the PTH file (the exact algorithm we use can be changed as needed). These changes lead to the following performance changes when using PTH on Cocoa.h: - fsyntax-only: 10% performance improvement - Eonly: 30% performance improvement llvm-svn: 62273
* PTH: Embed a persistentID side-table in the PTH file that is sorted in theTed Kremenek2009-01-151-2/+3
| | | | | | | | lexical order of the corresponding identifier strings. This will be used for a forthcoming optimization. This slows down PTH generation time by 7%. We can revert this change if the optimization proves to not be valuable. llvm-svn: 62248
* PTH:Ted Kremenek2009-01-131-1/+1
| | | | | | | | | | | | | | | | | - Use canonical FileID when using getSpelling() caching. This addresses some cache misses we were seeing with -fsyntax-only on Cocoa.h - Added Preprocessor::getPhysicalCharacterAt() utility method for clients to grab the first character at a specified sourcelocation. This uses the PTH spelling cache. - Modified Sema::ActOnNumericConstant() to use Preprocessor::getPhysicalCharacterAt() instead of SourceManager::getCharacterData() (to get PTH hits). These changes cause -fsyntax-only to not page in any sources from Cocoa.h. We see a speedup of 27%. llvm-svn: 62193
* Fix corner cases in PTH getSpelling() binary search.Ted Kremenek2009-01-131-0/+3
| | | | llvm-svn: 62187
* PTH: Fix remaining cases where the spelling cache in the PTH file was being ↵Ted Kremenek2009-01-131-16/+19
| | | | | | missed when it shouldn't. This shaves another 7% off PTH time for -Eonly on Cocoa.h llvm-svn: 62186
* Enhance PTH 'getSpelling' caching:Ted Kremenek2009-01-091-15/+104
| | | | | | | | | | | | | | - Refactor caching logic into a helper class PTHSpellingSearch - Allow "random accesses" in the spelling cache, thus catching the remaining cases where 'getSpelling' wasn't hitting the PTH cache For -Eonly, PTH, Cocoa.h: - This reduces wall time by 3% (user time unchanged, sys time reduced) - This reduces the amount of paged source by 1112K. The remaining 1112K still being paged in is from somewhere else (investigating). llvm-svn: 62009
* Invert assertion condition.Ted Kremenek2009-01-091-1/+1
| | | | llvm-svn: 61961
* PTH: Hook up getSpelling() caching in PTHLexer. This results in a niceTed Kremenek2009-01-081-4/+62
| | | | | | | | | | | | | | | | | | | | performance gain. Here's what we see for -Eonly on Cocoa.h (using PTH): - wall time decreases by 21% (26% speedup overall) - system time decreases by 35% - user time decreases by 6% These reductions are due to not paging source files just to get spellings for literals. The solution in place doesn't appear to be 100% yet, as we still see some of the pages for source files getting mapped in. Using -print-stats, we see that SourceManager maps in 7179K less bytes of source text (reduction of 75%). Will investigate why the remaining 25% are getting paged in. With these changes, here's how PTH compares to non-PTH on Cocoa.h: -Eonly: PTH takes 64% of the time as non-PTH (54% speedup) -fsyntax-only: PTH takes 89% of the time as non-PTH (11% speedup) llvm-svn: 61913
* PTH:Ted Kremenek2009-01-081-6/+22
| | | | | | | | | | - Added stub PTHLexer::getSpelling() that will be used for fetching cached spellings from the PTH file. This doesn't do anything yet. - Added a hook in Preprocessor::getSpelling() to call PTHLexer::getSpelling() when using a PTHLexer. - Updated PTHLexer to read the offsets of spelling tables in the PTH file. llvm-svn: 61911
* PTH: Remove some methods and simplify some conditions in PTHLexer::Lex(). ↵Ted Kremenek2008-12-231-58/+30
| | | | | | No big functionality change. llvm-svn: 61381
* PTH: Use 3 bytes instead of 4 bytes to encode the persistent ID for a token.Ted Kremenek2008-12-231-9/+8
| | | | | | | - This reduces the PTH size for Cocoa.h by 7%. - The increases PTH -Eonly speed for Cocoa.h by 0.8%. llvm-svn: 61377
* Cosmetics: rename a variable and tighten spacing. No functionality change.Ted Kremenek2008-12-231-4/+2
| | | | llvm-svn: 61375
* PTH:Ted Kremenek2008-12-231-4/+2
| | | | | | | | - Encode the token length with 2 bytes instead of 4. - This reduces the size of the .pth file for Cocoa.h by 12%. - This speeds up PTH time (-Eonly) on Cocoa.h by 1.6%. llvm-svn: 61364
* PTH:Ted Kremenek2008-12-231-30/+39
| | | | | | | | | | | | | - In PTHLexer::Lex read all of the token data from PTH file before constructing the token. The idea is to enhance locality. - Do not use Read8/Read32 in PTHLexer::Lex. Inline these operations manually. - Change PTHManager::ReadIdentifierInfo() to PTHManager::GetIdentifierInfo(). They are functionally the same except that PTHLexer::Lex() reads the persistent id. These changes result in a 3.3% speedup for PTH on Cocoa.h (-Eonly). llvm-svn: 61363
* PTH:Ted Kremenek2008-12-231-111/+92
| | | | | | | | | | | | - Embed 'eom' tokens in PTH file. - Use embedded 'eom' tokens to not lazily generate them in the PTHLexer. This means that PTHLexer can always advance to the next token after reading a token (instead of buffering tokens using a copy). - Moved logic of 'ReadToken' into Lex. GetToken & ReadToken no longer exist. - These changes result in a 3.3% speedup (-Eonly) on Cocoa.h. - The code is a little gross. Many cleanups are possible and should be done. llvm-svn: 61360
* Use '&' to test StartOfLine flag.Ted Kremenek2008-12-181-1/+1
| | | | llvm-svn: 61205
* Rewrite PTHLexer::DiscardToEndOfLine() to not use GetToken and instead only ↵Ted Kremenek2008-12-171-9/+18
| | | | | | read the bytes needed to determine if a token is not at the start of the line. llvm-svn: 61172
* Change PTHLexer::getSourceLocation() to not call GetToken() and instead just ↵Ted Kremenek2008-12-171-0/+15
| | | | | | read the file offset in the token data buffer directly. llvm-svn: 61170
OpenPOWER on IntegriCloud