summaryrefslogtreecommitdiffstats
path: root/clang/lib/Lex/Lexer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use new UnicodeCharSet interface.Alexander Kornienko2013-08-291-15/+35
| | | | | | | | | | | | | | Summary: This is a Clang part of http://llvm-reviews.chandlerc.com/D1534 Reviewers: jordan_rose, klimek, rsmith Reviewed By: rsmith CC: cfe-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1535 llvm-svn: 189583
* Fix "//" comments with -traditional-cpp in C++.Eli Friedman2013-08-281-2/+4
| | | | | | | | | Apparently, gcc's -traditional-cpp behaves slightly differently in C++ mode; specifically, it discards "//" comments. Match gcc's behavior. <rdar://problem/14808126> llvm-svn: 189515
* Respect -Wnewline-eof even in C++11 mode.Jordan Rose2013-08-231-4/+22
| | | | | | | | | | | If the user has requested this warning, we should emit it, even if it's not an extension in the current language mode. However, being an extension is more important, so prefer the pedantic warning or the pedantic-compatibility warning if those are enabled. <rdar://problem/12922063> llvm-svn: 189110
* ObjectiveC migrator: More work towardsFariborz Jahanian2013-08-201-2/+3
| | | | | | insertion of ObjC audit pragmas. llvm-svn: 188733
* C++1y literal suffix support:Richard Smith2013-07-231-6/+18
| | | | | | | * Allow ns, us, ms, s, min, h as numeric ud-suffixes * Allow s as string ud-suffix llvm-svn: 186933
* Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.Michael J. Spencer2013-05-241-1/+1
| | | | llvm-svn: 182675
* [modules] If we hit a failure while loading a PCH/module, abort parsing ↵Argyrios Kyrtzidis2013-05-241-0/+6
| | | | | | | | | | instead of trying to continue in an invalid state. Also don't let libclang create a PCH with such an error. Fixes rdar://13953768 llvm-svn: 182629
* [Lexer] Improve Lexer::getSourceText() when the given range deals with ↵Argyrios Kyrtzidis2013-05-161-33/+24
| | | | | | | | function macro arguments. This is a modified version of a patch by Manuel Klimek. llvm-svn: 182055
* Typo and misc comment fix.Richard Smith2013-05-101-2/+4
| | | | llvm-svn: 181583
* [libclang] Make sure the preable does not truncate comments.Argyrios Kyrtzidis2013-04-191-2/+15
| | | | | | rdar://13647445 llvm-svn: 179907
* Add -Wc99-compat warning for C11 unicode string and character literals.Richard Smith2013-03-111-6/+8
| | | | llvm-svn: 176817
* When lexing in C11 mode, accept unicode character and string literals, per C11Richard Smith2013-03-091-9/+13
| | | | | | 6.4.4.4/1 and 6.4.5/1. llvm-svn: 176780
* Preprocessor: don't consider // to be a line comment in -E -std=c89 mode.Jordan Rose2013-03-051-4/+7
| | | | | | | | | | | | | | | | | | | It's beneficial when compiling to treat // as the start of a line comment even in -std=c89 mode, since it's not valid C code (with a few rare exceptions) and is usually intended as such. We emit a pedantic warning and then continue on as if line comments were enabled. This has been our behavior for quite some time. However, people use the preprocessor for things besides C source files. In today's prompting example, the input contains (unquoted) URLs, which contain // but should still be preserved. This change instructs the lexer to treat // as a plain token if Clang is in C90 mode and generating preprocessed output rather than actually compiling. <rdar://problem/13338743> llvm-svn: 176526
* Preprocessor: preserve whitespace in -traditional-cpp mode.Jordan Rose2013-02-211-17/+28
| | | | | | | | | Note that unlike GNU cpp we currently do not preserve whitespace in macros (even in -traditional-cpp mode). <rdar://problem/12897179> llvm-svn: 175778
* Properly validate UCNs for C99 and C++03 (both more restrictive than C(++)11).Jordan Rose2013-02-091-89/+86
| | | | | | | | Add warnings under -Wc++11-compat, -Wc++98-compat, and -Wc99-compat when a particular UCN is incompatible with a different standard, and -Wunicode when a UCN refers to a surrogate character in C++03. llvm-svn: 174788
* Pull Lexer's CharInfo table out for general use throughout Clang.Jordan Rose2013-02-081-170/+5
| | | | | | | | | | | Rewriting the same predicates over and over again is bad for code size and code maintainence. Using the functions in <ctype.h> is generally unsafe unless they are specified to be locale-independent (i.e. only isdigit and isxdigit). The next commit will try to clean up uses of <ctype.h> functions within Clang. llvm-svn: 174765
* Lexer: Don't warn about Unicode in preprocessor directives.Jordan Rose2013-01-311-2/+4
| | | | | | | | | This allows people to use Unicode in their #pragma mark and in macros that exist only to be string-ized. <rdar://problem/13107323&13121362> llvm-svn: 174081
* Fix r173881 to properly skip invalid UTF-8 characters in raw lexing and -E.Jordan Rose2013-01-301-0/+1
| | | | | | | | This caused hangs as we processed the same invalid byte over and over. <rdar://problem/13115651> llvm-svn: 173959
* Move UTF conversion routines from clang/lib/Basic to llvm/lib/SupportDmitri Gribenko2013-01-301-9/+11
| | | | | | This is required to use them in TableGen. llvm-svn: 173924
* Don't warn about Unicode characters in -E mode.Jordan Rose2013-01-301-18/+20
| | | | | | | | | | | | | | | People use the C preprocessor for things other than C files. Some of them have Unicode characters. We shouldn't warn about Unicode characters appearing outside of identifiers in this case. There's not currently a way for the preprocessor to tell if it's in -E mode, so I added a new flag, derived from the PreprocessorOutputOptions. This is only used by the Unicode warnings for now, but could conceivably be used by other warnings or even behavioral differences later. <rdar://problem/13107323> llvm-svn: 173881
* PR15067 (again): Don't warn about UCNs in C90 if we're raw-lexing.Jordan Rose2013-01-281-1/+2
| | | | | | Fixes a crash. Thanks, Richard. llvm-svn: 173701
* PR15067: Don't assert when a UCN appears in a C90 file.Jordan Rose2013-01-271-3/+6
| | | | | | | Unfortunately, we can't accept the UCN as an extension because we're required to treat it as two tokens for preprocessing purposes. llvm-svn: 173622
* Lexer.cpp: Fix a warning with ptrdiff_t on i686. [-Wsign-compare]NAKAMURA Takumi2013-01-251-1/+1
| | | | llvm-svn: 173447
* Clarify comment: "diagnose" is better than "warn" when emitting an error.Jordan Rose2013-01-251-1/+1
| | | | | | Thanks, Dmitri. llvm-svn: 173400
* Add a fixit for \U1234 -> \u1234.Jordan Rose2013-01-241-1/+9
| | | | llvm-svn: 173371
* As an extension, treat Unicode whitespace characters as whitespace.Jordan Rose2013-01-241-0/+23
| | | | llvm-svn: 173370
* Handle universal character names and Unicode characters outside of literals.Jordan Rose2013-01-241-13/+275
| | | | | | | | | | | | | | | | | | | | | | | | This is a missing piece for C99 conformance. This patch handles UCNs by adding a '\\' case to LexTokenInternal and LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN. If the UCN is not syntactically well-formed, we fall back to the old treatment: a backslash followed by an identifier beginning with 'u' (or 'U'). Because the spelling of an identifier with UCNs still has the UCN in it, we need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo. Of course, valid code that does *not* use UCNs will see only a very minimal performance hit (checks after each identifier for non-ASCII characters, checks when converting raw_identifiers to identifiers that they do not contain UCNs, and checks when getting the spelling of an identifier that it does not contain a UCN). This patch also adds basic support for actual UTF-8 in the source. This is treated almost exactly the same as UCNs except that we consider stray Unicode characters to be mistakes and offer a fixit to remove them. llvm-svn: 173369
* Remove useless 'llvm::' qualifier from names like StringRef and others that areDmitri Gribenko2013-01-121-1/+1
| | | | | | brought into 'clang' namespace by clang/Basic/LLVM.h llvm-svn: 172323
* Pull the bulk of Lexer::MeasureTokenLength() out into a new function,Argyrios Kyrtzidis2013-01-071-5/+15
| | | | | | | | Lexer::getRawToken(). No functionality change. llvm-svn: 171771
* s/CPlusPlus0x/CPlusPlus11/gRichard Smith2013-01-021-7/+7
| | | | llvm-svn: 171367
* Sort all of Clang's files under 'lib', and fix up the broken headersChandler Carruth2012-12-041-4/+4
| | | | | | | | | | | | | uncovered. This required manually correcting all of the incorrect main-module headers I could find, and running the new llvm/utils/sort_includes.py script over the files. I also manually added quite a few missing headers that were uncovered by shuffling the order or moving headers up to be main-module-headers. llvm-svn: 169237
* Teach Lexer::getSpelling about raw string literals. Specifically, if a rawRichard Smith2012-11-281-42/+67
| | | | | | | | string literal needs cleaning (because it contains line-splicing in the encoding prefix or in the ud-suffix), do not clean the section between the double-quotes -- that's the "raw" bit! llvm-svn: 168776
* Fix crash on end-of-file after \ in a char literal, fixes PR14369.Nico Weber2012-11-171-6/+8
| | | | | | | This makes LexCharConstant() look more like LexStringLiteral(), which doesn't have this bug. Add tests for eof after \ for several other cases. llvm-svn: 168269
* Fix an assertion failure printing the unused-label fixit in files using CRLF ↵Eli Friedman2012-11-141-1/+8
| | | | | | line endings. <rdar://problem/12639047>. llvm-svn: 167900
* Revert r167801, "[preprocessor] When #including something that contributes noDaniel Dunbar2012-11-131-22/+0
| | | | | | tokens at all,". This change broke External/Nurbs in LLVM test-suite. llvm-svn: 167858
* UCNs in char literals are done (in LiteralSupport), remove FIXME. Expand UCN ↵Nico Weber2012-11-131-2/+1
| | | | | | FIXME in LexNumericConstant. llvm-svn: 167818
* [preprocessor] When #including something that contributes no tokens at all,Argyrios Kyrtzidis2012-11-131-0/+22
| | | | | | | | | don't recursively continue lexing. This avoids a stack overflow with a sequence of many empty #includes. rdar://11988695 llvm-svn: 167801
* In Lexer::LexTokenInternal, avoid code duplication; no functionality change.Argyrios Kyrtzidis2012-11-131-39/+26
| | | | llvm-svn: 167800
* s/BCPLComment/LineComment/Nico Weber2012-11-111-22/+22
| | | | llvm-svn: 167690
* Take into account that there may be a BOM at the beginning of the file,Argyrios Kyrtzidis2012-10-251-3/+6
| | | | | | when computing the size of the precompiled preamble. llvm-svn: 166659
* StringRef'ize Preprocessor::CreateString().Dmitri Gribenko2012-09-241-1/+1
| | | | llvm-svn: 164555
* Dont cast away const needlessly. Found by gcc48 -Wcast-qual.Roman Divacky2012-09-061-1/+2
| | | | llvm-svn: 163325
* Make a bunch of methods on Lexer private.Eli Friedman2012-08-311-1/+1
| | | | llvm-svn: 162970
* Lexer: remove dead stores. Found by Clang static analyzer!Dmitri Gribenko2012-07-301-5/+2
| | | | llvm-svn: 160973
* Add warning flag -Winvalid-pp-token for preprocessing-tokens which haveRichard Smith2012-06-281-3/+3
| | | | | | | | undefined behaviour, and move the diagnostic for '' from an Error into an ExtWarn in this group. This is important for some users of the preprocessor, and is necessary for gcc compatibility. llvm-svn: 159335
* Documentation cleanup:James Dennett2012-06-171-11/+7
| | | | | | | | | | * Removed docs for Lexer::makeFileCharRange from Lexer.cpp, as they're in the header file; * Reworked the documentation for SkipBlockComment so that it doesn't confuse Doxygen's comment parsing; * Added another summary with \brief markup. llvm-svn: 158618
* [-E] Emit a rewritten _Pragma on its own line.Jordan Rose2012-06-151-1/+1
| | | | | | | | | | | 1. Teach Lexer that pragma lexers are like macro expansions at EOF. 2. Treat pragmas like #define/#undef when printing. 3. If we just printed a directive, add a newline before any more tokens. (4. Miscellaneous cleanup in PrintPreprocessedOutput.cpp) PR10594 and <rdar://problem/11562490> (two separate related problems) llvm-svn: 158571
* Documentation cleanup: escape backslashes in Doxygen comments.James Dennett2012-06-151-4/+5
| | | | llvm-svn: 158552
* PR12717: Clang supports hexadecimal floating-point literals in all languageRichard Smith2012-06-151-2/+14
| | | | | | | | | | | | | modes. For languages other than C99/C11, this isn't quite a conforming extension, and for C++11, it breaks some reasonable code containing user-defined literals. In languages which don't officially have hexfloats, pare back this extension to only apply in cases where the token starts 0x and does not contain an underscore. The extension is still not quite conforming, but it's a lot closer now. llvm-svn: 158487
* Fix PR13065.David Blaikie2012-06-151-1/+1
| | | | | | This condition (added in r158093) was overly conservative. llvm-svn: 158483
OpenPOWER on IntegriCloud