summaryrefslogtreecommitdiffstats
path: root/clang/lib/Lex/Lexer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-091-8/+8
| | | | | | | | | | | | | | | | | | | This is similar to the LLVM change https://reviews.llvm.org/D46290. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46320 llvm-svn: 331834
* PR37189 Fix incorrect end source location and spelling for a split '>>' token.Richard Smith2018-04-301-15/+11
| | | | | | | | | | | | | | | | | | When a '>>' token is split into two '>' tokens (in C++11 onwards), or (as an extension) when we do the same for other tokens starting with a '>', we can't just use a location pointing to the first '>' as the location of the split token, because that would result in our miscomputing the length and spelling for the token. As a consequence, for example, a refactoring replacing 'A<X>' with something else would sometimes replace one character too many, and similarly diagnostics highlighting a template-id source range would highlight one character too many. Fix this by creating an expansion range covering the first character of the '>>' token, whose spelling is '>'. For this to work, we generalize the expansion range of a macro FileID to be either a token range (the common case) or a character range (used in this new case). llvm-svn: 331155
* [CodeComplete] Fix completion in the middle of ident in ctor lists.Ilya Biryukov2018-04-251-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The example that was broken before (^ designates completion points): class Foo { Foo() : fie^ld^() {} // no completions were provided here. int field; }; To fix it we don't cut off lexing after an identifier followed by code completion token is lexed. Instead we skip the rest of identifier and continue lexing. This is consistent with behavior of completion when completion token is right before the identifier. Reviewers: sammccall, aaron.ballman, bkramer, sepavloff, arphaman, rsmith Reviewed By: aaron.ballman Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D44932 llvm-svn: 330833
* [CodeComplete] Fix completion at the end of keywordsIlya Biryukov2018-04-241-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Make completion behave consistently no matter if it is run at the start, in the middle or at the end of an identifier that happens to be a keyword or a macro name. Since completion is often ran on incomplete identifiers, they may turn into keywords by accident. For example, we should produce same results for all of these completion points: // ^ is completion point. ^class cla^ss class^ Previously clang produced different results for the last case (as if the completion point was after a space: `class ^`). This change also updates some offsets in tests that (unintentionally?) relied on the old behavior. Reviewers: sammccall, bkramer, arphaman, aaron.ballman Reviewed By: sammccall Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D45887 llvm-svn: 330717
* Fix typos in clangAlexander Kornienko2018-04-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
* [Lex] Avoid out-of-bounds dereference in LexAngledStringLiteral.Volodymyr Sapsai2018-01-121-8/+11
| | | | | | | | | | | | | | | | | | | | | | Fix makes the loop in LexAngledStringLiteral more like the loops in LexStringLiteral, LexCharConstant. When we skip a character after backslash, we need to check if we reached the end of the file instead of reading the next character unconditionally. Discovered by OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3832 rdar://problem/35572754 Reviewers: arphaman, kcc, rsmith, dexonsmith Reviewed By: rsmith, dexonsmith Subscribers: cfe-commits, rsmith, dexonsmith Differential Revision: https://reviews.llvm.org/D41423 llvm-svn: 322390
* Warn if we find a Unicode homoglyph for a symbol in an identifier.Richard Smith2017-12-141-1/+78
| | | | | | | | | | | | | | | | Specifically, warn if: * we find a character that the language standard says we must treat as an identifier, and * that character is not reasonably an identifier character (it's a punctuation character or similar), and * it renders identically to a valid non-identifier character in common fixed-width fonts. Some tools "helpfully" substitute the surprising characters for the expected characters, and replacing semicolons with Greek question marks is a common "prank". llvm-svn: 320697
* [Lex] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵Eugene Zelenko2017-12-081-49/+54
| | | | | | minor fixes (NFC). llvm-svn: 320207
* Stringizing raw string literals containing newlineTaewook Oh2017-12-061-56/+65
| | | | | | | | | | | | | | Summary: This patch implements 4.3 of http://open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4220.pdf. If a raw string contains a newline character, replace each newline character with the \n escape code. Without this patch, included test case (macro_raw_string.cpp) results compilation failure. Reviewers: rsmith, doug.gregor, jkorous-apple Reviewed By: jkorous-apple Subscribers: jkorous-apple, vsapsai, cfe-commits Differential Revision: https://reviews.llvm.org/D39279 llvm-svn: 319904
* Now that C++17 is official (https://www.iso.org/standard/68564.html), start ↵Aaron Ballman2017-12-041-2/+2
| | | | | | changing the C++1z terminology over to C++17. NFC intended, these are all mechanical changes. llvm-svn: 319688
* [c++2a] P0515R3: lexer support for new <=> token.Richard Smith2017-12-011-0/+18
| | | | llvm-svn: 319509
* [refactor][extract] insert semicolons into extracted/inserted codeAlex Lorenz2017-11-031-16/+20
| | | | | | | | | | | | | | | | | | when needed This commit implements the semicolon insertion logic into the extract refactoring. The following rules are used: - extracting expression: add terminating ';' to the extracted function. - extracting statements that don't require terminating ';' (e.g. switch): add terminating ';' to the callee. - extracting statements with ';': move (if possible) the original ';' from the callee and add terminating ';'. - otherwise, add ';' to both places. Differential Revision: https://reviews.llvm.org/D39441 llvm-svn: 317343
* Add -f[no-]double-square-bracket-attributes as new driver options to control ↵Aaron Ballman2017-10-151-1/+3
| | | | | | use of [[]] attributes in all language modes. This is the initial implementation of WG14 N2165, which is a proposal to add [[]] attributes to C2x, but also allows you to enable these attributes in C++98, or disable them in C++11 or later. llvm-svn: 315856
* [Lex] Avoid out-of-bounds dereference in SkipLineCommentAlex Lorenz2017-10-141-1/+2
| | | | | | | | | Credit to OSS-Fuzz for discovery: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3145 rdar://34526482 llvm-svn: 315785
* A '<' with a trigraph '#' is not a valid editor placeholderAlex Lorenz2017-10-111-1/+2
| | | | | | | | | Credit to OSS-Fuzz for discovery: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3137#c5 rdar://34923985 llvm-svn: 315398
* Fixed unused variable warning introduced in r313796 causing build failureCameron Desrochers2017-09-201-3/+0
| | | | llvm-svn: 313802
* [PCH] Fixed preamble breaking with BOM presence (and particularly, ↵Cameron Desrochers2017-09-201-7/+7
| | | | | | | | | | | | | | fluctuating BOM presence) This patch fixes broken preamble-skipping when the preamble region includes a byte order mark (BOM). Previously, parsing would fail if preamble PCH generation was enabled and a BOM was present. This also fixes preamble invalidation when a BOM appears or disappears. This may seem to be an obscure edge case, but it happens regularly with IDEs that pass buffer overrides that never (or always) have a BOM, yet the underlying file from the initial parse that generated a PCH might (or might not) have a BOM. I've included a test case for these scenarios. Differential Revision: https://reviews.llvm.org/D37491 llvm-svn: 313796
* [Preprocessor] Correct internal token parsing of newline characters in CRLFErich Keane2017-09-051-2/+3
| | | | | | | | | Correct implementation: Apparently I managed in r311683 to submit the wrong version of the patch for this, so I'm correcting it now. Differential Revision: https://reviews.llvm.org/D37079 llvm-svn: 312542
* [Preprocessor] Correct internal token parsing of newline characters in CRLFErich Keane2017-08-241-0/+2
| | | | | | | | | | | | | | | | | | | | Discovered due to a goofy git setup, the test system-headerline-directive.c (and a few others) failed because the token-consumption will consume only the '\r' in CRLF, making the preprocessor's printed value give the wrong line number when returning from an include. For example: (line 1):#include <noline.h>\r\n The "file exit" code causes the printer to try to print the 'returned to the main file' line. It looks up what the current line number is. However, since the current 'token' is the '\n' (since only the \r was consumed), it will give the line number as '1", not '2'. This results in a few failed tests, but more importantly, results in error messages being incorrect when compiling a previously preprocessed file. Differential Revision: https://reviews.llvm.org/D37079 llvm-svn: 311683
* [Lexer] Finding beginning of token with escaped new lineAlexander Kornienko2017-08-101-28/+44
| | | | | | | | | | | | | | | | | | | | Summary: Lexer::GetBeginningOfToken produced invalid location when backtracking across escaped new lines. This fixes PR26228 Reviewers: akyrtzi, alexfh, rsmith, doug.gregor Reviewed By: alexfh Subscribers: alexfh, cfe-commits Patch by Paweł Żukowski! Differential Revision: https://reviews.llvm.org/D30748 llvm-svn: 310576
* Fix invalid warnings for header guards in preamblesErik Verbruggen2017-07-051-1/+1
| | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=33574 Differential Revision: https://reviews.llvm.org/D34882 llvm-svn: 307134
* [PR33394] Avoid lexing editor placeholders when Clang is used onlyAlex Lorenz2017-06-161-1/+2
| | | | | | | | | | | | | | | | for preprocessing r300667 added support for editor placeholder to Clang. That commit didn’t take into account that users who use Clang for preprocessing only (-E) will get the "editor placeholder in source file" error when preprocessing their source (PR33394). This commit ensures that Clang doesn't lex editor placeholders when running a preprocessor only action. rdar://32718000 Differential Revision: https://reviews.llvm.org/D34256 llvm-svn: 305576
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-06-031-0/+2
| | | | llvm-svn: 304643
* Allow for unfinished #if blocks in preamblesErik Verbruggen2017-05-301-28/+11
| | | | | | | | | | | | | | | | | | | Previously, a preamble only included #if blocks (and friends like ifdef) if there was a corresponding #endif before any declaration or definition. The problem is that any header file that uses include guards will not have a preamble generated, which can make code-completion very slow. To prevent errors about unbalanced preprocessor conditionals in the preamble, and unbalanced preprocessor conditionals after a preamble containing unfinished conditionals, the conditional stack is stored in the pch file. This fixes PR26045. Differential Revision: http://reviews.llvm.org/D15994 llvm-svn: 304207
* [Lexer] Ensure that the token is not an annotation token whenAlex Lorenz2017-05-171-0/+4
| | | | | | | | | | | retrieving the identifer info for an Objective-C keyword This commit fixes an assertion that's triggered in getIdentifier when the token is an annotation token. rdar://32225463 llvm-svn: 303246
* Add a fix-it for -Wunguarded-availabilityAlex Lorenz2017-05-051-17/+49
| | | | | | | | | | | | | This patch adds a fix-it for the -Wunguarded-availability warning. This fix-it is similar to the Swift one: it suggests that you wrap the statement in an `if (@available)` check. The produced fixits are indented (just like the Swift ones) to make them look nice in Xcode's fix-it preview. rdar://31680358 Differential Revision: https://reviews.llvm.org/D32424 llvm-svn: 302253
* Add support for editor placeholders to ClangAlex Lorenz2017-04-191-0/+33
| | | | | | | | | | | | | | | | | | | | | This commit teaches Clang to recognize editor placeholders that are produced when an IDE like Xcode inserts a code-completion result that includes a placeholder. Now when the lexer sees a placeholder token, it emits an 'editor placeholder in source file' error and creates an identifier token that represents the placeholder. The parser/sema can now recognize the placeholders and can suppress the diagnostics related to the placeholders. This ensures that live issues in an IDE like Xcode won't get spurious diagnostics related to placeholders. This commit also adds a new compiler option named '-fallow-editor-placeholders' that silences the 'editor placeholder in source file' error. This is useful for an IDE like Xcode as we don't want to display those errors in live issues. rdar://31581400 Differential Revision: https://reviews.llvm.org/D32081 llvm-svn: 300667
* Do not warn about whitespace between ??/ trigraph and newline in line ↵Richard Smith2017-04-181-4/+6
| | | | | | comments if trigraphs are disabled in the current language. llvm-svn: 300609
* Fix mishandling of escaped newlines followed by newlines or nuls.Richard Smith2017-04-171-18/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if an escaped newline was followed by a newline or a nul, we'd lex the escaped newline as a bogus space character. This led to a bunch of different broken corner cases: For the pattern "\\\n\0#", we would then have a (horizontal) space whose spelling ends in a newline, and would decide that the '#' is at the start of a line, and incorrectly start preprocessing a directive in the middle of a logical source line. If we were already in the middle of a directive, this would result in our attempting to process multiple directives at the same time! This resulted in crashes, asserts, and hangs on invalid input, as discovered by fuzz-testing. For the pattern "\\\n" at EOF (with an implicit following nul byte), we would produce a bogus trailing space character with spelling "\\\n". This was mostly harmless, but would lead to clang-format getting confused and misformatting in rare cases. We now produce a trailing EOF token with spelling "\\\n", consistent with our handling for other similar cases -- an escaped newline is always part of the token containing the next character, if any. For the pattern "\\\n\n", this was somewhat more benign, but would produce an extraneous whitespace token to clients who care about preserving whitespace. However, it turns out that our lexing for line comments was relying on this bug due to an off-by-one error in its computation of the end of the comment, on the slow path where the comment might contain escaped newlines. llvm-svn: 300515
* Skip Unicode character expansion in assembly filesSanne Wouda2017-04-071-9/+11
| | | | | | | | | | | | | | | | | | | Summary: When using the C preprocessor with assembly files, either with a capital `S` file extension, or with `-xassembler-with-cpp`, the Unicode escape sequence `\u` is ignored. The `\u` pattern can be used for expanding a macro argument that starts with `u`. Author: Salman Arif <salman.arif@arm.com> Reviewers: rengolin, olista01 Reviewed By: olista01 Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D31765 llvm-svn: 299754
* Allow lexer to handle string_view literals. Patch from Anton Bikineev.Eric Fiselier2016-12-301-3/+3
| | | | | | | This implements the compiler side of p0403r0. This patch was reviewed as https://reviews.llvm.org/D26829. llvm-svn: 290744
* Move UTF functions into namespace llvm.Justin Lebar2016-09-301-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This lets people link against LLVM and their own version of the UTF library. I determined this only affects llvm, clang, lld, and lldb by running $ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq clang lld lldb llvm Tested with ninja lldb ninja check-clang check-llvm check-lld (ninja check-lldb doesn't complete for me with or without this patch.) Reviewers: rnk Subscribers: klimek, beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D24996 llvm-svn: 282822
* Fix some Clang-tidy modernize-use-using and Include What You Use warnings; ↵Eugene Zelenko2016-09-071-20/+22
| | | | | | | | other minor fixes. Differential revision: https://reviews.llvm.org/D24115 llvm-svn: 280870
* Implement filtering for code completion of identifiers.Vassil Vassilev2016-07-271-1/+9
| | | | | | | | Patch by Cristina Cristescu and Axel Naumann! Agreed on post commit review (D17820). llvm-svn: 276878
* [Lexer] Let the compiler infer string lengths. No functionality change intended.Benjamin Kramer2016-04-011-2/+2
| | | | llvm-svn: 265126
* [Lexer] Don't read out of bounds if a conflict marker is at the end of a fileBenjamin Kramer2016-04-011-1/+1
| | | | | | | | | | This can happen as we look for '<<<<' while scanning tokens but then expect '<<<<\n' to tell apart perforce from diff3 conflict markers. Just harden the pointer arithmetic. Found by libfuzzer + asan! llvm-svn: 265125
* Update diagnostics now that hexadecimal literals look likely to be part of ↵Richard Smith2016-03-041-2/+3
| | | | | | C++17. llvm-svn: 262753
* Remove use of builtin comma operator.Richard Trieu2016-02-181-1/+3
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261271
* [OpenCL] Adding reserved operator logical xor for OpenCLAnastasia Stulova2016-02-031-0/+3
| | | | | | | | | | | | | | | | | | This patch adds the reserved operator ^^ when compiling for OpenCL (spec v1.1 s6.3.g), which results in a more meaningful error message. Patch by Neil Hickey! Review: http://reviews.llvm.org/D13280 M test/SemaOpenCL/unsupported.cl M include/clang/Basic/TokenKinds.def M include/clang/Basic/DiagnosticParseKinds.td M lib/Basic/OperatorPrecedence.cpp M lib/Lex/Lexer.cpp M lib/Parse/ParseExpr.cpp llvm-svn: 259651
* Fix -Wnull-conversion for long macros.Richard Trieu2016-01-261-0/+25
| | | | | | | | | Move the function to get a macro name from DiagnosticRenderer.cpp to Lexer.cpp so that other files can use it. Lexer now has two functions to get the immediate macro name, the newly added one is better for diagnostic purposes. Make -Wnull-conversion use this function for better NULL macro detection. llvm-svn: 258778
* Emit a -Wmicrosoft warning when treating ^Z as EOF in MS mode.Nico Weber2015-12-291-1/+4
| | | | llvm-svn: 256596
* [clang] Disable Unicode in asm filesVinicius Tinti2015-11-201-2/+6
| | | | | | | | | Clang should not convert tokens to Unicode when preprocessing assembly files. Fixes PR25558. llvm-svn: 253738
* Use %select to merge similar diagnostics. NFCCraig Topper2015-11-141-5/+5
| | | | llvm-svn: 253119
* Disable trigraph and escaped newline expansion on all types of raw string ↵Craig Topper2015-10-221-1/+1
| | | | | | literals not just ASCII type. llvm-svn: 251025
* Replace a few std::string& with StringRef. NFC.Rafael Espindola2015-06-011-1/+1
| | | | | | Patch by Косов Евгений! llvm-svn: 238774
* Fix buffer overflow in LexerKostya Serebryany2015-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: Fix PR22407, where the Lexer overflows the buffer when parsing #include<\ (end of file after slash) Test Plan: Added a test that will trigger in asan build. This case is also covered by the clang-fuzzer bot. Reviewers: rnk Reviewed By: rnk Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D9489 llvm-svn: 236466
* Use delegating ctors to reduce code duplication. NFC.Benjamin Kramer2015-03-061-8/+2
| | | | llvm-svn: 231476
* Lex: Don't crash if both conflict markers are on the same lineDavid Majnemer2014-12-141-2/+2
| | | | | | | | | | We would check if the terminator marker is on a newline. However, the logic would end up out-of-bounds if the terminator marker immediately follows the start marker. This fixes PR21820. llvm-svn: 224210
* [c++1z] Support for u8 character literals.Richard Smith2014-11-081-6/+14
| | | | llvm-svn: 221576
* Fix warning in Altivec code when building with GCC 4.8.2 on Ubuntu 14.04.Jay Foad2014-10-291-1/+1
| | | | llvm-svn: 220855
OpenPOWER on IntegriCloud