summaryrefslogtreecommitdiffstats
path: root/clang/lib/Lex/Lexer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix typos in clangAlexander Kornienko2018-04-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Found via codespell -q 3 -I ../clang-whitelist.txt Where whitelist consists of: archtype cas classs checkk compres definit frome iff inteval ith lod methode nd optin ot pres statics te thru Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few files that have dubious fixes reverted.) Differential revision: https://reviews.llvm.org/D44188 llvm-svn: 329399
* [Lex] Avoid out-of-bounds dereference in LexAngledStringLiteral.Volodymyr Sapsai2018-01-121-8/+11
| | | | | | | | | | | | | | | | | | | | | | Fix makes the loop in LexAngledStringLiteral more like the loops in LexStringLiteral, LexCharConstant. When we skip a character after backslash, we need to check if we reached the end of the file instead of reading the next character unconditionally. Discovered by OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3832 rdar://problem/35572754 Reviewers: arphaman, kcc, rsmith, dexonsmith Reviewed By: rsmith, dexonsmith Subscribers: cfe-commits, rsmith, dexonsmith Differential Revision: https://reviews.llvm.org/D41423 llvm-svn: 322390
* Warn if we find a Unicode homoglyph for a symbol in an identifier.Richard Smith2017-12-141-1/+78
| | | | | | | | | | | | | | | | Specifically, warn if: * we find a character that the language standard says we must treat as an identifier, and * that character is not reasonably an identifier character (it's a punctuation character or similar), and * it renders identically to a valid non-identifier character in common fixed-width fonts. Some tools "helpfully" substitute the surprising characters for the expected characters, and replacing semicolons with Greek question marks is a common "prank". llvm-svn: 320697
* [Lex] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵Eugene Zelenko2017-12-081-49/+54
| | | | | | minor fixes (NFC). llvm-svn: 320207
* Stringizing raw string literals containing newlineTaewook Oh2017-12-061-56/+65
| | | | | | | | | | | | | | Summary: This patch implements 4.3 of http://open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4220.pdf. If a raw string contains a newline character, replace each newline character with the \n escape code. Without this patch, included test case (macro_raw_string.cpp) results compilation failure. Reviewers: rsmith, doug.gregor, jkorous-apple Reviewed By: jkorous-apple Subscribers: jkorous-apple, vsapsai, cfe-commits Differential Revision: https://reviews.llvm.org/D39279 llvm-svn: 319904
* Now that C++17 is official (https://www.iso.org/standard/68564.html), start ↵Aaron Ballman2017-12-041-2/+2
| | | | | | changing the C++1z terminology over to C++17. NFC intended, these are all mechanical changes. llvm-svn: 319688
* [c++2a] P0515R3: lexer support for new <=> token.Richard Smith2017-12-011-0/+18
| | | | llvm-svn: 319509
* [refactor][extract] insert semicolons into extracted/inserted codeAlex Lorenz2017-11-031-16/+20
| | | | | | | | | | | | | | | | | | when needed This commit implements the semicolon insertion logic into the extract refactoring. The following rules are used: - extracting expression: add terminating ';' to the extracted function. - extracting statements that don't require terminating ';' (e.g. switch): add terminating ';' to the callee. - extracting statements with ';': move (if possible) the original ';' from the callee and add terminating ';'. - otherwise, add ';' to both places. Differential Revision: https://reviews.llvm.org/D39441 llvm-svn: 317343
* Add -f[no-]double-square-bracket-attributes as new driver options to control ↵Aaron Ballman2017-10-151-1/+3
| | | | | | use of [[]] attributes in all language modes. This is the initial implementation of WG14 N2165, which is a proposal to add [[]] attributes to C2x, but also allows you to enable these attributes in C++98, or disable them in C++11 or later. llvm-svn: 315856
* [Lex] Avoid out-of-bounds dereference in SkipLineCommentAlex Lorenz2017-10-141-1/+2
| | | | | | | | | Credit to OSS-Fuzz for discovery: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3145 rdar://34526482 llvm-svn: 315785
* A '<' with a trigraph '#' is not a valid editor placeholderAlex Lorenz2017-10-111-1/+2
| | | | | | | | | Credit to OSS-Fuzz for discovery: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3137#c5 rdar://34923985 llvm-svn: 315398
* Fixed unused variable warning introduced in r313796 causing build failureCameron Desrochers2017-09-201-3/+0
| | | | llvm-svn: 313802
* [PCH] Fixed preamble breaking with BOM presence (and particularly, ↵Cameron Desrochers2017-09-201-7/+7
| | | | | | | | | | | | | | fluctuating BOM presence) This patch fixes broken preamble-skipping when the preamble region includes a byte order mark (BOM). Previously, parsing would fail if preamble PCH generation was enabled and a BOM was present. This also fixes preamble invalidation when a BOM appears or disappears. This may seem to be an obscure edge case, but it happens regularly with IDEs that pass buffer overrides that never (or always) have a BOM, yet the underlying file from the initial parse that generated a PCH might (or might not) have a BOM. I've included a test case for these scenarios. Differential Revision: https://reviews.llvm.org/D37491 llvm-svn: 313796
* [Preprocessor] Correct internal token parsing of newline characters in CRLFErich Keane2017-09-051-2/+3
| | | | | | | | | Correct implementation: Apparently I managed in r311683 to submit the wrong version of the patch for this, so I'm correcting it now. Differential Revision: https://reviews.llvm.org/D37079 llvm-svn: 312542
* [Preprocessor] Correct internal token parsing of newline characters in CRLFErich Keane2017-08-241-0/+2
| | | | | | | | | | | | | | | | | | | | Discovered due to a goofy git setup, the test system-headerline-directive.c (and a few others) failed because the token-consumption will consume only the '\r' in CRLF, making the preprocessor's printed value give the wrong line number when returning from an include. For example: (line 1):#include <noline.h>\r\n The "file exit" code causes the printer to try to print the 'returned to the main file' line. It looks up what the current line number is. However, since the current 'token' is the '\n' (since only the \r was consumed), it will give the line number as '1", not '2'. This results in a few failed tests, but more importantly, results in error messages being incorrect when compiling a previously preprocessed file. Differential Revision: https://reviews.llvm.org/D37079 llvm-svn: 311683
* [Lexer] Finding beginning of token with escaped new lineAlexander Kornienko2017-08-101-28/+44
| | | | | | | | | | | | | | | | | | | | Summary: Lexer::GetBeginningOfToken produced invalid location when backtracking across escaped new lines. This fixes PR26228 Reviewers: akyrtzi, alexfh, rsmith, doug.gregor Reviewed By: alexfh Subscribers: alexfh, cfe-commits Patch by Paweł Żukowski! Differential Revision: https://reviews.llvm.org/D30748 llvm-svn: 310576
* Fix invalid warnings for header guards in preamblesErik Verbruggen2017-07-051-1/+1
| | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=33574 Differential Revision: https://reviews.llvm.org/D34882 llvm-svn: 307134
* [PR33394] Avoid lexing editor placeholders when Clang is used onlyAlex Lorenz2017-06-161-1/+2
| | | | | | | | | | | | | | | | for preprocessing r300667 added support for editor placeholder to Clang. That commit didn’t take into account that users who use Clang for preprocessing only (-E) will get the "editor placeholder in source file" error when preprocessing their source (PR33394). This commit ensures that Clang doesn't lex editor placeholders when running a preprocessor only action. rdar://32718000 Differential Revision: https://reviews.llvm.org/D34256 llvm-svn: 305576
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-06-031-0/+2
| | | | llvm-svn: 304643
* Allow for unfinished #if blocks in preamblesErik Verbruggen2017-05-301-28/+11
| | | | | | | | | | | | | | | | | | | Previously, a preamble only included #if blocks (and friends like ifdef) if there was a corresponding #endif before any declaration or definition. The problem is that any header file that uses include guards will not have a preamble generated, which can make code-completion very slow. To prevent errors about unbalanced preprocessor conditionals in the preamble, and unbalanced preprocessor conditionals after a preamble containing unfinished conditionals, the conditional stack is stored in the pch file. This fixes PR26045. Differential Revision: http://reviews.llvm.org/D15994 llvm-svn: 304207
* [Lexer] Ensure that the token is not an annotation token whenAlex Lorenz2017-05-171-0/+4
| | | | | | | | | | | retrieving the identifer info for an Objective-C keyword This commit fixes an assertion that's triggered in getIdentifier when the token is an annotation token. rdar://32225463 llvm-svn: 303246
* Add a fix-it for -Wunguarded-availabilityAlex Lorenz2017-05-051-17/+49
| | | | | | | | | | | | | This patch adds a fix-it for the -Wunguarded-availability warning. This fix-it is similar to the Swift one: it suggests that you wrap the statement in an `if (@available)` check. The produced fixits are indented (just like the Swift ones) to make them look nice in Xcode's fix-it preview. rdar://31680358 Differential Revision: https://reviews.llvm.org/D32424 llvm-svn: 302253
* Add support for editor placeholders to ClangAlex Lorenz2017-04-191-0/+33
| | | | | | | | | | | | | | | | | | | | | This commit teaches Clang to recognize editor placeholders that are produced when an IDE like Xcode inserts a code-completion result that includes a placeholder. Now when the lexer sees a placeholder token, it emits an 'editor placeholder in source file' error and creates an identifier token that represents the placeholder. The parser/sema can now recognize the placeholders and can suppress the diagnostics related to the placeholders. This ensures that live issues in an IDE like Xcode won't get spurious diagnostics related to placeholders. This commit also adds a new compiler option named '-fallow-editor-placeholders' that silences the 'editor placeholder in source file' error. This is useful for an IDE like Xcode as we don't want to display those errors in live issues. rdar://31581400 Differential Revision: https://reviews.llvm.org/D32081 llvm-svn: 300667
* Do not warn about whitespace between ??/ trigraph and newline in line ↵Richard Smith2017-04-181-4/+6
| | | | | | comments if trigraphs are disabled in the current language. llvm-svn: 300609
* Fix mishandling of escaped newlines followed by newlines or nuls.Richard Smith2017-04-171-18/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if an escaped newline was followed by a newline or a nul, we'd lex the escaped newline as a bogus space character. This led to a bunch of different broken corner cases: For the pattern "\\\n\0#", we would then have a (horizontal) space whose spelling ends in a newline, and would decide that the '#' is at the start of a line, and incorrectly start preprocessing a directive in the middle of a logical source line. If we were already in the middle of a directive, this would result in our attempting to process multiple directives at the same time! This resulted in crashes, asserts, and hangs on invalid input, as discovered by fuzz-testing. For the pattern "\\\n" at EOF (with an implicit following nul byte), we would produce a bogus trailing space character with spelling "\\\n". This was mostly harmless, but would lead to clang-format getting confused and misformatting in rare cases. We now produce a trailing EOF token with spelling "\\\n", consistent with our handling for other similar cases -- an escaped newline is always part of the token containing the next character, if any. For the pattern "\\\n\n", this was somewhat more benign, but would produce an extraneous whitespace token to clients who care about preserving whitespace. However, it turns out that our lexing for line comments was relying on this bug due to an off-by-one error in its computation of the end of the comment, on the slow path where the comment might contain escaped newlines. llvm-svn: 300515
* Skip Unicode character expansion in assembly filesSanne Wouda2017-04-071-9/+11
| | | | | | | | | | | | | | | | | | | Summary: When using the C preprocessor with assembly files, either with a capital `S` file extension, or with `-xassembler-with-cpp`, the Unicode escape sequence `\u` is ignored. The `\u` pattern can be used for expanding a macro argument that starts with `u`. Author: Salman Arif <salman.arif@arm.com> Reviewers: rengolin, olista01 Reviewed By: olista01 Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D31765 llvm-svn: 299754
* Allow lexer to handle string_view literals. Patch from Anton Bikineev.Eric Fiselier2016-12-301-3/+3
| | | | | | | This implements the compiler side of p0403r0. This patch was reviewed as https://reviews.llvm.org/D26829. llvm-svn: 290744
* Move UTF functions into namespace llvm.Justin Lebar2016-09-301-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This lets people link against LLVM and their own version of the UTF library. I determined this only affects llvm, clang, lld, and lldb by running $ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq clang lld lldb llvm Tested with ninja lldb ninja check-clang check-llvm check-lld (ninja check-lldb doesn't complete for me with or without this patch.) Reviewers: rnk Subscribers: klimek, beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D24996 llvm-svn: 282822
* Fix some Clang-tidy modernize-use-using and Include What You Use warnings; ↵Eugene Zelenko2016-09-071-20/+22
| | | | | | | | other minor fixes. Differential revision: https://reviews.llvm.org/D24115 llvm-svn: 280870
* Implement filtering for code completion of identifiers.Vassil Vassilev2016-07-271-1/+9
| | | | | | | | Patch by Cristina Cristescu and Axel Naumann! Agreed on post commit review (D17820). llvm-svn: 276878
* [Lexer] Let the compiler infer string lengths. No functionality change intended.Benjamin Kramer2016-04-011-2/+2
| | | | llvm-svn: 265126
* [Lexer] Don't read out of bounds if a conflict marker is at the end of a fileBenjamin Kramer2016-04-011-1/+1
| | | | | | | | | | This can happen as we look for '<<<<' while scanning tokens but then expect '<<<<\n' to tell apart perforce from diff3 conflict markers. Just harden the pointer arithmetic. Found by libfuzzer + asan! llvm-svn: 265125
* Update diagnostics now that hexadecimal literals look likely to be part of ↵Richard Smith2016-03-041-2/+3
| | | | | | C++17. llvm-svn: 262753
* Remove use of builtin comma operator.Richard Trieu2016-02-181-1/+3
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261271
* [OpenCL] Adding reserved operator logical xor for OpenCLAnastasia Stulova2016-02-031-0/+3
| | | | | | | | | | | | | | | | | | This patch adds the reserved operator ^^ when compiling for OpenCL (spec v1.1 s6.3.g), which results in a more meaningful error message. Patch by Neil Hickey! Review: http://reviews.llvm.org/D13280 M test/SemaOpenCL/unsupported.cl M include/clang/Basic/TokenKinds.def M include/clang/Basic/DiagnosticParseKinds.td M lib/Basic/OperatorPrecedence.cpp M lib/Lex/Lexer.cpp M lib/Parse/ParseExpr.cpp llvm-svn: 259651
* Fix -Wnull-conversion for long macros.Richard Trieu2016-01-261-0/+25
| | | | | | | | | Move the function to get a macro name from DiagnosticRenderer.cpp to Lexer.cpp so that other files can use it. Lexer now has two functions to get the immediate macro name, the newly added one is better for diagnostic purposes. Make -Wnull-conversion use this function for better NULL macro detection. llvm-svn: 258778
* Emit a -Wmicrosoft warning when treating ^Z as EOF in MS mode.Nico Weber2015-12-291-1/+4
| | | | llvm-svn: 256596
* [clang] Disable Unicode in asm filesVinicius Tinti2015-11-201-2/+6
| | | | | | | | | Clang should not convert tokens to Unicode when preprocessing assembly files. Fixes PR25558. llvm-svn: 253738
* Use %select to merge similar diagnostics. NFCCraig Topper2015-11-141-5/+5
| | | | llvm-svn: 253119
* Disable trigraph and escaped newline expansion on all types of raw string ↵Craig Topper2015-10-221-1/+1
| | | | | | literals not just ASCII type. llvm-svn: 251025
* Replace a few std::string& with StringRef. NFC.Rafael Espindola2015-06-011-1/+1
| | | | | | Patch by Косов Евгений! llvm-svn: 238774
* Fix buffer overflow in LexerKostya Serebryany2015-05-041-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: Fix PR22407, where the Lexer overflows the buffer when parsing #include<\ (end of file after slash) Test Plan: Added a test that will trigger in asan build. This case is also covered by the clang-fuzzer bot. Reviewers: rnk Reviewed By: rnk Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D9489 llvm-svn: 236466
* Use delegating ctors to reduce code duplication. NFC.Benjamin Kramer2015-03-061-8/+2
| | | | llvm-svn: 231476
* Lex: Don't crash if both conflict markers are on the same lineDavid Majnemer2014-12-141-2/+2
| | | | | | | | | | We would check if the terminator marker is on a newline. However, the logic would end up out-of-bounds if the terminator marker immediately follows the start marker. This fixes PR21820. llvm-svn: 224210
* [c++1z] Support for u8 character literals.Richard Smith2014-11-081-6/+14
| | | | llvm-svn: 221576
* Fix warning in Altivec code when building with GCC 4.8.2 on Ubuntu 14.04.Jay Foad2014-10-291-1/+1
| | | | llvm-svn: 220855
* C++1y is now C++14!Aaron Ballman2014-08-191-2/+2
| | | | | | Changes diagnostic options, language standard options, diagnostic identifiers, diagnostic wording to use c++14 instead of c++1y. It also modifies related test cases to use the updated diagnostic wording. llvm-svn: 215982
* Use StringRef instead of MemoryBuffer&.Rafael Espindola2014-08-121-7/+7
| | | | | | | This code doesn't care where the data it is processing comes from, so a StringRef is probably the most natural interface. llvm-svn: 215448
* Change MemoryBuffer* to MemoryBuffer& parameter to Lexer::ComputePreambleDavid Blaikie2014-08-111-9/+9
| | | | | | | | | | | | | (dropping const from the reference as MemoryBuffer is immutable already, so const is just redundant - and while I'd personally put const everywhere, that's not the LLVM Way (see llvm::Type for another example of an immutable type where "const" is omitted for brevity)) Changing the pointer argument to a reference parameter makes call sites identical between callers with unique_ptrs or raw pointers, minimizing the churn in a pending unique_ptr migrations. llvm-svn: 215391
* Hide the concept of diagnostic levels from lex, parse and semaAlp Toker2014-06-151-6/+3
| | | | | | | | | | | | | | | | The compilation pipeline doesn't actually need to know about the high-level concept of diagnostic mappings, and hiding the final computed level presents several simplifications and other potential benefits. The only exceptions are opportunistic checks to see whether expensive code paths can be avoided for diagnostics that are guaranteed to be ignored at a certain SourceLocation. This commit formalizes that invariant by introducing and using DiagnosticsEngine::isIgnored() in place of individual level checks throughout lex, parse and sema. llvm-svn: 211005
OpenPOWER on IntegriCloud