summaryrefslogtreecommitdiffstats
path: root/llvm/lib/MC/MCParser/AsmLexer.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [MC] Fix floating-point literal lexing.Eli Friedman2019-03-281-10/+13
| | | | | | | | | | | | | | | | | | | | | This patch has three related fixes to improve float literal lexing: 1. Make AsmLexer::LexDigit handle floats without a decimal point more consistently. 2. Make AsmLexer::LexFloatLiteral print an error for floats which are apparently missing an "e". 3. Make APFloat::convertFromString use binutils-compatible exponent parsing. Together, this fixes some cases where a float would be incorrectly rejected, fixes some cases where the compiler would crash, and improves diagnostics in some cases. Patch by Brandon Jones. Differential Revision: https://reviews.llvm.org/D57321 llvm-svn: 357214
* [NFC] Fix typos: preceeding -> precedingJordan Rupprecht2019-02-231-1/+1
| | | | llvm-svn: 354715
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [WebAssembly] replaced .param/.result by .functypeWouter van Oortmerssen2018-11-191-1/+6
| | | | | | | | | | | | | | | | | | | | | Summary: This makes it easier/cleaner to generate a single signature from this directive. Also: - Adds the symbol name, such that we don't depend on the location of this directive anymore. - Actually constructs the signature in the assembler, and make the assembler own it. - Refactor the use of MVT vs ValType in the streamer and assembler to require less conversions overall. - Changed 700 or so tests to use it. Reviewers: sbc100, dschuff Subscribers: jgravelle-google, eraman, aheejin, sunfish, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54652 llvm-svn: 347228
* [MC] Separate masm integer literal lexer support from inline asmReid Kleckner2018-10-241-16/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This renames the IsParsingMSInlineAsm member variable of AsmLexer to LexMasmIntegers and moves it up to MCAsmLexer. This is the only behavior controlled by that variable. I added a public setter, so that it can be set from outside or from the llvm-mc command line. We may need to arrange things so that users can get this behavior from clang, but that's future work. I also put additional hex literal lexing functionality under this flag to fix PR32973. It appears that this hex literal parsing wasn't intended to be enabled in non-masm-style blocks. Now, masm integers (0b1101 and 0ABCh) work in __asm blocks from clang, but 0b label references work when using .intel_syntax in standalone .s files. However, 0b label references will *not* work from __asm blocks in clang. They will work from GCC inline asm blocks, which it sounds like is important for Crypto++ as mentioned in PR36144. Essentially, we only lex masm literals for inline asm blobs that use intel syntax. If the .intel_syntax directive is used inside a gnu-style inline asm statement, masm literals will not be lexed, which is compatible with gas and llvm-mc standalone .s assembly. This fixes PR36144 and PR32973. Reviewers: Gerolf, avt77 Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D53535 llvm-svn: 345189
* [MC] Fix regression tests on Windows when git “core.autocrlf” is set to ↵Zhen Cao2017-11-171-0/+2
| | | | | | | | | | true. Differential Revision: https://reviews.llvm.org/D39737 This is the second attempt to commit this. The test was broken on Linux in the first attempt. llvm-svn: 318560
* Revert "[MC] Fix regression tests on Windows when git “core.autocrlf” is ↵Rafael Espindola2017-11-171-2/+0
| | | | | | | | | | set to true." This reverts commit r318528. MC/AsmParser/preserve-comments-crlf.s fails on linux. llvm-svn: 318533
* [MC] Fix regression tests on Windows when git “core.autocrlf” is set to ↵Zhen Cao2017-11-171-0/+2
| | | | | | | | true. Differential Revision: https://reviews.llvm.org/D39737 llvm-svn: 318528
* [MC] Lex CRLF as one tokenReid Kleckner2017-10-161-1/+9
| | | | | | | | | | This will prevent doubling of line endings when parsing assembly and emitting assembly. Otherwise we'd parse the directive, consume the end of statement, hit the next end of statement, and emit a fresh newline. llvm-svn: 315943
* [MC] - Don't assert when non-english characters are used.George Rimar2017-10-041-12/+13
| | | | | | | | | | | | | | | | | | I found that llvm-mc does not like non-english characters even in comments, which it tries to tokenize. Problem happens because of functions like isdigit(), isalnum() which takes int argument and expects it is not negative. But at the same time MCParser uses char* to store input buffer poiner, char has signed value, so it is possible to pass negative value to one of functions from above and that triggers an assert. Testcase for demonstration is provided. To fix the issue helper functions were introduced in StringExtras.h Differential revision: https://reviews.llvm.org/D38461 llvm-svn: 314883
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [MC] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵Eugene Zelenko2017-02-101-7/+3
| | | | | | minor fixes (NFC). llvm-svn: 294685
* Add a comment consumer mechanism to MCAsmLexerOliver Stannard2016-12-081-0/+15
| | | | | | | | | This allows clients to register an AsmCommentConsumer with the MCAsmLexer, which receives a callback each time a comment is parsed. Differential Revision: https://reviews.llvm.org/D27511 llvm-svn: 289036
* Prevent out of order HashDirective lexing in AsmLexer.Nirav Dave2016-10-031-26/+17
| | | | | | | | | | | | | | | | | | | Retrying after buildbot reset. To lex hash directives we peek ahead to find component tokens, create a unified token, and unlex the peeked tokens so the parser does not need to parse the tokens then. Make sure we do not to lex another hash directive during peek operation. This fixes PR28921. Reviewers: rnk, loladiro Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24839 llvm-svn: 283111
* Revert "[MC] Prevent out of order HashDirective lexing in AsmLexer."Nirav Dave2016-10-011-17/+26
| | | | | | This reverts commit r282992 which appears to be causing an LTO test failure. llvm-svn: 283034
* Use StringRef instead of raw pointers in MCAsmInfo/MCInstrInfo APIs (NFC)Mehdi Amini2016-10-011-3/+3
| | | | llvm-svn: 283018
* [MC] Prevent out of order HashDirective lexing in AsmLexer.Nirav Dave2016-10-011-26/+17
| | | | | | | | | | | | | | | | | To lex hash directives we peek ahead to find component tokens, create a unified token, and unlex the peeked tokens so the parser does not need to parse the tokens then. Make sure we do not to lex another hash directive during peek operation. This fixes PR28921. Reviewers: rnk, loladiro Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24839 llvm-svn: 282992
* (LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block:Yunzhong Gao2016-09-021-2/+42
| | | | | | | | | | | 1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not. 0xNN and NNh may come with optional U or L suffix. 2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not. NNb may come with optional U or L suffix. Differential Revision: https://reviews.llvm.org/D22112 llvm-svn: 280555
* Fix some Clang-tidy modernize-use-using and Include What You Use warnings; ↵Eugene Zelenko2016-08-231-5/+10
| | | | | | | | other minor fixes. Differential revision: https://reviews.llvm.org/D23789 llvm-svn: 279535
* Re-commit r277988: [mips][ias] Fix all the hacks related to MIPS-specific ↵Daniel Sanders2016-08-081-1/+43
| | | | | | | | | unary operators (%hi/%lo/%gp_rel/etc.). Hopefully with the MSVC builds fixed. I've added a missing '#include <tuple>' that gcc and clang don't seem to need. llvm-svn: 277995
* Revert r277988: [mips][ias] Fix all the hacks related to MIPS-specific unary ↵Daniel Sanders2016-08-081-41/+1
| | | | | | | | operators (%hi/%lo/%gp_rel/etc.). It seems that MSVC doesn't like std::tie(). llvm-svn: 277990
* [mips][ias] Fix all the hacks related to MIPS-specific unary operators ↵Daniel Sanders2016-08-081-1/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (%hi/%lo/%gp_rel/etc.). Summary: They are now lexed as a single token on targets where MCAsmInfo::HasMipsExpressions is true and then parsed in a similar way to the '~' operator as part of MCExpr::parseExpression. As a result: * expressions and immediates no longer have different parsing rules. The difference is now solely down to whether evaluateAsAbsolute() succeeds. * %hi(%neg(%gp_rel(x))) are no longer parsed as a single operator and decomposed into the three MipsMCExpr nodes. They are parsed directly as three MipsMCExpr nodes. * parseMemOperand no longer needs to eat all the surrounding parenthesis to get at the outermost operator to make this work * %hi(%neg(%gp_rel(x))) and %lo(%neg(%gp_rel(x))) are no longer the only 3-in-1 relocs that parse for N64. They're still the only combinations that are permitted in relocatable expressions though. Fixing that should be a later patch. * We no longer need to list all the tokens that can occur as the first token of an expression or immediate. test/MC/Mips/expr1.s: This change also prevents the incorrect lowering of %lo(2*4)+foo to %lo(8+foo) which is not an equivalent expression (the difference is whether foo is truncated to 16-bit or not) and the test has been updated to account for the macro expansion the correct expression requires. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: https://reviews.llvm.org/D23110 llvm-svn: 277988
* [MC] When emitting output hash comments always use standard line comment ↵Nirav Dave2016-07-291-1/+1
| | | | | | seperator llvm-svn: 277146
* Refactor and cleanup Assembly Parsing / LexingNirav Dave2016-06-171-60/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | Recommiting after fixing non-atomic insert to front of SmallVector in MCAsmLexer.h Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 273007
* Revert "Refactor and cleanup Assembly Parsing / Lexing"Nirav Dave2016-06-161-77/+60
| | | | | | | | Reverting for unexpected crashes on various platforms. This reverts commit r272953. llvm-svn: 272957
* Refactor and cleanup Assembly Parsing / LexingNirav Dave2016-06-161-60/+77
| | | | | | | | | | | | | | | | | | | | | | | Add explicit Comment Token in Assembly Lexing for future support for outputting explicit comments from inline assembly. As part of this, CPPHash Directives are now explicitly distinguished from Hash line comments in Lexer. Line comments are recorded as EndOfStatement tokens, not Comment tokens to simplify compatibility with current TargetParsers. This slightly complicates comment output. This remove all lexing tasks out of the parser, does minor cleanup to remove extraneous newlines Asm Output, and some improvements white space handling. Reviewers: rtrieu, dwmw2, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20009 llvm-svn: 272953
* Ignore Lexing errors in macro body definitionsNirav Dave2016-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | Do not issue lexing errors found during the parsing of macro body definitions and parseIdentifier function in AsmParser. This changes the Parser to not issue a lexing error when we reach an error, but rather when it is consumed allowing us time to examine and recover from an error. As a result, of this, we stop issuing a both lexing error and a parsing error in floating-literals test. Minor tweak to parseDirectiveRealValue to favor more meaningful lexing error over less helpful parse error. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20535 llvm-svn: 271542
* [MCParser] Accept uppercase radix variants 0X and 0BColin LeMahieu2016-03-181-2/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D14781 llvm-svn: 263802
* Remove uses of builtin comma operator.Richard Trieu2016-02-181-23/+42
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* Extend MCAsmLexer so that it can peek forward several tokensBenjamin Kramer2015-08-171-3/+13
| | | | | | | | | | | | | | | This commit adds a virtual `peekTokens()` function to `MCAsmLexer` which can peek forward an arbitrary number of tokens. It also makes the `peekTok()` method call `peekTokens()` method, but only requesting one token. The idea is to better support targets which more more ambiguous assembly syntaxes. Patch by Dylan McKay! llvm-svn: 245221
* Fix uses of reserved identifiers starting with an underscore followed by an ↵David Blaikie2015-03-161-1/+1
| | | | | | | | | uppercase letter This covers essentially all of llvm's headers and libs. One or two weird cases I wasn't sure were worth/appropriate to fix. llvm-svn: 232394
* MC: AsmLexer: handle multi-character CommentStrings correctlySaleem Abdulrasool2014-08-141-5/+13
| | | | | | | | | | | | As X86MCAsmInfoDarwin uses '##' as CommentString although a single '#' starts a comment a workaround for this special case is added. Fixes divisions in constant expressions for the AArch64 assembler and other targets which use '//' as CommentString. Patch by Janne Grunau! llvm-svn: 215615
* This only needs a StringRef.Rafael Espindola2014-07-061-11/+8
| | | | llvm-svn: 212401
* MC: do not add comment string to the AsmToken in AsmLexer::LexLineCommentSaleem Abdulrasool2014-06-181-2/+2
| | | | | | | | Fixes macros with varargs if the macro instantiation has a trailing comment. Patch by Janne Grunau! llvm-svn: 211219
* [C++] Use 'nullptr'.Craig Topper2014-04-241-4/+4
| | | | llvm-svn: 207083
* MCParser: add a single token lookaheadSaleem Abdulrasool2014-02-091-0/+22
| | | | | | | Some of the more complex directive and macro handling for GAS compatibility requires lookahead. Add a single token lookahead in the MCAsmLexer. llvm-svn: 201058
* MC: Add AsmLexer::BigNum token for integers greater than 64 bitsDavid Woodhouse2014-02-011-17/+17
| | | | | | | | | | | | | | | | | | | | This will be needed for .octa support, but we don't want to just use the existing AsmLexer::Integer for it and then have to litter all its users with explicit checks for the size, and make them use the new get APIntVal() method. So let the lexer produce an AsmLexer::Integer as before for numbers which are small enough — which appears to cover what was previously a nasty special case handling of numbers which don't fit in int64_t but *do* fit in uint64_t. Where the number is too large even for that, produce an AsmLexer::BigNum instead. We do nothing with these except complain about them for now, but that will be changed shortly... Based on a patch from PaX Team <pageexec@freemail.hu> llvm-svn: 200613
* Cache AllowAtInIdentifier as class variable in AsmLexerDavid Peixotto2013-12-061-1/+1
| | | | | | | | This commit caches the value of the AllowAtInIdentifier variable as a class variable in AsmLexer. We do this to avoid repeated MAI queries and string comparisons each time we lex an identifier. llvm-svn: 196622
* Integrated assembler incorrectly lexes ARM-style commentsDavid Peixotto2013-12-061-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | The integrated assembler fails to properly lex arm comments when they are adjacent to an identifier in the input stream. The reason is that the arm comment symbol '@' is also used as symbol variant in other assembly languages so when lexing an identifier it allows the '@' symbol as part of the identifier. Example: $ cat comment.s foo: add r0, r0@got to parse this as a comment $ llvm-mc -triple armv7 comment.s comment.s:4:18: error: unexpected token in argument list add r0, r0@got to parse this as a comment ^ This should be parsed as correctly as `add r0, r0`. This commit modifes the assembly lexer to not include the '@' symbol in identifiers when lexing for targets that use '@' for comments. llvm-svn: 196607
* MC asm parser: allow ?'s in symbol names, and handle @'s in names in MS asmHans Wennborg2013-10-181-2/+2
| | | | | | | | | | | | | | | | | | | | This is another (final?) stab at making us able to parse our own asm output on Windows. Symbols on Windows often contain @'s and ?'s in their names. Our asm parser didn't like this. ?'s were not allowed, and @'s were intepreted as trying to reference PLT/GOT/etc. We can't just add quotes around the bad names, since e.g. for MinGW, we use gas to assemble, and it doesn't like quotes in some places (notably in .def directives). This commit makes us allow ?'s in symbol names, and @'s in symbol names for MS assembly. Differential Revision: http://llvm-reviews.chandlerc.com/D1978 llvm-svn: 193000
* Support C99 hexadecimal floating-point literals in assemblyTim Northover2013-08-141-1/+53
| | | | | | | | It's useful to be able to write down floating-point numbers without having to worry about what they'll be rounded to (as C99 discovered), this extends that ability to the MC assembly parsers. llvm-svn: 188370
* AsmParser: More generic support for integer type suffices.Jim Grosbach2013-02-261-6/+9
| | | | | | | | | | For integer constants, allow 'L', 'UL' as well as 'ULL' and 'LL'. This provides better support for shared headers between .s and .c files that define bunches of constant values. rdar://9321056 llvm-svn: 176118
* 'Hexadecimal' has two 'a's and only one 'i'.Matt Beaumont-Gay2013-02-251-1/+1
| | | | llvm-svn: 176031
* Revert r15266. This fixes llvm.org/pr15266.Rafael Espindola2013-02-141-40/+19
| | | | llvm-svn: 175173
* [ms-inline asm] Add support for lexing binary integers with a [bB] suffix.Chad Rosier2013-02-121-19/+40
| | | | | | | | | | | | | | This is complicated by backward labels (e.g., 0b can be both a backward label and a binary zero). The current implementation assumes [0-9]b is always a label and thus it's possible for 0b and 1b to not be interpreted correctly for ms-style inline assembly. However, this is relatively simple to fix in the inline assembly (i.e., drop the [bB]). This patch also limits backward labels to [0-9]b, so that only 0b and 1b are ambiguous. Part of rdar://12470373 llvm-svn: 174983
* Update error message due to previous commit, r174926.Chad Rosier2013-02-121-1/+3
| | | | llvm-svn: 174927
* [ms-inline asm] Add support for lexing hexidecimal integers with a [hH] suffix.Chad Rosier2013-02-121-14/+47
| | | | | | Part of rdar://12470373 llvm-svn: 174926
* Use the new script to sort the includes of every file under lib.Chandler Carruth2012-12-031-2/+2
| | | | | | | | | | | | | | | | | Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131
* Add support for macro parameters/arguments delimited by spaces,Preston Gurd2012-09-191-2/+11
| | | | | | | | | | to improve compatibility with GNU as. Based on a patch by PaX Team. Fixed assertion failures on non-Darwin and added additional test cases. llvm-svn: 164248
* Handle missing newline at EOF more gracefully in MC AsmLexer.Jim Grosbach2011-09-151-1/+8
| | | | | | | | | If we see an EOF w/o a preceding end-of-line, return an EndOfStatement token before returning the Eof token. Based on patch by Stepan Dyatkovskiy. llvm-svn: 139798
OpenPOWER on IntegriCloud