| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
Rewriting the same predicates over and over again is bad for code size and
code maintainence. Using the functions in <ctype.h> is generally unsafe
unless they are specified to be locale-independent (i.e. only isdigit and
isxdigit).
The next commit will try to clean up uses of <ctype.h> functions within Clang.
llvm-svn: 174765
|
|
|
|
|
|
|
|
|
| |
This allows people to use Unicode in their #pragma mark and in macros
that exist only to be string-ized.
<rdar://problem/13107323&13121362>
llvm-svn: 174081
|
|
|
|
|
|
|
|
| |
This caused hangs as we processed the same invalid byte over and over.
<rdar://problem/13115651>
llvm-svn: 173959
|
|
|
|
|
|
| |
This is required to use them in TableGen.
llvm-svn: 173924
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
People use the C preprocessor for things other than C files. Some of them
have Unicode characters. We shouldn't warn about Unicode characters
appearing outside of identifiers in this case.
There's not currently a way for the preprocessor to tell if it's in -E mode,
so I added a new flag, derived from the PreprocessorOutputOptions. This is
only used by the Unicode warnings for now, but could conceivably be used by
other warnings or even behavioral differences later.
<rdar://problem/13107323>
llvm-svn: 173881
|
|
|
|
|
|
| |
Fixes a crash. Thanks, Richard.
llvm-svn: 173701
|
|
|
|
|
|
|
| |
Unfortunately, we can't accept the UCN as an extension because we're
required to treat it as two tokens for preprocessing purposes.
llvm-svn: 173622
|
|
|
|
| |
llvm-svn: 173447
|
|
|
|
|
|
| |
Thanks, Dmitri.
llvm-svn: 173400
|
|
|
|
| |
llvm-svn: 173371
|
|
|
|
| |
llvm-svn: 173370
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a missing piece for C99 conformance.
This patch handles UCNs by adding a '\\' case to LexTokenInternal and
LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN.
If the UCN is not syntactically well-formed, we fall back to the old
treatment: a backslash followed by an identifier beginning with 'u' (or 'U').
Because the spelling of an identifier with UCNs still has the UCN in it, we
need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo.
Of course, valid code that does *not* use UCNs will see only a very minimal
performance hit (checks after each identifier for non-ASCII characters,
checks when converting raw_identifiers to identifiers that they do not
contain UCNs, and checks when getting the spelling of an identifier that it
does not contain a UCN).
This patch also adds basic support for actual UTF-8 in the source. This is
treated almost exactly the same as UCNs except that we consider stray
Unicode characters to be mistakes and offer a fixit to remove them.
llvm-svn: 173369
|
|
|
|
|
|
| |
brought into 'clang' namespace by clang/Basic/LLVM.h
llvm-svn: 172323
|
|
|
|
|
|
|
|
| |
Lexer::getRawToken().
No functionality change.
llvm-svn: 171771
|
|
|
|
| |
llvm-svn: 171367
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
uncovered.
This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.
I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.
llvm-svn: 169237
|
|
|
|
|
|
|
|
| |
string literal needs cleaning (because it contains line-splicing in the
encoding prefix or in the ud-suffix), do not clean the section between the
double-quotes -- that's the "raw" bit!
llvm-svn: 168776
|
|
|
|
|
|
|
| |
This makes LexCharConstant() look more like LexStringLiteral(), which doesn't
have this bug. Add tests for eof after \ for several other cases.
llvm-svn: 168269
|
|
|
|
|
|
| |
line endings. <rdar://problem/12639047>.
llvm-svn: 167900
|
|
|
|
|
|
| |
tokens at all,". This change broke External/Nurbs in LLVM test-suite.
llvm-svn: 167858
|
|
|
|
|
|
| |
FIXME in LexNumericConstant.
llvm-svn: 167818
|
|
|
|
|
|
|
|
|
| |
don't recursively continue lexing.
This avoids a stack overflow with a sequence of many empty #includes.
rdar://11988695
llvm-svn: 167801
|
|
|
|
| |
llvm-svn: 167800
|
|
|
|
| |
llvm-svn: 167690
|
|
|
|
|
|
| |
when computing the size of the precompiled preamble.
llvm-svn: 166659
|
|
|
|
| |
llvm-svn: 164555
|
|
|
|
| |
llvm-svn: 163325
|
|
|
|
| |
llvm-svn: 162970
|
|
|
|
| |
llvm-svn: 160973
|
|
|
|
|
|
|
|
| |
undefined behaviour, and move the diagnostic for '' from an Error into
an ExtWarn in this group. This is important for some users of the preprocessor,
and is necessary for gcc compatibility.
llvm-svn: 159335
|
|
|
|
|
|
|
|
|
|
| |
* Removed docs for Lexer::makeFileCharRange from Lexer.cpp, as they're in
the header file;
* Reworked the documentation for SkipBlockComment so that it doesn't confuse
Doxygen's comment parsing;
* Added another summary with \brief markup.
llvm-svn: 158618
|
|
|
|
|
|
|
|
|
|
|
| |
1. Teach Lexer that pragma lexers are like macro expansions at EOF.
2. Treat pragmas like #define/#undef when printing.
3. If we just printed a directive, add a newline before any more tokens.
(4. Miscellaneous cleanup in PrintPreprocessedOutput.cpp)
PR10594 and <rdar://problem/11562490> (two separate related problems)
llvm-svn: 158571
|
|
|
|
| |
llvm-svn: 158552
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
modes. For languages other than C99/C11, this isn't quite a conforming
extension, and for C++11, it breaks some reasonable code containing
user-defined literals.
In languages which don't officially have hexfloats, pare back this extension
to only apply in cases where the token starts 0x and does not contain an
underscore. The extension is still not quite conforming, but it's a lot closer
now.
llvm-svn: 158487
|
|
|
|
|
|
| |
This condition (added in r158093) was overly conservative.
llvm-svn: 158483
|
|
|
|
|
|
| |
to a change done long ago in r57393.
llvm-svn: 158243
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was a problem for people who write 'return(result);'
Also fix ARCMT's corresponding code, though there's no test case for this
because implicit casts like this are rejected by the migrator for being
ambiguous, and explicit casts have no problem.
<rdar://problem/11577346>
llvm-svn: 158130
|
|
|
|
|
|
|
|
|
| |
only expands #include directives.
Patch contributed by Lubos Lunak (l.lunax@suse.cz).
Review by Matt Beaumont-Gay (matthewbg@google.com).
llvm-svn: 158093
|
|
|
|
| |
llvm-svn: 158091
|
|
|
|
|
|
| |
so in a less malloc-intensive way.
llvm-svn: 157064
|
|
|
|
|
|
| |
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20120409/056126.html
llvm-svn: 154655
|
|
|
|
| |
llvm-svn: 154643
|
|
|
|
|
|
|
|
| |
MicrosoftMode. Hence create ext_ms_reserved_user_defined_literal that doesn't default to Error; otherwise MSVC headers won't parse.
Fixes PR12383.
llvm-svn: 154273
|
|
|
|
|
|
|
|
|
|
| |
(Lex to AST).
The member variable is always "LangOpts" and the member function is always "getLangOpts".
Reviewed by Chris Lattner
llvm-svn: 152536
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
starting with an underscore is ill-formed.
Since this rule rejects programs that were using <inttypes.h>'s macros, recover
from this error by treating the ud-suffix as a separate preprocessing-token,
with a DefaultError ExtWarn. The approach of treating such cases as two tokens
is under discussion for standardization, but is in any case a conforming
extension and allows existing codebases to keep building while the committee
makes up its mind.
Reword the warning on the definition of literal operators not starting with
underscores (which are, strangely, legal) to more explicitly state that such
operators can't be called by literals. Remove the special-case diagnostic for
hexfloats, since it was both triggering in the wrong cases and incorrect.
llvm-svn: 152287
|
|
|
|
|
|
|
| |
identifiers, in cases where those identifiers would be treated as
user-defined literal suffixes in C++11.
llvm-svn: 152198
|
|
|
|
|
|
|
|
|
|
| |
grammar requires a string-literal and not a user-defined-string-literal. The
two constructs are still represented by the same TokenKind, in order to prevent
a combinatorial explosion of different kinds of token. A flag on Token tracks
whether a ud-suffix is present, in order to prevent clients from needing to look
at the token's spelling.
llvm-svn: 152098
|
|
|
|
|
|
|
| |
kinds as the underlying string literals, and we silently drop the ud-suffix;
those issues will be fixed by subsequent patches.
llvm-svn: 152012
|
|
|
|
|
|
|
| |
instead of a SourceRange, and handle the case where the range is
a char (not token) range.
llvm-svn: 149677
|
|
|
|
|
|
|
|
|
| |
of macro arguments.
For "MAC1( MAC2(foo) )" and location of 'foo' token it would return
"MAC1" instead of "MAC2".
llvm-svn: 148704
|