Add file comments for ScriptParser.cpp.

llvm-svn: 295023
author: Rui Ueyama <ruiu@google.com> 2017-02-14 04:47:24 +0000
committer: Rui Ueyama <ruiu@google.com> 2017-02-14 04:47:24 +0000
commit: 4c82b4f6fa4da4981d680012391de8f4c75bab9f (patch)
tree: c0499edaa55d790a3bfaf0de40f9ae8d41ded7d3 /lld/ELF/ScriptLexer.cpp
parent: 794366a237c566f5bc8230910ff9dde789f96c01 (diff)
download: bcm5719-llvm-4c82b4f6fa4da4981d680012391de8f4c75bab9f.tar.gz
bcm5719-llvm-4c82b4f6fa4da4981d680012391de8f4c75bab9f.zip
1 files changed, 31 insertions, 2 deletions
diff --git a/lld/ELF/ScriptLexer.cpp b/lld/ELF/ScriptLexer.cpp
index 6398a52a026..418ec93695f 100644
--- a/lld/ELF/ScriptLexer.cpp
+++ b/lld/ELF/ScriptLexer.cpp
@@ -7,8 +7,37 @@
 //
 //===----------------------------------------------------------------------===//
 //
-// This file contains the base parser class for linker script and dynamic
-// list.
+// This file defines a lexer for the linker script.
+//
+// The linker script's grammar is not complex but ambiguous due to the
+// lack of the formal specification of the language. What we are trying to
+// do in this and other files in LLD is to make a "reasonable" linker
+// script processor.
+//
+// Among simplicity, compatibility and efficiency, we put the most
+// emphasis on simplicity when we wrote this lexer. Compatibility with the
+// GNU linkers is important, but we did not try to clone every tiny corner
+// case of their lexers, as even ld.bfd and ld.gold are subtly different
+// in various corner cases. We do not care much about efficiency because
+// the time spent in parsing linker scripts is usually negligible.
+//
+// Our grammar of the linker script is LL(2), meaning that it needs at
+// most two-token lookahead to parse. The only place we need two-token
+// lookahead is labels in version scripts, where we need to parse "local :"
+// as if "local:".
+//
+// Overall, this lexer works fine for most linker scripts. There's room
+// for improving compatibility, but that's probably not at the top of our
+// todo list.
+//
+// A caveat: This lexer splits an input string into tokens ahead of time,
+// so the lexer is not context aware. There's one known corner case. Let's
+// say the next string is "val*3" (without quotes). In the context where
+// the parser is expecting an expression, that should be tokenizes to
+// "val", "*" and "3". In other context, it should be just a single
+// token. (If it is in a filename context, it'll be interpeted as a glob
+// pattern, for example.)  We want to fix this, but it probably needs a
+// redesign of this lexer.
 //
 //===----------------------------------------------------------------------===//
author	Rui Ueyama <ruiu@google.com>	2017-02-14 04:47:24 +0000
committer	Rui Ueyama <ruiu@google.com>	2017-02-14 04:47:24 +0000
commit	4c82b4f6fa4da4981d680012391de8f4c75bab9f (patch)
tree	c0499edaa55d790a3bfaf0de40f9ae8d41ded7d3 /lld/ELF/ScriptLexer.cpp
parent	794366a237c566f5bc8230910ff9dde789f96c01 (diff)
download	bcm5719-llvm-4c82b4f6fa4da4981d680012391de8f4c75bab9f.tar.gz bcm5719-llvm-4c82b4f6fa4da4981d680012391de8f4c75bab9f.zip