diff options
author | Graydon Hoare <ghoare@apple.com> | 2018-03-27 19:52:45 +0000 |
---|---|---|
committer | Graydon Hoare <ghoare@apple.com> | 2018-03-27 19:52:45 +0000 |
commit | 926cd9b83783e2c55a5289542197c198eeb4cba5 (patch) | |
tree | 8a96d527cd64172370621678fb2025155480e8e8 /llvm/lib/Support/YAMLParser.cpp | |
parent | 0272cb077f4da79a7ac23c4079a29aaa517c2d7f (diff) | |
download | bcm5719-llvm-926cd9b83783e2c55a5289542197c198eeb4cba5.tar.gz bcm5719-llvm-926cd9b83783e2c55a5289542197c198eeb4cba5.zip |
[YAML] Escape non-printable multibyte UTF8 in Output::scalarString.
The existing YAML Output::scalarString code path includes a partial and
incorrect implementation of YAML escaping logic. In particular, the logic put
in place in rL321283 escapes non-printable bytes only if they are not part of a
multibyte UTF8 sequence; implicitly this means that all multibyte UTF8
sequences -- printable and non -- are passed through verbatim.
The simplest solution to this is to direct the Output::scalarString method to
use the standalone yaml::escape function, and this _almost_ works, except that
the existing code in that function _over_ escapes: any multibyte UTF8 sequence
is escaped, even printable ones. While this is permitted for YAML, it is also
more aggressive (and hard to read for non-English locales) than necessary,
and the entire point of rL321283 was to back off such aggressive over-escaping.
So in this change, I have both redirected Output::scalarString to use
yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to
non-printables. This preserves behaviour of any existing clients while giving
them a path to more moderate escaping should they desire.
Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun
Reviewed By: thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44863
llvm-svn: 328661
Diffstat (limited to 'llvm/lib/Support/YAMLParser.cpp')
-rw-r--r-- | llvm/lib/Support/YAMLParser.cpp | 6 |
1 files changed, 5 insertions, 1 deletions
diff --git a/llvm/lib/Support/YAMLParser.cpp b/llvm/lib/Support/YAMLParser.cpp index e2f21a56a81..3f71ab8fc6f 100644 --- a/llvm/lib/Support/YAMLParser.cpp +++ b/llvm/lib/Support/YAMLParser.cpp @@ -26,6 +26,7 @@ #include "llvm/Support/MemoryBuffer.h" #include "llvm/Support/SMLoc.h" #include "llvm/Support/SourceMgr.h" +#include "llvm/Support/Unicode.h" #include "llvm/Support/raw_ostream.h" #include <algorithm> #include <cassert> @@ -687,7 +688,7 @@ bool yaml::scanTokens(StringRef Input) { return true; } -std::string yaml::escape(StringRef Input) { +std::string yaml::escape(StringRef Input, bool EscapePrintable) { std::string EscapedInput; for (StringRef::iterator i = Input.begin(), e = Input.end(); i != e; ++i) { if (*i == '\\') @@ -734,6 +735,9 @@ std::string yaml::escape(StringRef Input) { EscapedInput += "\\L"; else if (UnicodeScalarValue.first == 0x2029) EscapedInput += "\\P"; + else if (!EscapePrintable && + sys::unicode::isPrintable(UnicodeScalarValue.first)) + EscapedInput += StringRef(i, UnicodeScalarValue.second); else { std::string HexStr = utohexstr(UnicodeScalarValue.first); if (HexStr.size() <= 2) |