summaryrefslogtreecommitdiffstats
path: root/llvm/test/Demangle
Commit message (Collapse)AuthorAgeFilesLines
* llvm-undname: Fix assert-on->4GiB-string-literal, found by oss-fuzzNico Weber2019-04-241-0/+5
| | | | llvm-svn: 359109
* llvm-undname: Support demangling the spaceship operatorNico Weber2019-04-231-0/+9
| | | | | | Also add a test for demanling the co_await operator. llvm-svn: 359007
* llvm-undname: Fix an assert-on-invalid, found by oss-fuzzNico Weber2019-04-221-0/+5
| | | | llvm-svn: 358891
* llvm-undname: Fix hex escapes in wchar_t, char16_t, char32_t stringsNico Weber2019-04-211-5/+6
| | | | | | | | | | | | | | | | | llvm-undname used to put '\x' in front of every pair of nibbles, but u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct for a single character (plus terminating \0) is u\xD7FF instead. Now, wchar_t, char16_t, and char32_t strings roundtrip from source to clang-cl (and cl.exe) and then llvm-undname. (...at least as long as it's not a string like L"\xD7FF" L"foo" which gets demangled as L"\xD7FFfoo", where the compiler then considers the "f" as part of the hex escape. That seems ok.) Also add a comment saying that the "almost-valid" char32_t string I added in my last commit is actually produced by compilers. llvm-svn: 358857
* llvm-undname: Fix stack overflow on almost-validNico Weber2019-04-211-0/+10
| | | | | | | | | | | | | | | | | If a unsigned with all 4 bytes non-0 was passed to outputHex(), there were two off-by-ones in it: - Both MaxPos and Pos left space for the final \0, which left the buffer one byte to small. Set MaxPos to 16 instead of 15 to fix. - The `assert(Pos >= 0);` was after a `Pos--`, move it up one line. Since valid Unicode codepoints are <= 0x10ffff, this could never really happen in practice. Found by oss-fuzz. llvm-svn: 358856
* llvm-undname: Fix stack overflow on invalid found by oss-fuzzNico Weber2019-04-211-0/+5
| | | | llvm-svn: 358852
* llvm-undname: Improve string literal demangling with embedded \0 charsNico Weber2019-04-201-0/+10
| | | | | | | | | - Don't assert when a string looks like a u32 string to the heuristic but doesn't have a length that's 0 mod 4. Instead, classify those as u16 with embedded \0 chars. Found by oss-fuzz. - Print embedded nul bytes as \0 instead of \x00. llvm-svn: 358835
* llvm-undname: Fix two more asserts-on-invalid, found by oss-fuzzNico Weber2019-04-181-0/+10
| | | | llvm-svn: 358708
* llvm-undname: Fix two asserts-on-invalidNico Weber2019-04-181-0/+10
| | | | llvm-svn: 358707
* llvm-undname: Fix nullptr deref on invalid structor names in template argsNico Weber2019-04-161-0/+5
| | | | | | | | | | | | Similar to r358421: A StructorIndentifierNode has a Class field which is read when printing it, but if the StructorIndentifierNode appears in a template argument then demangleFullyQualifiedSymbolName() which sets Class isn't called. Since StructorIndentifierNodes are always leaf names, we can just reject them as well. Found by oss-fuzz. llvm-svn: 358491
* llvm-undname: Tweak arena allocatorNico Weber2019-04-161-0/+3
| | | | | | | | | | | | | | | | - Make `allocUnalignedBuffer` look more like `allocArray` and `alloc`. No behavior change. - Change `Head->Used < Head->Capacity` to `Head->Used <= Head->Capacity` in `allocArray` and `alloc`. No intended behavior change, might be a minuscule memory usage improvement. Noticed this since it was the logic used in `allocUnalignedBuffer`. - Don't let `allocArray` alloc too small buffers for names that have more than 512 levels of nesting (in 64-bit builds). Fixes a heap buffer overflow found by oss-fuzz. Differential Revision: https://reviews.llvm.org/D60774 llvm-svn: 358489
* llvm-undname: add a missing CHECK: to a passing testNico Weber2019-04-161-2/+2
| | | | llvm-svn: 358488
* Fix llvm-undname tests after r358485Nico Weber2019-04-161-1/+1
| | | | llvm-svn: 358487
* llvm-undname: Fix nullptr deref on invalid conversion operator names in ↵Nico Weber2019-04-151-0/+5
| | | | | | | | | | | | | | | | | template args A ConversionOperatorIdentifierNode has a TargetType which is read when printing it, but if the ConversionOperatorIdentifierNode appears in a template argument there's nothing that can provide the TargetType. Normally the COIN is a symbol (leaf) name and takes its TargetType from the symbol's type, but in a template argument context the COIN can only be either a non-leaf name piece or a type, and must hence be invalid. Similar to the COIN check in demangleDeclarator(). Found by oss-fuzz. llvm-svn: 358421
* llvm-undname: Fix oss-fuzz-foudn crash-on-invalid with incomplete special ↵Nico Weber2019-04-141-0/+10
| | | | | | table nodes llvm-svn: 358367
* llvm-undname: Fix another crash-on-invalid found by oss-fuzzNico Weber2019-04-141-0/+5
| | | | llvm-svn: 358363
* llvm-undname: Fix out-of-bounds read on invalid intrinsic function codeNico Weber2019-04-111-0/+5
| | | | | | Found by inspection. llvm-svn: 358239
* llvm-undname: Don't crash on incomplete enum tag manglingsNico Weber2019-04-111-0/+5
| | | | | | Found by inspection. llvm-svn: 358238
* llvm-undname: Fix crash on incomplete virtual this adjustsNico Weber2019-04-111-0/+5
| | | | | | | | Found by oss-fuzz. Also remove an else-after-return, this part has no behavior change. llvm-svn: 358237
* llvm-undname: Fix crash on invalid name in a template parameter pointer to ↵Nico Weber2019-04-111-0/+5
| | | | | | | | member arg Found by oss-fuzz. llvm-svn: 358234
* llvm-undname: Fix another crash-on-invalidNico Weber2019-04-101-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | This fixes a regression from https://reviews.llvm.org/D60354. We used to SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN); if (Symbol) { Symbol->Name = QN; } but changed that to SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN); if (Error) return nullptr; Symbol->Name = QN; and one branch somewhere returned a nullptr without setting Error. Looking at the code changed in r340083 and r340710 that branch looks like a remnant from an earlier attempt to demangle RTTI descriptors that has since been rewritten -- so just remove this branch. It shouldn't change behavior for correctly mangled symbols. llvm-svn: 358112
* llvm-undname: Fix more crashes and asserts on invalid inputsNico Weber2019-04-081-0/+60
| | | | | | | | | | | | | | | | | | | | | For functions whose callers don't check that enough input is present, add checks at the start of the function that enough input is there and set Error otherwise. For functions that return AST objects, return nullptr instead of incomplete AST objects with nullptr fields if an error occurred during the function. Introduce a new function demangleDeclarator() for the sequence demangleFullyQualifiedSymbolName(); demangleEncodedSymbol() and use it in the two places that had this sequence. Let this new function check that ConversionOperatorIdentifiers have a valid TargetType. Some of the bad inputs found by oss-fuzz, others by inspection. Differential Revision: https://reviews.llvm.org/D60354 llvm-svn: 357936
* llvm-undname: Fix a crash-on-invalidNico Weber2019-04-031-0/+5
| | | | | | | | Found by oss-fuzz, fixes issue 13260 on oss-fuzz. Differential Revision: https://reviews.llvm.org/D60207 llvm-svn: 357649
* llvm-undame: Fix an assert-on-invalidNico Weber2019-04-031-0/+5
| | | | | | | | Found by oss-fuzz, fixes issue 12432 on os-fuzz. Differential Revision: https://reviews.llvm.org/D60206 llvm-svn: 357648
* llvm-undname: Fix an assert-on-invalidNico Weber2019-04-031-0/+5
| | | | | | | | Found by oss-fuzz, fixes issues 12428 and 12429 on oss-fuzz. Differential Revision: https://reviews.llvm.org/D60204 llvm-svn: 357647
* llvm-undname: Fix a crash-on-invalidNico Weber2019-04-031-1/+6
| | | | | | | | Found by oss-fuzz, fixes issues 12435 and 12438 on oss-fuzz. Differential Revision: https://reviews.llvm.org/D60202 llvm-svn: 357646
* [llvm-undname] Add support for demangling msvc's noexcept types.Zachary Turner2019-01-081-0/+25
| | | | | | | | | | | Starting in C++17, MSVC introduced a new mangling for function parameters that are themselves noexcept functions. This patch makes llvm-undname properly demangle them. Patch by Zachary Henkel Differential Revision: https://reviews.llvm.org/D55769 llvm-svn: 350656
* [MS Demangler] Fail gracefully on invalid pointer types.Zachary Turner2018-12-141-0/+5
| | | | | | | | | Once we detect a 'P', we know we a pointer type is upcoming, so we make some assumptions about the output that follows. If those assumptions didn't hold, we would assert. Instead, we should fail gracefully and propagate the error up. llvm-svn: 349169
* [MS Demangler] Add a regression test for an invalid mangled name.Zachary Turner2018-12-141-0/+6
| | | | llvm-svn: 349168
* [MS Demangler] Print public:, protected:, private: if set in FunctionClass ↵Nico Weber2018-11-138-27/+27
| | | | | | | | | | | | or a variable's StorageClass. undname prints them, and the information is in the decorated name, so we probably shouldn't lose it when undecorating. I spot-checked a few of the funnier-looking outputs, and undname has the same output. Differential Revision: https://reviews.llvm.org/D54396 llvm-svn: 346791
* [MS demangler] Use a slightly shorter unmangling for mangled strings.Nico Weber2018-11-092-372/+372
| | | | | | | | | | Before: const wchar_t * {L"%"} Now: L"%" See also PR39593. Differential Revision: https://reviews.llvm.org/D54294 llvm-svn: 346544
* [MS Demangler] Add support for $$Z parameter pack separator.Zachary Turner2018-08-301-0/+4
| | | | | | | | | $$Z appears between adjacent expanded parameter packs in the same template instantiation. We don't need to print it, it's only there to disambiguate between manglings that would otherwise be ambiguous. So we just need to parse it and throw it away. llvm-svn: 341119
* [MS Demangler] Fix several crashes and demangling bugs.Zachary Turner2018-08-293-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These bugs were found by writing a Python script which spidered the entire Chromium build directory tree demangling every symbol in every object file. At the start, the tool printed: Processed 27443 object files. 2926377/2936108 symbols successfully demangled (99.6686%) 9731 symbols could not be demangled (0.3314%) 14589 files crashed while demangling (53.1611%) After this patch, it prints: Processed 27443 object files. 41295518/41295617 symbols successfully demangled (99.9998%) 99 symbols could not be demangled (0.0002%) 0 files crashed while demangling (0.0000%) The issues fixed in this patch are: * Ignore empty parameter packs. Previously we would encounter a mangling for an empty parameter pack and add a null node to the AST. Since we don't print these anyway, we now just don't add anything to the AST and ignore it entirely. This fixes some of the crashes. * Account for "incorrect" string literal demanglings. Apparently an older version of clang would not truncate mangled string literals to 32 bytes of encoded character data. The demangling code however would allocate a 32 byte buffer thinking that it would not encounter more than this, and overrun the buffer. We now demangle up to 128 bytes of data, since the buggy clang would encode up to 32 *characters* of data. * Extended support for demangling init-fini stubs. If you had something like struct Foo { static vector<string> S; }; this would generate a dynamic atexit initializer *for the variable*. We didn't handle this, but now we print something nice. This is actually an improvement over undname, which will fail to demangle this at all. * Fixed one case of static this adjustment. We weren't handling several thunk codes so we didn't recognize the mangling. These are now handled. * Fixed a back-referencing problem. Member pointer templates should have their components considered for back-referencing The remaining 99 symbols which can't be demangled are all symbols which are compiler-generated and undname can't demangle either. llvm-svn: 341000
* Add support for various C++14 demanglings.Zachary Turner2018-08-291-0/+37
| | | | | | | | | | Mostly this includes <auto> and <decltype-auto> return values. Additionally, this fixes a fairly obscure back-referencing bug that was encountered in one of the C++14 tests, which is that if you have something like Foo<&bar, &bar> then the `bar` forms a backreference. llvm-svn: 340896
* [MS Demangler] Re-write the Microsoft demangler.Zachary Turner2018-08-274-15/+11
| | | | | | | | | | | | | | | | | | | | This is a pretty large refactor / re-write of the Microsoft demangler. The previous one was a little hackish because it evolved as I was learning about all the various edge cases, exceptions, etc. It didn't have a proper AST and so there was lots of custom handling of things that should have been much more clean. Taking what was learned from that experience, it's now re-written with a completely redesigned and much more sensible AST. It's probably still not perfect, but at least it's comprehensible now to someone else who wants to come along and make some modifications or read the code. Incidentally, this fixed a couple of bugs, so I've enabled the tests which now pass. llvm-svn: 340710
* [MS Demangler] Print template constructor args.Zachary Turner2018-08-211-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously if you had something like this: template<typename T> struct Foo { template<typename U> Foo(U); }; Foo F(3.7); this would mangle as ??$?0N@?$Foo@H@@QEAA@N@Z and this would be demangled as: undname: __cdecl Foo<int>::Foo<int><double>(double) llvm-undname: __cdecl Foo<int>::Foo<int>(double) Note the lack of the constructor template parameter in our demangling. This patch makes it so we print the constructor argument list. llvm-svn: 340356
* [MS Demangler] Fix a few more edge cases.Zachary Turner2018-08-213-0/+8
| | | | | | | | | | | | | | | | I found these by running llvm-undname over a couple hundred megabytes of object files generated as part of building chromium. The issues fixed in this patch are: 1) decltype-auto return types. 2) Indirect vtables (e.g. const A::`vftable'{for `B'}) 3) Pointers, references, and rvalue-references to member pointers. I have exactly one remaining symbol out of a few hundred MB of object files that produces a name we can't demangle, and it's related to back-referencing. llvm-svn: 340341
* [MS Demangler] Demangle special operator 'dynamic initializer'.Zachary Turner2018-08-201-0/+6
| | | | | | | | | | | This is encoded as __E and should print something like "dynamic initializer for 'Foo'(void)" This also adds support for dynamic atexit destructor, which is basically identical but encoded as __F with slightly different description. llvm-svn: 340239
* [MS Demangler] Anonymous namespace hashes can be backreferenced.Zachary Turner2018-08-201-0/+3
| | | | | | | Previously we were not remembering the key values of anonymous namespaces, but we need to do this. llvm-svn: 340238
* [MS Demangler] Properly demangle anonymous namespaces.Zachary Turner2018-08-201-0/+2
| | | | llvm-svn: 340237
* [MS Demangler] Demangle member pointer template parameters.Zachary Turner2018-08-202-0/+126
| | | | llvm-svn: 340199
* [MS Demangler] Resolve backreferences eagerly, not lazily.Zachary Turner2018-08-182-2/+119
| | | | | | | | | | | | | | | | | | | | A while back I submitted a patch to resolve backreferences lazily, thinking this that it was not always possible to know in advance what type you were looking at until you had completed a full pass over the input, and therefore it would be impossible to resolve backreferences eagerly. This was mistaken though, and turned out to be an unrelated problem. In fact, the reverse is true. You *must* resolve backreferences eagerly. This is because certain types of nested mangled symbols do not share a backreference context with their parent symbol, and as such, if you try to resolve them lazily their backreference context will have been lost by the time you finish demangling the entire input. On the other hand, resolving them eagerly appears to always work, and enables us to port many more tests over. llvm-svn: 340126
* [MS Demangler] Properly print all thunk types.Zachary Turner2018-08-172-3/+15
| | | | | | | | | We were only printing the vtordisp thunk before as the previous patch was more aimed at getting special operators working, one of which was a thunk. This patch gets all thunk types to print properly, and adds a test for each one. llvm-svn: 340088
* [MS Demangler] Demangle all remaining types of operators.Zachary Turner2018-08-171-14/+14
| | | | | | | This demangles all remaining special operators including thunks, RTTI Descriptors, and local static guard variables. llvm-svn: 340083
* [MS Demangler] Rework the way operators are demangled.Zachary Turner2018-08-172-1/+230
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, some of the code for actually parsing mangled operator names was more like formatting code in nature, and was interspersed with the demangling code which builds the AST. This means that by the time we got to the printing code, we had lost all information about what type of operator we had, and all we were left with was a string that we just had to print. However, not all operators are actually even operators. it's basically just a catch-all mangling for "special names", and for some of the other types it helps to know when we're actually doing the printing what it is. This patch changes the way things work by introducing an OperatorInfo structure and corresponding enumeration. When we demangle we store the enumeration value and demangled components separately. This gives more flexibility during printing. In doing so, some demanglings of special names which we didn't previously support come out of this for free, so we now demangle those. A few are more complex and are better left for a followup patch though. An exhaustive test of every possible operator code is included, with the ones that don't yet work commented out. llvm-svn: 340046
* [MS Demangler] Demangle string literals.Zachary Turner2018-08-161-0/+758
| | | | | | | | | | | | | | | | | | | | When demangling string literals, Microsoft's undname simply prints 'string'. This patch implements string literal demangling while doing a bit better than this by decoding as much of the string as possible and trying to faithfully reproduce the original string literal definition. This is a bit tricky because the different character types char, char16_t, and char32_t are not uniquely identified by the mangling, so we have to use a heuristic to try to guess the character type. But it works pretty well, and many tests are added to illustrate the behavior. Differential Revision: https://reviews.llvm.org/D50806 llvm-svn: 339892
* [MS Demangler] Don't fail on MD5-mangled names.Zachary Turner2018-08-161-0/+11
| | | | | | | | When we have an MD5 mangled name, we shouldn't choke and say that it's an invalid name. Even though it's impossible to demangle, we should just output the original name. llvm-svn: 339891
* [MS Demangler] Fix some minor formatting bugs.Zachary Turner2018-08-143-14/+12
| | | | | | | | | | | | | | | | 1) We print __restrict twice on member pointers. This is fixed and relevant tests are re-enabled. 2) Several tests were disabled because of printing slightly different output than undname. These were confirmed to be bugs in undname, so we just re-enable the tests. 3) The test for printing reference temporaries is re-enabled. This is a clang mangling extension, so we have some flexibility with how we demangle it. The output currently looks fine, so we just re-enable the test with no fixes. llvm-svn: 339708
* [MS Demangler] Support extern "C" functions.Zachary Turner2018-08-101-10/+8
| | | | | | | | | | | | | | There are two cases we need to support with extern "C" functions. The first is the case of a '9' indicating that the function has no prototype. This occurs when we mangle a symbol inside of an extern "C" function, but not the function itself. The second case is when we have an overloaded extern "C" functions. In this case we emit $$J0 to indicate this. This patch adds support for both of these cases. llvm-svn: 339471
* Resubmit r339450 - [MS Demangler] Add conversion operator testsZachary Turner2018-08-101-1/+31
| | | | | | | | | This was broken because of a malformed check line. Incidentally, this exposed a case where we crash when we should just be returning an error, so we should fix that. The demangler shouldn't crash due to user input. llvm-svn: 339466
OpenPOWER on IntegriCloud