diff options
Diffstat (limited to 'llvm')
| -rw-r--r-- | llvm/docs/CommandGuide/tblgen.rst | 5 | ||||
| -rw-r--r-- | llvm/docs/TableGen/BackEnds.rst | 121 | ||||
| -rw-r--r-- | llvm/docs/TableGen/index.rst | 7 | ||||
| -rw-r--r-- | llvm/include/llvm/TableGen/Record.h | 2 | ||||
| -rw-r--r-- | llvm/lib/TableGen/CMakeLists.txt | 1 | ||||
| -rw-r--r-- | llvm/lib/TableGen/JSONBackend.cpp | 189 | ||||
| -rw-r--r-- | llvm/test/TableGen/JSON-check.py | 51 | ||||
| -rw-r--r-- | llvm/test/TableGen/JSON.td | 146 | ||||
| -rw-r--r-- | llvm/utils/TableGen/TableGen.cpp | 6 | 
9 files changed, 526 insertions, 2 deletions
diff --git a/llvm/docs/CommandGuide/tblgen.rst b/llvm/docs/CommandGuide/tblgen.rst index c23af5eea93..55b54294846 100644 --- a/llvm/docs/CommandGuide/tblgen.rst +++ b/llvm/docs/CommandGuide/tblgen.rst @@ -57,6 +57,11 @@ OPTIONS   Print all records to standard output (default). +.. option:: -dump-json + + Print a JSON representation of all records, suitable for further + automated processing. +  .. option:: -print-enums   Print enumeration values for a class. diff --git a/llvm/docs/TableGen/BackEnds.rst b/llvm/docs/TableGen/BackEnds.rst index a39ea61cd1c..8b313383566 100644 --- a/llvm/docs/TableGen/BackEnds.rst +++ b/llvm/docs/TableGen/BackEnds.rst @@ -435,6 +435,127 @@ AttrDocs  **Purpose**: Creates ``AttributeReference.rst`` from ``AttrDocs.td``, and is  used for documenting user-facing attributes. +General BackEnds +================ + +JSON +---- + +**Purpose**: Output all the values in every ``def``, as a JSON data +structure that can be easily parsed by a variety of languages. Useful +for writing custom backends without having to modify TableGen itself, +or for performing auxiliary analysis on the same TableGen data passed +to a built-in backend. + +**Output**: + +The root of the output file is a JSON object (i.e. dictionary), +containing the following fixed keys: + +* ``!tablegen_json_version``: a numeric version field that will +  increase if an incompatible change is ever made to the structure of +  this data. The format described here corresponds to version 1. + +* ``!instanceof``: a dictionary whose keys are the class names defined +  in the TableGen input. For each key, the corresponding value is an +  array of strings giving the names of ``def`` records that derive +  from that class. So ``root["!instanceof"]["Instruction"]``, for +  example, would list the names of all the records deriving from the +  class ``Instruction``. + +For each ``def`` record, the root object also has a key for the record +name. The corresponding value is a subsidiary object containing the +following fixed keys: + +* ``!superclasses``: an array of strings giving the names of all the +  classes that this record derives from. + +* ``!fields``: an array of strings giving the names of all the variables +  in this record that were defined with the ``field`` keyword. + +* ``!name``: a string giving the name of the record. This is always +  identical to the key in the JSON root object corresponding to this +  record's dictionary. (If the record is anonymous, the name is +  arbitrary.) + +* ``!anonymous``: a boolean indicating whether the record's name was +  specified by the TableGen input (if it is ``false``), or invented by +  TableGen itself (if ``true``). + +For each variable defined in a record, the ``def`` object for that +record also has a key for the variable name. The corresponding value +is a translation into JSON of the variable's value, using the +conventions described below. + +Some TableGen data types are translated directly into the +corresponding JSON type: + +* A completely undefined value (e.g. for a variable declared without +  initializer in some superclass of this record, and never initialized +  by the record itself or any other superclass) is emitted as the JSON +  ``null`` value. + +* ``int`` and ``bit`` values are emitted as numbers. Note that +  TableGen ``int`` values are capable of holding integers too large to +  be exactly representable in IEEE double precision. The integer +  literal in the JSON output will show the full exact integer value. +  So if you need to retrieve large integers with full precision, you +  should use a JSON reader capable of translating such literals back +  into 64-bit integers without losing precision, such as Python's +  standard ``json`` module. + +* ``string`` and ``code`` values are emitted as JSON strings. + +* ``list<T>`` values, for any element type ``T``, are emitted as JSON +  arrays. Each element of the array is represented in turn using these +  same conventions. + +* ``bits`` values are also emitted as arrays. A ``bits`` array is +  ordered from least-significant bit to most-significant. So the +  element with index ``i`` corresponds to the bit described as +  ``x{i}`` in TableGen source. However, note that this means that +  scripting languages are likely to *display* the array in the +  opposite order from the way it appears in the TableGen source or in +  the diagnostic ``-print-records`` output. + +All other TableGen value types are emitted as a JSON object, +containing two standard fields: ``kind`` is a discriminator describing +which kind of value the object represents, and ``printable`` is a +string giving the same representation of the value that would appear +in ``-print-records``. + +* A reference to a ``def`` object has ``kind=="def"``, and has an +  extra field ``def`` giving the name of the object referred to. + +* A reference to another variable in the same record has +  ``kind=="var"``, and has an extra field ``var`` giving the name of +  the variable referred to. + +* A reference to a specific bit of a ``bits``-typed variable in the +  same record has ``kind=="varbit"``, and has two extra fields: +  ``var`` gives the name of the variable referred to, and ``index`` +  gives the index of the bit. + +* A value of type ``dag`` has ``kind=="dag"``, and has two extra +  fields. ``operator`` gives the initial value after the opening +  parenthesis of the dag initializer; ``args`` is an array giving the +  following arguments. The elements of ``args`` are arrays of length +  2, giving the value of each argument followed by its colon-suffixed +  name (if any). For example, in the JSON representation of the dag +  value ``(Op 22, "hello":$foo)`` (assuming that ``Op`` is the name of +  a record defined elsewhere with a ``def`` statement): + +  * ``operator`` will be an object in which ``kind=="def"`` and +    ``def=="Op"`` + +  * ``args`` will be the array ``[[22, null], ["hello", "foo"]]``. + +* If any other kind of value or complicated expression appears in the +  output, it will have ``kind=="complex"``, and no additional fields. +  These values are not expected to be needed by backends. The standard +  ``printable`` field can be used to extract a representation of them +  in TableGen source syntax if necessary. +  How to write a back-end  ======================= diff --git a/llvm/docs/TableGen/index.rst b/llvm/docs/TableGen/index.rst index a19b4f82b63..0697bd0298e 100644 --- a/llvm/docs/TableGen/index.rst +++ b/llvm/docs/TableGen/index.rst @@ -76,11 +76,14 @@ example, to get a list of all of the definitions that subclass a particular type    ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr,    ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... -The default backend prints out all of the records. +The default backend prints out all of the records. There is also a general +backend which outputs all the records as a JSON data structure, enabled using +the `-dump-json` option.  If you plan to use TableGen, you will most likely have to write a `backend`_  that extracts the information specific to what you need and formats it in the -appropriate way. +appropriate way. You can do this by extending TableGen itself in C++, or by +writing a script in any language that can consume the JSON output.  Example  ------- diff --git a/llvm/include/llvm/TableGen/Record.h b/llvm/include/llvm/TableGen/Record.h index 229e8dcc5c3..e022bc82b4e 100644 --- a/llvm/include/llvm/TableGen/Record.h +++ b/llvm/include/llvm/TableGen/Record.h @@ -1900,6 +1900,8 @@ public:    Init *resolve(Init *VarName) override;  }; +void EmitJSON(RecordKeeper &RK, raw_ostream &OS); +  } // end namespace llvm  #endif // LLVM_TABLEGEN_RECORD_H diff --git a/llvm/lib/TableGen/CMakeLists.txt b/llvm/lib/TableGen/CMakeLists.txt index 9333b653777..e7bb0ada5fb 100644 --- a/llvm/lib/TableGen/CMakeLists.txt +++ b/llvm/lib/TableGen/CMakeLists.txt @@ -1,5 +1,6 @@  add_llvm_library(LLVMTableGen    Error.cpp +  JSONBackend.cpp    Main.cpp    Record.cpp    SetTheory.cpp diff --git a/llvm/lib/TableGen/JSONBackend.cpp b/llvm/lib/TableGen/JSONBackend.cpp new file mode 100644 index 00000000000..94fa5209715 --- /dev/null +++ b/llvm/lib/TableGen/JSONBackend.cpp @@ -0,0 +1,189 @@ +//===- JSONBackend.cpp - Generate a JSON dump of all records. -*- C++ -*-=====// +// +//                     The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This TableGen back end generates a machine-readable representation +// of all the classes and records defined by the input, in JSON format. +// +//===----------------------------------------------------------------------===// + +#include "llvm/ADT/BitVector.h" +#include "llvm/Support/Debug.h" +#include "llvm/TableGen/Error.h" +#include "llvm/TableGen/Record.h" +#include "llvm/TableGen/TableGenBackend.h" +#include "llvm/Support/JSON.h" + +#define DEBUG_TYPE "json-emitter" + +using namespace llvm; + +namespace { + +class JSONEmitter { +private: +  RecordKeeper &Records; + +  json::Value translateInit(const Init &I); +  json::Array listSuperclasses(const Record &R); + +public: +  JSONEmitter(RecordKeeper &R); + +  void run(raw_ostream &OS); +}; + +} // end anonymous namespace + +JSONEmitter::JSONEmitter(RecordKeeper &R) : Records(R) {} + +json::Value JSONEmitter::translateInit(const Init &I) { + +  // Init subclasses that we return as JSON primitive values of one +  // kind or another. + +  if (isa<UnsetInit>(&I)) { +    return nullptr; +  } else if (auto *Bit = dyn_cast<BitInit>(&I)) { +    return Bit->getValue() ? 1 : 0; +  } else if (auto *Bits = dyn_cast<BitsInit>(&I)) { +    json::Array array; +    for (unsigned i = 0, limit = Bits->getNumBits(); i < limit; i++) +      array.push_back(translateInit(*Bits->getBit(i))); +    return array; +  } else if (auto *Int = dyn_cast<IntInit>(&I)) { +    return Int->getValue(); +  } else if (auto *Str = dyn_cast<StringInit>(&I)) { +    return Str->getValue(); +  } else if (auto *Code = dyn_cast<CodeInit>(&I)) { +    return Code->getValue(); +  } else if (auto *List = dyn_cast<ListInit>(&I)) { +    json::Array array; +    for (auto val : *List) +      array.push_back(translateInit(*val)); +    return array; +  } + +  // Init subclasses that we return as JSON objects containing a +  // 'kind' discriminator. For these, we also provide the same +  // translation back into TableGen input syntax that -print-records +  // would give. + +  json::Object obj; +  obj["printable"] = I.getAsString(); + +  if (auto *Def = dyn_cast<DefInit>(&I)) { +    obj["kind"] = "def"; +    obj["def"] = Def->getDef()->getName(); +    return obj; +  } else if (auto *Var = dyn_cast<VarInit>(&I)) { +    obj["kind"] = "var"; +    obj["var"] = Var->getName(); +    return obj; +  } else if (auto *VarBit = dyn_cast<VarBitInit>(&I)) { +    if (auto *Var = dyn_cast<VarInit>(VarBit->getBitVar())) { +      obj["kind"] = "varbit"; +      obj["var"] = Var->getName(); +      obj["index"] = VarBit->getBitNum(); +      return obj; +    } +  } else if (auto *Dag = dyn_cast<DagInit>(&I)) { +    obj["kind"] = "dag"; +    obj["operator"] = translateInit(*Dag->getOperator()); +    if (auto name = Dag->getName()) +      obj["name"] = name->getAsUnquotedString(); +    json::Array args; +    for (unsigned i = 0, limit = Dag->getNumArgs(); i < limit; ++i) { +      json::Array arg; +      arg.push_back(translateInit(*Dag->getArg(i))); +      if (auto argname = Dag->getArgName(i)) +        arg.push_back(argname->getAsUnquotedString()); +      else +        arg.push_back(nullptr); +      args.push_back(std::move(arg)); +    } +    obj["args"] = std::move(args); +    return obj; +  } + +  // Final fallback: anything that gets past here is simply given a +  // kind field of 'complex', and the only other field is the standard +  // 'printable' representation. + +  assert(!I.isConcrete()); +  obj["kind"] = "complex"; +  return obj; +} + +void JSONEmitter::run(raw_ostream &OS) { +  json::Object root; + +  root["!tablegen_json_version"] = 1; + +  // Prepare the arrays that will list the instances of every class. +  // We mostly fill those in by iterating over the superclasses of +  // each def, but we also want to ensure we store an empty list for a +  // class with no instances at all, so we do a preliminary iteration +  // over the classes, invoking std::map::operator[] to default- +  // construct the array for each one. +  std::map<std::string, json::Array> instance_lists; +  for (const auto &C : Records.getClasses()) { +    auto &Name = C.second->getNameInitAsString(); +    (void)instance_lists[Name]; +  } + +  // Main iteration over the defs. +  for (const auto &D : Records.getDefs()) { +    auto &Name = D.second->getNameInitAsString(); +    auto &Def = *D.second; + +    json::Object obj; +    json::Array fields; + +    for (const RecordVal &RV : Def.getValues()) { +      if (!Def.isTemplateArg(RV.getNameInit())) { +        auto Name = RV.getNameInitAsString(); +        if (RV.getPrefix()) +          fields.push_back(Name); +        obj[Name] = translateInit(*RV.getValue()); +      } +    } + +    obj["!fields"] = std::move(fields); + +    json::Array superclasses; +    for (const auto &SuperPair : Def.getSuperClasses()) +      superclasses.push_back(SuperPair.first->getNameInitAsString()); +    obj["!superclasses"] = std::move(superclasses); + +    obj["!name"] = Name; +    obj["!anonymous"] = Def.isAnonymous(); + +    root[Name] = std::move(obj); + +    // Add this def to the instance list for each of its superclasses. +    for (const auto &SuperPair : Def.getSuperClasses()) { +      auto SuperName = SuperPair.first->getNameInitAsString(); +      instance_lists[SuperName].push_back(Name); +    } +  } + +  // Make a JSON object from the std::map of instance lists. +  json::Object instanceof; +  for (auto kv: instance_lists) +    instanceof[kv.first] = std::move(kv.second); +  root["!instanceof"] = std::move(instanceof); + +  // Done. Write the output. +  OS << json::Value(std::move(root)) << "\n"; +} + +namespace llvm { + +void EmitJSON(RecordKeeper &RK, raw_ostream &OS) { JSONEmitter(RK).run(OS); } +} // end namespace llvm diff --git a/llvm/test/TableGen/JSON-check.py b/llvm/test/TableGen/JSON-check.py new file mode 100644 index 00000000000..b6bc4ee6c90 --- /dev/null +++ b/llvm/test/TableGen/JSON-check.py @@ -0,0 +1,51 @@ +#!/usr/bin/env python + +import sys +import subprocess +import traceback +import json + +data = json.load(sys.stdin) +testfile = sys.argv[1] + +prefix = "CHECK: " + +fails = 0 +passes = 0 +with open(testfile) as testfh: +    lineno = 0 +    for line in iter(testfh.readline, ""): +        lineno += 1 +        line = line.rstrip("\r\n") +        try: +            prefix_pos = line.index(prefix) +        except ValueError: +            continue +        check_expr = line[prefix_pos + len(prefix):] + +        try: +            exception = None +            result = eval(check_expr, {"data":data}) +        except Exception: +            result = False +            exception = traceback.format_exc().splitlines()[-1] + +        if exception is not None: +            sys.stderr.write( +                "{file}:{line:d}: check threw exception: {expr}\n" +                "{file}:{line:d}: exception was: {exception}\n".format( +                    file=testfile, line=lineno, +                    expr=check_expr, exception=exception)) +            fails += 1 +        elif not result: +            sys.stderr.write( +                "{file}:{line:d}: check returned False: {expr}\n".format( +                    file=testfile, line=lineno, expr=check_expr)) +            fails += 1 +        else: +            passes += 1 + +if fails != 0: +    sys.exit("{} checks failed".format(fails)) +else: +    sys.stdout.write("{} checks passed\n".format(passes)) diff --git a/llvm/test/TableGen/JSON.td b/llvm/test/TableGen/JSON.td new file mode 100644 index 00000000000..968c2577fa9 --- /dev/null +++ b/llvm/test/TableGen/JSON.td @@ -0,0 +1,146 @@ +// RUN: llvm-tblgen -dump-json %s | %python %S/JSON-check.py %s + +// CHECK: data['!tablegen_json_version'] == 1 + +// CHECK: all(data[s]['!name'] == s for s in data if not s.startswith("!")) + +class Base {} +class Intermediate : Base {} +class Derived : Intermediate {} + +def D : Intermediate {} +// CHECK: 'D' in data['!instanceof']['Base'] +// CHECK: 'D' in data['!instanceof']['Intermediate'] +// CHECK: 'D' not in data['!instanceof']['Derived'] +// CHECK: 'Base' in data['D']['!superclasses'] +// CHECK: 'Intermediate' in data['D']['!superclasses'] +// CHECK: 'Derived' not in data['D']['!superclasses'] + +def ExampleDagOp; + +def FieldKeywordTest { +    int a; +    field int b; +    // CHECK: 'a' not in data['FieldKeywordTest']['!fields'] +    // CHECK: 'b' in data['FieldKeywordTest']['!fields'] +} + +class Variables { +    int i; +    string s; +    bit b; +    bits<8> bs; +    code c; +    list<int> li; +    Base base; +    dag d; +} +def VarNull : Variables { +    // A variable not filled in at all has its value set to JSON +    // 'null', which translates to Python None +    // CHECK: data['VarNull']['i'] is None +} +def VarPrim : Variables { +    // Test initializers that map to primitive JSON types + +    int i = 3; +    // CHECK: data['VarPrim']['i'] == 3 + +    // Integer literals should be emitted in the JSON at full 64-bit +    // precision, for the benefit of JSON readers that preserve that +    // much information. Python's is one such. +    int enormous_pos = 9123456789123456789; +    int enormous_neg = -9123456789123456789; +    // CHECK: data['VarPrim']['enormous_pos'] == 9123456789123456789 +    // CHECK: data['VarPrim']['enormous_neg'] == -9123456789123456789 + +    string s = "hello, world"; +    // CHECK: data['VarPrim']['s'] == 'hello, world' + +    bit b = 0; +    // CHECK: data['VarPrim']['b'] == 0 + +    // bits<> arrays are stored in logical order (array[i] is the same +    // bit identified in .td files as bs{i}), which means the _visual_ +    // order of the list (in default rendering) is reversed. +    bits<8> bs = { 0,0,0,1,0,1,1,1 }; +    // CHECK: data['VarPrim']['bs'] == [ 1,1,1,0,1,0,0,0 ] + +    code c = [{ \"  }]; +    // CHECK: data['VarPrim']['c'] == r' \"  ' + +    list<int> li = [ 1, 2, 3, 4 ]; +    // CHECK: data['VarPrim']['li'] == [ 1, 2, 3, 4 ] +} +def VarObj : Variables { +    // Test initializers that map to JSON objects containing a 'kind' +    // discriminator + +    Base base = D; +    // CHECK: data['VarObj']['base']['kind'] == 'def' +    // CHECK: data['VarObj']['base']['def'] == 'D' +    // CHECK: data['VarObj']['base']['printable'] == 'D' + +    dag d = (ExampleDagOp 22, "hello":$foo); +    // CHECK: data['VarObj']['d']['kind'] == 'dag' +    // CHECK: data['VarObj']['d']['operator']['kind'] == 'def' +    // CHECK: data['VarObj']['d']['operator']['def'] == 'ExampleDagOp' +    // CHECK: data['VarObj']['d']['operator']['printable'] == 'ExampleDagOp' +    // CHECK: data['VarObj']['d']['args'] == [[22, None], ["hello", "foo"]] +    // CHECK: data['VarObj']['d']['printable'] == '(ExampleDagOp 22, "hello":$foo)' + +    int undef_int; +    field int ref_int = undef_int; +    // CHECK: data['VarObj']['ref_int']['kind'] == 'var' +    // CHECK: data['VarObj']['ref_int']['var'] == 'undef_int' +    // CHECK: data['VarObj']['ref_int']['printable'] == 'undef_int' + +    bits<2> undef_bits; +    bits<4> ref_bits; +    let ref_bits{3-2} = 0b10; +    let ref_bits{1-0} = undef_bits{1-0}; +    // CHECK: data['VarObj']['ref_bits'][3] == 1 +    // CHECK: data['VarObj']['ref_bits'][2] == 0 +    // CHECK: data['VarObj']['ref_bits'][1]['kind'] == 'varbit' +    // CHECK: data['VarObj']['ref_bits'][1]['var'] == 'undef_bits' +    // CHECK: data['VarObj']['ref_bits'][1]['index'] == 1 +    // CHECK: data['VarObj']['ref_bits'][1]['printable'] == 'undef_bits{1}' +    // CHECK: data['VarObj']['ref_bits'][0]['kind'] == 'varbit' +    // CHECK: data['VarObj']['ref_bits'][0]['var'] == 'undef_bits' +    // CHECK: data['VarObj']['ref_bits'][0]['index'] == 0 +    // CHECK: data['VarObj']['ref_bits'][0]['printable'] == 'undef_bits{0}' + +    field int complex_ref_int = !add(undef_int, 2); +    // CHECK: data['VarObj']['complex_ref_int']['kind'] == 'complex' +    // CHECK: data['VarObj']['complex_ref_int']['printable'] == '!add(undef_int, 2)' +} + +// Test the !anonymous member. This is tricky because when a def is +// anonymous, almost by definition, the test can't reliably predict +// the name it will be stored under! So we have to search all the defs +// in the JSON output looking for the one that has the test integer +// field set to the right value. + +def Named { int AnonTestField = 1; } +// CHECK: data['Named']['AnonTestField'] == 1 +// CHECK: data['Named']['!anonymous'] is False + +def { int AnonTestField = 2; } +// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 2)['!anonymous'] is True + +multiclass AnonTestMulticlass<int base> { +    def _plus_one { int AnonTestField = !add(base,1); } +    def { int AnonTestField = !add(base,2); } +} + +defm NamedDefm : AnonTestMulticlass<10>; +// CHECK: data['NamedDefm_plus_one']['!anonymous'] is False +// CHECK: data['NamedDefm_plus_one']['AnonTestField'] == 11 +// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 12)['!anonymous'] is True + +// D47431 clarifies that a named def inside a multiclass gives a +// *non*-anonymous output record, even if the defm that instantiates +// that multiclass is anonymous. +defm : AnonTestMulticlass<20>; +// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 21)['!anonymous'] is False +// CHECK: next(rec for rec in data.values() if isinstance(rec, dict) and rec.get('AnonTestField') == 22)['!anonymous'] is True diff --git a/llvm/utils/TableGen/TableGen.cpp b/llvm/utils/TableGen/TableGen.cpp index cf1404d8769..b78260625cb 100644 --- a/llvm/utils/TableGen/TableGen.cpp +++ b/llvm/utils/TableGen/TableGen.cpp @@ -24,6 +24,7 @@ using namespace llvm;  enum ActionType {    PrintRecords, +  DumpJSON,    GenEmitter,    GenRegisterInfo,    GenInstrInfo, @@ -59,6 +60,8 @@ namespace {    Action(cl::desc("Action to perform:"),           cl::values(clEnumValN(PrintRecords, "print-records",                                 "Print all records to stdout (default)"), +                    clEnumValN(DumpJSON, "dump-json", +                               "Dump all records as machine-readable JSON"),                      clEnumValN(GenEmitter, "gen-emitter",                                 "Generate machine code emitter"),                      clEnumValN(GenRegisterInfo, "gen-register-info", @@ -126,6 +129,9 @@ bool LLVMTableGenMain(raw_ostream &OS, RecordKeeper &Records) {    case PrintRecords:      OS << Records;           // No argument, dump all contents      break; +  case DumpJSON: +    EmitJSON(Records, OS); +    break;    case GenEmitter:      EmitCodeEmitter(Records, OS);      break;  | 

