git.sr.ht/~pingoo/stdx@v0.0.0-20240218134121-094174641f6e/toml/internal/toml-test/README.md (about)

     1  `toml-test` is a language-agnostic test suite to verify the correctness of
     2  [TOML][t] parsers and writers.
     3  
     4  Tests are divided into two groups: "invalid" and "valid". Decoders or encoders
     5  that reject "invalid" tests pass the tests, and decoders that accept "valid"
     6  tests and output precisely what is expected pass the tests. The output format is
     7  JSON, described below.
     8  
     9  Both encoders and decoders share valid tests, except an encoder accepts JSON and
    10  outputs TOML rather than the reverse. The TOML representations are read with a
    11  blessed decoder and is compared. Encoders have their own set of invalid tests in
    12  the invalid-encoder directory. The JSON given to a TOML encoder is in the same
    13  format as the JSON that a TOML decoder should output.
    14  
    15  Compatible with TOML version [v1.0.0][v1].
    16  
    17  [t]: https://toml.io
    18  [v1]: https://toml.io/en/v1.0.0
    19  
    20  Installation
    21  ------------
    22  There are binaries on the [release page][r]; these are statically compiled and
    23  should run in most environments. It's recommended you use a binary, or a tagged
    24  release if you build from source especially in CI environments. This prevents
    25  your tests from breaking on changes to tests in this tool.
    26  
    27  To compile from source you will need Go 1.16 or newer (older versions will *not*
    28  work):
    29  
    30      $ git clone https://git.sr.ht/~pingoo/stdx/toml-test.git
    31      $ cd toml-test
    32      $ go build ./cmd/toml-test
    33  
    34  This will build a `./toml-test` binary.
    35  
    36  [r]: https://git.sr.ht/~pingoo/stdx/toml-test/releases
    37  
    38  Usage
    39  -----
    40  `toml-test` accepts an encoder or decoder as the first positional argument, for
    41  example:
    42  
    43      $ toml-test my-toml-decoder
    44      $ toml-test my-toml-encoder -encoder
    45  
    46  The `-encoder` flag is used to signal that this is an encoder rather than a
    47  decoder.
    48  
    49  For example, to run the tests against the Go TOML library:
    50  
    51      # Install my parser
    52      $ go install git.sr.ht/~pingoo/stdx/toml/cmd/toml-test-decoder@master
    53      $ go install git.sr.ht/~pingoo/stdx/toml/cmd/toml-test-encoder@master
    54  
    55      $ toml-test toml-test-decoder
    56      toml-test [toml-test-decoder]: using embeded tests: 278 passed
    57  
    58      $ toml-test -encoder toml-test-encoder
    59      toml-test [toml-test-encoder]: using embeded tests:  94 passed,  0 failed
    60  
    61  The default is to use the tests compiled in the binary; you can use `-testdir`
    62  to load tests from the filesystem. You can use `-run [name]` or `-skip [name]`
    63  to run or skip specific tests. Both flags can be given more than once and accept
    64  glob patterns: `-run 'valid/string/*'`.
    65  
    66  See `toml-test -help` for detailed usage.
    67  
    68  ### Implementing a decoder
    69  For your decoder to be compatible with `toml-test` it **must** satisfy the
    70  expected interface:
    71  
    72  - Your decoder **must** accept TOML data on `stdin` until EOF.
    73  - If the TOML data is invalid, your decoder **must** return with a non-zero
    74    exit, code indicating an error.
    75  - If the TOML data is valid, your decoder **must** output a JSON encoding of
    76    that data on `stdout` and return with a zero exit code indicating success.
    77  
    78  An example in pseudocode:
    79  
    80      toml_data = read_stdin()
    81  
    82      parsed_toml = decode_toml(toml_data)
    83  
    84      if error_parsing_toml():
    85          print_error_to_stderr()
    86          exit(1)
    87  
    88      print_as_tagged_json(parsed_toml)
    89      exit(0)
    90  
    91  Details on the tagged JSON is explained below in "JSON encoding".
    92  
    93  ### Implementing an encoder
    94  For your encoder to be compatible with `toml-test`, it **must** satisfy the
    95  expected interface:
    96  
    97  - Your encoder **must** accept JSON data on `stdin` until EOF.
    98  - If the JSON data cannot be converted to a valid TOML representation, your
    99    encoder **must** return with a non-zero exit code indicating an error.
   100  - If the JSON data can be converted to a valid TOML representation, your encoder
   101    **must** output a TOML encoding of that data on `stdout` and return with a
   102    zero exit code indicating success.
   103  
   104  An example in pseudocode:
   105  
   106      json_data = read_stdin()
   107  
   108      parsed_json_with_tags = decode_json(json_data)
   109  
   110      if error_parsing_json():
   111          print_error_to_stderr()
   112          exit(1)
   113  
   114      print_as_toml(parsed_json_with_tags)
   115      exit(0)
   116  
   117  JSON encoding
   118  -------------
   119  The following JSON encoding applies equally to both encoders and decoders:
   120  
   121  - TOML tables correspond to JSON objects.
   122  - TOML table arrays correspond to JSON arrays.
   123  - TOML values correspond to a special JSON object of the form:
   124    `{"type": "{TTYPE}", "value": {TVALUE}}`
   125  
   126  In the above, `TTYPE` may be one of:
   127  
   128  - string
   129  - integer
   130  - float
   131  - bool
   132  - datetime
   133  - datetime-local
   134  - date-local
   135  - time-local
   136  
   137  `TVALUE` is always a JSON string.
   138  
   139  Empty hashes correspond to empty JSON objects (`{}`) and empty arrays correspond
   140  to empty JSON arrays (`[]`).
   141  
   142  Offset datetimes should be encoded in RFC 3339; Local datetimes should be
   143  encoded following RFC 3339 without the offset part. Local dates should be
   144  encoded as the date part of RFC 3339 and Local times as the time part.
   145  
   146  Examples:
   147  
   148      TOML                JSON
   149  
   150      a = 42              {"type": "integer": "value": "42}
   151  
   152  ---
   153  
   154      [tbl]               {"tbl": {
   155      a = 42                  "a": {"type": "integer": "value": "42}
   156                          }}
   157  
   158  ---
   159  
   160      a = ["a", 2]        {"a": [
   161                              {"type": "string", "value": "1"},
   162                              {"type: "integer": "value": "2"}
   163                          ]}
   164  
   165  Or a more complex example:
   166  
   167  ```toml
   168  best-day-ever = 1987-07-05T17:45:00Z
   169  
   170  [numtheory]
   171  boring     = false
   172  perfection = [6, 28, 496]
   173  ```
   174  
   175  And the JSON encoding expected by `toml-test` is:
   176  
   177  ```json
   178  {
   179    "best-day-ever": {"type": "datetime", "value": "1987-07-05T17:45:00Z"},
   180    "numtheory": {
   181      "boring": {"type": "bool", "value": "false"},
   182      "perfection": [
   183        {"type": "integer", "value": "6"},
   184        {"type": "integer", "value": "28"},
   185        {"type": "integer", "value": "496"}
   186      ]
   187    }
   188  }
   189  ```
   190  
   191  Note that the only JSON values ever used are objects, arrays and strings.
   192  
   193  An example implementation can be found in the BurnSushi/toml:
   194  
   195  - [Add tags](https://git.sr.ht/~pingoo/stdx/toml/blob/master/internal/tag/add.go)
   196  - [Remove tags](https://git.sr.ht/~pingoo/stdx/toml/blob/master/internal/tag/rm.go)
   197  
   198  Implementation-defined behaviour
   199  --------------------------------
   200  This only tests behaviour that's should be true for every encoder implementing
   201  TOML; a few things are left up to implementations, and are not tested here.
   202  
   203  - Millisecond precision (4 digits) is required for datetimes and times, and
   204    further precision is implementation-specific, and any greater precision than
   205    is supported must be truncated (not rounded).
   206  
   207    This tests only millisecond precision, and not any further precision or the
   208    truncation of it.
   209  
   210  
   211  Assumptions of Truth
   212  --------------------
   213  The following are taken as ground truths by `toml-test`:
   214  
   215  - All tests classified as `invalid` **are** invalid.
   216  - All tests classified as `valid` **are** valid.
   217  - All expected outputs in `valid/test-name.json` are exactly correct.
   218  - The Go standard library package `encoding/json` decodes JSON correctly.
   219  - When testing encoders, the TOML decoder at
   220    [git.sr.ht/~pingoo/stdx/toml/toml](https://git.sr.ht/~pingoo/stdx/toml) is assumed to be
   221    correct. (Note that this assumption is not made when testing decoders!)
   222  
   223  Of particular note is that **no TOML decoder** is taken as ground truth when
   224  testing decoders. This means that most changes to the spec will only require an
   225  update of the tests in `toml-test`. (Bigger changes may require an adjustment of
   226  how two things are considered equal. Particularly if a new type of data is
   227  added.) Obviously, this advantage does not apply to testing TOML encoders since
   228  there must exist a TOML decoder that conforms to the specification in order to
   229  read the output of a TOML encoder.
   230  
   231  Adding tests
   232  ------------
   233  `toml-test` was designed so that tests can be easily added and removed. As
   234  mentioned above, tests are split into two groups: invalid and valid tests.
   235  
   236  Invalid tests **only check if a decoder rejects invalid TOML data**. Or, in the
   237  case of testing encoders, invalid tests **only check if an encoder rejects an
   238  invalid representation of TOML** (e.g., a hetergeneous array). Therefore, all
   239  invalid tests should try to **test one thing and one thing only**. Invalid tests
   240  should be named after the fault it is trying to expose. Invalid tests for
   241  decoders are in the `tests/invalid` directory while invalid tests for encoders
   242  are in the `tests/invalid-encoder` directory.
   243  
   244  Valid tests check that a decoder accepts valid TOML data **and** that the parser
   245  has the correct representation of the TOML data. Therefore, valid tests need a
   246  JSON encoding in addition to the TOML data. The tests should be small enough
   247  that writing the JSON encoding by hand will not give you brain damage. The exact
   248  reverse is true when testing encoders.
   249  
   250  A valid test without either a `.json` or `.toml` file will automatically fail.
   251  
   252  If you have tests that you'd like to add, please submit a pull request.
   253  
   254  Why JSON?
   255  ---------
   256  In order for a language agnostic test suite to work, we need some kind of data
   257  exchange format. TOML cannot be used, as it would imply that a particular parser
   258  has a blessing of correctness.
   259  
   260  My decision to use JSON was not a careful one. It was based on expediency. The
   261  Go standard library has an excellent `encoding/json` package built in, which
   262  made it easy to compare JSON data.
   263  
   264  The problem with JSON is that the types in TOML are not in one-to-one
   265  correspondence with JSON. This is why every TOML value represented in JSON is
   266  tagged with a type annotation, as described above.
   267  
   268  YAML may be closer in correspondence with TOML, but I don't believe we should
   269  rely on that correspondence. Making things explicit with JSON means that writing
   270  tests is a little more cumbersome, but it also reduces the number of assumptions
   271  we need to make.