github.com/fraugster/parquet-go@v0.12.0/parquetschema/doc.go (about)

     1  // Package parquetschema contains functions and data types to manage
     2  // schema definitions for the parquet-go package. Most importantly,
     3  // provides a schema definition parser to turn a textual representation
     4  // of a parquet schema into a SchemaDefinition object.
     5  //
     6  // For the purpose of giving users the ability to define parquet schemas
     7  // in other ways, this package also exposes the data types necessary for it.
     8  // Users have the possibility to manually assemble their own SchemaDefinition
     9  // object manually and programmatically.
    10  //
    11  // To construct a schema definition, start with a SchemaDefinition object and
    12  // set its RootDocument field to a ColumnDefinition. This "root column" describes
    13  // the whole message. The root column doesn't have a type on its own, so the
    14  // SchemaElement can be left unset. Inside the root column definition, you then
    15  // need to populate children. For each of the children, you need to set the SchemaElement,
    16  // and either SchemaElement.Type or the children. This is for the following reason:
    17  // if no type is set, it indicates that this column is a group, consisting of its children.
    18  // A group without children is nonsensical. If a type is set, it indicates that the field
    19  // is of a particular type, and therefore can't have any children.
    20  //
    21  // For the purpose of ensuring that schema definitions that were constructed not
    22  // by the schema parser are sound and don't miss any information, you can use the Validate()
    23  // function on the SchemaDefinition. It validates the schema definition for general soundness
    24  // of the set data types, the overall structure (types vs groups), as well as whether
    25  // logical types or converted types were used and whether the elements using these logical
    26  // or converted types adhere to the conventions as laid out by the parquet documentation. You can
    27  // find this documentation here: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
    28  package parquetschema