github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/pkg/ccl/importccl/testdata/avro/README.md (about)

     1  ###### Prerequisites
     2  
     3  You will need two tools installed:
     4  
     5  1. Avro tools
     6  `$ brew install avro-tools`
     7  2. jq to manipulate json files: `$ brew install jq`
     8  
     9  _simple.schema_ contains JSON schema for our _simple_ table.
    10  
    11  ###### Notes on file extensions
    12  
    13  * .json: json records, one per line, compact format
    14  * .bjson: avro datum encoded binary json
    15  * .pjson: pretty printed json
    16  * .ocf: OCF (object container format) output
    17  
    18  ###### Test Data Generation
    19  
    20  We use `avro-tools` to generate some random data, and the manipulate
    21  the data using `jq`
    22  
    23  Generate OCF:
    24  
    25  `$ avro-tools random --schema-file simple-schema.json  --count 1000 simple.ocf`
    26  
    27  Generated sorted pretty printed json:
    28  
    29  `$ avro-tools tojson simple.ocf  | jq -s 'sort_by(.i) | .[]' > simple-sorted.pjson`
    30  
    31  Generate sorted compact json:
    32  
    33  `$ avro-tools tojson simple.ocf  | jq -s -c 'sort_by(.i) | .[]' > simple-sorted.json`
    34  
    35  Generate avro fragments (BIN_RECORDS):
    36   
    37  `$ avro-tools jsontofrag --schema-file simple-schema.json simple-sorted.json > simple-sorted-records.avro`
    38  
    39  Note: avro-tools, of course, could produce non-unique random data.
    40  Spot check/verify that your primary key data does not have duplicates.