github.com/square/finch@v0.0.0-20240412205204-6530c03e2b96/docs/content/data/keys.md

github.com/square/finch@v0.0.0-20240412205204-6530c03e2b96/docs/content/data/keys.md (about)

     1  ---
     2  weight: 1
     3  ---
     4  
     5  _Data keys_ are placeholders in SQL statements, like "@d", that are replaced with real values when executed by a client.
     6  
     7  ![Finch data key and generator](/finch/img/finch_data_key.svg)
     8  
     9  Data keys are used in [trx files]({{< relref "syntax/trx-file" >}}), configured in [stage files]({{< relref "syntax/stage-file" >}}), and are replaced at runtime when executed by a client.
    10  
    11  {{< toc >}}
    12  
    13  ## Name and Configure
    14  
    15  The naming rule is simple: "@" followed by a single word (no spaces) in trx files/SQL statements, but no "@" prefix in stage files:
    16  
    17  {{< columns >}}
    18  _Use in Trx File/SQL Statement_
    19  ```
    20  @d
    21  @id
    22  @last_updated
    23  ```
    24  <--->
    25  _Configure in Stage File_
    26  ```
    27  d
    28  id
    29  last_updated
    30  
    31  ```
    32  {{< /columns >}}
    33  
    34  Examples in these docs use the canonical data key (@d) and other short names, but a good practice is to name data keys after the columns for which they're used:
    35  
    36  ```
    37  SELECT c FROM t WHERE id = @id AND n > @n AND @k IS NOT NULL
    38  ```
    39  
    40  Data keys are configured in [`stage.trx[].data`]({{< relref "syntax/stage-file#data" >}}): each data key must have a corresponding entry in that map.
    41  Since "@" is a special character in YAML, the data key names in a stage file do _not_ have the "@" prefix:
    42  
    43  ```yaml
    44  stage:
    45    trx:
    46      - file: read.sql
    47        data:
    48          d:                   # @d
    49            generator: int
    50          id:                  # @id
    51            generator: int
    52            params:
    53              max: 4294967296
    54          last_updated:        # @last_updated
    55            generator: int
    56  ```
    57  
    58  The only required configuration is `generator`: the name of a [data generator]({{< relref "data/generators" >}}) to use for the data key.
    59  But you most likely need to specify generator-specific parameters, as shown above for "id".
    60  See [`stage.trx[].data`]({{< relref "syntax/stage-file#data" >}}) for the full configuration.
    61  
    62  {{< hint type=note title="Terminology" >}}
    63  The terms "data key" and "data generator" are used interchangeably because they're two sides of the same coin.
    64  A specific term is used when necessary to make a technical distinction. 
    65  {{< /hint >}}
    66  
    67  ## Duplicates
    68  
    69  You can reuse the same data key name in a trx file, like @id shown below.
    70  
    71  ```
    72  SELECT c FROM t WHERE id = @id
    73  
    74  UPDATE t SET n=n+1 WHERE id = @id
    75  ```
    76  
    77  This works because the default [data scope]({{< relref "data/scope" >}}) is statement: Finch creates one data generator for @id in the `SELECT`, and another for @id in the `UPDATE`.
    78  However, there are two important points:
    79  
    80  * @id in both statements will have the same configuration because they'll be the same map key ("id") in [`stage.trx[].data`]({{< relref "syntax/stage-file#data" >}})&mdash;same name, same configuration.
    81  * [Data scope]({{< relref "data/scope" >}}) determines if the two @id are different or the same and when each is called to generate a new data value.
    82  
    83  {{< hint type=warning title="Data Scope" >}}
    84  Be sure to read [Data / Scope]({{< relref "data/scope" >}}) to fully understand the second point.
    85  {{< /hint >}}
    86  
    87  The second point&mdash;[data scope]({{< relref "data/scope" >}})&mdash;is important because, in that trx, it looks like the two statements are supposed to access the same row (read the row, then update it).
    88  If that's true, then the default statement data scope won't do what you want.
    89  Instead, you need to make @id the _same_ in both statements by giving it trx scope:
    90  
    91  ```yaml
    92  stage:
    93    trx:
    94      - file: read.sql
    95        data:
    96          id:
    97            scope: trx # <-- data scope
    98            generator: int
    99  ```
   100  
   101  Now the generator will return the same value for @id wherever @id is used in the trx.
   102  
   103  ## Meta
   104  
   105  Meta data keys don't directly generate a value but serve another purpose as documented.
   106  You do _not_ configure meta data keys in a stage file.
   107  
   108  ### @PREV
   109  
   110  @PREV refers to the previous data key.
   111  It's required for ranges: `BETWEEN @d AND @PREV`.
   112  For example:
   113  
   114  ```yaml
   115  stage:
   116    trx:
   117      - file: read.sql
   118        data:
   119          d:
   120            generator: int-range
   121  ```
   122  
   123  The [int-range generator]({{< relref "data/generators#int-range" >}}) returns two values (an ordered pair): the first value replaced @d, the second value replaces @PREV.
   124