github.com/Jeffail/benthos/v3@v3.65.0/website/docs/configuration/about.md

github.com/Jeffail/benthos/v3@v3.65.0/website/docs/configuration/about.md (about)

     1  ---
     2  title: Configuration
     3  sidebar_label: About
     4  description: Learn about Benthos configuration
     5  ---
     6  
     7  Benthos pipelines are configured in a YAML file that consists of a number of root sections, arranged like so:
     8  
     9  import Tabs from '@theme/Tabs';
    10  
    11  <Tabs defaultValue="common" values={[
    12    { label: 'Common', value: 'common', },
    13    { label: 'Full', value: 'full', },
    14  ]}>
    15  
    16  import TabItem from '@theme/TabItem';
    17  
    18  <TabItem value="common">
    19  
    20  ```yaml
    21  input:
    22    kafka:
    23      addresses: [ TODO ]
    24      topics: [ foo, bar ]
    25      consumer_group: foogroup
    26  
    27  pipeline:
    28    processors:
    29    - bloblang: |
    30        root.message = this
    31        root.meta.link_count = this.links.length()
    32  
    33  output:
    34    aws_s3:
    35      bucket: TODO
    36      path: '${! meta("kafka_topic") }/${! json("message.id") }.json'
    37  ```
    38  
    39  </TabItem>
    40  <TabItem value="full">
    41  
    42  ```yaml
    43  http:
    44    address: 0.0.0.0:4195
    45    debug_endpoints: false
    46  
    47  input:
    48    kafka:
    49      addresses: [ TODO ]
    50      topics: [ foo, bar ]
    51      consumer_group: foogroup
    52  
    53  buffer:
    54    none: {}
    55  
    56  pipeline:
    57    processors:
    58    - bloblang: |
    59        root.message = this
    60        root.meta.link_count = this.links.length()
    61  
    62  output:
    63    aws_s3:
    64      bucket: TODO
    65      path: '${! meta("kafka_topic") }/${! json("message.id") }.json'
    66  
    67  input_resources: []
    68  cache_resources: []
    69  processor_resources: []
    70  rate_limit_resources: []
    71  output_resources: []
    72  
    73  logger:
    74    level: INFO
    75    static_fields:
    76      '@service': benthos
    77  
    78  metrics:
    79    prometheus: {}
    80  
    81  tracer:
    82    none: {}
    83  
    84  shutdown_timeout: 20s
    85  ```
    86  
    87  </TabItem>
    88  
    89  </Tabs>
    90  
    91  Most sections represent a component type, which you can read about in more detail in [this document][components].
    92  
    93  These types are hierarchical. For example, an `input` can have a list of child `processor` types attached to it, which in turn can have their own `processor` children.
    94  
    95  This is powerful but can potentially lead to large and cumbersome configuration files. This document outlines tooling provided by Benthos to help with writing and managing these more complex configuration files.
    96  
    97  ### Testing
    98  
    99  For guidance on how to write and run unit tests for your configuration files read [this guide][config.testing].
   100  
   101  ## Customising Your Configuration
   102  
   103  Sometimes it's useful to write a configuration where certain fields can be defined during deployment. For this purpose Benthos supports [environment variable interpolation][config-interp], allowing you to set fields in your config with environment variables like so:
   104  
   105  ```yaml
   106  input:
   107    kafka:
   108      addresses:
   109      - ${KAFKA_BROKER:localhost:9092}
   110      topics:
   111      - ${KAFKA_TOPIC:default-topic}
   112  ```
   113  
   114  This is very useful for sharing configuration files across different deployment environments.
   115  
   116  ## Reusing Configuration Snippets
   117  
   118  Sometimes it's necessary to use a rather large component multiple times. Instead of copy/pasting the configuration or using YAML anchors you can define your component [as a resource][config.resources].
   119  
   120  In the following example we want to make an HTTP request with our payloads. Occasionally the payload might get rejected due to garbage within its contents, and so we catch these rejected requests, attempt to "cleanse" the contents and try to make the same HTTP request again. Since the HTTP request component is quite large (and likely to change over time) we make sure to avoid duplicating it by defining it as a resource `get_foo`:
   121  
   122  ```yaml
   123  pipeline:
   124    processors:
   125      - resource: get_foo
   126      - catch:
   127        - bloblang: |
   128            root = this
   129            root.content = this.content.strip_html()
   130        - resource: get_foo
   131  
   132  processor_resources:
   133    - label: get_foo
   134      http:
   135        url: http://example.com/foo
   136        verb: POST
   137        headers:
   138          SomeThing: "set-to-this"
   139          SomeThingElse: "set-to-something-else"
   140  ```
   141  
   142  ### Feature Toggles
   143  
   144  Resources can be imported separately to your config file with the cli flag `-r` or `-resources`, which is a useful way to switch out resources with common names based on your chosen environment. For example, with a main configuration file `config.yaml`:
   145  
   146  ```yaml
   147  pipeline:
   148    processors:
   149      - resource: get_foo
   150  ```
   151  
   152  And then two resource files, one stored at the path `./staging/request.yaml`:
   153  
   154  ```yaml
   155  processor_resources:
   156    - label: get_foo
   157      http:
   158        url: http://example.com/foo
   159        verb: POST
   160        headers:
   161          SomeThing: "set-to-this"
   162          SomeThingElse: "set-to-something-else"
   163  ```
   164  
   165  And another stored at the path `./production/request.yaml`:
   166  
   167  ```yaml
   168  processor_resources:
   169    - label: get_foo
   170      http:
   171        url: http://example.com/bar
   172        verb: PUT
   173        headers:
   174          Desires: "are-empty"
   175  ```
   176  
   177  We can select our chosen resource by changing which file we import, either running:
   178  
   179  ```sh
   180  benthos -r ./staging/request.yaml -c ./config.yaml
   181  ```
   182  
   183  Or:
   184  
   185  ```sh
   186  benthos -r ./production/request.yaml -c ./config.yaml
   187  ```
   188  
   189  These flags also support wildcards, which allows you to import an entire directory of resource files like `benthos -r "./staging/*.yaml" -c ./config.yaml`. You can find out more about configuration resources in the [resources document][config.resources].
   190  
   191  ### Templating
   192  
   193  Resources can only be instantiated with a single configuration, which means they aren't suitable for cases where the configuration is required in multiple places but with slightly different parameters, ugh!
   194  
   195  But hey, why don't you chill out? Benthos has a (currently experimental) alternative feature called templates, with which it's possible to define a custom configuration schema and a template for building a configuration from that schema. You can read more about templates [in this guide][config.templating].
   196  
   197  ## Reloading
   198  
   199  It's possible to have a running instance of Benthos reload configurations, including resource files imported with `-r`/`--resources`, automatically when the files are updated without needing to manually restart the service. This is done by specifying the `-w`/`--watcher` flag when running Benthos in normal mode or in streams mode:
   200  
   201  ```sh
   202  # Normal mode
   203  benthos -w -r ./production/request.yaml -c ./config.yaml
   204  ```
   205  
   206  ```sh
   207  # Streams mode
   208  benthos -w -r ./production/request.yaml streams ./stream_configs/*.yaml
   209  ```
   210  
   211  If a file update results in configuration parsing or linting errors then the change is ignored (with logs informing you of the problem) and the previous configuration will continue to be run (until the issues are fixed).
   212  
   213  ## Enabling Discovery
   214  
   215  The discoverability of configuration fields is a common headache with any configuration driven application. The classic solution is to provide curated documentation that is often hosted on a dedicated site.
   216  
   217  However, a user often only needs to get their hands on a short, runnable example config file for their use case. They just need to see the format and field names as the fields themselves are usually self explanatory. Forcing such a user to navigate a website, scrolling through paragraphs of text, seems inefficient when all they actually needed to see was something like:
   218  
   219  ```yaml
   220  input:
   221    amqp_0_9:
   222      urls: [ amqp://guest:guest@localhost:5672/ ]
   223      consumer_tag: benthos-consumer
   224      queue: benthos-queue
   225      prefetch_count: 10
   226      prefetch_size: 0
   227  output:
   228    stdout: {}
   229  ```
   230  
   231  In order to make this process easier Benthos is able to generate usable configuration examples for any types, and you can do this from the binary using the `create` subcommand.
   232  
   233  If, for example, we wanted to generate a config with a websocket input, a Kafka output and a Bloblang processor in the middle, we could do it with the following command:
   234  
   235  ```text
   236  benthos create websocket/bloblang/kafka
   237  ```
   238  
   239  > If you need a gentle reminder as to which components Benthos offers you can see those as well with `benthos list`.
   240  
   241  All of these generated configuration examples also include other useful config sections such as `metrics`, `logging`, etc with sensible defaults.
   242  
   243  For more information read the output from `benthos create --help`.
   244  
   245  ## Help With Debugging
   246  
   247  Once you have a config written you now move onto the next headache of proving that it works, and understanding why it doesn't. Benthos, like most good config driven services, performs validation on configs and tries to provide sensible error messages.
   248  
   249  However, with validation it can be hard to capture all problems, and the user usually understands their intentions better than the service. In order to help expose and diagnose config errors Benthos provides two mechanisms, linting and echoing.
   250  
   251  ### Linting
   252  
   253  If you attempt to run a config that has linting errors Benthos will print the errors and halt execution. If, however, you want to test your configs before deployment you can do so with the `lint` subcommand:
   254  
   255  For example, imagine we have a config `foo.yaml`, where we intend to read from AMQP, but there is a typo in our config struct:
   256  
   257  ```text
   258  input:
   259    amqp_0_9:
   260      yourl: amqp://guest:guest@rabbitmqserver:5672/
   261  ```
   262  
   263  We can catch this error before attempting to run the config:
   264  
   265  ```sh
   266  $ benthos lint ./foo.yaml
   267  ./foo.yaml: line 3: field yourl not recognised
   268  ```
   269  
   270  For more information read the output from `benthos lint --help`.
   271  
   272  ### Echoing
   273  
   274  Echoing is where Benthos can print back your configuration _after_ it has been parsed. It is done with the `echo` subcommand, which is able to show you a normalised version of your config, allowing you to see how it was interpreted:
   275  
   276  ```sh
   277  benthos -c ./your-config.yaml echo
   278  ```
   279  
   280  You can check the output of the above command to see if certain sections are missing or fields are incorrect, which allows you to pinpoint typos in the config.
   281  
   282  [processors]: /docs/components/processors/about
   283  [config-interp]: /docs/configuration/interpolation
   284  [config.testing]: /docs/configuration/unit_testing
   285  [config.templating]: /docs/configuration/templating
   286  [config.resources]: /docs/configuration/resources
   287  [json-references]: https://tools.ietf.org/html/draft-pbryan-zyp-json-ref-03
   288  [components]: /docs/components/about