github.com/choria-io/go-choria@v0.28.1-0.20240416190746-b3bf9c7d5a45/docs/content/server/monitoring.md (about)

     1  +++
     2  title = "Monitoring"
     3  toc = true
     4  weight = 10
     5  pre = "<b>1. </b>"
     6  +++
     7  
     8  Choria Server is designed to not open any listening ports unless its Apple HomeKit integration is enabled.
     9  
    10  Without any opening ports monitoring it is via a state file that it writes regularly when enabled:
    11  
    12  ```ini
    13  plugin.choria.status_file_path = /var/log/choria-status.json
    14  plugin.choria.status_update_interval = 30
    15  ```
    16  
    17  The above configuration will cause the status file to update every 30 seconds. This needs to be enabled for any
    18  deep introspection.
    19  
    20  ### Nagios Check
    21  
    22  A `nagios` protocol test is included in the command `choria tool status`, this can check various aspects of the server
    23  operation.
    24  
    25  ```nohighlight
    26  $ choria tool status --status-file /var/log/choria-status.json \
    27      --disconnected \        # alerts when the server is not connected to a broker
    28      --message-since 1h \    # must have received RPC requests within the last 1 hour
    29      --max-age 1m \          # Status file may not be older than 1 minute
    30      --token-age 24h \       # Alert 1 day before the token expires
    31      --certificate-age 24h \ # Alert 1 day before the certificate expires
    32      --provisioned           # Alerts if the server is in provisioned mode
    33  ```
    34  
    35  ### Autonomous Agent Check
    36  
    37  A running instance can check itself using an Autonomous Agent, it will then public Cloud Events about it's internal state and, optionally, expose it's state to a local Prometheus Node Exporter via its text file directory.
    38  
    39  ```yaml
    40  watchers:
    41    - name: check_choria
    42      type: nagios
    43      interval: 5m # checks every 5 minutes, require the status file to be 15 minutes or newer
    44      properties:
    45        builtin: choria_status
    46        token_expire: 1d    # alerts when the token expires soon
    47        pubcert_expire: 1d  # alerts when the certificate expires soon
    48        last_message: 1h    # alerts when no RPC message was received in 1 hour
    49  ```
    50  
    51  Review the [Autonomous Agent](https://choria.io/docs/autoagents/) section for full detail about these checks.
    52  
    53  If you have Prometheus Node Exporter running locally with an argument `--collector.textfile.directory=/var/lib/node_exporter/textfile` set
    54  you can configure this path in Choria which would cause the above Autonomous Agent to write status to that directory:
    55  
    56  ```ini
    57  plugin.choria.prometheus_textfile_directory = /var/lib/node_exporter/textfile
    58  ```
    59  
    60  ### Lifecycle Events
    61  
    62  Choria will publish a number of events in [Cloud Events](https://cloudevents.io/) format, these can be
    63  observed using `choria tool event`, this will include start, stop, provisioned etc events from every Choria Server instance.
    64  
    65  Some details about these events are in these blog posts:
    66   
    67   * [Choria Lifecycle Events](https://choria.io/blog/post/2019/01/03/lifecycle/)
    68   * [Transitioning Events to Cloud Events](https://choria.io/blog/post/2019/12/05/cloudevents_transition/)