github.com/hyperion-hyn/go-ethereum@v2.4.0+incompatible/docs/Privacy/Tessera/Usage/Monitoring.md (about) 1 Tessera can be used with InfluxDB and Prometheus time-series databases to record API usage metrics. The data recorded can be visualised either by creating a custom dashboard or by using an existing dashboarding tool such as Grafana. 2 3 In addition, Tessera logs can be searched, analyzed and monitored using Splunk. Splunk can be set up in such a way that the logs for multiple Tessera nodes in a network are accessible from a single centralized Splunk instance. 4 5 ## API Metrics 6 Tessera can record the following usage metrics for each endpoint of its API: 7 8 * Average Response Time 9 * Max Response Time 10 * Min Response Time 11 * Request Count 12 * Requests Per Second 13 14 These metrics can be stored in an InfluxDB or Prometheus time-series database for further analysis. 15 16 * [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/) should be used when it is preferred for metrics to be "pushed" from Tessera to the DB (i.e. Tessera starts a service which periodically writes the latest metrics to the DB by calling the DBs API) 17 * [Prometheus](https://prometheus.io/) should be used when it is preferred for metrics to be "pulled" from Tessera by the DB (i.e. Tessera exposes a `/metrics` API endpoint which the DB periodically calls to fetch the latest metrics) 18 19 Both databases integrate well with the open source dashboard editor [Grafana](https://grafana.com/) to allow for easy creation of dashboards to visualise the data being captured from Tessera. 20 21 ### Using InfluxDB 22 See the [InfluxDB documentation](https://docs.influxdata.com/influxdb) for details on how to set up an InfluxDB database ready for use with Tessera. A summary of the steps is as follows: 23 24 1. [Install InfluxDB](https://docs.influxdata.com/influxdb/v1.7/introduction/installation/) 25 1. Start the InfluxDB server 26 ```bash 27 influxd -config /path/to/influx.conf 28 ``` 29 For local development/testing the default configuration file (Linux: `/etc/influxdb/influxdb.conf`, macOS: `/usr/local/etc/influxdb.conf`), should be sufficient. For further configuration options see [Configuring InfluxDB](https://docs.influxdata.com/influxdb/v1.7/administration/config/) 30 1. Connect to the InfluxDB server using the [`influx` CLI](https://docs.influxdata.com/influxdb/v1.7/tools/shell/) and create a new DB. If using the default config, this is simply: 31 ```bash 32 influx 33 > CREATE DATABASE myDb 34 ``` 35 1. To view data stored in the database use the [Influx Query Language](https://docs.influxdata.com/influxdb/v1.7/query_language/) 36 ```bash 37 influx 38 > USE myDb 39 > SHOW MEASUREMENTS 40 > SELECT * FROM <measurement> 41 ``` 42 43 !!! info 44 The InfluxDB HTTP API can be called directly as an alternative to using the `influx` CLI 45 46 Each Tessera server type (i.e. `P2P`, `Q2T`, `ADMIN`, `THIRDPARTY`, `ENCLAVE`) can be configured to store API metrics in an InfluxDB. These servers can be configured to store metrics to the same DB or separate ones. Not all servers need to be configured to store metrics. 47 48 To configure a server to use an InfluxDB, add `influxConfig` to the server config. For example: 49 50 ```json 51 "serverConfigs": [ 52 { 53 "app":"Q2T", 54 "enabled": true, 55 "serverAddress":"unix:/path/to/tm.ipc", 56 "communicationType" : "REST", 57 "influxConfig": { 58 "serverAddress": "https://localhost:8086", // InfluxDB server address 59 "dbName": "myDb", // InfluxDB DB name (DB must already exist) 60 "pushIntervalInSecs": 15, // How frequently Tessera will push new metrics to the DB 61 "sslConfig": { // Config required if InfluxDB server is using TLS 62 "tls": "STRICT", 63 "sslConfigType": "CLIENT_ONLY", 64 "clientTrustMode": "CA", 65 "clientTrustStore": "/path/to/truststore.jks", 66 "clientTrustStorePassword": "password", 67 "clientKeyStore": "path/to/truststore.jks", 68 "clientKeyStorePassword": "password" 69 } 70 } 71 }, 72 { 73 "app":"P2P", 74 "enabled": true, 75 "serverAddress":"http://localhost:9001", 76 "communicationType" : "REST", 77 "influxConfig": { 78 "serverAddress": "http://localhost:8087", 79 "dbName": "anotherDb", 80 "pushIntervalInSecs": 15 81 } 82 } 83 ] 84 ``` 85 86 #### InfluxDB TLS Configuration 87 InfluxDB supports 1-way TLS. This allows clients to validate the identity of the InfluxDB server and provides data encryption. 88 89 See [Enabling HTTPS with InfluxDB](https://docs.influxdata.com/influxdb/v1.7/administration/https_setup/) for details on how to secure an InfluxDB server with TLS. A summary of the steps is as follows: 90 91 1. Obtain a CA/self-signed certificate and key (either as separate `.crt` and `.key` files or as a combined `.pem` file) 92 1. Enable HTTPS in `influx.conf`: 93 ``` bash 94 # Determines whether HTTPS is enabled. 95 https-enabled = true 96 97 # The SSL certificate to use when HTTPS is enabled. 98 https-certificate = "/path/to/certAndKey.pem" 99 100 # Use a separate private key location. 101 https-private-key = "/path/to/certAndKey.pem" 102 ``` 103 1. Restart the InfluxDB server to apply the config changes 104 105 To allow Tessera to communicate with a TLS-secured InfluxDB, `sslConfig` must be provided. To configure Tessera as the client in 1-way TLS: 106 ```json 107 "sslConfig": { 108 "tls": "STRICT", 109 "sslConfigType": "CLIENT_ONLY", 110 "clientTrustMode": "CA", 111 "clientTrustStore": "/path/to/truststore.jks", 112 "clientTrustStorePassword": "password", 113 "clientKeyStore": "path/to/truststore.jks", 114 "clientKeyStorePassword": "password", 115 "environmentVariablePrefix": "INFLUX" 116 } 117 ``` 118 where `truststore.jks` is a Java KeyStore format file containing the trusted certificates for the Tessera client (e.g. the certificate of the CA used to create the InfluxDB certificate). 119 120 If securing the keystore with a password this password should be provided. Passwords can be provided either in the config (e.g. `clientTrustStorePassword`) or as environment variables (using `environmentVariablePrefix` and setting `<PREFIX>_TESSERA_CLIENT_TRUSTSTORE_PWD`). The [TLS Config](../../Configuration/TLS) documentation explains this in more detail. 121 122 As Tessera expects 2-way TLS, a `.jks` file for the `clientKeyStore` must also be provided. This will not be used so can simply be set as the truststore. 123 124 ### Using Prometheus 125 The [Prometheus documentation](https://prometheus.io/docs/introduction/overview/) provides all the information needed to get Prometheus setup and ready to integrate with Tessera. The [Prometheus First Steps](https://prometheus.io/docs/introduction/first_steps/) is a good starting point. A summary of the steps to store Tessera metrics in a Prometheus DB are as follows: 126 127 1. Install Prometheus 128 1. Create a `prometheus.yml` configuration file to provide Prometheus with the necessary information to pull metrics from Tessera. A simple Prometheus config for use with the [7nodes example network](../../../../Getting Started/7Nodes) is: 129 ```yaml 130 global: 131 scrape_interval: 15s 132 evaluation_interval: 15s 133 134 scrape_configs: 135 - job_name: tessera-7nodes 136 static_configs: 137 - targets: ['localhost:9001', 'localhost:9002', 'localhost:9003', 'localhost:9004', 'localhost:9005', 'localhost:9006', 'localhost:9007'] 138 ``` 139 1. Start Tessera. As Tessera always exposes the `metrics` endpoint no additional configuration of Tessera is required 140 1. Start Prometheus 141 ```bash 142 prometheus --config.file=prometheus.yml 143 ``` 144 1. To view data stored in the database, access the Prometheus UI (by default `localhost:9090`, this address can be changed in `prometheus.yml`) and use the [Prometheus Query Language](https://prometheus.io/docs/prometheus/latest/querying/basics/) 145 146 ### Creating a Grafana dashboard 147 Grafana can be used to create dashboards from data stored in InfluxDB or Prometheus databases. See the [Grafana documentation](http://docs.grafana.org/) and [Grafana Getting Started](https://grafana.com/docs/guides/getting_started/) for details on how to set up a Grafana instance and integrate it with databases. A summary of the steps is as follows: 148 149 1. [Install and start Grafana](https://grafana.com/docs/) as described for your OS (if using the default config, Grafana will start on port `3000` and require login/password `admin/admin` to access the dashboard) 150 1. Create a Data Source to provide the necessary details to connect to the database 151 1. Create a new Dashboard 152 1. Add panels to the dashboard. Panels are the graphs, tables, statistics etc. that make up a dashboard. The New Panel wizard allows the components of the panel to be configured: 153 * Queries: Details the query to use retrieve data from the datasource, see the following links for info on using the Query Editor for [InfluxDB](https://grafana.com/docs/features/datasources/influxdb/) and [Prometheus](https://grafana.com/docs/features/datasources/prometheus/) 154 * Visualization: How to present the data queried, including panel type, axis headings etc. 155 156 #### Example dashboard 157 [](../../../../images/tessera/monitoring/example-grafana-dashboard.png) 158 159 To create this dashboard, a [7nodes example network](../../../../Getting Started/7Nodes) was started, with each Tessera node configured to store its `P2P` and `Q2T` metrics to the same InfluxDB. Several runs of the Quorum Acceptance Tests were run against this network to simulate network activity. 160 161 As can be seen in the top-right corner, the dashboard was set to only show data collected in the past 15 mins. 162 163 To create a dashboard similar to this: 164 165 1. Create an InfluxDB datasource within Grafana: 166 1. Hover over the cog icon in the left sidebar 167 1. Data Sources 168 1. Add data source 169 1. Select the type of DB to connect to (e.g. InfluxDB or Prometheus) 170 1. Fill out the form to provide all necessary DB connection information, e.g.: 171 [](../../../../images/tessera/monitoring/grafana-influxdb-datasource.png) 172 173 1. Create a new dashboard 174 1. Hover over the plus icon in the left sidebar 175 1. Dashboard 176 1. Add Query to configure the first panel 177 1. Add Panel in the top-right to add additional panels 178 [](../../../../images/tessera/monitoring/grafana-new-dashboard.png) 179 180 !!! note 181 For each of the following examples, additional options such as titles, axis labels and formatting can be configured by navigating the menus in the left-hand sidebar 182 183 [](../../../../images/tessera/monitoring/grafana-panel-sidebar.png) 184 185 1. Create *sendRaw requests* panel 186 1. Select the correct datasource from the *Queries to* dropdown list 187 1. Construct the query as shown in the below image. This retrieves the data for the `sendraw` API from the InfluxDB, finds the sum of the `RequestCount` for this data (i.e. the total number of requests) and groups by `instance` (i.e. each Tessera node). `time($_interval)` automatically scales the graph resolution for the time range and graph width. 188 [](../../../../images/tessera/monitoring/grafana-send-raw-query.png) 189 190 This panel shows the number of private payloads sent to Tessera using the `sendraw` API over time. 191 192 1. Create *receiveRaw requests* panel 193 1. Select the correct datasource from the *Queries to* dropdown list 194 1. Construct the query as shown in the below image. This retrieves the data for the `receiveraw` API from the InfluxDB, finds the sum of the `RequestCount` for this data (i.e. the total number of requests) and groups by `instance` (i.e. each Tessera node). `time($_interval)` automatically scales the graph resolution for the time range and graph width. 195 [](../../../../images/tessera/monitoring/grafana-receive-raw-query.png) 196 197 This panel shows the number of private payloads retrieved from Tessera using the `receiveraw` API over time. 198 199 1. Create *partyinfo request rate (Tessera network health)* panel 200 1. Select the correct datasource from the *Queries to* dropdown list 201 1. Construct the query as shown in the below image. This retrieves the data for the `partyinfo` API from the InfluxDB, finds the non-negative derivative of the `RequestCount` for this data and groups by `instance` (i.e. each Tessera node). `non_negative_derivative(1s)` calculates the per second change in `RequestCount` and ignores negative values that will occur if a node is stopped and restarted. 202 [](../../../../images/tessera/monitoring/grafana-partyinfo-rate.png) 203 204 This panel shows the rate of POST requests per second to `partyinfo`. For this network of 7 healthy nodes, this rate fluctuates between 5.5 and 6.5 requests/sec. At approx 09:37 node 1 was killed and the partyinfo rate across all nodes immediately drops. This is because they are no longer receiving requests to their `partyinfo` API from node 1. At 09:41 node 1 is restarted and the rates return to their original values. 205 206 This metric can be used as an indirect method of monitoring the health of the network. Using some of the more advanced InfluxDB query options available in Grafana and the other metrics measurements available it may be possible to make this result more explicit. 207 208 [Alerts and rules](https://grafana.com/docs/alerting/notifications/) can be configured to determine when a node has disconnected and send notifications to pre-configured channels (e.g. Slack, email, etc.). 209 210 1. Create *sendRaw rate* panel 211 1. Select the correct datasource from the *Queries to* dropdown list 212 1. Construct the query as shown in the below image. This retrieves the data for the `sendraw` API from the InfluxDB, finds the sum of the `RequestRate` for this data and groups by `instance` (i.e. each Tessera node). `time($_interval)` automatically scales the graph resolution for the time range and graph width. 213 [](../../../../images/tessera/monitoring/grafana-sendraw-rate-query.png) 214 215 The POST `sendraw` API is used by Quorum whenever a private transaction is sent using the `eth_sendTransaction` or `personal_sendTransaction` API. This panel gives a good indication of the private tx throughput in Quorum. Note that if the `sendraw` API is called by another process, the count will not be a true representation of Quorum traffic. 216 217 ## Monitoring a Tessera network with Splunk 218 Splunk can be used to search, analyze and monitor the logs of Tessera nodes. 219 220 To consolidate the logs from multiple Tessera nodes in a network requires setting up Splunk and Splunk Universal Forwarders. The following pages from the Splunk documentation are a good starting point for understanding how to achieve this: 221 222 * [Consolidate data from multiple hosts](http://docs.splunk.com/Documentation/Forwarder/7.1.2/Forwarder/Consolidatedatafrommultiplehosts) 223 * [Set up the Universal Forwarder](http://docs.splunk.com/Documentation/Splunk/7.1.2/Forwarding/EnableforwardingonaSplunkEnterpriseinstance#Set_up_the_universal_forwarder) 224 * [Configure the Universal Forwarder](http://docs.splunk.com/Documentation/Forwarder/7.1.2/Forwarder/Configuretheuniversalforwarder) 225 * [Enable a receiver](http://docs.splunk.com/Documentation/Forwarder/7.1.2/Forwarder/Enableareceiver) 226 227 The general steps to consolidate the logs for a Tessera network in Splunk are: 228 229 1. Set up a central Splunk instance if one does not already exist. Typically this will be on a separate host to the hosts running the Tessera nodes. This is known as the *Receiver*. 230 1. Configure the Tessera hosts to forward their node's logs to the *Receiver* by: 231 1. Configuring the format and output location of the node's logs. This is achieved by configuring logback (the logging framework used by Tessera) at node start-up. 232 233 The following example XML configures logback to save Tessera's logs to a file. See the [Logback documentation](https://logback.qos.ch/manual/configuration.html#syntax) for more information on configuring logback: 234 ``` xml 235 <?xml version="1.0" encoding="UTF-8"?> 236 <configuration> 237 <appender name="FILE" class="ch.qos.logback.core.FileAppender"> 238 <file>/path/to/file.log</file> 239 <encoder> 240 <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern> 241 </encoder> 242 </appender> 243 244 <logger name="org.glassfish.jersey.internal.inject.Providers" level="ERROR" /> 245 <logger name="org.hibernate.validator.internal.util.Version" level="ERROR" /> 246 <logger name="org.hibernate.validator.internal.engine.ConfigurationImpl" level="ERROR" /> 247 248 <root level="INFO"> 249 <appender-ref ref="FILE"/> 250 </root> 251 </configuration> 252 ``` 253 254 To start Tessera with an XML configuration file: 255 256 ``` bash 257 java -Dlogback.configurationFile=/path/to/logback-config.xml -jar /path/to/tessera-app-<version>-app.jar -configfile /path/to/config.json 258 ``` 259 260 1. Set up Splunk *Universal Forwarders* (lightweight Splunk clients) on each Tessera host to forward log data for their node to the *Receiver* 261 1. Set up the Splunk *Receiver* to listen and receive logging data from the *Universal Forwarders* 262 263