github.com/MetalBlockchain/metalgo@v1.11.9/api/health/service.md (about)

     1  ---
     2  tags: [AvalancheGo APIs]
     3  description: This page is an overview of the Health API associated with AvalancheGo. This API can be used for measuring node health.
     4  sidebar_label: Health API
     5  pagination_label: Health API
     6  ---
     7  
     8  # Health API
     9  
    10  This API can be used for measuring node health.
    11  
    12  :::info
    13  
    14  This API set is for a specific node; it is unavailable on the [public server](/tooling/rpc-providers.md).
    15  
    16  :::
    17  
    18  ## Health Checks
    19  
    20  The node periodically runs all health checks, including health checks for each chain.
    21  
    22  The frequency at which health checks are run can be specified with the [--health-check-frequency](/nodes/configure/avalanchego-config-flags.md) flag.
    23  
    24  ## Filterable Health Checks
    25  
    26  The health checks that are run by the node are filterable. You can specify which health checks
    27  you want to see by using `tags` filters. Returned results will only include health checks that
    28  match the specified tags and global health checks like `network`, `database` etc.
    29  When filtered, the returned results will not show the full node health,
    30  but only a subset of filtered health checks.
    31  This means the node can still be unhealthy in unfiltered checks, even if the returned results show that the node is healthy.
    32  AvalancheGo supports using subnetIDs as tags.
    33  
    34  ## GET Request
    35  
    36  To get an HTTP status code response that indicates the node’s health, make a `GET` request.
    37  If the node is healthy, it will return a `200` status code.
    38  If the node is unhealthy, it will return a `503` status code.
    39  In-depth information about the node's health is included in the response body.
    40  
    41  ### Filtering
    42  
    43  To filter GET health checks, add a `tag` query parameter to the request.
    44  The `tag` parameter is a string.
    45  For example, to filter health results by subnetID `29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL`,
    46  use the following query:
    47  
    48  ```sh
    49  curl 'http://localhost:9650/ext/health?tag=29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL'
    50  ```
    51  
    52  In this example returned results will contain global health checks and health checks that are
    53  related to subnetID `29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL`.
    54  
    55  **Note:** This filtering can show healthy results even if the node is unhealthy in other Chains/Subnets.
    56  
    57  In order to filter results by multiple tags, use multiple `tag` query parameters.
    58  For example, to filter health results by subnetID `29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL` and
    59  `28nrH5T2BMvNrWecFcV3mfccjs6axM1TVyqe79MCv2Mhs8kxiY` use the following query:
    60  
    61  ```sh
    62  curl 'http://localhost:9650/ext/health?tag=29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL&tag=28nrH5T2BMvNrWecFcV3mfccjs6axM1TVyqe79MCv2Mhs8kxiY'
    63  ```
    64  
    65  The returned results will include health checks for both subnetIDs as well as global health checks.
    66  
    67  ### Endpoints
    68  
    69  The available endpoints for GET requests are:
    70  
    71  - `/ext/health` returns a holistic report of the status of the node.
    72    **Most operators should monitor this status.**
    73  - `/ext/health/health` is the same as `/ext/health`.
    74  - `/ext/health/readiness` returns healthy once the node has finished initializing.
    75  - `/ext/health/liveness` returns healthy once the endpoint is available.
    76  
    77  ## JSON RPC Request
    78  
    79  ### Format
    80  
    81  This API uses the `json 2.0` RPC format. For more information on making JSON RPC calls, see
    82  [here](/reference/standards/guides/issuing-api-calls.md).
    83  
    84  ### Endpoint
    85  
    86  ```text
    87  /ext/health
    88  ```
    89  
    90  ### Methods
    91  
    92  #### `health.health`
    93  
    94  This method returns the last set of health check results.
    95  
    96  **Example Call:**
    97  
    98  ```sh
    99  curl  -H 'Content-Type: application/json' --data '{
   100      "jsonrpc":"2.0",
   101      "id"     :1,
   102      "method" :"health.health",
   103      "params": {
   104          "tags": ["11111111111111111111111111111111LpoYY", "29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL"]
   105      }
   106  }' 'http://localhost:9650/ext/health'
   107  ```
   108  
   109  **Example Response:**
   110  
   111  ```json
   112  {
   113      "jsonrpc": "2.0",
   114      "result": {
   115          "checks": {
   116              "C": {
   117                  "message": {
   118                      "engine": {
   119                          "consensus": {
   120                              "lastAcceptedHeight": 31273749,
   121                              "lastAcceptedID": "2Y4gZGzQnu8UjnHod8j1BLewHFVEbzhULPNzqrSWEHkHNqDrYL",
   122                              "longestProcessingBlock": "0s",
   123                              "processingBlocks": 0
   124                          },
   125                          "vm": null
   126                      },
   127                      "networking": {
   128                          "percentConnected": 0.9999592612587486
   129                      }
   130                  },
   131                  "timestamp": "2024-03-26T19:44:45.2931-04:00",
   132                  "duration": 20375
   133              },
   134              "P": {
   135                  "message": {
   136                      "engine": {
   137                          "consensus": {
   138                              "lastAcceptedHeight": 142517,
   139                              "lastAcceptedID": "2e1FEPCBEkG2Q7WgyZh1v4nt3DXj1HDbDthyhxdq2Ltg3shSYq",
   140                              "longestProcessingBlock": "0s",
   141                              "processingBlocks": 0
   142                          },
   143                          "vm": null
   144                      },
   145                      "networking": {
   146                          "percentConnected": 0.9999592612587486
   147                      }
   148                  },
   149                  "timestamp": "2024-03-26T19:44:45.293115-04:00",
   150                  "duration": 8750
   151              },
   152              "X": {
   153                  "message": {
   154                      "engine": {
   155                          "consensus": {
   156                              "lastAcceptedHeight": 24464,
   157                              "lastAcceptedID": "XuFCsGaSw9cn7Vuz5e2fip4KvP46Xu53S8uDRxaC2QJmyYc3w",
   158                              "longestProcessingBlock": "0s",
   159                              "processingBlocks": 0
   160                          },
   161                          "vm": null
   162                      },
   163                      "networking": {
   164                          "percentConnected": 0.9999592612587486
   165                      }
   166                  },
   167                  "timestamp": "2024-03-26T19:44:45.29312-04:00",
   168                  "duration": 23291
   169              },
   170              "bootstrapped": {
   171                  "message": [],
   172                  "timestamp": "2024-03-26T19:44:45.293078-04:00",
   173                  "duration": 3375
   174              },
   175              "database": {
   176                  "timestamp": "2024-03-26T19:44:45.293102-04:00",
   177                  "duration": 1959
   178              },
   179              "diskspace": {
   180                  "message": {
   181                      "availableDiskBytes": 227332591616
   182                  },
   183                  "timestamp": "2024-03-26T19:44:45.293106-04:00",
   184                  "duration": 3042
   185              },
   186              "network": {
   187                  "message": {
   188                      "connectedPeers": 284,
   189                      "sendFailRate": 0,
   190                      "timeSinceLastMsgReceived": "293.098ms",
   191                      "timeSinceLastMsgSent": "293.098ms"
   192                  },
   193                  "timestamp": "2024-03-26T19:44:45.2931-04:00",
   194                  "duration": 2333
   195              },
   196              "router": {
   197                  "message": {
   198                      "longestRunningRequest": "66.90725ms",
   199                      "outstandingRequests": 3
   200                  },
   201                  "timestamp": "2024-03-26T19:44:45.293097-04:00",
   202                  "duration": 3542
   203              }
   204          },
   205          "healthy": true
   206      },
   207      "id": 1
   208  }
   209  ```
   210  
   211  In this example response, every check has passed. So, the node is healthy.
   212  
   213  **Response Explanation:**
   214  
   215  - `checks` is a list of health check responses.
   216    - A check response may include a `message` with additional context.
   217    - A check response may include an `error` describing why the check failed.
   218    - `timestamp` is the timestamp of the last health check.
   219    - `duration` is the execution duration of the last health check, in nanoseconds.
   220    - `contiguousFailures` is the number of times in a row this check failed.
   221    - `timeOfFirstFailure` is the time this check first failed.
   222  - `healthy` is true all the health checks are passing.
   223  
   224  #### `health.readiness`
   225  
   226  This method returns the last evaluation of the startup health check results.
   227  
   228  **Example Call:**
   229  
   230  ```sh
   231  curl  -H 'Content-Type: application/json' --data '{
   232      "jsonrpc":"2.0",
   233      "id"     :1,
   234      "method" :"health.readiness",
   235      "params": {
   236          "tags": ["11111111111111111111111111111111LpoYY", "29uVeLPJB1eQJkzRemU8g8wZDw5uJRqpab5U2mX9euieVwiEbL"]
   237      }
   238  }' 'http://localhost:9650/ext/health'
   239  ```
   240  
   241  **Example Response:**
   242  
   243  ```json
   244  {
   245      "jsonrpc": "2.0",
   246      "result": {
   247          "checks": {
   248              "bootstrapped": {
   249                  "message": [],
   250                  "timestamp": "2024-03-26T20:02:45.299114-04:00",
   251                  "duration": 2834
   252              }
   253          },
   254          "healthy": true
   255      },
   256      "id": 1
   257  }
   258  ```
   259  
   260  In this example response, every check has passed. So, the node has finished the startup process.
   261  
   262  **Response Explanation:**
   263  
   264  - `checks` is a list of health check responses.
   265    - A check response may include a `message` with additional context.
   266    - A check response may include an `error` describing why the check failed.
   267    - `timestamp` is the timestamp of the last health check.
   268    - `duration` is the execution duration of the last health check, in nanoseconds.
   269    - `contiguousFailures` is the number of times in a row this check failed.
   270    - `timeOfFirstFailure` is the time this check first failed.
   271  - `healthy` is true all the health checks are passing.
   272  
   273  #### `health.liveness`
   274  
   275  This method returns healthy.
   276  
   277  **Example Call:**
   278  
   279  ```sh
   280  curl  -H 'Content-Type: application/json' --data '{
   281      "jsonrpc":"2.0",
   282      "id"     :1,
   283      "method" :"health.liveness"
   284  }' 'http://localhost:9650/ext/health'
   285  ```
   286  
   287  **Example Response:**
   288  
   289  ```json
   290  {
   291      "jsonrpc": "2.0",
   292      "result": {
   293          "checks": {},
   294          "healthy": true
   295      },
   296      "id": 1
   297  }
   298  ```
   299  
   300  In this example response, the node was able to handle the request and mark the service as healthy.
   301  
   302  **Response Explanation:**
   303  
   304  - `checks` is an empty list.
   305  - `healthy` is true.