github.com/ari-anchor/sei-tendermint@v0.0.0-20230519144642-dc826b7b56bb/docs/rfc/rfc-006-event-subscription.md (about) 1 # RFC 006: Event Subscription 2 3 ## Changelog 4 5 - 30-Oct-2021: Initial draft (@creachadair) 6 7 ## Abstract 8 9 The Tendermint consensus node allows clients to subscribe to its event stream 10 via methods on its RPC service. The ability to view the event stream is 11 valuable for clients, but the current implementation has some deficiencies that 12 make it difficult for some clients to use effectively. This RFC documents these 13 issues and discusses possible approaches to solving them. 14 15 16 ## Background 17 18 A running Tendermint consensus node exports a [JSON-RPC service][rpc-service] 19 that provides a [large set of methods][rpc-methods] for inspecting and 20 interacting with the node. One important cluster of these methods are the 21 `subscribe`, `unsubscribe`, and `unsubscribe_all` methods, which permit clients 22 to subscribe to a filtered stream of the [events generated by the node][events] 23 as it runs. 24 25 Unlike the other methods of the service, the methods in the "event 26 subscription" cluster are not accessible via [ordinary HTTP GET or POST 27 requests][rpc-transport], but require upgrading the HTTP connection to a 28 [websocket][ws]. This is necessary because the `subscribe` request needs a 29 persistent channel to deliver results back to the client, and an ordinary HTTP 30 connection does not reliably persist across multiple requests. Since these 31 methods do not work properly without a persistent channel, they are _only_ 32 exported via a websocket connection, and are not routed for plain HTTP. 33 34 35 ## Discussion 36 37 There are some operational problems with the current implementation of event 38 subscription in the RPC service: 39 40 - **Event delivery is not valid JSON-RPC.** When a client issues a `subscribe` 41 request, the server replies (correctly) with an initial empty acknowledgement 42 (`{}`). After that, each matching event is delivered "unsolicited" (without 43 another request from the client), as a separate [response object][json-response] 44 with the same ID as the initial request. 45 46 This matters because it means a standard JSON-RPC client library can't 47 interact correctly with the event subscription mechanism. 48 49 Even for clients that can handle unsolicited values pushed by the server, 50 these responses are invalid: They have an ID, so they cannot be treated as 51 [notifications][json-notify]; but the ID corresponds to a request that was 52 already completed. In practice, this means that general-purpose JSON-RPC 53 libraries cannot use this method correctly -- it requires a custom client. 54 55 The Go RPC client from the Tendermint core can support this case, but clients 56 in other languages have no easy solution. 57 58 This is the cause of issue [#2949][issue2949]. 59 60 - **Subscriptions are terminated by disconnection.** When the connection to the 61 client is interrupted, the subscription is silently dropped. 62 63 This is a reasonable behavior, but it matters because a client whose 64 subscription is dropped gets no useful error feedback, just a closed 65 connection. Should they try again? Is the node overloaded? Was the client 66 too slow? Did the caller forget to respond to pings? Debugging these kinds 67 of failures is unnecessarily painful. 68 69 Websockets compound this, because websocket connections time out if no 70 traffic is seen for a while, and keeping them alive requires active 71 cooperation between the client and server. With a plain TCP socket, liveness 72 is handled transparently by the keepalive mechanism. On a websocket, 73 however, one side has to occasionally send a PING (if the connection is 74 otherwise idle). The other side must return a matching PONG in time, or the 75 connection is dropped. Apart from being tedious, this is highly susceptible 76 to CPU load. 77 78 The Tendermint Go implementation automatically sends and responds to pings. 79 Clients in other languages (or not wanting to use the Tendermint libraries) 80 need to handle it explicitly. This burdens the client for no practical 81 benefit: A subscriber has no information about when matching events may be 82 available, so it shouldn't have to participate in keeping the connection 83 alive. 84 85 - **Mismatched load profiles.** Most of the RPC service is mainly important for 86 low-volume local use, either by the application the node serves (e.g., the 87 ABCI methods) or by the node operator (e.g., the info methods). Event 88 subscription is important for remote clients, and may represent a much higher 89 volume of traffic. 90 91 This matters because both are using the same JSON-RPC mechanism. For 92 low-volume local use, the ergonomics of JSON-RPC are a good fit: It's easy to 93 issue queries from the command line (e.g., using `curl`) or to write scripts 94 that call the RPC methods to monitor the running node. 95 96 For high-volume remote use, JSON-RPC is not such a good fit: Even leaving 97 aside the non-standard delivery protocol mentioned above, the time and memory 98 cost of encoding event data matters for the stability of the node when there 99 can be potentially hundreds of subscribers. Moreover, a subscription is 100 long-lived compared to most RPC methods, in that it may persist as long the 101 node is active. 102 103 - **Mismatched security profiles.** The RPC service exports several methods 104 that should not be open to arbitrary remote callers, both for correctness 105 reasons (e.g., `remove_tx` and `broadcast_tx_*`) and for operational 106 stability reasons (e.g., `tx_search`). A node may still need to expose 107 events, however, to support UI tools. 108 109 This matters, because all the methods share the same network endpoint. While 110 it is possible to block the top-level GET and POST handlers with a proxy, 111 exposing the `/websocket` handler exposes not _only_ the event subscription 112 methods, but the rest of the service as well. 113 114 ### Possible Improvements 115 116 There are several things we could do to improve the experience of developers 117 who need to subscribe to events from the consensus node. These are not all 118 mutually exclusive. 119 120 1. **Split event subscription into a separate service**. Instead of exposing 121 event subscription on the same endpoint as the rest of the RPC service, 122 dedicate a separate endpoint on the node for _only_ event subscription. The 123 rest of the RPC services (_sans_ events) would remain as-is. 124 125 This would make it easy to disable or firewall outside access to sensitive 126 RPC methods, without blocking access to event subscription (and vice versa). 127 This is probably worth doing, even if we don't take any of the other steps 128 described here. 129 130 2. **Use a different protocol for event subscription.** There are various ways 131 we could approach this, depending how much we're willing to shake up the 132 current API. Here are sketches of a few options: 133 134 - Keep the websocket, but rework the API to be more JSON-RPC compliant, 135 perhaps by converting event delivery into notifications. This is less 136 up-front change for existing clients, but retains all of the existing 137 implementation complexity, and doesn't contribute much toward more serious 138 performance and UX improvements later. 139 140 - Switch from websocket to plain HTTP, and rework the subscription API to 141 use a more conventional request/response pattern instead of streaming. 142 This is a little more up-front work for existing clients, but leverages 143 better library support for clients not written in Go. 144 145 The protocol would become more chatty, but we could mitigate that with 146 batching, and in return we would get more control over what to do about 147 slow clients: Instead of simply silently dropping them, as we do now, we 148 could drop messages and signal the client that they missed some data ("M 149 dropped messages since your last poll"). 150 151 This option is probably the best balance between work, API change, and 152 benefit, and has a nice incidental effect that it would be easier to debug 153 subscriptions from the command-line, like the other RPC methods. 154 155 - Switch to gRPC: Preserves a persistent connection and gives us a more 156 efficient binary wire format (protobuf), at the cost of much more work for 157 clients and harder debugging. This may be the best option if performance 158 and server load are our top concerns. 159 160 Given that we are currently using JSON-RPC, however, I'm not convinced the 161 costs of encoding and sending messages on the event subscription channel 162 are the limiting factor on subscription efficiency, however. 163 164 3. **Delegate event subscriptions to a proxy.** Give responsibility for 165 managing event subscription to a proxy that runs separately from the node, 166 and switch the node to push events to the proxy (like a webhook) instead of 167 serving subscribers directly. This is more work for the operator (another 168 process to configure and run) but may scale better for big networks. 169 170 I mention this option for completeness, but making this change would be a 171 fairly substantial project. If we want to consider shifting responsibility 172 for event subscription outside the node anyway, we should probably be more 173 systematic about it. For a more principled approach, see point (4) below. 174 175 4. **Move event subscription downstream of indexing.** We are already planning 176 to give applications more control over event indexing. By extension, we 177 might allow the application to also control how events are filtered, 178 queried, and subscribed. Having the application control these concerns, 179 rather than the node, might make life easier for developers building UI and 180 tools for that application. 181 182 This is a much larger change, so I don't think it is likely to be practical 183 in the near-term, but it's worth considering as a broader option. Some of 184 the existing code for filtering and selection could be made more reusable, 185 so applications would not need to reinvent everything. 186 187 188 ## References 189 190 - [Tendermint RPC service][rpc-service] 191 - [Tendermint RPC routes][rpc-methods] 192 - [Discussion of the event system][events] 193 - [Discussion about RPC transport options][rpc-transport] (from RFC 002) 194 - [RFC 6455: The websocket protocol][ws] 195 - [JSON-RPC 2.0 Specification](https://www.jsonrpc.org/specification) 196 197 [rpc-service]: https://docs.tendermint.com/master/rpc/ 198 [rpc-methods]: https://github.com/tendermint/tendermint/blob/master/internal/rpc/core/routes.go#L12 199 [events]: ./rfc-005-event-system.rst 200 [rpc-transport]: ./rfc-002-ipc-ecosystem.md#rpc-transport 201 [ws]: https://datatracker.ietf.org/doc/html/rfc6455 202 [json-response]: https://www.jsonrpc.org/specification#response_object 203 [json-notify]: https://www.jsonrpc.org/specification#notification 204 [issue2949]: https://github.com/tendermint/tendermint/issues/2949