github.com/ari-anchor/sei-tendermint@v0.0.0-20230519144642-dc826b7b56bb/docs/rfc/rfc-012-custom-indexing.md (about) 1 # RFC 012: Event Indexing Revisited 2 3 ## Changelog 4 5 - 11-Feb-2022: Add terminological notes. 6 - 10-Feb-2022: Updated from review feedback. 7 - 07-Feb-2022: Initial draft (@creachadair) 8 9 ## Abstract 10 11 A Tendermint node allows ABCI events associated with block and transaction 12 processing to be "indexed" into persistent storage. The original Tendermint 13 implementation provided a fixed, built-in [proprietary indexer][kv-index] for 14 such events. 15 16 In response to user requests to customize indexing, [ADR 065][adr065] 17 introduced an "event sink" interface that allows developers (at least in 18 theory) to plug in alternative index storage. 19 20 Although ADR-065 was a good first step toward customization, its implementation 21 model does not satisfy all the user requirements. Moreover, this approach 22 leaves some existing technical issues with indexing unsolved. 23 24 This RFC documents these concerns, and discusses some potential approaches to 25 solving them. This RFC does _not_ propose a specific technical decision. It is 26 meant to unify and focus some of the disparate discussions of the topic. 27 28 29 ## Background 30 31 We begin with some important terminological context. The term "event" in 32 Tendermint can be confusing, as the same word is used for multiple related but 33 distinct concepts: 34 35 1. **ABCI Events** refer to the key-value metadata attached to blocks and 36 transactions by the application. These values are represented by the ABCI 37 `Event` protobuf message type. 38 39 2. **Consensus Events** refer to the data published by the Tendermint node to 40 its pubsub bus in response to various consensus state transitions and other 41 important activities, such as round updates, votes, transaction delivery, 42 and block completion. 43 44 This confusion is compounded because some "consensus event" values also have 45 "ABCI event" metadata attached to them. Notably, block and transaction items 46 typically have ABCI metadata assigned by the application. 47 48 Indexers and RPC clients subscribed to the pubsub bus receive **consensus 49 events**, but they identify which ones to care about using query expressions 50 that match against the **ABCI events** associated with them. 51 52 In the discussion that follows, we will use the term **event item** to refer to 53 a datum published to or received from the pubsub bus, and **ABCI event** or 54 **event metadata** to refer to the key/value annotations. 55 56 **Indexing** in this context means recording the association between certain 57 ABCI metadata and the blocks or transactions they're attached to. The ABCI 58 metadata typically carry application-specific details like sender and recipient 59 addresses, catgory tags, and so forth, that are not part of consensus but are 60 used by UI tools to find and display transactions of interest. 61 62 The consensus node records the blocks and transactions as part of its block 63 store, but does not persist the application metadata. Metadata persistence is 64 the task of the indexer, which can be (optionally) enabled by the node 65 operator. 66 67 ### History 68 69 The [original indexer][kv-index] built in to Tendermint stored index data in an 70 embedded [`tm-db` database][tmdb] with a proprietary key layout. 71 In [ADR 065][adr065], we noted that this implementation has both performance 72 and scaling problems under load. Moreover, the only practical way to query the 73 index data is via the [query filter language][query] used for event 74 subscription. [Issue #1161][i1161] appears to be a motivational context for that ADR. 75 76 To mitigate both of these concerns, we introduced the [`EventSink`][esink] 77 interface, combining the original transaction and block indexer interfaces 78 along with some service plumbing. Using this interface, a developer can plug 79 in an indexer that uses a more efficient storage engine, and provides a more 80 expressive query language. As a proof-of-concept, we built a [PostgreSQL event 81 sink][psql] that exports data to a [PostgreSQL database][postgres]. 82 83 Although this approach addressed some of the immediate concerns, there are 84 several issues for custom indexing that have not been fully addressed. Here we 85 will discuss them in more detail. 86 87 For further context, including links to user reports and related work, see also 88 the [Pluggable custom event indexing tracking issue][i7135] issue. 89 90 ### Issue 1: Tight Coupling 91 92 The `EventSink` interface supports multiple implementations, but plugging in 93 implementations still requires tight integration with the node. In particular: 94 95 - Any custom indexer must either be written in Go and compiled in to the 96 Tendermint binary, or the developer must write a Go shim to communicate with 97 the implementation and build that into the Tendermint binary. 98 99 - This means to support a custom indexer, it either has to be integrated into 100 the Tendermint core repository, or every installation that uses that indexer 101 must fetch or build a patched version of Tendermint. 102 103 The problem with integrating indexers into Tendermint Core is that every user 104 of Tendermint Core takes a dependency on all supported indexers, including 105 those they never use. Even if the unused code is disabled with build tags, 106 users have to remember to do this or potentially be exposed to security issues 107 that may arise in any of the custom indexers. This is a risk for Tendermint, 108 which is a trust-critical component of all applications built on it. 109 110 The problem with _not_ integrating indexers into Tendermint Core is that any 111 developer who wants to use a particular indexer must now fetch or build a 112 patched version of the core code that includes the custom indexer. Besides 113 being inconvenient, this makes it harder for users to upgrade their node, since 114 they need to either re-apply their patches directly or wait for an intermediary 115 to do it for them. 116 117 Even for developers who have written their applications in Go and link with the 118 consensus node directly (e.g., using the [Cosmos SDK][sdk]), these issues add a 119 potentially significant complication to the build process. 120 121 ### Issue 2: Legacy Compatibility 122 123 The `EventSink` interface retains several limitations of the original 124 proprietary indexer. These include: 125 126 - The indexer has no control over which event items are reported. Only the 127 exact block and transaction events that were reported to the original indexer 128 are reported to a custom indexer. 129 130 - The interface requires the implementation to define methods for the legacy 131 search and query API. This requirement comes from the integation with the 132 [event subscription RPC API][event-rpc], but actually supporting these 133 methods is not trivial. 134 135 At present, only the original KV indexer implements the query methods. Even the 136 proof-of-concept PostgreSQL implementation simply reports errors for all calls 137 to these methods. 138 139 Even for a plugin written in Go, implementing these methods "correctly" would 140 require parsing and translating the custom query language over whatever storage 141 platform the indexer uses. 142 143 For a plugin _not_ written in Go, even beyond the cost of integration the 144 developer would have to re-implement the entire query language. 145 146 ### Issue 3: Indexing Delays Consensus 147 148 Within the node, indexing hooks in to the same internal pubsub dispatcher that 149 is used to export event items to the [event subscription RPC API][event-rpc]. 150 In contrast with RPC subscribers, however, indexing is a "privileged" 151 subscriber: If an RPC subscriber is "too slow", the node may terminate the 152 subscription and disconnect the client. That means that RPC subscribers may 153 lose (miss) event items. The indexer, however, is "unbuffered", and the 154 publisher will never drop or disconnect from it. If the indexer is slow, the 155 publisher will block until it returns, to ensure that no event items are lost. 156 157 In practice, this means that the performance of the indexer has a direct effect 158 on the performance of the consensus node: If the indexer is slow or stalls, it 159 will slow or halt the progress of consensus. Users have already reported this 160 problem even with the built-in indexer (see, for example, [#7247][i7247]). 161 Extending this concern to arbitrary user-defined custom indexers gives that 162 risk a much larger surface area. 163 164 165 ## Discussion 166 167 It is not possible to simultaneously guarantee that publishing event items will 168 not delay consensus, and also that all event items of interest are always 169 completely indexed. 170 171 Therefore, our choice is between eliminating delay (and minimizing loss) or 172 eliminating loss (and minimizing delay). Currently, we take the second 173 approach, which has led to user complaints about consensus delays due to 174 indexing and subscription overhead. 175 176 - If we agree that consensus performance supersedes index completeness, our 177 design choices are to constrain the likelihood and frequency of missing event 178 items. 179 180 - If we decide that consensus performance is more important than index 181 completeness, our option is to minimize overhead on the event delivery path 182 and document that indexer plugins constrain the rate of consensus. 183 184 Since we have user reports requesting both properties, we have to choose one or 185 the other. Since the primary job of the consensus engine is to correctly, 186 robustly, reliablly, and efficiently replicate application state across the 187 network, I believe the correct choice is to favor consensus performance. 188 189 An important consideration for this decision is that a node does not index 190 application metadata separately: If indexing is disabled, there is no built-in 191 mechanism to go back and replay or reconstruct the data that an indexer would 192 have stored. The node _does_ store the blockchain itself (i.e., the blocks and 193 their transactions), so potentially some use cases currently handled by the 194 indexer could be handled by the node. For example, allowing clients to ask 195 whether a given transaction ID has been committed to a block could in principle 196 be done without an indexer, since it does not depend on application metadata. 197 198 Inevitably, a question will arise whether we could implement both strategies 199 and toggle between them with a flag. That would be a worst-case scenario, 200 requiring us to maintain the complexity of two very-different operational 201 concerns. If our goal is that Tendermint should be as simple, efficient, and 202 trustworthy as posible, there is not a strong case for making these options 203 configurable: We should pick a side and commit to it. 204 205 ### Design Principles 206 207 Although there is no unique "best" solution to the issues described above, 208 there are some specific principles that a solution should include: 209 210 1. **A custom indexer should not require integration into Tendermint core.** A 211 developer or node operator can create, build, deploy, and use a custom 212 indexer with a stock build of the Tendermint consensus node. 213 214 2. **Custom indexers cannot stall consensus.** An indexer that is slow or 215 stalls cannot slow down or prevent core consensus from making progress. 216 217 The plugin interface must give node operators control over the tolerances 218 for acceptable indexer performance, and the means to detect when indexers 219 are falling outside those tolerances, but indexer failures should "fail 220 safe" with respect to consensus (even if that means the indexer may miss 221 some data, in sufficiently-extreme circumstances). 222 223 3. **Custom indexers control which event items they index.** A custom indexer 224 is not limited to only the current transaction and block events, but can 225 observe any event item published by the node. 226 227 4. **Custom indexing is forward-compatible.** Adding new event item types or 228 metadata to the consensus node should not require existing custom indexers 229 to be rebuilt or modified, unless they want to take advantage of the new 230 data. 231 232 5. **Indexers are responsible for answering queries.** An indexer plugin is not 233 required to support the legacy query filter language, nor to be compatible 234 with the legacy RPC endpoints for accessing them. Any APIs for clients to 235 query a custom index are the responsibility of the indexer, not the node. 236 237 ### Open Questions 238 239 Given the constraints outlined above, there are important design questions we 240 must answer to guide any specific changes: 241 242 1. **What is an acceptable probability that, given sufficiently extreme 243 operational issues, an indexer might miss some number of events?** 244 245 There are two parts to this question: One is what constitutes an extreme 246 operational problem, the other is how likely we are to miss some number of 247 events items. 248 249 - If the consensus is that no event item must ever be missed, no matter how 250 bad the operational circumstances, then we _must_ accept that indexing can 251 slow or halt consensus arbitrarily. It is impossible to guarantee complete 252 index coverage without potentially unbounded delays. 253 254 - Otherwise, how much data can we afford to lose and how often? For example, 255 if we can ensure no event item will be lost unless the indexer halts for 256 at least five minutes, is that acceptable? What probabilities and time 257 ranges are reasonable for real production environments? 258 259 2. **What level of operational overhead is acceptable to impose on node 260 operators to support indexing?** 261 262 Are node operators willing to configure and run custom indexers as sidecar 263 type processes alongside a node? How much indexer setup above and beyond the 264 work of setting up the underlying node in isolation is tractable in 265 production networks? 266 267 The answer to this question also informs the question of whether we should 268 keep an "in-process" indexing option, and to what extent that option needs 269 to satisfy the suggested design principles. 270 271 Relatedly, to what extent do we need to be concerned about the cost of 272 encoding and sending event items to an external process (e.g., as JSON blobs 273 or protobuf wire messages)? Given that the node already encodes event items 274 as JSON for subscription purposes, the overhead would be negligible for the 275 node itself, but the indexer would have to decode to process the results. 276 277 3. **What (if any) query APIs does the consensus node need to export, 278 independent of the indexer implementation?** 279 280 One typical example is whether the node should be able to answer queries 281 like "is this transaction ID in a block?" Currently, a node cannot answer 282 this query _unless_ it runs the built-in KV indexer. Does the node need to 283 continue to support that query even for nodes that disable the KV indexer, 284 or which use a custom indexer? 285 286 ### Informal Design Intent 287 288 The design principles described above implicate several components of the 289 Tendermint node, beyond just the indexer. In the context of [ADR 075][adr075], 290 we are re-working the RPC event subscription API to improve some of the UX 291 issues discussed above for RPC clients. It is our expectation that a solution 292 for pluggable custom indexing will take advantage of some of the same work. 293 294 On that basis, the design approach I am considering for custom indexing looks 295 something like this (subject to refinement): 296 297 1. A custom indexer runs as a separate process from the node. 298 299 2. The indexer subscribes to event items via the ADR 075 events API. 300 301 This means indexers would receive event payloads as JSON rather than 302 protobuf, but since we already have to support JSON encoding for the RPC 303 interface anyway, that should not increase complexity for the node. 304 305 3. The existing PostgreSQL indexer gets reworked to have this form, and no 306 longer built as part of the Tendermint core binary. 307 308 We can retain the code in the core repository as a proof-of-concept, or 309 perhaps create a separate repository with contributed indexers and move it 310 there. 311 312 4. (Possibly) Deprecate and remove the legacy KV indexer, or disable it by 313 default. If we decide to remove it, we can also remove the legacy RPC 314 endpoints for querying the KV indexer. 315 316 If we plan to do this, we should also investigate providing a way for 317 clients to query whether a given transaction ID has landed in a block. That 318 serves a common need, and currently _only_ works if the KV indexer is 319 enabled, but could be addressed more simply using the other data a node 320 already has stored, without having to answer more general queries. 321 322 323 ## References 324 325 - [ADR 065: Custom Event Indexing][adr065] 326 - [ADR 075: RPC Event Subscription Interface][adr075] 327 - [Cosmos SDK][sdk] 328 - [Event subscription RPC][event-rpc] 329 - [KV transaction indexer][kv-index] 330 - [Pluggable custom event indexing][i7135] (#7135) 331 - [PostgreSQL event sink][psql] 332 - [PostgreSQL database][postgres] 333 - [Query filter language][query] 334 - [Stream events to postgres for indexing][i1161] (#1161) 335 - [Unbuffered event subscription slow down the consensus][i7247] (#7247) 336 - [`EventSink` interface][esink] 337 - [`tm-db` library][tmdb] 338 339 [adr065]: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-065-custom-event-indexing.md 340 [adr075]: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-075-rpc-subscription.md 341 [esink]: https://pkg.go.dev/github.com/tendermint/tendermint/internal/state/indexer#EventSink 342 [event-rpc]: https://docs.tendermint.com/master/rpc/#/Websocket/subscribe 343 [i1161]: https://github.com/tendermint/tendermint/issues/1161 344 [i7135]: https://github.com/tendermint/tendermint/issues/7135 345 [i7247]: https://github.com/tendermint/tendermint/issues/7247 346 [kv-index]: https://github.com/tendermint/tendermint/blob/master/internal/state/indexer/tx/kv 347 [postgres]: https://postgresql.org/ 348 [psql]: https://github.com/tendermint/tendermint/blob/master/internal/state/indexer/sink/psql 349 [psql]: https://github.com/tendermint/tendermint/blob/master/internal/state/indexer/sink/psql 350 [query]: https://pkg.go.dev/github.com/tendermint/tendermint/internal/pubsub/query/syntax 351 [sdk]: https://github.com/cosmos/cosmos-sdk 352 [tmdb]: https://pkg.go.dev/github.com/tendermint/tm-db#DB