github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/architecture/adr-065-custom-event-indexing.md (about) 1 # ADR 065: Custom Event Indexing 2 3 - [ADR 065: Custom Event Indexing](#adr-065-custom-event-indexing) 4 - [Changelog](#changelog) 5 - [Status](#status) 6 - [Context](#context) 7 - [Alternative Approaches](#alternative-approaches) 8 - [Decision](#decision) 9 - [Detailed Design](#detailed-design) 10 - [EventSink](#eventsink) 11 - [Supported Sinks](#supported-sinks) 12 - [`KVEventSink`](#kveventsink) 13 - [`PSQLEventSink`](#psqleventsink) 14 - [Configuration](#configuration) 15 - [Future Improvements](#future-improvements) 16 - [Consequences](#consequences) 17 - [Positive](#positive) 18 - [Negative](#negative) 19 - [Neutral](#neutral) 20 - [References](#references) 21 22 ## Changelog 23 24 - April 1, 2021: Initial Draft (@alexanderbez) 25 - April 28, 2021: Specify search capabilities are only supported through the KV indexer (@marbar3778) 26 - May 19, 2021: Update the SQL schema and the eventsink interface (@jayt106) 27 - Aug 30, 2021: Update the SQL schema and the psql implementation (@creachadair) 28 - Oct 5, 2021: Clarify goals and implementation changes (@creachadair) 29 30 ## Status 31 32 Implemented 33 34 ## Context 35 36 Currently, Tendermint Core supports block and transaction event indexing through 37 the `tx_index.indexer` configuration. Events are captured in transactions and 38 are indexed via a `TxIndexer` type. Events are captured in blocks, specifically 39 from `BeginBlock` and `EndBlock` application responses, and are indexed via a 40 `BlockIndexer` type. Both of these types are managed by a single `IndexerService` 41 which is responsible for consuming events and sending those events off to be 42 indexed by the respective type. 43 44 In addition to indexing, Tendermint Core also supports the ability to query for 45 both indexed transaction and block events via Tendermint's RPC layer. The ability 46 to query for these indexed events facilitates a great multitude of upstream client 47 and application capabilities, e.g. block explorers, IBC relayers, and auxiliary 48 data availability and indexing services. 49 50 Currently, Tendermint only supports indexing via a `kv` indexer, which is supported 51 by an underlying embedded key/value store database. The `kv` indexer implements 52 its own indexing and query mechanisms. While the former is somewhat trivial, 53 providing a rich and flexible query layer is not as trivial and has caused many 54 issues and UX concerns for upstream clients and applications. 55 56 The fragile nature of the proprietary `kv` query engine and the potential 57 performance and scaling issues that arise when a large number of consumers are 58 introduced, motivate the need for a more robust and flexible indexing and query 59 solution. 60 61 ## Alternative Approaches 62 63 With regards to alternative approaches to a more robust solution, the only serious 64 contender that was considered was to transition to using [SQLite](https://www.sqlite.org/index.html). 65 66 While the approach would work, it locks us into a specific query language and 67 storage layer, so in some ways it's only a bit better than our current approach. 68 In addition, the implementation would require the introduction of CGO into the 69 Tendermint Core stack, whereas right now CGO is only introduced depending on 70 the database used. 71 72 ## Decision 73 74 We will adopt a similar approach to that of the Cosmos SDK's `KVStore` state 75 listening described in [ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md). 76 77 We will implement the following changes: 78 79 - Introduce a new interface, `EventSink`, that all data sinks must implement. 80 - Augment the existing `tx_index.indexer` configuration to now accept a series 81 of one or more indexer types, i.e., sinks. 82 - Combine the current `TxIndexer` and `BlockIndexer` into a single `KVEventSink` 83 that implements the `EventSink` interface. 84 - Introduce an additional `EventSink` implementation that is backed by 85 [PostgreSQL](https://www.postgresql.org/). 86 - Implement the necessary schemas to support both block and transaction event indexing. 87 - Update `IndexerService` to use a series of `EventSinks`. 88 89 In addition: 90 91 - The Postgres indexer implementation will _not_ implement the proprietary `kv` 92 query language. Users wishing to write queries against the Postgres indexer 93 will connect to the underlying DBMS directly and use SQL queries based on the 94 indexing schema. 95 96 Future custom indexer implementations will not be required to support the 97 proprietary query language either. 98 99 - For now, the existing `kv` indexer will be left in place with its current 100 query support, but will be marked as deprecated in a subsequent release, and 101 the documentation will be updated to encourage users who need to query the 102 event index to migrate to the Postgres indexer. 103 104 - In the future we may remove the `kv` indexer entirely, or replace it with a 105 different implementation; that decision is deferred as future work. 106 107 - In the future, we may remove the index query endpoints from the RPC service 108 entirely; that decision is deferred as future work, but recommended. 109 110 111 ## Detailed Design 112 113 ### EventSink 114 115 We introduce the `EventSink` interface type that all supported sinks must implement. 116 The interface is defined as follows: 117 118 ```go 119 type EventSink interface { 120 IndexBlockEvents(types.EventDataNewBlockHeader) error 121 IndexTxEvents([]*abci.TxResult) error 122 123 SearchBlockEvents(context.Context, *query.Query) ([]int64, error) 124 SearchTxEvents(context.Context, *query.Query) ([]*abci.TxResult, error) 125 126 GetTxByHash([]byte) (*abci.TxResult, error) 127 HasBlock(int64) (bool, error) 128 129 Type() EventSinkType 130 Stop() error 131 } 132 ``` 133 134 The `IndexerService` will accept a list of one or more `EventSink` types. During 135 the `OnStart` method it will call the appropriate APIs on each `EventSink` to 136 index both block and transaction events. 137 138 ### Supported Sinks 139 140 We will initially support two `EventSink` types out of the box. 141 142 #### `KVEventSink` 143 144 This type of `EventSink` is a combination of the `TxIndexer` and `BlockIndexer` 145 indexers, both of which are backed by a single embedded key/value database. 146 147 A bulk of the existing business logic will remain the same, but the existing APIs 148 mapped to the new `EventSink` API. Both types will be removed in favor of a single 149 `KVEventSink` type. 150 151 The `KVEventSink` will be the only `EventSink` enabled by default, so from a UX 152 perspective, operators should not notice a difference apart from a configuration 153 change. 154 155 We omit `EventSink` implementation details as it should be fairly straightforward 156 to map the existing business logic to the new APIs. 157 158 #### `PSQLEventSink` 159 160 This type of `EventSink` indexes block and transaction events into a [PostgreSQL](https://www.postgresql.org/). 161 database. We define and automatically migrate the following schema when the 162 `IndexerService` starts. 163 164 The postgres eventsink will not support `tx_search`, `block_search`, `GetTxByHash` and `HasBlock`. 165 166 ```sql 167 -- Table Definition ---------------------------------------------- 168 169 -- The blocks table records metadata about each block. 170 -- The block record does not include its events or transactions (see tx_results). 171 CREATE TABLE blocks ( 172 rowid BIGSERIAL PRIMARY KEY, 173 174 height BIGINT NOT NULL, 175 chain_id VARCHAR NOT NULL, 176 177 -- When this block header was logged into the sink, in UTC. 178 created_at TIMESTAMPTZ NOT NULL, 179 180 UNIQUE (height, chain_id) 181 ); 182 183 -- Index blocks by height and chain, since we need to resolve block IDs when 184 -- indexing transaction records and transaction events. 185 CREATE INDEX idx_blocks_height_chain ON blocks(height, chain_id); 186 187 -- The tx_results table records metadata about transaction results. Note that 188 -- the events from a transaction are stored separately. 189 CREATE TABLE tx_results ( 190 rowid BIGSERIAL PRIMARY KEY, 191 192 -- The block to which this transaction belongs. 193 block_id BIGINT NOT NULL REFERENCES blocks(rowid), 194 -- The sequential index of the transaction within the block. 195 index INTEGER NOT NULL, 196 -- When this result record was logged into the sink, in UTC. 197 created_at TIMESTAMPTZ NOT NULL, 198 -- The hex-encoded hash of the transaction. 199 tx_hash VARCHAR NOT NULL, 200 -- The protobuf wire encoding of the TxResult message. 201 tx_result BYTEA NOT NULL, 202 203 UNIQUE (block_id, index) 204 ); 205 206 -- The events table records events. All events (both block and transaction) are 207 -- associated with a block ID; transaction events also have a transaction ID. 208 CREATE TABLE events ( 209 rowid BIGSERIAL PRIMARY KEY, 210 211 -- The block and transaction this event belongs to. 212 -- If tx_id is NULL, this is a block event. 213 block_id BIGINT NOT NULL REFERENCES blocks(rowid), 214 tx_id BIGINT NULL REFERENCES tx_results(rowid), 215 216 -- The application-defined type label for the event. 217 type VARCHAR NOT NULL 218 ); 219 220 -- The attributes table records event attributes. 221 CREATE TABLE attributes ( 222 event_id BIGINT NOT NULL REFERENCES events(rowid), 223 key VARCHAR NOT NULL, -- bare key 224 composite_key VARCHAR NOT NULL, -- composed type.key 225 value VARCHAR NULL, 226 227 UNIQUE (event_id, key) 228 ); 229 230 -- A joined view of events and their attributes. Events that do not have any 231 -- attributes are represented as a single row with empty key and value fields. 232 CREATE VIEW event_attributes AS 233 SELECT block_id, tx_id, type, key, composite_key, value 234 FROM events LEFT JOIN attributes ON (events.rowid = attributes.event_id); 235 236 -- A joined view of all block events (those having tx_id NULL). 237 CREATE VIEW block_events AS 238 SELECT blocks.rowid as block_id, height, chain_id, type, key, composite_key, value 239 FROM blocks JOIN event_attributes ON (blocks.rowid = event_attributes.block_id) 240 WHERE event_attributes.tx_id IS NULL; 241 242 -- A joined view of all transaction events. 243 CREATE VIEW tx_events AS 244 SELECT height, index, chain_id, type, key, composite_key, value, tx_results.created_at 245 FROM blocks JOIN tx_results ON (blocks.rowid = tx_results.block_id) 246 JOIN event_attributes ON (tx_results.rowid = event_attributes.tx_id) 247 WHERE event_attributes.tx_id IS NOT NULL; 248 ``` 249 250 The `PSQLEventSink` will implement the `EventSink` interface as follows 251 (some details omitted for brevity): 252 253 ```go 254 func NewEventSink(connStr, chainID string) (*EventSink, error) { 255 db, err := sql.Open(driverName, connStr) 256 // ... 257 258 return &EventSink{ 259 store: db, 260 chainID: chainID, 261 }, nil 262 } 263 264 func (es *EventSink) IndexBlockEvents(h types.EventDataNewBlockHeader) error { 265 ts := time.Now().UTC() 266 267 return runInTransaction(es.store, func(tx *sql.Tx) error { 268 // Add the block to the blocks table and report back its row ID for use 269 // in indexing the events for the block. 270 blockID, err := queryWithID(tx, ` 271 INSERT INTO blocks (height, chain_id, created_at) 272 VALUES ($1, $2, $3) 273 ON CONFLICT DO NOTHING 274 RETURNING rowid; 275 `, h.Header.Height, es.chainID, ts) 276 // ... 277 278 // Insert the special block meta-event for height. 279 if err := insertEvents(tx, blockID, 0, []abci.Event{ 280 makeIndexedEvent(types.BlockHeightKey, fmt.Sprint(h.Header.Height)), 281 }); err != nil { 282 return fmt.Errorf("block meta-events: %w", err) 283 } 284 // Insert all the block events. Order is important here, 285 if err := insertEvents(tx, blockID, 0, h.ResultBeginBlock.Events); err != nil { 286 return fmt.Errorf("begin-block events: %w", err) 287 } 288 if err := insertEvents(tx, blockID, 0, h.ResultEndBlock.Events); err != nil { 289 return fmt.Errorf("end-block events: %w", err) 290 } 291 return nil 292 }) 293 } 294 295 func (es *EventSink) IndexTxEvents(txrs []*abci.TxResult) error { 296 ts := time.Now().UTC() 297 298 for _, txr := range txrs { 299 // Encode the result message in protobuf wire format for indexing. 300 resultData, err := proto.Marshal(txr) 301 // ... 302 303 // Index the hash of the underlying transaction as a hex string. 304 txHash := fmt.Sprintf("%X", types.Tx(txr.Tx).Hash()) 305 306 if err := runInTransaction(es.store, func(tx *sql.Tx) error { 307 // Find the block associated with this transaction. 308 blockID, err := queryWithID(tx, ` 309 SELECT rowid FROM blocks WHERE height = $1 AND chain_id = $2; 310 `, txr.Height, es.chainID) 311 // ... 312 313 // Insert a record for this tx_result and capture its ID for indexing events. 314 txID, err := queryWithID(tx, ` 315 INSERT INTO tx_results (block_id, index, created_at, tx_hash, tx_result) 316 VALUES ($1, $2, $3, $4, $5) 317 ON CONFLICT DO NOTHING 318 RETURNING rowid; 319 `, blockID, txr.Index, ts, txHash, resultData) 320 // ... 321 322 // Insert the special transaction meta-events for hash and height. 323 if err := insertEvents(tx, blockID, txID, []abci.Event{ 324 makeIndexedEvent(types.TxHashKey, txHash), 325 makeIndexedEvent(types.TxHeightKey, fmt.Sprint(txr.Height)), 326 }); err != nil { 327 return fmt.Errorf("indexing transaction meta-events: %w", err) 328 } 329 // Index any events packaged with the transaction. 330 if err := insertEvents(tx, blockID, txID, txr.Result.Events); err != nil { 331 return fmt.Errorf("indexing transaction events: %w", err) 332 } 333 return nil 334 335 }); err != nil { 336 return err 337 } 338 } 339 return nil 340 } 341 342 // SearchBlockEvents is not implemented by this sink, and reports an error for all queries. 343 func (es *EventSink) SearchBlockEvents(ctx context.Context, q *query.Query) ([]int64, error) 344 345 // SearchTxEvents is not implemented by this sink, and reports an error for all queries. 346 func (es *EventSink) SearchTxEvents(ctx context.Context, q *query.Query) ([]*abci.TxResult, error) 347 348 // GetTxByHash is not implemented by this sink, and reports an error for all queries. 349 func (es *EventSink) GetTxByHash(hash []byte) (*abci.TxResult, error) 350 351 // HasBlock is not implemented by this sink, and reports an error for all queries. 352 func (es *EventSink) HasBlock(h int64) (bool, error) 353 ``` 354 355 ### Configuration 356 357 The current `tx_index.indexer` configuration would be changed to accept a list 358 of supported `EventSink` types instead of a single value. 359 360 Example: 361 362 ```toml 363 [tx_index] 364 365 indexer = [ 366 "kv", 367 "psql" 368 ] 369 ``` 370 371 If the `indexer` list contains the `null` indexer, then no indexers will be used 372 regardless of what other values may exist. 373 374 Additional configuration parameters might be required depending on what event 375 sinks are supplied to `tx_index.indexer`. The `psql` will require an additional 376 connection configuration. 377 378 ```toml 379 [tx_index] 380 381 indexer = [ 382 "kv", 383 "psql" 384 ] 385 386 pqsql_conn = "postgresql://<user>:<password>@<host>:<port>/<db>?<opts>" 387 ``` 388 389 Any invalid or misconfigured `tx_index` configuration should yield an error as 390 early as possible. 391 392 ## Future Improvements 393 394 Although not technically required to maintain feature parity with the current 395 existing Tendermint indexer, it would be beneficial for operators to have a method 396 of performing a "re-index". Specifically, Tendermint operators could invoke an 397 RPC method that allows the Tendermint node to perform a re-indexing of all block 398 and transaction events between two given heights, H<sub>1</sub> and H<sub>2</sub>, 399 so long as the block store contains the blocks and transaction results for all 400 the heights specified in a given range. 401 402 ## Consequences 403 404 ### Positive 405 406 - A more robust and flexible indexing and query engine for indexing and search 407 block and transaction events. 408 - The ability to not have to support a custom indexing and query engine beyond 409 the legacy `kv` type. 410 - The ability to offload/proxy indexing and querying to the underling sink. 411 - Scalability and reliability that essentially comes "for free" from the underlying 412 sink, if it supports it. 413 414 ### Negative 415 416 - The need to support multiple and potentially a growing set of custom `EventSink` 417 types. 418 419 ### Neutral 420 421 ## References 422 423 - [Cosmos SDK ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md) 424 - [PostgreSQL](https://www.postgresql.org/) 425 - [SQLite](https://www.sqlite.org/index.html)