github.com/number571/tendermint@v0.34.11-gost/docs/architecture/adr-065-custom-event-indexing.md (about) 1 # ADR 065: Custom Event Indexing 2 3 - [ADR 065: Custom Event Indexing](#adr-065-custom-event-indexing) 4 - [Changelog](#changelog) 5 - [Status](#status) 6 - [Context](#context) 7 - [Alternative Approaches](#alternative-approaches) 8 - [Decision](#decision) 9 - [Detailed Design](#detailed-design) 10 - [EventSink](#eventsink) 11 - [Supported Sinks](#supported-sinks) 12 - [`KVEventSink`](#kveventsink) 13 - [`PSQLEventSink`](#psqleventsink) 14 - [Configuration](#configuration) 15 - [Future Improvements](#future-improvements) 16 - [Consequences](#consequences) 17 - [Positive](#positive) 18 - [Negative](#negative) 19 - [Neutral](#neutral) 20 - [References](#references) 21 22 ## Changelog 23 24 - April 1, 2021: Initial Draft (@alexanderbez) 25 - April 28, 2021: Specify search capabilities are only supported through the KV indexer (@marbar3778) 26 - May 19, 2021: Update the SQL schema and the eventsink interface (@jayt106) 27 28 ## Status 29 30 Accepted 31 32 ## Context 33 34 Currently, Tendermint Core supports block and transaction event indexing through 35 the `tx_index.indexer` configuration. Events are captured in transactions and 36 are indexed via a `TxIndexer` type. Events are captured in blocks, specifically 37 from `BeginBlock` and `EndBlock` application responses, and are indexed via a 38 `BlockIndexer` type. Both of these types are managed by a single `IndexerService` 39 which is responsible for consuming events and sending those events off to be 40 indexed by the respective type. 41 42 In addition to indexing, Tendermint Core also supports the ability to query for 43 both indexed transaction and block events via Tendermint's RPC layer. The ability 44 to query for these indexed events facilitates a great multitude of upstream client 45 and application capabilities, e.g. block explorers, IBC relayers, and auxiliary 46 data availability and indexing services. 47 48 Currently, Tendermint only supports indexing via a `kv` indexer, which is supported 49 by an underlying embedded key/value store database. The `kv` indexer implements 50 its own indexing and query mechanisms. While the former is somewhat trivial, 51 providing a rich and flexible query layer is not as trivial and has caused many 52 issues and UX concerns for upstream clients and applications. 53 54 The fragile nature of the proprietary `kv` query engine and the potential 55 performance and scaling issues that arise when a large number of consumers are 56 introduced, motivate the need for a more robust and flexible indexing and query 57 solution. 58 59 ## Alternative Approaches 60 61 With regards to alternative approaches to a more robust solution, the only serious 62 contender that was considered was to transition to using [SQLite](https://www.sqlite.org/index.html). 63 64 While the approach would work, it locks us into a specific query language and 65 storage layer, so in some ways it's only a bit better than our current approach. 66 In addition, the implementation would require the introduction of CGO into the 67 Tendermint Core stack, whereas right now CGO is only introduced depending on 68 the database used. 69 70 ## Decision 71 72 We will adopt a similar approach to that of the Cosmos SDK's `KVStore` state 73 listening described in [ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md). 74 75 Namely, we will perform the following: 76 77 - Introduce a new interface, `EventSink`, that all data sinks must implement. 78 - Augment the existing `tx_index.indexer` configuration to now accept a series 79 of one or more indexer types, i.e sinks. 80 - Combine the current `TxIndexer` and `BlockIndexer` into a single `KVEventSink` 81 that implements the `EventSink` interface. 82 - Introduce an additional `EventSink` that is backed by [PostgreSQL](https://www.postgresql.org/). 83 - Implement the necessary schemas to support both block and transaction event 84 indexing. 85 - Update `IndexerService` to use a series of `EventSinks`. 86 - Proxy queries to the relevant sink's native query layer. 87 - Update all relevant RPC methods. 88 89 90 ## Detailed Design 91 92 ### EventSink 93 94 We introduce the `EventSink` interface type that all supported sinks must implement. 95 The interface is defined as follows: 96 97 ```go 98 type EventSink interface { 99 IndexBlockEvents(types.EventDataNewBlockHeader) error 100 IndexTxEvents([]*abci.TxResult) error 101 102 SearchBlockEvents(context.Context, *query.Query) ([]int64, error) 103 SearchTxEvents(context.Context, *query.Query) ([]*abci.TxResult, error) 104 105 GetTxByHash([]byte) (*abci.TxResult, error) 106 HasBlock(int64) (bool, error) 107 108 Type() EventSinkType 109 Stop() error 110 } 111 ``` 112 113 The `IndexerService` will accept a list of one or more `EventSink` types. During 114 the `OnStart` method it will call the appropriate APIs on each `EventSink` to 115 index both block and transaction events. 116 117 ### Supported Sinks 118 119 We will initially support two `EventSink` types out of the box. 120 121 #### `KVEventSink` 122 123 This type of `EventSink` is a combination of the `TxIndexer` and `BlockIndexer` 124 indexers, both of which are backed by a single embedded key/value database. 125 126 A bulk of the existing business logic will remain the same, but the existing APIs 127 mapped to the new `EventSink` API. Both types will be removed in favor of a single 128 `KVEventSink` type. 129 130 The `KVEventSink` will be the only `EventSink` enabled by default, so from a UX 131 perspective, operators should not notice a difference apart from a configuration 132 change. 133 134 We omit `EventSink` implementation details as it should be fairly straightforward 135 to map the existing business logic to the new APIs. 136 137 #### `PSQLEventSink` 138 139 This type of `EventSink` indexes block and transaction events into a [PostgreSQL](https://www.postgresql.org/). 140 database. We define and automatically migrate the following schema when the 141 `IndexerService` starts. 142 143 The postgres eventsink will not support `tx_search`, `block_search`, `GetTxByHash` and `HasBlock`. 144 145 ```sql 146 -- Table Definition ---------------------------------------------- 147 148 CREATE TYPE block_event_type AS ENUM ('begin_block', 'end_block', ''); 149 150 CREATE TABLE block_events ( 151 id SERIAL PRIMARY KEY, 152 key VARCHAR NOT NULL, 153 value VARCHAR NOT NULL, 154 height INTEGER NOT NULL, 155 type block_event_type, 156 created_at TIMESTAMPTZ NOT NULL, 157 chain_id VARCHAR NOT NULL 158 ); 159 160 CREATE TABLE tx_results ( 161 id SERIAL PRIMARY KEY, 162 tx_result BYTEA NOT NULL, 163 created_at TIMESTAMPTZ NOT NULL 164 ); 165 166 CREATE TABLE tx_events ( 167 id SERIAL PRIMARY KEY, 168 key VARCHAR NOT NULL, 169 value VARCHAR NOT NULL, 170 height INTEGER NOT NULL, 171 hash VARCHAR NOT NULL, 172 tx_result_id SERIAL, 173 created_at TIMESTAMPTZ NOT NULL, 174 chain_id VARCHAR NOT NULL, 175 FOREIGN KEY (tx_result_id) 176 REFERENCES tx_results(id) 177 ON DELETE CASCADE 178 ); 179 180 -- Indices ------------------------------------------------------- 181 182 CREATE INDEX idx_block_events_key_value ON block_events(key, value); 183 CREATE INDEX idx_tx_events_key_value ON tx_events(key, value); 184 CREATE INDEX idx_tx_events_hash ON tx_events(hash); 185 ``` 186 187 The `PSQLEventSink` will implement the `EventSink` interface as follows 188 (some details omitted for brevity): 189 190 191 ```go 192 func NewPSQLEventSink(connStr string, chainID string) (*PSQLEventSink, error) { 193 db, err := sql.Open("postgres", connStr) 194 if err != nil { 195 return nil, err 196 } 197 198 // ... 199 } 200 201 func (es *PSQLEventSink) IndexBlockEvents(h types.EventDataNewBlockHeader) error { 202 sqlStmt := sq.Insert("block_events").Columns("key", "value", "height", "type", "created_at", "chain_id") 203 204 // index the reserved block height index 205 ts := time.Now() 206 sqlStmt = sqlStmt.Values(types.BlockHeightKey, h.Header.Height, h.Header.Height, "", ts, es.chainID) 207 208 for _, event := range h.ResultBeginBlock.Events { 209 // only index events with a non-empty type 210 if len(event.Type) == 0 { 211 continue 212 } 213 214 for _, attr := range event.Attributes { 215 if len(attr.Key) == 0 { 216 continue 217 } 218 219 // index iff the event specified index:true and it's not a reserved event 220 compositeKey := fmt.Sprintf("%s.%s", event.Type, string(attr.Key)) 221 if compositeKey == types.BlockHeightKey { 222 return fmt.Errorf("event type and attribute key \"%s\" is reserved; please use a different key", compositeKey) 223 } 224 225 if attr.GetIndex() { 226 sqlStmt = sqlStmt.Values(compositeKey, string(attr.Value), h.Header.Height, BlockEventTypeBeginBlock, ts, es.chainID) 227 } 228 } 229 } 230 231 // index end_block events... 232 // execute sqlStmt db query... 233 } 234 235 func (es *PSQLEventSink) IndexTxEvents(txr []*abci.TxResult) error { 236 sqlStmtEvents := sq.Insert("tx_events").Columns("key", "value", "height", "hash", "tx_result_id", "created_at", "chain_id") 237 sqlStmtTxResult := sq.Insert("tx_results").Columns("tx_result", "created_at") 238 239 ts := time.Now() 240 for _, tx := range txr { 241 // store the tx result 242 txBz, err := proto.Marshal(tx) 243 if err != nil { 244 return err 245 } 246 247 sqlStmtTxResult = sqlStmtTxResult.Values(txBz, ts) 248 249 // execute sqlStmtTxResult db query... 250 var txID uint32 251 err = sqlStmtTxResult.QueryRow().Scan(&txID) 252 if err != nil { 253 return err 254 } 255 256 // index the reserved height and hash indices 257 hash := types.Tx(tx.Tx).Hash() 258 sqlStmtEvents = sqlStmtEvents.Values(types.TxHashKey, hash, tx.Height, hash, txID, ts, es.chainID) 259 sqlStmtEvents = sqlStmtEvents.Values(types.TxHeightKey, tx.Height, tx.Height, hash, txID, ts, es.chainID) 260 261 for _, event := range result.Result.Events { 262 // only index events with a non-empty type 263 if len(event.Type) == 0 { 264 continue 265 } 266 267 for _, attr := range event.Attributes { 268 if len(attr.Key) == 0 { 269 continue 270 } 271 272 // index if `index: true` is set 273 compositeTag := fmt.Sprintf("%s.%s", event.Type, string(attr.Key)) 274 275 // ensure event does not conflict with a reserved prefix key 276 if compositeTag == types.TxHashKey || compositeTag == types.TxHeightKey { 277 return fmt.Errorf("event type and attribute key \"%s\" is reserved; please use a different key", compositeTag) 278 } 279 280 if attr.GetIndex() { 281 sqlStmtEvents = sqlStmtEvents.Values(compositeKey, string(attr.Value), tx.Height, hash, txID, ts, es.chainID) 282 } 283 } 284 } 285 } 286 287 // execute sqlStmtEvents db query... 288 } 289 290 func (es *PSQLEventSink) SearchBlockEvents(ctx context.Context, q *query.Query) ([]int64, error) { 291 return nil, errors.New("block search is not supported via the postgres event sink") 292 } 293 294 func (es *PSQLEventSink) SearchTxEvents(ctx context.Context, q *query.Query) ([]*abci.TxResult, error) { 295 return nil, errors.New("tx search is not supported via the postgres event sink") 296 } 297 298 func (es *PSQLEventSink) GetTxByHash(hash []byte) (*abci.TxResult, error) { 299 return nil, errors.New("getTxByHash is not supported via the postgres event sink") 300 } 301 302 func (es *PSQLEventSink) HasBlock(h int64) (bool, error) { 303 return false, errors.New("hasBlock is not supported via the postgres event sink") 304 } 305 ``` 306 307 ### Configuration 308 309 The current `tx_index.indexer` configuration would be changed to accept a list 310 of supported `EventSink` types instead of a single value. 311 312 Example: 313 314 ```toml 315 [tx_index] 316 317 indexer = [ 318 "kv", 319 "psql" 320 ] 321 ``` 322 323 If the `indexer` list contains the `null` indexer, then no indexers will be used 324 regardless of what other values may exist. 325 326 Additional configuration parameters might be required depending on what event 327 sinks are supplied to `tx_index.indexer`. The `psql` will require an additional 328 connection configuration. 329 330 ```toml 331 [tx_index] 332 333 indexer = [ 334 "kv", 335 "psql" 336 ] 337 338 pqsql_conn = "postgresql://<user>:<password>@<host>:<port>/<db>?<opts>" 339 ``` 340 341 Any invalid or misconfigured `tx_index` configuration should yield an error as 342 early as possible. 343 344 ## Future Improvements 345 346 Although not technically required to maintain feature parity with the current 347 existing Tendermint indexer, it would be beneficial for operators to have a method 348 of performing a "re-index". Specifically, Tendermint operators could invoke an 349 RPC method that allows the Tendermint node to perform a re-indexing of all block 350 and transaction events between two given heights, H<sub>1</sub> and H<sub>2</sub>, 351 so long as the block store contains the blocks and transaction results for all 352 the heights specified in a given range. 353 354 ## Consequences 355 356 ### Positive 357 358 - A more robust and flexible indexing and query engine for indexing and search 359 block and transaction events. 360 - The ability to not have to support a custom indexing and query engine beyond 361 the legacy `kv` type. 362 - The ability to offload/proxy indexing and querying to the underling sink. 363 - Scalability and reliability that essentially comes "for free" from the underlying 364 sink, if it supports it. 365 366 ### Negative 367 368 - The need to support multiple and potentially a growing set of custom `EventSink` 369 types. 370 371 ### Neutral 372 373 ## References 374 375 - [Cosmos SDK ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md) 376 - [PostgreSQL](https://www.postgresql.org/) 377 - [SQLite](https://www.sqlite.org/index.html)