github.com/kchristidis/fabric@v1.0.4-0.20171028114726-837acd08cde1/proposals/r1/Next-Ledger-Architecture-Proposal.md (about) 1 **Draft** / **Work in Progress** 2 3 This page documents a proposal for a future ledger architecture based on community feedback. All input is welcome as the goal is to make this a community effort. 4 5 ##### Table of Contents 6 [Motivation](#motivation) 7 [API](#api) 8 [Point-in-Time Queries](#pointintime) 9 [Query Language](#querylanguage) 10 11 12 ## <a name="motivation"></a> Motivation 13 The motivation for exploring a new ledger architecture is based on community feedback. While the existing ledger is able to support some (but not all) of the below requirements, we wanted to explore what a new ledger would look like given all that has been learned. Based on many discussions in the community over Slack, GitHub, and the face to face hackathons, it is clear that there is a strong desire to support the following requirements: 14 15 1. Point in time queries - The ability to query chaincode state at previous blocks and easily trace lineage **without** replaying transactions 16 2. SQL like query language 17 3. Privacy - The complete ledger may not reside on all committers 18 4. Cryptographically secure ledger - Data integrity without consulting other nodes 19 5. Support for consensus algorithms that provides immediate finality like PBFT 20 6. Support consensus algorithms that require stochastic convergence like PoW, PoET 21 7. Pruning - Ability to remove old transaction data as needed. 22 8. Support separation of endorsement from consensus as described in the [Next Consensus Architecture Proposal](https://github.com/hyperledger/fabric/wiki/Next-Consensus-Architecture-Proposal). This implies that some peers may apply endorsed results to their ledger **without** executing transactions or viewing chaincode logic. 23 9. API / Enginer separation. The ability to plug in different storage engines as needed. 24 25 <a name="api"></a> 26 ## API 27 28 Proposed API in Go pseudocode 29 30 ``` 31 package ledger 32 33 import "github.com/hyperledger/fabric/protos/peer" 34 35 // Encryptor is an interface that a ledger implementation can use for Encrypt/Decrypt the chaincode state 36 type Encryptor interface { 37 Encrypt([]byte) []byte 38 Decrypt([]byte) []byte 39 } 40 41 // PeerMgmt is an interface that a ledger implementation expects from peer implementation 42 type PeerMgmt interface { 43 // IsPeerEndorserFor returns 'true' if the peer is endorser for given chaincodeID 44 IsPeerEndorserFor(chaincodeID string) bool 45 46 // ListEndorsingChaincodes return the chaincodeIDs for which the peer acts as one of the endorsers 47 ListEndorsingChaincodes() []string 48 49 // GetEncryptor returns the Encryptor for the given chaincodeID 50 GetEncryptor(chaincodeID string) (Encryptor, error) 51 } 52 53 // In the case of a confidential chaincode, the simulation results from ledger are expected to be encrypted using the 'Encryptor' corresponding to the chaincode. 54 // Similarly, the blocks returned by the GetBlock(s) method of the ledger are expected to have the state updates in the encrypted form. 55 // However, internally, the ledger can maintain the latest and historical state for the chaincodes for which the peer is one of the endorsers - in plain text form. 56 // TODO - Is this assumption correct? 57 58 // General purpose interface for forcing a data element to be serializable/de-serializable 59 type DataHolder interface { 60 GetData() interface{} 61 GetBytes() []byte 62 DecodeBytes(b []byte) interface{} 63 } 64 65 type SimulationResults interface { 66 DataHolder 67 } 68 69 type QueryResult interface { 70 DataHolder 71 } 72 73 type BlockHeader struct { 74 } 75 76 type PrunePolicy interface { 77 } 78 79 type BlockRangePrunePolicy struct { 80 FirstBlockHash string 81 LastBlockHash string 82 } 83 84 // QueryExecutor executes the queries 85 // Get* methods are for supporting KV-based data model. ExecuteQuery method is for supporting a rich datamodel and query support 86 // 87 // ExecuteQuery method in the case of a rich data model is expected to support queries on 88 // latest state, historical state and on the intersection of state and transactions 89 type QueryExecutor interface { 90 GetState(key string) ([]byte, error) 91 GetStateRangeScanIterator(startKey string, endKey string) (ResultsIterator, error) 92 GetStateMultipleKeys(keys []string) ([][]byte, error) 93 GetTransactionsForKey(key string) (ResultsIterator, error) 94 95 ExecuteQuery(query string) (ResultsIterator, error) 96 } 97 98 // TxSimulator simulates a transaction on a consistent snapshot of the as recent state as possible 99 type TxSimulator interface { 100 QueryExecutor 101 StartNewTx() 102 103 // KV data model 104 SetState(key string, value []byte) 105 DeleteState(key string) 106 SetStateMultipleKeys(kvs map[string][]byte) 107 108 // for supporting rich data model (see comments on QueryExecutor above) 109 ExecuteUpdate(query string) 110 111 // This can be a large payload 112 CopyState(sourceChaincodeID string) error 113 114 // GetTxSimulationResults encapsulates the results of the transaction simulation. 115 // This should contain enough detail for 116 // - The update in the chaincode state that would be caused if the transaction is to be committed 117 // - The environment in which the transaction is executed so as to be able to decide the validity of the enviroment 118 // (at a later time on a different peer) during committing the transactions 119 // Different ledger implementation (or configurations of a single implementation) may want to represent the above two pieces 120 // of information in different way in order to support different data-models or optimize the information representations. 121 // TODO detailed illustration of a couple of representations. 122 GetTxSimulationResults() SimulationResults 123 HasConflicts() bool 124 Clear() 125 } 126 127 type ResultsIterator interface { 128 // Next moves to next key-value. Returns true if next key-value exists 129 Next() bool 130 // GetKeyValue returns next key-value 131 GetResult() QueryResult 132 // Close releases resources occupied by the iterator 133 Close() 134 } 135 136 // OrdererLedger implements methods required by 'orderer ledger' 137 type OrdererLedger interface { 138 Ledger 139 // CommitBlock adds a new block 140 CommitBlock(block *common.Block) error 141 } 142 143 // PeerLedger differs from the OrdererLedger in that PeerLedger locally maintain a bitmask 144 // that tells apart valid transactions from invalid ones 145 type PeerLedger interface { 146 Ledger 147 // GetTransactionByID retrieves a transaction by id 148 GetTransactionByID(txID string) (*pb.Transaction, error) 149 // GetBlockByHash returns a block given it's hash 150 GetBlockByHash(blockHash []byte) (*common.Block, error) 151 // NewTxSimulator gives handle to a transaction simulator. 152 // A client can obtain more than one 'TxSimulator's for parallel execution. 153 // Any snapshoting/synchronization should be performed at the implementation level if required 154 NewTxSimulator() (TxSimulator, error) 155 // NewQueryExecutor gives handle to a query executor. 156 // A client can obtain more than one 'QueryExecutor's for parallel execution. 157 // Any synchronization should be performed at the implementation level if required 158 NewQueryExecutor() (QueryExecutor, error) 159 // NewHistoryQueryExecutor gives handle to a history query executor. 160 // A client can obtain more than one 'HistoryQueryExecutor's for parallel execution. 161 // Any synchronization should be performed at the implementation level if required 162 NewHistoryQueryExecutor() (HistoryQueryExecutor, error) 163 // Commits block into the ledger 164 Commit(block *common.Block) error 165 //Prune prunes the blocks/transactions that satisfy the given policy 166 Prune(policy PrunePolicy) error 167 } 168 169 // ValidatedLedger represents the 'final ledger' after filtering out invalid transactions from PeerLedger. 170 // Post-v1 171 type ValidatedLedger interface { 172 Ledger 173 } 174 175 // Ledger captures the methods that are common across the 'PeerLedger', 'OrdererLedger', and 'ValidatedLedger' 176 type Ledger interface { 177 // GetBlockchainInfo returns basic info about blockchain 178 GetBlockchainInfo() (*pb.BlockchainInfo, error) 179 // GetBlockByNumber returns block at a given height 180 // blockNumber of math.MaxUint64 will return last block 181 GetBlockByNumber(blockNumber uint64) (*common.Block, error) 182 // GetBlocksIterator returns an iterator that starts from `startBlockNumber`(inclusive). 183 // The iterator is a blocking iterator i.e., it blocks till the next block gets available in the ledger 184 // ResultsIterator contains type BlockHolder 185 GetBlocksIterator(startBlockNumber uint64) (ResultsIterator, error) 186 // Close closes the ledger 187 Close() 188 } 189 190 //BlockChain represents an instance of a block chain. In the case of a consensus algorithm that could cause a fork, an instance of BlockChain 191 // represent one of the forks (i.e., one of the chains starting from the genesis block to the one of the top most blocks) 192 type BlockChain interface { 193 GetTopBlockHash() string 194 GetBlockchainInfo() (*protos.BlockchainInfo, error) 195 GetBlockHeaders(startingBlockHash, endingBlockHash string) []*BlockHeader 196 GetBlocks(startingBlockHash, endingBlockHash string) []*protos.Block 197 GetBlockByNumber(blockNumber uint64) *protos.Block 198 GetBlocksByNumber(startingBlockNumber, endingBlockNumber uint64) []*protos.Block 199 GetBlockchainSize() uint64 200 VerifyChain(highBlock, lowBlock uint64) (uint64, error) 201 } 202 ``` 203 204 # Engine specific thoughts 205 206 <a name="pointintime"></a> 207 ### Point-in-Time Queries 208 In abstract temporal terms, there are three varieties of query important to chaincode and application developers: 209 210 1. Retrieve the most recent value of a key. (type: current; ex. How much money is in Alice's account?) 211 2. Retrieve the value of a key at a specific time. (type: historical; ex. What was Alice's account balance at the end of last month?) 212 3. Retrieve all values of a key over time. (type: lineage; ex. Produce a statement listing all of Alice's transactions.) 213 214 When formulating a query, a developer will benefit from the ability to filter, project, and relate transactions to one-another. Consider the following examples: 215 216 1. Simple Filtering: Find all accounts that fell below a balance of $100 in the last month. 217 2. Complex Filtering: Find all of Trudy's transactions that occurred in Iraq or Syria where the amount is above a threshold and the other party has a name that matches a regular expression. 218 3. Relating: Determine if Alice has ever bought from the same gas station more than once in the same day. Feed this information into a fraud detection model. 219 4. Projection: Retrieve the city, state, country, and amount of Alice's last ten transactions. This information will be fed into a risk/fraud detection model. 220 221 <a name="querylanguage"></a> 222 ### Query Language 223 Developing a query language to support such a diverse range of queries will not be simple. The challenges are: 224 225 1. Scaling the query language with developers as their needs grow. To date, the requests from developers have been modest. As the Hyperledger project's user base grows, so will the query complexity. 226 2. There are two nearly disjoint classes of query: 227 1. Find a single value matching a set of constraints. Amenable to existing SQL and NoSQL grammars. 228 2. Find a chain or chains of transactions satisfying a set of constraints. Amenable to graph query languages, such as Neo4J's Cypher or SPARQL. 229 230 <a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>. 231 s