github.com/hechain20/hechain@v0.0.0-20220316014945-b544036ba106/docs/source/orderer/ordering_service.md (about) 1 # The Ordering Service 2 3 **Audience:** Architects, ordering service admins, channel creators 4 5 This topic serves as a conceptual introduction to the concept of ordering, how 6 orderers interact with peers, the role they play in a transaction flow, and an 7 overview of the currently available implementations of the ordering service, 8 with a particular focus on the recommended **Raft** ordering service implementation. 9 10 ## What is ordering? 11 12 Many distributed blockchains, such as Ethereum and Bitcoin, are not permissioned, 13 which means that any node can participate in the consensus process, wherein 14 transactions are ordered and bundled into blocks. Because of this fact, these 15 systems rely on **probabilistic** consensus algorithms which eventually 16 guarantee ledger consistency to a high degree of probability, but which are 17 still vulnerable to divergent ledgers (also known as a ledger "fork"), where 18 different participants in the network have a different view of the accepted 19 order of transactions. 20 21 Hechain works differently. It features a node called an 22 **orderer** (it's also known as an "ordering node") that does this transaction 23 ordering, which along with other orderer nodes forms an **ordering service**. 24 Because Fabric's design relies on **deterministic** consensus algorithms, any block 25 validated by the peer is guaranteed to be final and correct. Ledgers cannot fork 26 the way they do in many other distributed and permissionless blockchain networks. 27 28 In addition to promoting finality, separating the endorsement of chaincode 29 execution (which happens at the peers) from ordering gives Fabric advantages 30 in performance and scalability, eliminating bottlenecks which can occur when 31 execution and ordering are performed by the same nodes. 32 33 ## Orderer nodes and channel configuration 34 35 Orderers also enforce basic access control for channels, restricting who can 36 read and write data to them, and who can configure them. Remember that who 37 is authorized to modify a configuration element in a channel is subject to the 38 policies that the relevant administrators set when they created the consortium 39 or the channel. Configuration transactions are processed by the orderer, 40 as it needs to know the current set of policies to execute its basic 41 form of access control. In this case, the orderer processes the 42 configuration update to make sure that the requestor has the proper 43 administrative rights. If so, the orderer validates the update request against 44 the existing configuration, generates a new configuration transaction, 45 and packages it into a block that is relayed to all peers on the channel. The 46 peers then process the configuration transactions in order to verify that the 47 modifications approved by the orderer do indeed satisfy the policies defined in 48 the channel. 49 50 ## Orderer nodes and identity 51 52 Everything that interacts with a blockchain network, including peers, 53 applications, admins, and orderers, acquires their organizational identity from 54 their digital certificate and their Membership Service Provider (MSP) definition. 55 56 For more information about identities and MSPs, check out our documentation on 57 [Identity](../identity/identity.html) and [Membership](../membership/membership.html). 58 59 Just like peers, ordering nodes belong to an organization. And similar to peers, 60 a separate Certificate Authority (CA) should be used for each organization. 61 Whether this CA will function as the root CA, or whether you choose to deploy 62 a root CA and then intermediate CAs associated with that root CA, is up to you. 63 64 ## Orderers and the transaction flow 65 66 ### Phase one: Transaction Proposal and Endorsement 67 68 We've seen from our topic on [Peers](../peers/peers.html) that they form the basis 69 for a blockchain network, hosting ledgers, which can be queried and updated by 70 applications through smart contracts. 71 72 Specifically, applications that want to update the ledger are involved in a 73 process with three phases that ensures all of the peers in a blockchain network 74 keep their ledgers consistent with each other. 75 76 In the first phase, a client application sends a transaction proposal to the Fabric 77 Gateway service, via a trusted peer. This peer executes the proposed transaction or 78 forwards it to another peer in its organization for execution. 79 80 The gateway also forwards the transaction to peers in the organizations required by the endorsement policy. These endorsing peers run the transaction and return the 81 transaction response to the gateway service. They do not apply the proposed update to 82 their copy of the ledger at this time. The endorsed transaction proposals will ultimately 83 be ordered into blocks in phase two, and then distributed to all peers for final validation 84 and commitment to the ledger in phase three. 85 86 Note: Fabric v2.3 SDKs embed the logic of the v2.4 Fabric Gateway service in the client application --- refer to the [v2.3 Applications and Peers](https://hyperledger-fabric.readthedocs.io/en/release-2.3/peers/peers.html#applications-and-peers) topic for details. 87 88 For an in-depth look at phase one, refer back to the [Peers](../peers/peers.html#applications-and-peers) topic. 89 90 ### Phase two: Transaction Submission and Ordering 91 92 With successful completion of the first transaction phase (proposal), the client 93 application has received an endorsed transaction proposal response from the 94 Fabric Gateway service for signing. For an endorsed transaction, the gateway service 95 forwards the transaction to the ordering service, which orders it with 96 other endorsed transactions, and packages them all into a block. 97 98 The ordering service creates these blocks of transactions, which will ultimately 99 be distributed to all peers on the channel for validation and commitment to 100 the ledger in phase three. The blocks themselves are also ordered and are the 101 basic components of a blockchain ledger. 102 103 Ordering service nodes receive transactions from many different application 104 clients (via the gateway) concurrently. These ordering service nodes collectively 105 form the ordering service, which may be shared by multiple channels. 106 107 The number of transactions in a block depends on channel configuration 108 parameters related to the desired size and maximum elapsed duration for a 109 block (`BatchSize` and `BatchTimeout` parameters, to be exact). The blocks are 110 then saved to the orderer's ledger and distributed to all peers on the channel. 111 If a peer happens to be down at this time, or joins 112 the channel later, it will receive the blocks by gossiping with another peer. 113 We'll see how this block is processed by peers in the third phase. 114 115 It's worth noting that the sequencing of transactions in a block is not 116 necessarily the same as the order received by the ordering service, since there 117 can be multiple ordering service nodes that receive transactions at approximately 118 the same time. What's important is that the ordering service puts the transactions 119 into a strict order, and peers will use this order when validating and committing 120 transactions. 121 122 This strict ordering of transactions within blocks makes Hechain a 123 little different from other blockchains where the same transaction can be 124 packaged into multiple different blocks that compete to form a chain. 125 In Hechain, the blocks generated by the ordering service are 126 **final**. Once a transaction has been written to a block, its position in the 127 ledger is immutably assured. As we said earlier, Hechain's finality 128 means that there are no **ledger forks** --- validated and committed transactions 129 will never be reverted or dropped. 130 131 We can also see that, whereas peers execute smart contracts (chaincode) and process transactions, 132 orderers most definitely do not. Every authorized transaction that arrives at an 133 orderer is then mechanically packaged into a block --- the orderer makes no judgement 134 as to the content of a transaction (except for channel configuration transactions, 135 as mentioned earlier). 136 137 At the end of phase two, we see that orderers have been responsible for the simple 138 but vital processes of collecting proposed transaction updates, ordering them, 139 and packaging them into blocks, ready for distribution to the channel peers. 140 141 ### Phase three: Transaction Validation and Commitment 142 143 The third phase of the transaction workflow involves the distribution of 144 ordered and packaged blocks from the ordering service to the channel peers 145 for validation and commitment to the ledger. 146 147 Phase three begins with the ordering service distributing blocks to all channel 148 peers. It's worth noting that not every peer needs to be connected to an orderer --- 149 peers can cascade blocks to other peers using the [**gossip**](../gossip.html) 150 protocol --- although receiving blocks directly from the ordering service is 151 recommended. 152 153 Each peer will validate distributed blocks independently, ensuring that ledgers 154 remain consistent. Specifically, each peer in the channel will validate each 155 transaction in the block to ensure it has been endorsed 156 by the required organizations, that its endorsements match, and that 157 it hasn't become invalidated by other recently committed transactions. Invalidated 158 transactions are still retained in the immutable block created by the orderer, 159 but they are marked as invalid by the peer and do not update the ledger's state. 160 161  162 163 *The second role of an ordering node is to distribute blocks to peers. In this 164 example, orderer O1 distributes block B2 to peer P1 and peer P2. Peer P1 165 processes block B2, resulting in a new block being added to ledger L1 on P1. In 166 parallel, peer P2 processes block B2, resulting in a new block being added to 167 ledger L1 on P2. Once this process is complete, the ledger L1 has been 168 consistently updated on peers P1 and P2, and each may inform connected 169 applications that the transaction has been processed.* 170 171 In summary, phase three sees the blocks of transactions created by the ordering 172 service applied consistently to the ledger by the peers. The strict 173 ordering of transactions into blocks allows each peer to validate that transaction 174 updates are consistently applied across the channel. 175 176 For a deeper look at phase 3, refer back to the [Peers](../peers/peers.html#phase-3-validation-and-commit) topic. 177 178 ## Ordering service implementations 179 180 While every ordering service currently available handles transactions and 181 configuration updates the same way, there are nevertheless several different 182 implementations for achieving consensus on the strict ordering of transactions 183 between ordering service nodes. 184 185 For information about how to stand up an ordering node (regardless of the 186 implementation the node will be used in), check out [our documentation on deploying a production ordering service](../deployorderer/ordererplan.html). 187 188 * **Raft** (recommended) 189 190 New as of v1.4.1, Raft is a crash fault tolerant (CFT) ordering service 191 based on an implementation of [Raft protocol](https://raft.github.io/raft.pdf) 192 in [`etcd`](https://coreos.com/etcd/). Raft follows a "leader and 193 follower" model, where a leader node is elected (per channel) and its decisions 194 are replicated by the followers. Raft ordering services should be easier to set 195 up and manage than Kafka-based ordering services, and their design allows 196 different organizations to contribute nodes to a distributed ordering service. 197 198 * **Kafka** (deprecated in v2.x) 199 200 Similar to Raft-based ordering, Apache Kafka is a CFT implementation that uses 201 a "leader and follower" node configuration. Kafka utilizes a ZooKeeper 202 ensemble for management purposes. The Kafka based ordering service has been 203 available since Fabric v1.0, but many users may find the additional 204 administrative overhead of managing a Kafka cluster intimidating or undesirable. 205 206 * **Solo** (deprecated in v2.x) 207 208 The Solo implementation of the ordering service is intended for test only and 209 consists only of a single ordering node. It has been deprecated and may be 210 removed entirely in a future release. Existing users of Solo should move to 211 a single node Raft network for equivalent function. 212 213 ## Raft 214 215 For information on how to customize the `orderer.yaml` file that determines the configuration of an ordering node, check out the [Checklist for a production ordering node](../deployorderer/ordererchecklist.html). 216 217 The go-to ordering service choice for production networks, the Fabric 218 implementation of the established Raft protocol uses a "leader and follower" 219 model, in which a leader is dynamically elected among the ordering 220 nodes in a channel (this collection of nodes is known as the "consenter set"), 221 and that leader replicates messages to the follower nodes. Because the system 222 can sustain the loss of nodes, including leader nodes, as long as there is a 223 majority of ordering nodes (what's known as a "quorum") remaining, Raft is said 224 to be "crash fault tolerant" (CFT). In other words, if there are three nodes in a 225 channel, it can withstand the loss of one node (leaving two remaining). If you 226 have five nodes in a channel, you can lose two nodes (leaving three 227 remaining nodes). This feature of a Raft ordering service is a factor in the 228 establishment of a high availability strategy for your ordering service. Additionally, 229 in a production environment, you would want to spread these nodes across data 230 centers and even locations. For example, by putting one node in three different 231 data centers. That way, if a data center or entire location becomes unavailable, 232 the nodes in the other data centers continue to operate. 233 234 From the perspective of the service they provide to a network or a channel, Raft 235 and the existing Kafka-based ordering service (which we'll talk about later) are 236 similar. They're both CFT ordering services using the leader and follower 237 design. If you are an application developer, smart contract developer, or peer 238 administrator, you will not notice a functional difference between an ordering 239 service based on Raft versus Kafka. However, there are a few major differences worth 240 considering, especially if you intend to manage an ordering service. 241 242 * Raft is easier to set up. Although Kafka has many admirers, even those 243 admirers will (usually) admit that deploying a Kafka cluster and its ZooKeeper 244 ensemble can be tricky, requiring a high level of expertise in Kafka 245 infrastructure and settings. Additionally, there are many more components to 246 manage with Kafka than with Raft, which means that there are more places where 247 things can go wrong. Kafka also has its own versions, which must be coordinated 248 with your orderers. **With Raft, everything is embedded into your ordering node**. 249 250 * Kafka and Zookeeper are not designed to be run across large networks. While 251 Kafka is CFT, it should be run in a tight group of hosts. This means that 252 practically speaking you need to have one organization run the Kafka cluster. 253 Given that, having ordering nodes run by different organizations when using Kafka 254 (which Fabric supports) doesn't decentralize the nodes because ultimately 255 the nodes all go to a Kafka cluster which is under the control of a 256 single organization. With Raft, each organization can have its own ordering 257 nodes, participating in the ordering service, which leads to a more decentralized 258 system. 259 260 * Raft is supported natively, which means that users are required to get the requisite images and 261 learn how to use Kafka and ZooKeeper on their own. Likewise, support for 262 Kafka-related issues is handled through [Apache](https://kafka.apache.org/), the 263 open-source developer of Kafka, not Hechain. The Fabric Raft implementation, 264 on the other hand, has been developed and will be supported within the Fabric 265 developer community and its support apparatus. 266 267 * Where Kafka uses a pool of servers (called "Kafka brokers") and the admin of 268 the orderer organization specifies how many nodes they want to use on a 269 particular channel, Raft allows the users to specify which ordering nodes will 270 be deployed to which channel. In this way, peer organizations can make sure 271 that, if they also own an orderer, this node will be made a part of a ordering 272 service of that channel, rather than trusting and depending on a central admin 273 to manage the Kafka nodes. 274 275 * Raft is the first step toward Fabric's development of a byzantine fault tolerant 276 (BFT) ordering service. As we'll see, some decisions in the development of 277 Raft were driven by this. If you are interested in BFT, learning how to use 278 Raft should ease the transition. 279 280 For all of these reasons, support for Kafka-based ordering service is being 281 deprecated in Fabric v2.x. 282 283 Note: Similar to Solo and Kafka, a Raft ordering service can lose transactions 284 after acknowledgement of receipt has been sent to a client. For example, if the 285 leader crashes at approximately the same time as a follower provides 286 acknowledgement of receipt. Therefore, application clients should listen on peers 287 for transaction commit events regardless (to check for transaction validity), but 288 extra care should be taken to ensure that the client also gracefully tolerates a 289 timeout in which the transaction does not get committed in a configured timeframe. 290 Depending on the application, it may be desirable to resubmit the transaction or 291 collect a new set of endorsements upon such a timeout. 292 293 ### Raft concepts 294 295 While Raft offers many of the same features as Kafka --- albeit in a simpler and 296 easier-to-use package --- it functions substantially different under the covers 297 from Kafka and introduces a number of new concepts, or twists on existing 298 concepts, to Fabric. 299 300 **Log entry**. The primary unit of work in a Raft ordering service is a "log 301 entry", with the full sequence of such entries known as the "log". We consider 302 the log consistent if a majority (a quorum, in other words) of members agree on 303 the entries and their order, making the logs on the various orderers replicated. 304 305 **Consenter set**. The ordering nodes actively participating in the consensus 306 mechanism for a given channel and receiving replicated logs for the channel. 307 308 **Finite-State Machine (FSM)**. Every ordering node in Raft has an FSM and 309 collectively they're used to ensure that the sequence of logs in the various 310 ordering nodes is deterministic (written in the same sequence). 311 312 **Quorum**. Describes the minimum number of consenters that need to affirm a 313 proposal so that transactions can be ordered. For every consenter set, this is a 314 **majority** of nodes. In a cluster with five nodes, three must be available for 315 there to be a quorum. If a quorum of nodes is unavailable for any reason, the 316 ordering service cluster becomes unavailable for both read and write operations 317 on the channel, and no new logs can be committed. 318 319 **Leader**. This is not a new concept --- Kafka also uses leaders --- 320 but it's critical to understand that at any given time, a channel's consenter set 321 elects a single node to be the leader (we'll describe how this happens in Raft 322 later). The leader is responsible for ingesting new log entries, replicating 323 them to follower ordering nodes, and managing when an entry is considered 324 committed. This is not a special **type** of orderer. It is only a role that 325 an orderer may have at certain times, and then not others, as circumstances 326 determine. 327 328 **Follower**. Again, not a new concept, but what's critical to understand about 329 followers is that the followers receive the logs from the leader and 330 replicate them deterministically, ensuring that logs remain consistent. As 331 we'll see in our section on leader election, the followers also receive 332 "heartbeat" messages from the leader. In the event that the leader stops 333 sending those message for a configurable amount of time, the followers will 334 initiate a leader election and one of them will be elected the new leader. 335 336 ### Raft in a transaction flow 337 338 Every channel runs on a **separate** instance of the Raft protocol, which allows each instance to elect a different leader. This configuration also allows further decentralization of the service in use cases where clusters are made up of ordering nodes controlled by different organizations. Ordering nodes can be added or removed from a channel as needed as long as only a single node is added or removed at a time. While this configuration creates more overhead in the form of redundant heartbeat messages and goroutines, it lays necessary groundwork for BFT. 339 340 In Raft, transactions (in the form of proposals or configuration updates) are 341 automatically routed by the ordering node that receives the transaction to the 342 current leader of that channel. This means that peers and applications do not 343 need to know who the leader node is at any particular time. Only the ordering 344 nodes need to know. 345 346 When the orderer validation checks have been completed, the transactions are 347 ordered, packaged into blocks, consented on, and distributed, as described in 348 phase two of our transaction flow. 349 350 ### Architectural notes 351 352 #### How leader election works in Raft 353 354 Although the process of electing a leader happens within the orderer's internal 355 processes, it's worth noting how the process works. 356 357 Raft nodes are always in one of three states: follower, candidate, or leader. 358 All nodes initially start out as a **follower**. In this state, they can accept 359 log entries from a leader (if one has been elected), or cast votes for leader. 360 If no log entries or heartbeats are received for a set amount of time (for 361 example, five seconds), nodes self-promote to the **candidate** state. In the 362 candidate state, nodes request votes from other nodes. If a candidate receives a 363 quorum of votes, then it is promoted to a **leader**. The leader must accept new 364 log entries and replicate them to the followers. 365 366 For a visual representation of how the leader election process works, check out 367 [The Secret Lives of Data](http://thesecretlivesofdata.com/raft/). 368 369 #### Snapshots 370 371 If an ordering node goes down, how does it get the logs it missed when it is 372 restarted? 373 374 While it's possible to keep all logs indefinitely, in order to save disk space, 375 Raft uses a process called "snapshotting", in which users can define how many 376 bytes of data will be kept in the log. This amount of data will conform to a 377 certain number of blocks (which depends on the amount of data in the blocks. 378 Note that only full blocks are stored in a snapshot). 379 380 For example, let's say lagging replica `R1` was just reconnected to the network. 381 Its latest block is `100`. Leader `L` is at block `196`, and is configured to 382 snapshot at amount of data that in this case represents 20 blocks. `R1` would 383 therefore receive block `180` from `L` and then make a `Deliver` request for 384 blocks `101` to `180`. Blocks `180` to `196` would then be replicated to `R1` 385 through the normal Raft protocol. 386 387 ### Kafka (deprecated in v2.x) 388 389 The other crash fault tolerant ordering service supported by Fabric is an 390 adaptation of a Kafka distributed streaming platform for use as a cluster of 391 ordering nodes. You can read more about Kafka at the [Apache Kafka Web site](https://kafka.apache.org/intro), 392 but at a high level, Kafka uses the same conceptual "leader and follower" 393 configuration used by Raft, in which transactions (which Kafka calls "messages") 394 are replicated from the leader node to the follower nodes. In the event the 395 leader node goes down, one of the followers becomes the leader and ordering can 396 continue, ensuring fault tolerance, just as with Raft. 397 398 The management of the Kafka cluster, including the coordination of tasks, 399 cluster membership, access control, and controller election, among others, is 400 handled by a ZooKeeper ensemble and its related APIs. 401 402 Kafka clusters and ZooKeeper ensembles are notoriously tricky to set up, so our 403 documentation assumes a working knowledge of Kafka and ZooKeeper. If you decide 404 to use Kafka without having this expertise, you should complete, *at a minimum*, 405 the first six steps of the [Kafka Quickstart guide](https://kafka.apache.org/quickstart) before experimenting with the 406 Kafka-based ordering service. You can also consult 407 [this sample configuration file](https://github.com/hechain20/hechain/blob/release-1.1/bddtests/dc-orderer-kafka.yml) 408 for a brief explanation of the sensible defaults for Kafka and ZooKeeper. 409 410 To learn how to bring up a Kafka-based ordering service, check out [our documentation on Kafka](../kafka.html). 411 412 <!--- Licensed under Creative Commons Attribution 4.0 International License 413 https://creativecommons.org/licenses/by/4.0/) -->