github.com/hechain20/hechain@v0.0.0-20220316014945-b544036ba106/docs/source/orderer/ordering_service.md

github.com/hechain20/hechain@v0.0.0-20220316014945-b544036ba106/docs/source/orderer/ordering_service.md (about)

1 # The Ordering Service
2
3 **Audience:** Architects, ordering service admins, channel creators
4
5 This topic serves as a conceptual introduction to the concept of ordering, how
6 orderers interact with peers, the role they play in a transaction flow, and an
7 overview of the currently available implementations of the ordering service,
8 with a particular focus on the recommended **Raft** ordering service implementation.
9
10 ## What is ordering?
11
12 Many distributed blockchains, such as Ethereum and Bitcoin, are not permissioned,
13 which means that any node can participate in the consensus process, wherein
14 transactions are ordered and bundled into blocks. Because of this fact, these
15 systems rely on **probabilistic** consensus algorithms which eventually
16 guarantee ledger consistency to a high degree of probability, but which are
17 still vulnerable to divergent ledgers (also known as a ledger "fork"), where
18 different participants in the network have a different view of the accepted
19 order of transactions.
20
21 Hechain works differently. It features a node called an
22 **orderer** (it's also known as an "ordering node") that does this transaction
23 ordering, which along with other orderer nodes forms an **ordering service**.
24 Because Fabric's design relies on **deterministic** consensus algorithms, any block
25 validated by the peer is guaranteed to be final and correct. Ledgers cannot fork
26 the way they do in many other distributed and permissionless blockchain networks.
27
28 In addition to promoting finality, separating the endorsement of chaincode
29 execution (which happens at the peers) from ordering gives Fabric advantages
30 in performance and scalability, eliminating bottlenecks which can occur when
31 execution and ordering are performed by the same nodes.
32
33 ## Orderer nodes and channel configuration
34
35 Orderers also enforce basic access control for channels, restricting who can
36 read and write data to them, and who can configure them. Remember that who
37 is authorized to modify a configuration element in a channel is subject to the
38 policies that the relevant administrators set when they created the consortium
39 or the channel. Configuration transactions are processed by the orderer,
40 as it needs to know the current set of policies to execute its basic
41 form of access control. In this case, the orderer processes the
42 configuration update to make sure that the requestor has the proper
43 administrative rights. If so, the orderer validates the update request against
44 the existing configuration, generates a new configuration transaction,
45 and packages it into a block that is relayed to all peers on the channel. The
46 peers then process the configuration transactions in order to verify that the
47 modifications approved by the orderer do indeed satisfy the policies defined in
48 the channel.
49
50 ## Orderer nodes and identity
51
52 Everything that interacts with a blockchain network, including peers,
53 applications, admins, and orderers, acquires their organizational identity from
54 their digital certificate and their Membership Service Provider (MSP) definition.
55
56 For more information about identities and MSPs, check out our documentation on
57 [Identity](../identity/identity.html) and [Membership](../membership/membership.html).
58
59 Just like peers, ordering nodes belong to an organization. And similar to peers,
60 a separate Certificate Authority (CA) should be used for each organization.
61 Whether this CA will function as the root CA, or whether you choose to deploy
62 a root CA and then intermediate CAs associated with that root CA, is up to you.
63
64 ## Orderers and the transaction flow
65
66 ### Phase one: Transaction Proposal and Endorsement
67
68 We've seen from our topic on [Peers](../peers/peers.html) that they form the basis
69 for a blockchain network, hosting ledgers, which can be queried and updated by
70 applications through smart contracts.
71
72 Specifically, applications that want to update the ledger are involved in a
73 process with three phases that ensures all of the peers in a blockchain network
74 keep their ledgers consistent with each other.
75
76 In the first phase, a client application sends a transaction proposal to the Fabric
77 Gateway service, via a trusted peer. This peer executes the proposed transaction or
78 forwards it to another peer in its organization for execution.
79
80 The gateway also forwards the transaction to peers in the organizations required by the endorsement policy. These endorsing peers run the transaction and return the
81 transaction response to the gateway service. They do not apply the proposed update to
82 their copy of the ledger at this time. The endorsed transaction proposals will ultimately
83 be ordered into blocks in phase two, and then distributed to all peers for final validation
84 and commitment to the ledger in phase three.
85
86 Note: Fabric v2.3 SDKs embed the logic of the v2.4 Fabric Gateway service in the client application --- refer to the [v2.3 Applications and Peers](https://hyperledger-fabric.readthedocs.io/en/release-2.3/peers/peers.html#applications-and-peers) topic for details.
87
88 For an in-depth look at phase one, refer back to the [Peers](../peers/peers.html#applications-and-peers) topic.
89
90 ### Phase two: Transaction Submission and Ordering
91
92 With successful completion of the first transaction phase (proposal), the client
93 application has received an endorsed transaction proposal response from the
94 Fabric Gateway service for signing. For an endorsed transaction, the gateway service
95 forwards the transaction to the ordering service, which orders it with
96 other endorsed transactions, and packages them all into a block.
97
98 The ordering service creates these blocks of transactions, which will ultimately
99 be distributed to all peers on the channel for validation and commitment to
100 the ledger in phase three. The blocks themselves are also ordered and are the
101 basic components of a blockchain ledger.
102
103 Ordering service nodes receive transactions from many different application
104 clients (via the gateway) concurrently. These ordering service nodes collectively
105 form the ordering service, which may be shared by multiple channels.
106
107 The number of transactions in a block depends on channel configuration
108 parameters related to the desired size and maximum elapsed duration for a
109 block (`BatchSize` and `BatchTimeout` parameters, to be exact). The blocks are
110 then saved to the orderer's ledger and distributed to all peers on the channel.
111 If a peer happens to be down at this time, or joins
112 the channel later, it will receive the blocks by gossiping with another peer.
113 We'll see how this block is processed by peers in the third phase.
114
115 It's worth noting that the sequencing of transactions in a block is not
116 necessarily the same as the order received by the ordering service, since there
117 can be multiple ordering service nodes that receive transactions at approximately
118 the same time. What's important is that the ordering service puts the transactions
119 into a strict order, and peers will use this order when validating and committing
120 transactions.
121
122 This strict ordering of transactions within blocks makes Hechain a
123 little different from other blockchains where the same transaction can be
124 packaged into multiple different blocks that compete to form a chain.
125 In Hechain, the blocks generated by the ordering service are
126 **final**. Once a transaction has been written to a block, its position in the
127 ledger is immutably assured. As we said earlier, Hechain's finality
128 means that there are no **ledger forks** --- validated and committed transactions
129 will never be reverted or dropped.
130
131 We can also see that, whereas peers execute smart contracts (chaincode) and process transactions,
132 orderers most definitely do not. Every authorized transaction that arrives at an
133 orderer is then mechanically packaged into a block --- the orderer makes no judgement
134 as to the content of a transaction (except for channel configuration transactions,
135 as mentioned earlier).
136
137 At the end of phase two, we see that orderers have been responsible for the simple
138 but vital processes of collecting proposed transaction updates, ordering them,
139 and packaging them into blocks, ready for distribution to the channel peers.
140
141 ### Phase three: Transaction Validation and Commitment
142
143 The third phase of the transaction workflow involves the distribution of
144 ordered and packaged blocks from the ordering service to the channel peers
145 for validation and commitment to the ledger.
146
147 Phase three begins with the ordering service distributing blocks to all channel
148 peers. It's worth noting that not every peer needs to be connected to an orderer ---
149 peers can cascade blocks to other peers using the [**gossip**](../gossip.html)
150 protocol --- although receiving blocks directly from the ordering service is
151 recommended.
152
153 Each peer will validate distributed blocks independently, ensuring that ledgers
154 remain consistent. Specifically, each peer in the channel will validate each
155 transaction in the block to ensure it has been endorsed
156 by the required organizations, that its endorsements match, and that
157 it hasn't become invalidated by other recently committed transactions. Invalidated
158 transactions are still retained in the immutable block created by the orderer,
159 but they are marked as invalid by the peer and do not update the ledger's state.
160
161 ![Orderer2](./orderer.diagram.2.png)
162
163 *The second role of an ordering node is to distribute blocks to peers. In this
164 example, orderer O1 distributes block B2 to peer P1 and peer P2. Peer P1
165 processes block B2, resulting in a new block being added to ledger L1 on P1. In
166 parallel, peer P2 processes block B2, resulting in a new block being added to
167 ledger L1 on P2. Once this process is complete, the ledger L1 has been
168 consistently updated on peers P1 and P2, and each may inform connected
169 applications that the transaction has been processed.*
170
171 In summary, phase three sees the blocks of transactions created by the ordering
172 service applied consistently to the ledger by the peers. The strict
173 ordering of transactions into blocks allows each peer to validate that transaction
174 updates are consistently applied across the channel.
175
176 For a deeper look at phase 3, refer back to the [Peers](../peers/peers.html#phase-3-validation-and-commit) topic.
177
178 ## Ordering service implementations
179
180 While every ordering service currently available handles transactions and
181 configuration updates the same way, there are nevertheless several different
182 implementations for achieving consensus on the strict ordering of transactions
183 between ordering service nodes.
184
185 For information about how to stand up an ordering node (regardless of the
186 implementation the node will be used in), check out [our documentation on deploying a production ordering service](../deployorderer/ordererplan.html).
187
188 * **Raft** (recommended)
189
190 New as of v1.4.1, Raft is a crash fault tolerant (CFT) ordering service
191 based on an implementation of [Raft protocol](https://raft.github.io/raft.pdf)
192 in [`etcd`](https://coreos.com/etcd/). Raft follows a "leader and
193 follower" model, where a leader node is elected (per channel) and its decisions
194 are replicated by the followers. Raft ordering services should be easier to set
195 up and manage than Kafka-based ordering services, and their design allows
196 different organizations to contribute nodes to a distributed ordering service.
197
198 * **Kafka** (deprecated in v2.x)
199
200 Similar to Raft-based ordering, Apache Kafka is a CFT implementation that uses
201 a "leader and follower" node configuration. Kafka utilizes a ZooKeeper
202 ensemble for management purposes. The Kafka based ordering service has been
203 available since Fabric v1.0, but many users may find the additional
204 administrative overhead of managing a Kafka cluster intimidating or undesirable.
205
206 * **Solo** (deprecated in v2.x)
207
208 The Solo implementation of the ordering service is intended for test only and
209 consists only of a single ordering node. It has been deprecated and may be
210 removed entirely in a future release. Existing users of Solo should move to
211 a single node Raft network for equivalent function.
212
213 ## Raft
214
215 For information on how to customize the `orderer.yaml` file that determines the configuration of an ordering node, check out the [Checklist for a production ordering node](../deployorderer/ordererchecklist.html).
216
217 The go-to ordering service choice for production networks, the Fabric
218 implementation of the established Raft protocol uses a "leader and follower"
219 model, in which a leader is dynamically elected among the ordering
220 nodes in a channel (this collection of nodes is known as the "consenter set"),
221 and that leader replicates messages to the follower nodes. Because the system
222 can sustain the loss of nodes, including leader nodes, as long as there is a
223 majority of ordering nodes (what's known as a "quorum") remaining, Raft is said
224 to be "crash fault tolerant" (CFT). In other words, if there are three nodes in a
225 channel, it can withstand the loss of one node (leaving two remaining). If you
226 have five nodes in a channel, you can lose two nodes (leaving three
227 remaining nodes). This feature of a Raft ordering service is a factor in the
228 establishment of a high availability strategy for your ordering service. Additionally,
229 in a production environment, you would want to spread these nodes across data
230 centers and even locations. For example, by putting one node in three different
231 data centers. That way, if a data center or entire location becomes unavailable,
232 the nodes in the other data centers continue to operate.
233
234 From the perspective of the service they provide to a network or a channel, Raft
235 and the existing Kafka-based ordering service (which we'll talk about later) are
236 similar. They're both CFT ordering services using the leader and follower
237 design. If you are an application developer, smart contract developer, or peer
238 administrator, you will not notice a functional difference between an ordering
239 service based on Raft versus Kafka. However, there are a few major differences worth
240 considering, especially if you intend to manage an ordering service.
241
242 * Raft is easier to set up. Although Kafka has many admirers, even those
243 admirers will (usually) admit that deploying a Kafka cluster and its ZooKeeper
244 ensemble can be tricky, requiring a high level of expertise in Kafka
245 infrastructure and settings. Additionally, there are many more components to
246 manage with Kafka than with Raft, which means that there are more places where
247 things can go wrong. Kafka also has its own versions, which must be coordinated
248 with your orderers. **With Raft, everything is embedded into your ordering node**.
249
250 * Kafka and Zookeeper are not designed to be run across large networks. While
251 Kafka is CFT, it should be run in a tight group of hosts. This means that
252 practically speaking you need to have one organization run the Kafka cluster.
253 Given that, having ordering nodes run by different organizations when using Kafka
254 (which Fabric supports) doesn't decentralize the nodes because ultimately
255 the nodes all go to a Kafka cluster which is under the control of a
256 single organization. With Raft, each organization can have its own ordering
257 nodes, participating in the ordering service, which leads to a more decentralized
258 system.
259
260 * Raft is supported natively, which means that users are required to get the requisite images and
261 learn how to use Kafka and ZooKeeper on their own. Likewise, support for
262 Kafka-related issues is handled through [Apache](https://kafka.apache.org/), the
263 open-source developer of Kafka, not Hechain. The Fabric Raft implementation,
264 on the other hand, has been developed and will be supported within the Fabric
265 developer community and its support apparatus.
266
267 * Where Kafka uses a pool of servers (called "Kafka brokers") and the admin of
268 the orderer organization specifies how many nodes they want to use on a
269 particular channel, Raft allows the users to specify which ordering nodes will
270 be deployed to which channel. In this way, peer organizations can make sure
271 that, if they also own an orderer, this node will be made a part of a ordering
272 service of that channel, rather than trusting and depending on a central admin
273 to manage the Kafka nodes.
274
275 * Raft is the first step toward Fabric's development of a byzantine fault tolerant
276 (BFT) ordering service. As we'll see, some decisions in the development of
277 Raft were driven by this. If you are interested in BFT, learning how to use
278 Raft should ease the transition.
279
280 For all of these reasons, support for Kafka-based ordering service is being
281 deprecated in Fabric v2.x.
282
283 Note: Similar to Solo and Kafka, a Raft ordering service can lose transactions
284 after acknowledgement of receipt has been sent to a client. For example, if the
285 leader crashes at approximately the same time as a follower provides
286 acknowledgement of receipt. Therefore, application clients should listen on peers
287 for transaction commit events regardless (to check for transaction validity), but
288 extra care should be taken to ensure that the client also gracefully tolerates a
289 timeout in which the transaction does not get committed in a configured timeframe.
290 Depending on the application, it may be desirable to resubmit the transaction or
291 collect a new set of endorsements upon such a timeout.
292
293 ### Raft concepts
294
295 While Raft offers many of the same features as Kafka --- albeit in a simpler and
296 easier-to-use package --- it functions substantially different under the covers
297 from Kafka and introduces a number of new concepts, or twists on existing
298 concepts, to Fabric.
299
300 **Log entry**. The primary unit of work in a Raft ordering service is a "log
301 entry", with the full sequence of such entries known as the "log". We consider
302 the log consistent if a majority (a quorum, in other words) of members agree on
303 the entries and their order, making the logs on the various orderers replicated.
304
305 **Consenter set**. The ordering nodes actively participating in the consensus
306 mechanism for a given channel and receiving replicated logs for the channel.
307
308 **Finite-State Machine (FSM)**. Every ordering node in Raft has an FSM and
309 collectively they're used to ensure that the sequence of logs in the various
310 ordering nodes is deterministic (written in the same sequence).
311
312 **Quorum**. Describes the minimum number of consenters that need to affirm a
313 proposal so that transactions can be ordered. For every consenter set, this is a
314 **majority** of nodes. In a cluster with five nodes, three must be available for
315 there to be a quorum. If a quorum of nodes is unavailable for any reason, the
316 ordering service cluster becomes unavailable for both read and write operations
317 on the channel, and no new logs can be committed.
318
319 **Leader**. This is not a new concept --- Kafka also uses leaders ---
320 but it's critical to understand that at any given time, a channel's consenter set
321 elects a single node to be the leader (we'll describe how this happens in Raft
322 later). The leader is responsible for ingesting new log entries, replicating
323 them to follower ordering nodes, and managing when an entry is considered
324 committed. This is not a special **type** of orderer. It is only a role that
325 an orderer may have at certain times, and then not others, as circumstances
326 determine.
327
328 **Follower**. Again, not a new concept, but what's critical to understand about
329 followers is that the followers receive the logs from the leader and
330 replicate them deterministically, ensuring that logs remain consistent. As
331 we'll see in our section on leader election, the followers also receive
332 "heartbeat" messages from the leader. In the event that the leader stops
333 sending those message for a configurable amount of time, the followers will
334 initiate a leader election and one of them will be elected the new leader.
335
336 ### Raft in a transaction flow
337
338 Every channel runs on a **separate** instance of the Raft protocol, which allows each instance to elect a different leader. This configuration also allows further decentralization of the service in use cases where clusters are made up of ordering nodes controlled by different organizations. Ordering nodes can be added or removed from a channel as needed as long as only a single node is added or removed at a time. While this configuration creates more overhead in the form of redundant heartbeat messages and goroutines, it lays necessary groundwork for BFT.
339
340 In Raft, transactions (in the form of proposals or configuration updates) are
341 automatically routed by the ordering node that receives the transaction to the
342 current leader of that channel. This means that peers and applications do not
343 need to know who the leader node is at any particular time. Only the ordering
344 nodes need to know.
345
346 When the orderer validation checks have been completed, the transactions are
347 ordered, packaged into blocks, consented on, and distributed, as described in
348 phase two of our transaction flow.
349
350 ### Architectural notes
351
352 #### How leader election works in Raft
353
354 Although the process of electing a leader happens within the orderer's internal
355 processes, it's worth noting how the process works.
356
357 Raft nodes are always in one of three states: follower, candidate, or leader.
358 All nodes initially start out as a **follower**. In this state, they can accept
359 log entries from a leader (if one has been elected), or cast votes for leader.
360 If no log entries or heartbeats are received for a set amount of time (for
361 example, five seconds), nodes self-promote to the **candidate** state. In the
362 candidate state, nodes request votes from other nodes. If a candidate receives a
363 quorum of votes, then it is promoted to a **leader**. The leader must accept new
364 log entries and replicate them to the followers.
365
366 For a visual representation of how the leader election process works, check out
367 [The Secret Lives of Data](http://thesecretlivesofdata.com/raft/).
368
369 #### Snapshots
370
371 If an ordering node goes down, how does it get the logs it missed when it is
372 restarted?
373
374 While it's possible to keep all logs indefinitely, in order to save disk space,
375 Raft uses a process called "snapshotting", in which users can define how many
376 bytes of data will be kept in the log. This amount of data will conform to a
377 certain number of blocks (which depends on the amount of data in the blocks.
378 Note that only full blocks are stored in a snapshot).
379
380 For example, let's say lagging replica `R1` was just reconnected to the network.
381 Its latest block is `100`. Leader `L` is at block `196`, and is configured to
382 snapshot at amount of data that in this case represents 20 blocks. `R1` would
383 therefore receive block `180` from `L` and then make a `Deliver` request for
384 blocks `101` to `180`. Blocks `180` to `196` would then be replicated to `R1`
385 through the normal Raft protocol.
386
387 ### Kafka (deprecated in v2.x)
388
389 The other crash fault tolerant ordering service supported by Fabric is an
390 adaptation of a Kafka distributed streaming platform for use as a cluster of
391 ordering nodes. You can read more about Kafka at the [Apache Kafka Web site](https://kafka.apache.org/intro),
392 but at a high level, Kafka uses the same conceptual "leader and follower"
393 configuration used by Raft, in which transactions (which Kafka calls "messages")
394 are replicated from the leader node to the follower nodes. In the event the
395 leader node goes down, one of the followers becomes the leader and ordering can
396 continue, ensuring fault tolerance, just as with Raft.
397
398 The management of the Kafka cluster, including the coordination of tasks,
399 cluster membership, access control, and controller election, among others, is
400 handled by a ZooKeeper ensemble and its related APIs.
401
402 Kafka clusters and ZooKeeper ensembles are notoriously tricky to set up, so our
403 documentation assumes a working knowledge of Kafka and ZooKeeper. If you decide
404 to use Kafka without having this expertise, you should complete, *at a minimum*,
405 the first six steps of the [Kafka Quickstart guide](https://kafka.apache.org/quickstart) before experimenting with the
406 Kafka-based ordering service. You can also consult
407 [this sample configuration file](https://github.com/hechain20/hechain/blob/release-1.1/bddtests/dc-orderer-kafka.yml)
408 for a brief explanation of the sensible defaults for Kafka and ZooKeeper.
409
410 To learn how to bring up a Kafka-based ordering service, check out [our documentation on Kafka](../kafka.html).
411
412