github.com/kaituanwang/hyperledger@v2.0.1+incompatible/docs/source/kafka.rst

github.com/kaituanwang/hyperledger@v2.0.1+incompatible/docs/source/kafka.rst (about)

1 Bringing up a Kafka-based Ordering Service
2 ===========================================
3
4 .. _kafka-caveat:
5
6 Caveat emptor
7 -------------
8
9 This document assumes that the reader knows how to set up a Kafka cluster and a
10 ZooKeeper ensemble, and keep them secure for general usage by preventing
11 unauthorized access. The sole purpose of this guide is to identify the steps you
12 need to take so as to have a set of Hyperledger Fabric ordering service nodes
13 (OSNs) use your Kafka cluster and provide an ordering service to your blockchain
14 network.
15
16 For information about the role orderers play in a network and in a transaction
17 flow, checkout our :doc:`orderer/ordering_service` documentation.
18
19 For information on how to set up an ordering node, check out our :doc:`orderer_deploy`
20 documentation.
21
22 For information about configuring Raft ordering services, check out :doc:`raft_configuration`.
23
24 Big picture
25 -----------
26
27 Each channel maps to a separate single-partition topic in Kafka. When an OSN
28 receives transactions via the ``Broadcast`` RPC, it checks to make sure that the
29 broadcasting client has permissions to write on the channel, then relays (i.e.
30 produces) those transactions to the appropriate partition in Kafka. This
31 partition is also consumed by the OSN which groups the received transactions
32 into blocks locally, persists them in its local ledger, and serves them to
33 receiving clients via the ``Deliver`` RPC. For low-level details, refer to `the
34 document that describes how we came to this design <https://docs.google.com/document/d/19JihmW-8blTzN99lAubOfseLUZqdrB6sBR0HsRgCAnY/edit>`_.
35 **Figure 8** is a schematic representation of the process described above.
36
37 Steps
38 -----
39
40 Let ``K`` and ``Z`` be the number of nodes in the Kafka cluster and the
41 ZooKeeper ensemble respectively:
42
43 1. At a minimum, ``K`` should be set to 4. (As we will explain in Step 4 below,
44 this is the minimum number of nodes necessary in order to exhibit crash fault
45 tolerance, i.e. with 4 brokers, you can have 1 broker go down, all channels
46 will continue to be writeable and readable, and new channels can be created.)
47
48 2. ``Z`` will either be 3, 5, or 7. It has to be an odd number to avoid
49 split-brain scenarios, and larger than 1 in order to avoid single point of
50 failures. Anything beyond 7 ZooKeeper servers is considered overkill.
51
52 Then proceed as follows:
53
54 3. Orderers: **Encode the Kafka-related information in the network's genesis
55 block.** If you are using ``configtxgen``, edit ``configtx.yaml``. Alternatively,
56 pick a preset profile for the system channel's genesis block— so that:
57
58 * ``Orderer.OrdererType`` is set to ``kafka``.
59 * ``Orderer.Kafka.Brokers`` contains the address of *at least two* of the Kafka
60 brokers in your cluster in ``IP:port`` notation. The list does not need to be
61 exhaustive. (These are your bootstrap brokers.)
62
63 4. Orderers: **Set the maximum block size.** Each block will have at most
64 ``Orderer.AbsoluteMaxBytes`` bytes (not including headers), a value that you can
65 set in ``configtx.yaml``. Let the value you pick here be ``A`` and make note of
66 it —-- it will affect how you configure your Kafka brokers in Step 6.
67
68 5. Orderers: **Create the genesis block.** Use ``configtxgen``. The settings you
69 picked in Steps 3 and 4 above are system-wide settings, i.e. they apply across the
70 network for all the OSNs. Make note of the genesis block's location.
71
72 6. Kafka cluster: **Configure your Kafka brokers appropriately.** Ensure that every
73 Kafka broker has these keys configured:
74
75 * ``unclean.leader.election.enable = false`` — Data consistency is key in a
76 blockchain environment. We cannot have a channel leader chosen outside of
77 the in-sync replica set, or we run the risk of overwriting the offsets that
78 the previous leader produced, and —as a result— rewrite the blockchain that
79 the orderers produce.
80
81 * ``min.insync.replicas = M`` — Where you pick a value ``M`` such that
82 ``1 < M < N`` (see ``default.replication.factor`` below). Data is
83 considered committed when it is written to at least ``M`` replicas
84 (which are then considered in-sync and belong to the in-sync replica
85 set, or ISR). In any other case, the write operation returns an error.
86 Then:
87
88 * If up to ``N-M`` replicas —out of the ``N`` that the channel data is
89 written to become unavailable, operations proceed normally.
90
91 * If more replicas become unavailable, Kafka cannot maintain an ISR set
92 of ``M,`` so it stops accepting writes. Reads work without issues.
93 The channel becomes writeable again when ``M`` replicas get in-sync.
94
95 * ``default.replication.factor = N`` — Where you pick a value ``N`` such
96 that ``N < K``. A replication factor of ``N`` means that each channel will
97 have its data replicated to ``N`` brokers. These are the candidates for the
98 ISR set of a channel. As we noted in the ``min.insync.replicas section``
99 above, not all of these brokers have to be available all the time. ``N``
100 should be set *strictly smaller* to ``K`` because channel creations cannot
101 go forward if less than ``N`` brokers are up. So if you set ``N = K``, a
102 single broker going down means that no new channels can be created on the
103 blockchain network — the crash fault tolerance of the ordering service is
104 non-existent.
105
106 Based on what we've described above, the minimum allowed values for ``M``
107 and ``N`` are 2 and 3 respectively. This configuration allows for the
108 creation of new channels to go forward, and for all channels to continue
109 to be writeable.
110
111 * ``message.max.bytes`` and ``replica.fetch.max.bytes`` should be set to
112 a value larger than ``A``, the value you picked in ``Orderer.AbsoluteMaxBytes``
113 in Step 4 above. Add some buffer to account for headers —-- 1 MiB is more than
114 enough. The following condition applies:
115
116 ::
117
118 Orderer.AbsoluteMaxBytes < replica.fetch.max.bytes <= message.max.bytes
119
120 (For completeness, we note that ``message.max.bytes`` should be strictly
121 smaller to ``socket.request.max.bytes`` which is set by default to 100
122 MiB. If you wish to have blocks larger than 100 MiB you will need to edit
123 the hard-coded value in ``brokerConfig.Producer.MaxMessageBytes`` in
124 ``fabric/orderer/kafka/config.go`` and rebuild the binary from source.
125 This is not advisable.)
126
127 * ``log.retention.ms = -1``. Until the ordering service adds support for
128 pruning of the Kafka logs, you should disable time-based retention and
129 prevent segments from expiring. (Size-based retention
130 — see ``log.retention.bytes`` — is disabled by default in Kafka at the time
131 of this writing, so there's no need to set it explicitly.)
132
133 7. Orderers: **Point each OSN to the genesis block.** Edit
134 ``General.BootstrapFile`` in ``orderer.yaml`` so that it points to the genesis
135 block created in Step 5 above. While at it, ensure all other keys in that YAML
136 file are set appropriately.
137
138 8. Orderers: **Adjust polling intervals and timeouts.** (Optional step.)
139
140 * The ``Kafka.Retry`` section in the ``orderer.yaml`` file allows you to
141 adjust the frequency of the metadata/producer/consumer requests, as well as
142 the socket timeouts. (These are all settings you would expect to see in a
143 Kafka producer or consumer.)
144
145 * Additionally, when a new channel is created, or when an existing channel is
146 reloaded (in case of a just-restarted orderer), the orderer interacts with
147 the Kafka cluster in the following ways:
148
149 * It creates a Kafka producer (writer) for the Kafka partition that
150 corresponds to the channel. . It uses that producer to post a no-op
151 ``CONNECT`` message to that partition. . It creates a Kafka consumer
152 (reader) for that partition.
153
154 * If any of these steps fail, you can adjust the frequency with which they
155 are repeated. Specifically they will be re-attempted every
156 ``Kafka.Retry.ShortInterval`` for a total of ``Kafka.Retry.ShortTotal``,
157 and then every ``Kafka.Retry.LongInterval`` for a total of
158 ``Kafka.Retry.LongTotal`` until they succeed. Note that the orderer will
159 be unable to write to or read from a channel until all of the steps above
160 have been completed successfully.
161
162 9. **Set up the OSNs and Kafka cluster so that they communicate over SSL.**
163 (Optional step, but highly recommended.) Refer to `the Confluent guide <https://docs.confluent.io/2.0.0/kafka/ssl.html>`_
164 for the Kafka cluster side of the equation, and set the keys under
165 ``Kafka.TLS`` in ``orderer.yaml`` on every OSN accordingly.
166
167 10. **Bring up the nodes in the following order: ZooKeeper ensemble, Kafka
168 cluster, ordering service nodes.**
169
170 Additional considerations
171 -------------------------
172
173 1. **Preferred message size.** In Step 4 above (see `Steps`_ section) you can
174 also set the preferred size of blocks by setting the
175 ``Orderer.Batchsize.PreferredMaxBytes`` key. Kafka offers higher throughput
176 when dealing with relatively small messages; aim for a value no bigger than 1
177 MiB.
178
179 2. **Using environment variables to override settings.** When using the
180 sample Kafka and Zookeeper Docker images provided with Fabric (see
181 ``images/kafka`` and ``images/zookeeper`` respectively), you can override a
182 Kafka broker or a ZooKeeper server's settings by using environment variables.
183 Replace the dots of the configuration key with underscores. For example,
184 ``KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false`` will allow you to override the
185 default value of ``unclean.leader.election.enable``. The same applies to the
186 OSNs for their *local* configuration, i.e. what can be set in ``orderer.yaml``.
187 For example ``ORDERER_KAFKA_RETRY_SHORTINTERVAL=1s`` allows you to override the
188 default value for ``Orderer.Kafka.Retry.ShortInterval``.
189
190 Kafka Protocol Version Compatibility
191 ------------------------------------
192
193 Fabric uses the `sarama client library <https://github.com/Shopify/sarama>`_ and
194 vendors a version of it that supports Kafka 0.10 to 1.0, yet is still known to
195 work with older versions.
196
197 Using the ``Kafka.Version`` key in ``orderer.yaml``, you can configure which
198 version of the Kafka protocol is used to communicate with the Kafka cluster's
199 brokers. Kafka brokers are backward compatible with older protocol versions.
200 Because of a Kafka broker's backward compatibility with older protocol versions,
201 upgrading your Kafka brokers to a new version does not require an update of the
202 ``Kafka.Version`` key value, but the Kafka cluster might suffer a `performance
203 penalty <https://kafka.apache.org/documentation/#upgrade_11_message_format>`_
204 while using an older protocol version.
205
206 Debugging
207 ---------
208
209 Set environment variable ``FABRIC_LOGGING_SPEC`` to ``DEBUG`` and set
210 ``Kafka.Verbose`` to ``true`` in ``orderer.yaml`` .
211
212 .. Licensed under Creative Commons Attribution 4.0 International License
213 https://creativecommons.org/licenses/by/4.0/