github.com/tenywen/fabric@v1.0.0-beta.0.20170620030522-a5b1ed380643/docs/source/kafka.rst (about) 1 Bringing up a Kafka-based Ordering Service 2 =========================================== 3 4 Caveat emptor 5 ------------- 6 7 This document assumes that the reader generally knows how to set up a Kafka 8 cluster and a ZooKeeper ensemble. The purpose of this guide is to identify the 9 steps you need to take so as to have a set of Hyperledger Fabric ordering 10 service nodes (OSNs) use your Kafka cluster and provide an ordering service to 11 your blockchain network. 12 13 Big picture 14 ----------- 15 16 Each channel in Fabric maps to a separate single-partition topic in Kafka. When 17 an OSN receives transactions via the ``Broadcast`` RPC, it checks to make sure 18 that the broadcasting client has permissions to write on the channel, then 19 relays (i.e. produces) those transactions to the appropriate partition in Kafka. 20 This partition is also consumed by the OSN which groups the received 21 transactions into blocks locally, persists them in its local ledger, and serves 22 them to receiving clients via the ``Deliver`` RPC. For low-level details, refer 23 to `the document that describes how we came to this design 24 <https://docs.google.com/document/d/1vNMaM7XhOlu9tB_10dKnlrhy5d7b1u8lSY8a-kVjCO4/edit>`_ 25 -- Figure 8 is a schematic representation of the process described above. 26 27 Steps 28 ----- 29 30 Let ``K`` and ``Z`` be the number of nodes in the Kafka cluster and the 31 ZooKeeper ensemble respectively: 32 33 i. At a minimum, ``K`` should be set to 4. (As we will explain in Step 4 below, 34 this is the minimum number of nodes necessary in order to exhibit crash fault 35 tolerance, i.e. with 4 brokers, you can have 1 broker go down, all channels will 36 continue to be writeable and readable, and new channels can be created.) 37 38 ii. ``Z`` will either be 3, 5, or 7. It has to be an odd number to avoid 39 split-brain scenarios, and larger than 1 in order to avoid single point of 40 failures. Anything beyond 7 ZooKeeper servers is considered an overkill. 41 42 Proceed as follows: 43 44 1. Orderers: **Encode the Kafka-related information in the network's genesis 45 block.** If you are using ``configtxgen``, edit ``configtx.yaml`` -- or pick a 46 preset profile for the system channel's genesis block -- so that: 47 48 a. ``Orderer.OrdererType`` is set to ``kafka``. 49 50 b. ``Orderer.Kafka.Brokers`` contains the address of *at least two* of the 51 Kafka brokers in your cluster in ``IP:port`` notation. The list does not 52 need to be exhaustive. (These are your seed brokers.) 53 54 2. Orderers: **Set the maximum block size.** Each block will have at most 55 `Orderer.AbsoluteMaxBytes` bytes (not including headers), a value that you can 56 set in ``configtx.yaml``. Let the value you pick here be ``A`` and make note of 57 it -- it will affect how you configure your Kafka brokers in Step 4. 58 59 3. Orderers: **Create the genesis block.** Use ``configtxgen``. The settings you 60 picked in Steps 1 and 2 above are system-wide settings, i.e. they apply across 61 the network for all the OSNs. Make note of the genesis block's location. 62 63 4. Kafka cluster: **Configure your Kafka brokers appropriately.** Ensure that 64 every Kafka broker has these keys configured: 65 66 a. ``unclean.leader.election.enable = false`` -- Data consistency is key in 67 a blockchain environment. We cannot have a channel leader chosen outside of 68 the in-sync replica set, or we run the risk of overwriting the offsets that 69 the previous leader produced, and --as a result-- rewrite the blockchain 70 that the orderers produce. 71 72 b. ``min.insync.replicas = M`` -- Where you pick a value ``M`` such that 73 1 < M < N (see ``default.replication.factor`` below). Data is considered 74 committed when it is written to at least ``M`` replicas (which are then 75 considered in-sync and belong to the in-sync replica set, or ISR). In any 76 other case, the write operation returns an error. Then: 77 78 i. If up to N-M replicas -- out of the ``N`` that the channel data is 79 written to -- become unavailable, operations proceed normally. 80 ii. If more replicas become unavailable, Kafka cannot maintain an ISR 81 set of ``M,`` so it stops accepting writes. Reads work without issues. 82 The channel becomes writeable again when ``M`` replicas get in-sync. 83 84 c. ``default.replication.factor = N`` -- Where you pick a value ``N`` such 85 that N < K. A replication factor of ``N`` means that each channel will have 86 its data replicated to ``N`` brokers. These are the candidates for the ISR 87 set of a channel. As we noted in the ``min.insync.replicas section`` above, 88 not all of these brokers have to be available all the time. ``N`` should be 89 set *strictly smaller* to ``K`` because channel creations cannot go forward 90 if less than ``N`` brokers are up. So if you set N = K, a single broker 91 going down means that no new channels can be created on the blockchain 92 network -- the crash fault tolerance of the ordering service is 93 non-existent. 94 95 d. ``message.max.bytes`` and ``replica.fetch.max.bytes`` should be set to a 96 value larger than ``A``, the value you picked in 97 ``Orderer.AbsoluteMaxBytes`` in Step 2 above. Add some buffer to account for 98 headers -- 1 MiB is more than enough. The following condition applies: 99 100 :: 101 102 Orderer.AbsoluteMaxBytes < replica.fetch.max.bytes <= message.max.bytes 103 104 (For completeness, we note that ``message.max.bytes`` should be strictly 105 smaller to ``socket.request.max.bytes`` which is set by default to 100 MiB. 106 If you wish to have blocks larger than 100 MiB you will need to edit the 107 hard-coded value in ``brokerConfig.Producer.MaxMessageBytes`` in 108 ``fabric/orderer/kafka/config.go`` and rebuild the binary from source. 109 This is not advisable.) 110 111 e. ``log.retention.ms = -1``. Until the ordering service in Fabric adds 112 support for pruning of the Kafka logs, you should disable time-based 113 retention and prevent segments from expiring. (Size-based retention -- see 114 ``log.retention.bytes`` -- is disabled by default in Kafka at the time of 115 this writing, so there's no need to set it explicitly.) 116 117 Based on what we've described above, the minimum allowed values for ``M`` 118 and ``N`` are 2 and 3 respectively. This configuration allows for the 119 creation of new channels to go forward, and for all channels to continue to 120 be writeable. 121 122 5. Orderers: **Point each OSN to the genesis block.** Edit 123 ``General.GenesisFile`` in ``orderer.yaml`` so that it points to the genesis 124 block created in Step 3 above. (While at it, ensure all other keys in that YAML 125 file are set appropriately.) 126 127 6. Orderers: **Adjust polling intervals and timeouts.** (Optional step.) 128 129 a. The ``Kafka.Retry`` section in the ``orderer.yaml`` file allows you to 130 adjust the frequency of the metadata/producer/consumer requests, as well as 131 the socket timeouts. (These are all settings you would expect to see in a 132 Kafka producer or consumer.) 133 134 b. Additionally, when a new channel is created, or when an existing channel 135 is reloaded (in case of a just-restarted orderer), the orderer interacts 136 with the Kafka cluster in the following ways: 137 138 a. It creates a Kafka producer (writer) for the Kafka partition that 139 corresponds to the channel. 140 141 b. It uses that producer to post a no-op ``CONNECT`` message to that 142 partition. 143 144 c. It creates a Kafka consumer (reader) for that partition. 145 146 If any of these steps fail, you can adjust the frequency with which they 147 are repeated. Specifically they will be re-attempted every 148 ``Kafka.Retry.ShortInterval`` for a total of ``Kafka.Retry.ShortTotal``, 149 and then every ``Kafka.Retry.LongInterval`` for a total of 150 ``Kafka.Retry.LongTotal`` until they succeed. Note that the orderer will 151 be unable to write to or read from a channel until all of the steps 152 above have been completed successfully. 153 154 7. **Set up the OSNs and Kafka cluster so that they communicate over SSL.** 155 (Optional step, but highly recommended.) Refer to `the Confluent guide 156 <http://docs.confluent.io/2.0.0/kafka/ssl.html>`_ for the Kafka cluster side of 157 the equation, and set the keys under ``Kafka.TLS`` in ``orderer.yaml`` on every 158 OSN accordingly. 159 160 8. **Bring up the nodes in the following order: ZooKeeper ensemble, Kafka 161 cluster, ordering service nodes.** 162 163 Additional considerations 164 ------------------------- 165 166 1. **Preferred message size.** In Step 2 above (see `Steps`_ section) you can 167 also set the preferred size of blocks by setting the 168 ``Orderer.Batchsize.PreferredMaxBytes`` key. Kafka offers higher throughput when 169 dealing with relatively small messages; aim for a value no bigger than 1 MiB. 170 171 2. **Using environment variables to override settings.** You can override a 172 Kafka broker or a ZooKeeper server's settings by using environment variables. 173 Replace the dots of the configuration key with underscores -- 174 e.g. ``KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false`` will allow you to override 175 the default value of ``unclean.leader.election.enable``. The same applies to the 176 OSNs for their *local* configuration, i.e. what can be set in ``orderer.yaml``. 177 For example ``ORDERER_KAFKA_RETRY_SHORTINTERVAL=1s`` allows you to override the 178 default value for ``Orderer.Kafka.Retry.ShortInterval``. 179 180 Supported Kafka versions and upgrading 181 -------------------------------------- 182 183 Supported Kafka versions for v1 are ``0.9`` and ``0.10``. (Fabric uses the 184 `sarama client library <https://github.com/Shopify/sarama>`_ and vendors a 185 version of it that supports Kafka 0.9 and 0.10.) 186 187 Out of the box the Kafka version defaults to ``0.9.0.1``. If you wish to use a 188 different supported version, you will have to edit the source code (modify the 189 ``Version`` field of the ``defaults`` struct in 190 ``orderer/localconfig/config.go``) and rebuild the ``orderer`` binary. For 191 example, if you wish to run the ordering service in a Kafka cluster running 192 0.10.0.1, you would edit the file like so: 193 194 :: 195 196 ... 197 Verbose: false, 198 Version: sarama.V0_10_0_1, 199 TLS: TLS{ 200 ... 201 202 And then rebuild the binary. (This process will be improved with 203 `FAB-4619 <https://jira.hyperledger.org/browse/FAB-4619>`_.) 204 205 Debugging 206 --------- 207 208 Set ``General.LogLevel`` to ``DEBUG`` and ``Kafka.Verbose`` in ``orderer.yaml`` 209 to ``true``. 210 211 Example 212 ------- 213 214 Sample Docker Compose configuration files inline with the recommended settings 215 above can be found under the ``fabric/bddtests`` directory. Look for 216 ``dc-orderer-kafka-base.yml`` and ``dc-orderer-kafka.yml``. 217 218 .. Licensed under Creative Commons Attribution 4.0 International License 219 https://creativecommons.org/licenses/by/4.0/