github.com/hechain20/hechain@v0.0.0-20220316014945-b544036ba106/docs/source/private-data-arch.rst (about) 1 Private Data 2 ============ 3 4 .. note:: This topic assumes an understanding of the conceptual material in the 5 `documentation on private data <private-data/private-data.html>`_. 6 7 Private data collection definition 8 ---------------------------------- 9 10 A collection definition contains one or more collections, each having a policy 11 definition listing the organizations in the collection, as well as properties 12 used to control dissemination of private data at endorsement time and, 13 optionally, whether the data will be purged. 14 15 Beginning with the Fabric chaincode lifecycle introduced with Fabric v2.0, the 16 collection definition is part of the chaincode definition. The chaincode including 17 collection definition must be approved by the required channel members, and 18 then becomes effective when the chaincode definition is committed to the channel. 19 The collection definition that is approved must be identical for each of the required 20 channel members. When using the peer CLI to approve and commit the 21 chaincode definition, use the ``--collections-config`` flag to specify the path 22 to the collection definition file. 23 24 Collection definitions are composed of the following properties: 25 26 * ``name``: Name of the collection. 27 28 * ``policy``: The private data collection distribution policy defines which 29 organizations' peers are allowed to retrieve and persist the collection data expressed using 30 the ``Signature`` policy syntax, with each member being included in an ``OR`` 31 signature policy list. To support read/write transactions, the private data 32 distribution policy must define a broader set of organizations than the 33 endorsement policy, as peers must have the private data in order to endorse 34 proposed transactions. For example, in a channel with ten organizations, 35 five of the organizations might be included in a private data collection 36 distribution policy, but the endorsement policy might call for any three 37 of the organizations in the channel to endorse a read/write transaction. 38 For write-only transactions, organizations that are not members of the 39 collection distribution policy but are included in the chaincode level 40 endorsement policy may endorse transactions that write to the private data 41 collection. If this is not desirable, utilize a collection level 42 ``endorsementPolicy`` to restrict the set of allowed endorsers to the private data 43 distribution policy members. 44 45 * ``requiredPeerCount``: Minimum number of peers (across authorized organizations) 46 that each endorsing peer must successfully disseminate private data to before the 47 peer signs the endorsement and returns the proposal response back to the client. 48 Requiring dissemination as a condition of endorsement will ensure that private data 49 is available in the network even if the endorsing peer(s) become unavailable. When 50 ``requiredPeerCount`` is ``0``, it means that no distribution is **required**, 51 but there may be some distribution if ``maxPeerCount`` is greater than zero. A 52 ``requiredPeerCount`` of ``0`` would typically not be recommended, as it could 53 lead to loss of private data in the network if the endorsing peer(s) becomes unavailable. 54 Typically you would want to require at least some distribution of the private 55 data at endorsement time to ensure redundancy of the private data on multiple 56 peers in the network. 57 58 * ``maxPeerCount``: For data redundancy purposes, the maximum number of other 59 peers (across authorized organizations) that each endorsing peer will attempt 60 to distribute the private data to. If an endorsing peer becomes unavailable between 61 endorsement time and commit time, other peers that are collection members but who 62 did not yet receive the private data at endorsement time, will be able to pull 63 the private data from peers the private data was disseminated to. If this value 64 is set to ``0``, the private data is not disseminated at endorsement time, 65 forcing private data pulls against endorsing peers on all authorized peers at 66 commit time. 67 68 * ``blockToLive``: Represents how long the data should live on the private 69 database in terms of blocks. The data will live for this specified number of 70 blocks on the private database and after that it will get purged, making this 71 data obsolete from the network so that it cannot be queried from chaincode, 72 and cannot be made available to requesting peers. To keep private data 73 indefinitely, that is, to never purge private data, set the ``blockToLive`` 74 property to ``0``. 75 76 * ``memberOnlyRead``: a value of ``true`` indicates that peers automatically 77 enforce that only clients belonging to one of the collection member organizations 78 are allowed read access to private data. If a client from a non-member org 79 attempts to execute a chaincode function that performs a read of a private data key, 80 the chaincode invocation is terminated with an error. Utilize a value of 81 ``false`` if you would like to encode more granular access control within 82 individual chaincode functions. 83 84 * ``memberOnlyWrite``: a value of ``true`` indicates that peers automatically 85 enforce that only clients belonging to one of the collection member organizations 86 are allowed to write private data from chaincode. If a client from a non-member org 87 attempts to execute a chaincode function that performs a write on a private data key, 88 the chaincode invocation is terminated with an error. Utilize a value of 89 ``false`` if you would like to encode more granular access control within 90 individual chaincode functions, for example you may want certain clients 91 from non-member organization to be able to create private data in a certain 92 collection. 93 94 * ``endorsementPolicy``: An optional endorsement policy to utilize for the 95 collection that overrides the chaincode level endorsement policy. A 96 collection level endorsement policy may be specified in the form of a 97 ``signaturePolicy`` or may be a ``channelConfigPolicy`` reference to 98 an existing policy from the channel configuration. The ``endorsementPolicy`` 99 may be the same as the collection distribution ``policy``, or may require 100 fewer or additional organization peers. 101 102 Here is a sample collection definition JSON file, containing an array of two 103 collection definitions: 104 105 .. code:: bash 106 107 [ 108 { 109 "name": "collectionMarbles", 110 "policy": "OR('Org1MSP.member', 'Org2MSP.member')", 111 "requiredPeerCount": 0, 112 "maxPeerCount": 3, 113 "blockToLive":1000000, 114 "memberOnlyRead": true, 115 "memberOnlyWrite": true 116 }, 117 { 118 "name": "collectionMarblePrivateDetails", 119 "policy": "OR('Org1MSP.member')", 120 "requiredPeerCount": 0, 121 "maxPeerCount": 3, 122 "blockToLive":3, 123 "memberOnlyRead": true, 124 "memberOnlyWrite":true, 125 "endorsementPolicy": { 126 "signaturePolicy": "OR('Org1MSP.member')" 127 } 128 } 129 ] 130 131 This example uses the organizations from the Fabric test network, ``Org1`` and 132 ``Org2``. The policy in the ``collectionMarbles`` definition authorizes both 133 organizations to the private data. This is a typical configuration when the 134 chaincode data needs to remain private from the ordering service nodes. However, 135 the policy in the ``collectionMarblePrivateDetails`` definition restricts access 136 to a subset of organizations in the channel (in this case ``Org1`` ). Additionally, 137 writing to this collection requires endorsement from an ``Org1`` peer, even 138 though the chaincode level endorsement policy may require endorsement from 139 ``Org1`` or ``Org2``. And since "memberOnlyWrite" is true, only clients from 140 ``Org1`` may invoke chaincode that writes to the private data collection. 141 In this way you can control which organizations are entrusted to write to certain 142 private data collections. 143 144 Implicit private data collections 145 --------------------------------- 146 147 In addition to explicitly defined private data collections, 148 every chaincode has an implicit private data namespace reserved for organization-specific 149 private data. These implicit organization-specific private data collections can 150 be used to store an individual organization's private data, and do not need to 151 be defined explicitly. 152 153 The private data dissemination policy and endorsement policy for implicit 154 organization-specific collections is the respective organization itself. 155 The implication is that if data exists in an implicit private data collection, 156 it was endorsed by the respective organization. Implicit private data collections 157 can therefore be used by an organization to record their agreement or vote 158 for some fact, which is a useful pattern to leverage in multi-party business 159 processes implemented in chaincode since other organizations can check 160 the on-chain hash to verify the organization's record. Private data 161 can also be shared or transferred to an implicit collection of another organization, 162 making implicit collections a useful pattern to leverage in chaincode 163 applications, without the need to explicitly manage collection definitions. 164 165 Since implicit private data collections are not explicitly defined, 166 it is not possible to set the additional collection properties. Specifically, 167 ``memberOnlyRead`` and ``memberOnlyWrite`` are not available, 168 meaning that access control for clients reading data from or writing data to 169 an implicit private data collection must be encoded in the chaincode on the organization's peer. 170 Furthermore, ``blockToLive`` is not available, meaning that private data is never automatically purged. 171 172 The properties ``requiredPeerCount`` and ``maxPeerCount`` can however be set in the peer's core.yaml 173 (``peer.gossip.pvtData.implicitCollectionDisseminationPolicy.requiredPeerCount`` and 174 ``peer.gossip.pvtData.implicitCollectionDisseminationPolicy.maxPeerCount``). An organization 175 can set these properties based on the number of peers that they deploy, as described 176 in the next section. 177 178 .. note:: Since implicit private data collections are not explicitly defined, 179 it is not possible to associate CouchDB indexes with them. Utilize 180 key-based queries and key-range queries rather than JSON queries. 181 182 Private data dissemination 183 -------------------------- 184 185 Since private data is not included in the transactions that get submitted to 186 the ordering service, and therefore not included in the blocks that get distributed 187 to all peers in a channel, the endorsing peer plays an important role in 188 disseminating private data to other peers of authorized organizations. This ensures 189 the availability of private data in the channel's collection, even if endorsing 190 peers become unavailable after their endorsement. To assist with this dissemination, 191 the ``maxPeerCount`` and ``requiredPeerCount`` properties 192 control the degree of dissemination at endorsement time. 193 194 If the endorsing peer cannot successfully disseminate the private data to at least 195 the ``requiredPeerCount``, it will return an error back to the client. The endorsing 196 peer will attempt to disseminate the private data to peers of different organizations, 197 in an effort to ensure that each authorized organization has a copy of the private 198 data. Since transactions are not committed at chaincode execution time, the endorsing 199 peer and recipient peers store a copy of the private data in a local ``transient store`` 200 alongside their blockchain until the transaction is committed. 201 202 When authorized peers do not have a copy of the private data in their transient 203 data store at commit time (either because they were not an endorsing peer or because 204 they did not receive the private data via dissemination at endorsement time), 205 they will attempt to pull the private data from another authorized 206 peer, *for a configurable amount of time* based on the peer property 207 ``peer.gossip.pvtData.pullRetryThreshold`` in the peer configuration ``core.yaml`` 208 file. 209 210 .. note:: The peers being asked for private data will only return the private data 211 if the requesting peer is a member of the collection as defined by the 212 private data dissemination policy. 213 214 Considerations when using ``pullRetryThreshold``: 215 216 * If the requesting peer is able to retrieve the private data within the 217 ``pullRetryThreshold``, it will commit the transaction to its ledger 218 (including the private data hash), and store the private data in its 219 state database, logically separated from other channel state data. 220 221 * If the requesting peer is not able to retrieve the private data within 222 the ``pullRetryThreshold``, it will commit the transaction to it’s blockchain 223 (including the private data hash), without the private data. 224 225 * If the peer was entitled to the private data but it is missing, then 226 that peer will not be able to endorse future transactions that reference 227 the missing private data - a chaincode query for a key that is missing will 228 be detected (based on the presence of the key’s hash in the state database), 229 and the chaincode will receive an error. 230 231 Therefore, it is important to set the ``requiredPeerCount`` and ``maxPeerCount`` 232 properties large enough to ensure the availability of private data in your 233 channel. For example, if each of the endorsing peers become unavailable 234 before the transaction commits, the ``requiredPeerCount`` and ``maxPeerCount`` 235 properties will have ensured the private data is available on other peers. 236 237 .. note:: For collections to work, it is important to have cross organizational 238 gossip configured correctly. Refer to our documentation on :doc:`gossip`, 239 paying particular attention to the "anchor peers" and "external endpoint" 240 configuration. 241 242 Referencing collections from chaincode 243 -------------------------------------- 244 245 A set of `shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim>`_ 246 are available for setting and retrieving private data. 247 248 The same chaincode data operations can be applied to channel state data and 249 private data, but in the case of private data, a collection name is specified 250 along with the data in the chaincode APIs, for example 251 ``PutPrivateData(collection,key,value)`` and ``GetPrivateData(collection,key)``. 252 253 A single chaincode can reference multiple collections. 254 255 Referencing implicit collections from chaincode 256 ----------------------------------------------- 257 258 Starting in v2.0, an implicit private data collection can be used for each 259 organization in a channel, so that you don't have to define collections if you'd 260 like to utilize per-organization collections. Each org-specific implicit collection 261 has a distribution policy and endorsement policy of the matching organization. 262 You can therefore utilize implicit collections for use cases where you'd like 263 to ensure that a specific organization has written to a collection key namespace. 264 The v2.0 chaincode lifecycle uses implicit collections to track which organizations 265 have approved a chaincode definition. Similarly, you can use implicit collections 266 in application chaincode to track which organizations have approved or voted 267 for some change in state. 268 269 To write and read an implicit private data collection key, in the ``PutPrivateData`` 270 and ``GetPrivateData`` chaincode APIs, specify the collection parameter as 271 ``"_implicit_org_<MSPID>"``, for example ``"_implicit_org_Org1MSP"``. 272 273 .. note:: Application defined collection names are not allowed to start with an underscore, 274 therefore there is no chance for an implicit collection name to collide 275 with an application defined collection name. 276 277 How to pass private data in a chaincode proposal 278 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 279 280 Since the chaincode proposal gets stored on the blockchain, it is also important 281 not to include private data in the main part of the chaincode proposal. A special 282 field in the chaincode proposal called the ``transient`` field can be used to pass 283 private data from the client (or data that chaincode will use to generate private 284 data), to chaincode invocation on the peer. The chaincode can retrieve the 285 ``transient`` field by calling the `GetTransient() API <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStub.GetTransient>`_. 286 This ``transient`` field gets excluded from the channel transaction. 287 288 Protecting private data content 289 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 290 291 If the private data is relatively simple and predictable (e.g. transaction dollar 292 amount), channel members who are not authorized to the private data collection 293 could try to guess the content of the private data via brute force hashing of 294 the domain space, in hopes of finding a match with the private data hash on the 295 chain. Private data that is predictable should therefore include a random "salt" 296 that is concatenated with the private data key and included in the private data 297 value, so that a matching hash cannot realistically be found via brute force. 298 The random "salt" can be generated at the client side (e.g. by sampling a secure 299 pseudo-random source) and then passed along with the private data in the transient 300 field at the time of chaincode invocation. 301 302 Protecting private data responses 303 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 304 305 Chaincode can return any data to a client application in the proposal response payload field. 306 For read-only chaincode functions that query private data and which will not get submitted as transactions to the ordering service, 307 private data may be returned in the proposal response payload field to the requesting client. 308 For chaincode functions that propose private data writes however, take care not to include 309 private data in the proposal response payload field, since this field will get 310 included in the transaction which all channel members can access. 311 312 Access control for private data 313 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 314 315 Until version 1.3, access control to private data based on collection membership 316 was enforced for peers only. Access control based on the organization of the 317 chaincode proposal submitter was required to be encoded in chaincode logic. 318 Collection configuration options ``memberOnlyRead`` (since version v1.4) and 319 ``memberOnlyWrite`` (since version v2.0) can automatically enforce that the chaincode 320 proposal submitter must be from a collection member in order to read or write 321 private data keys. For more information about collection 322 configuration definitions and how to set them, refer back to the 323 `Private data collection definition`_ section of this topic. 324 325 .. note:: If you would like more granular access control, you can set 326 ``memberOnlyRead`` and ``memberOnlyWrite`` to false (implicit collections always 327 behave as if ``memberOnlyRead`` and ``memberOnlyWrite`` are false). You can then apply your 328 own access control logic in chaincode, for example by calling the GetCreator() 329 chaincode API or using the client identity 330 `chaincode library <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStub.GetCreator>`__ . 331 332 Querying Private Data 333 ~~~~~~~~~~~~~~~~~~~~~ 334 335 Private data collection can be queried just like normal channel data, using 336 shim APIs: 337 338 * ``GetPrivateDataByRange(collection, startKey, endKey string)`` 339 * ``GetPrivateDataByPartialCompositeKey(collection, objectType string, keys []string)`` 340 341 And if using explicit private data collections and CouchDB state database, 342 JSON content queries can be passed using the shim API: 343 344 * ``GetPrivateDataQueryResult(collection, query string)`` 345 346 Limitations: 347 348 * Clients that call chaincode that executes key range queries or JSON queries should be aware 349 that they may receive a subset of the result set, if the peer they query has missing 350 private data, based on the explanation in Private Data Dissemination section 351 above. Clients can query multiple peers and compare the results to 352 determine if a peer may be missing some of the result set. 353 * Chaincode that executes key range queries or JSON queries and updates data in a single 354 transaction is not supported, as the query results cannot be validated on the peers 355 that don’t have access to the private data, or on peers that are missing the 356 private data that they have access to. If a chaincode invocation both queries 357 and updates private data, the proposal request will return an error. If your application 358 can tolerate result set changes between chaincode execution and validation/commit time, 359 then you could call one chaincode function to perform the query, and then call a second 360 chaincode function to make the updates. Note that calls to GetPrivateData() to retrieve 361 individual keys can be made in the same transaction as PutPrivateData() calls, since 362 all peers can validate key reads based on the hashed key version. 363 * Since implicit private data collections are not explicitly defined, 364 it is not possible to associate CouchDB indexes with them. 365 It is therefore not recommended to utilize JSON queries with implicit private data collections. 366 367 Using Indexes with collections 368 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 369 370 The topic :doc:`couchdb_as_state_database` describes indexes that can be 371 applied to the channel’s state database to enable JSON content queries, by 372 packaging indexes in a ``META-INF/statedb/couchdb/indexes`` directory at chaincode 373 installation time. Similarly, indexes can also be applied to private data 374 collections that are explicitly defined, by packaging indexes in a ``META-INF/statedb/couchdb/collections/<collection_name>/indexes`` 375 directory. An example index is available `here <https://github.com/hyperledger/fabric-samples/blob/{BRANCH}/chaincode/marbles02_private/go/META-INF/statedb/couchdb/collections/collectionMarbles/indexes/indexOwner.json>`_. 376 377 Considerations when using private data 378 -------------------------------------- 379 380 Private data purging 381 ~~~~~~~~~~~~~~~~~~~~ 382 383 Private data in explicitly defined private data collections can be periodically purged from peers. 384 For more details, see the ``blockToLive`` collection definition property above. 385 386 Additionally, recall that prior to commit, peers store private data in a local 387 transient data store. This data automatically gets purged when the transaction 388 commits. But if a transaction was never submitted to the channel and 389 therefore never committed, the private data would remain in each peer’s 390 transient store. This data is purged from the transient store after a 391 configurable number blocks by using the peer’s 392 ``peer.gossip.pvtData.transientstoreMaxBlockRetention`` property in the peer 393 ``core.yaml`` file. 394 395 Updating a collection definition 396 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 397 398 To update a collection definition or add a new collection, you can update 399 the chaincode definition and pass the new collection configuration 400 in the chaincode approve and commit transactions, for example using the ``--collections-config`` 401 flag if using the CLI. If a collection configuration is specified when updating 402 the chaincode definition, a definition for each of the existing collections must be 403 included. 404 405 When updating a chaincode definition, you can add new private data collections, 406 and update existing private data collections, for example to add new 407 members to an existing collection or change one of the collection definition 408 properties. Note that you cannot update the collection name or the 409 blockToLive property, since a consistent blockToLive is required 410 regardless of a peer's block height. 411 412 Collection updates becomes effective when a peer commits the block with the updated 413 chaincode definition. Note that collections cannot be 414 deleted, as there may be prior private data hashes on the channel’s blockchain 415 that cannot be removed. 416 417 Private data reconciliation 418 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 419 420 Starting in v1.4, peers of organizations that are added to an existing collection 421 will automatically fetch private data that was committed to the collection before 422 they joined the collection. 423 424 This private data "reconciliation" also applies to peers that 425 were entitled to receive private data but did not yet receive it --- because of 426 a network failure, for example --- by keeping track of private data that was "missing" 427 at the time of block commit. 428 429 Private data reconciliation occurs periodically based on the 430 ``peer.gossip.pvtData.reconciliationEnabled`` and ``peer.gossip.pvtData.reconcileSleepInterval`` 431 properties in core.yaml. The peer will periodically attempt to fetch the private 432 data from other collection member peers that are expected to have it. 433 434 Note that this private data reconciliation feature only works on peers running 435 v1.4 or later of Fabric. 436 437 .. Licensed under Creative Commons Attribution 4.0 International License 438 https://creativecommons.org/licenses/by/4.0/