github.com/hechain20/hechain@v0.0.0-20220316014945-b544036ba106/docs/source/private-data-arch.rst (about)

     1  Private Data
     2  ============
     3  
     4  .. note:: This topic assumes an understanding of the conceptual material in the
     5            `documentation on private data <private-data/private-data.html>`_.
     6  
     7  Private data collection definition
     8  ----------------------------------
     9  
    10  A collection definition contains one or more collections, each having a policy
    11  definition listing the organizations in the collection, as well as properties
    12  used to control dissemination of private data at endorsement time and,
    13  optionally, whether the data will be purged.
    14  
    15  Beginning with the Fabric chaincode lifecycle introduced with Fabric v2.0, the
    16  collection definition is part of the chaincode definition. The chaincode including
    17  collection definition must be approved by the required channel members, and
    18  then becomes effective when the chaincode definition is committed to the channel.
    19  The collection definition that is approved must be identical for each of the required
    20  channel members. When using the peer CLI to approve and commit the
    21  chaincode definition, use the ``--collections-config`` flag to specify the path
    22  to the collection definition file.
    23  
    24  Collection definitions are composed of the following properties:
    25  
    26  * ``name``: Name of the collection.
    27  
    28  * ``policy``: The private data collection distribution policy defines which
    29    organizations' peers are allowed to retrieve and persist the collection data expressed using
    30    the ``Signature`` policy syntax, with each member being included in an ``OR``
    31    signature policy list. To support read/write transactions, the private data
    32    distribution policy must define a broader set of organizations than the
    33    endorsement policy, as peers must have the private data in order to endorse
    34    proposed transactions. For example, in a channel with ten organizations,
    35    five of the organizations might be included in a private data collection
    36    distribution policy, but the endorsement policy might call for any three
    37    of the organizations in the channel to endorse a read/write transaction.
    38    For write-only transactions, organizations that are not members of the
    39    collection distribution policy but are included in the chaincode level
    40    endorsement policy may endorse transactions that write to the private data
    41    collection. If this is not desirable, utilize a collection level
    42    ``endorsementPolicy`` to restrict the set of allowed endorsers to the private data
    43    distribution policy members.
    44  
    45  * ``requiredPeerCount``: Minimum number of peers (across authorized organizations)
    46    that each endorsing peer must successfully disseminate private data to before the
    47    peer signs the endorsement and returns the proposal response back to the client.
    48    Requiring dissemination as a condition of endorsement will ensure that private data
    49    is available in the network even if the endorsing peer(s) become unavailable. When
    50    ``requiredPeerCount`` is ``0``, it means that no distribution is **required**,
    51    but there may be some distribution if ``maxPeerCount`` is greater than zero. A
    52    ``requiredPeerCount`` of ``0`` would typically not be recommended, as it could
    53    lead to loss of private data in the network if the endorsing peer(s) becomes unavailable.
    54    Typically you would want to require at least some distribution of the private
    55    data at endorsement time to ensure redundancy of the private data on multiple
    56    peers in the network.
    57  
    58  * ``maxPeerCount``: For data redundancy purposes, the maximum number of other
    59    peers (across authorized organizations) that each endorsing peer will attempt
    60    to distribute the private data to. If an endorsing peer becomes unavailable between
    61    endorsement time and commit time, other peers that are collection members but who
    62    did not yet receive the private data at endorsement time, will be able to pull
    63    the private data from peers the private data was disseminated to. If this value
    64    is set to ``0``, the private data is not disseminated at endorsement time,
    65    forcing private data pulls against endorsing peers on all authorized peers at
    66    commit time.
    67  
    68  * ``blockToLive``: Represents how long the data should live on the private
    69    database in terms of blocks. The data will live for this specified number of
    70    blocks on the private database and after that it will get purged, making this
    71    data obsolete from the network so that it cannot be queried from chaincode,
    72    and cannot be made available to requesting peers. To keep private data
    73    indefinitely, that is, to never purge private data, set the ``blockToLive``
    74    property to ``0``.
    75  
    76  * ``memberOnlyRead``: a value of ``true`` indicates that peers automatically
    77    enforce that only clients belonging to one of the collection member organizations
    78    are allowed read access to private data. If a client from a non-member org
    79    attempts to execute a chaincode function that performs a read of a private data key,
    80    the chaincode invocation is terminated with an error. Utilize a value of
    81    ``false`` if you would like to encode more granular access control within
    82    individual chaincode functions.
    83  
    84  * ``memberOnlyWrite``: a value of ``true`` indicates that peers automatically
    85    enforce that only clients belonging to one of the collection member organizations
    86    are allowed to write private data from chaincode. If a client from a non-member org
    87    attempts to execute a chaincode function that performs a write on a private data key,
    88    the chaincode invocation is terminated with an error. Utilize a value of
    89    ``false`` if you would like to encode more granular access control within
    90    individual chaincode functions, for example you may want certain clients
    91    from non-member organization to be able to create private data in a certain
    92    collection.
    93  
    94  * ``endorsementPolicy``: An optional endorsement policy to utilize for the
    95    collection that overrides the chaincode level endorsement policy. A
    96    collection level endorsement policy may be specified in the form of a
    97    ``signaturePolicy`` or may be a ``channelConfigPolicy`` reference to
    98    an existing policy from the channel configuration. The ``endorsementPolicy``
    99    may be the same as the collection distribution ``policy``, or may require
   100    fewer or additional organization peers.
   101  
   102  Here is a sample collection definition JSON file, containing an array of two
   103  collection definitions:
   104  
   105  .. code:: bash
   106  
   107   [
   108    {
   109       "name": "collectionMarbles",
   110       "policy": "OR('Org1MSP.member', 'Org2MSP.member')",
   111       "requiredPeerCount": 0,
   112       "maxPeerCount": 3,
   113       "blockToLive":1000000,
   114       "memberOnlyRead": true,
   115       "memberOnlyWrite": true
   116    },
   117    {
   118       "name": "collectionMarblePrivateDetails",
   119       "policy": "OR('Org1MSP.member')",
   120       "requiredPeerCount": 0,
   121       "maxPeerCount": 3,
   122       "blockToLive":3,
   123       "memberOnlyRead": true,
   124       "memberOnlyWrite":true,
   125       "endorsementPolicy": {
   126         "signaturePolicy": "OR('Org1MSP.member')"
   127       }
   128    }
   129   ]
   130  
   131  This example uses the organizations from the Fabric test network, ``Org1`` and
   132  ``Org2``. The policy in the  ``collectionMarbles`` definition authorizes both
   133  organizations to the private data. This is a typical configuration when the
   134  chaincode data needs to remain private from the ordering service nodes. However,
   135  the policy in the ``collectionMarblePrivateDetails`` definition restricts access
   136  to a subset of organizations in the channel (in this case ``Org1`` ). Additionally,
   137  writing to this collection requires endorsement from an ``Org1`` peer, even
   138  though the chaincode level endorsement policy may require endorsement from
   139  ``Org1`` or ``Org2``. And since "memberOnlyWrite" is true, only clients from
   140  ``Org1`` may invoke chaincode that writes to the private data collection.
   141  In this way you can control which organizations are entrusted to write to certain
   142  private data collections.
   143  
   144  Implicit private data collections
   145  ---------------------------------
   146  
   147  In addition to explicitly defined private data collections,
   148  every chaincode has an implicit private data namespace reserved for organization-specific
   149  private data. These implicit organization-specific private data collections can
   150  be used to store an individual organization's private data, and do not need to
   151  be defined explicitly.
   152  
   153  The private data dissemination policy and endorsement policy for implicit
   154  organization-specific collections is the respective organization itself.
   155  The implication is that if data exists in an implicit private data collection,
   156  it was endorsed by the respective organization. Implicit private data collections
   157  can therefore be used by an organization to record their agreement or vote
   158  for some fact, which is a useful pattern to leverage in multi-party business
   159  processes implemented in chaincode since other organizations can check
   160  the on-chain hash to verify the organization's record. Private data
   161  can also be shared or transferred to an implicit collection of another organization,
   162  making implicit collections a useful pattern to leverage in chaincode
   163  applications, without the need to explicitly manage collection definitions.
   164  
   165  Since implicit private data collections are not explicitly defined,
   166  it is not possible to set the additional collection properties. Specifically,
   167  ``memberOnlyRead`` and ``memberOnlyWrite`` are not available,
   168  meaning that access control for clients reading data from or writing data to
   169  an implicit private data collection must be encoded in the chaincode on the organization's peer.
   170  Furthermore, ``blockToLive`` is not available, meaning that private data is never automatically purged.
   171  
   172  The properties ``requiredPeerCount`` and ``maxPeerCount`` can however be set in the peer's core.yaml
   173  (``peer.gossip.pvtData.implicitCollectionDisseminationPolicy.requiredPeerCount`` and
   174  ``peer.gossip.pvtData.implicitCollectionDisseminationPolicy.maxPeerCount``). An organization
   175  can set these properties based on the number of peers that they deploy, as described
   176  in the next section.
   177  
   178  .. note:: Since implicit private data collections are not explicitly defined,
   179            it is not possible to associate CouchDB indexes with them. Utilize
   180            key-based queries and key-range queries rather than JSON queries.
   181  
   182  Private data dissemination
   183  --------------------------
   184  
   185  Since private data is not included in the transactions that get submitted to
   186  the ordering service, and therefore not included in the blocks that get distributed
   187  to all peers in a channel, the endorsing peer plays an important role in
   188  disseminating private data to other peers of authorized organizations. This ensures
   189  the availability of private data in the channel's collection, even if endorsing
   190  peers become unavailable after their endorsement. To assist with this dissemination,
   191  the  ``maxPeerCount`` and ``requiredPeerCount`` properties
   192  control the degree of dissemination at endorsement time.
   193  
   194  If the endorsing peer cannot successfully disseminate the private data to at least
   195  the ``requiredPeerCount``, it will return an error back to the client. The endorsing
   196  peer will attempt to disseminate the private data to peers of different organizations,
   197  in an effort to ensure that each authorized organization has a copy of the private
   198  data. Since transactions are not committed at chaincode execution time, the endorsing
   199  peer and recipient peers store a copy of the private data in a local ``transient store``
   200  alongside their blockchain until the transaction is committed.
   201  
   202  When authorized peers do not have a copy of the private data in their transient
   203  data store at commit time (either because they were not an endorsing peer or because
   204  they did not receive the private data via dissemination at endorsement time),
   205  they will attempt to pull the private data from another authorized
   206  peer, *for a configurable amount of time* based on the peer property
   207  ``peer.gossip.pvtData.pullRetryThreshold`` in the peer configuration ``core.yaml``
   208  file.
   209  
   210  .. note:: The peers being asked for private data will only return the private data
   211            if the requesting peer is a member of the collection as defined by the
   212            private data dissemination policy.
   213  
   214  Considerations when using ``pullRetryThreshold``:
   215  
   216  * If the requesting peer is able to retrieve the private data within the
   217    ``pullRetryThreshold``, it will commit the transaction to its ledger
   218    (including the private data hash), and store the private data in its
   219    state database, logically separated from other channel state data.
   220  
   221  * If the requesting peer is not able to retrieve the private data within
   222    the ``pullRetryThreshold``, it will commit the transaction to it’s blockchain
   223    (including the private data hash), without the private data.
   224  
   225  * If the peer was entitled to the private data but it is missing, then
   226    that peer will not be able to endorse future transactions that reference
   227    the missing private data - a chaincode query for a key that is missing will
   228    be detected (based on the presence of the key’s hash in the state database),
   229    and the chaincode will receive an error.
   230  
   231  Therefore, it is important to set the ``requiredPeerCount`` and ``maxPeerCount``
   232  properties large enough to ensure the availability of private data in your
   233  channel. For example, if each of the endorsing peers become unavailable
   234  before the transaction commits, the ``requiredPeerCount`` and ``maxPeerCount``
   235  properties will have ensured the private data is available on other peers.
   236  
   237  .. note:: For collections to work, it is important to have cross organizational
   238            gossip configured correctly. Refer to our documentation on :doc:`gossip`,
   239            paying particular attention to the "anchor peers" and "external endpoint"
   240            configuration.
   241  
   242  Referencing collections from chaincode
   243  --------------------------------------
   244  
   245  A set of `shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim>`_
   246  are available for setting and retrieving private data.
   247  
   248  The same chaincode data operations can be applied to channel state data and
   249  private data, but in the case of private data, a collection name is specified
   250  along with the data in the chaincode APIs, for example
   251  ``PutPrivateData(collection,key,value)`` and ``GetPrivateData(collection,key)``.
   252  
   253  A single chaincode can reference multiple collections.
   254  
   255  Referencing implicit collections from chaincode
   256  -----------------------------------------------
   257  
   258  Starting in v2.0, an implicit private data collection can be used for each
   259  organization in a channel, so that you don't have to define collections if you'd
   260  like to utilize per-organization collections. Each org-specific implicit collection
   261  has a distribution policy and endorsement policy of the matching organization.
   262  You can therefore utilize implicit collections for use cases where you'd like
   263  to ensure that a specific organization has written to a collection key namespace.
   264  The v2.0 chaincode lifecycle uses implicit collections to track which organizations
   265  have approved a chaincode definition. Similarly, you can use implicit collections
   266  in application chaincode to track which organizations have approved or voted
   267  for some change in state.
   268  
   269  To write and read an implicit private data collection key, in the ``PutPrivateData``
   270  and ``GetPrivateData`` chaincode APIs, specify the collection parameter as
   271  ``"_implicit_org_<MSPID>"``, for example ``"_implicit_org_Org1MSP"``.
   272  
   273  .. note:: Application defined collection names are not allowed to start with an underscore,
   274            therefore there is no chance for an implicit collection name to collide
   275            with an application defined collection name.
   276  
   277  How to pass private data in a chaincode proposal
   278  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   279  
   280  Since the chaincode proposal gets stored on the blockchain, it is also important
   281  not to include private data in the main part of the chaincode proposal. A special
   282  field in the chaincode proposal called the ``transient`` field can be used to pass
   283  private data from the client (or data that chaincode will use to generate private
   284  data), to chaincode invocation on the peer.  The chaincode can retrieve the
   285  ``transient`` field by calling the `GetTransient() API <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStub.GetTransient>`_.
   286  This ``transient`` field gets excluded from the channel transaction.
   287  
   288  Protecting private data content
   289  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   290  
   291  If the private data is relatively simple and predictable (e.g. transaction dollar
   292  amount), channel members who are not authorized to the private data collection
   293  could try to guess the content of the private data via brute force hashing of
   294  the domain space, in hopes of finding a match with the private data hash on the
   295  chain. Private data that is predictable should therefore include a random "salt"
   296  that is concatenated with the private data key and included in the private data
   297  value, so that a matching hash cannot realistically be found via brute force.
   298  The random "salt" can be generated at the client side (e.g. by sampling a secure
   299  pseudo-random source) and then passed along with the private data in the transient
   300  field at the time of chaincode invocation.
   301  
   302  Protecting private data responses
   303  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   304  
   305  Chaincode can return any data to a client application in the proposal response payload field.
   306  For read-only chaincode functions that query private data and which will not get submitted as transactions to the ordering service,
   307  private data may be returned in the proposal response payload field to the requesting client.
   308  For chaincode functions that propose private data writes however, take care not to include
   309  private data in the proposal response payload field, since this field will get
   310  included in the transaction which all channel members can access.
   311  
   312  Access control for private data
   313  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   314  
   315  Until version 1.3, access control to private data based on collection membership
   316  was enforced for peers only. Access control based on the organization of the
   317  chaincode proposal submitter was required to be encoded in chaincode logic.
   318  Collection configuration options ``memberOnlyRead`` (since version v1.4) and
   319  ``memberOnlyWrite`` (since version v2.0) can automatically enforce that the chaincode
   320  proposal submitter must be from a collection member in order to read or write
   321  private data keys. For more information about collection
   322  configuration definitions and how to set them, refer back to the
   323  `Private data collection definition`_  section of this topic.
   324  
   325  .. note:: If you would like more granular access control, you can set
   326            ``memberOnlyRead`` and ``memberOnlyWrite`` to false (implicit collections always
   327            behave as if ``memberOnlyRead`` and ``memberOnlyWrite`` are false). You can then apply your
   328            own access control logic in chaincode, for example by calling the GetCreator()
   329            chaincode API or using the client identity
   330            `chaincode library <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStub.GetCreator>`__ .
   331  
   332  Querying Private Data
   333  ~~~~~~~~~~~~~~~~~~~~~
   334  
   335  Private data collection can be queried just like normal channel data, using
   336  shim APIs:
   337  
   338  * ``GetPrivateDataByRange(collection, startKey, endKey string)``
   339  * ``GetPrivateDataByPartialCompositeKey(collection, objectType string, keys []string)``
   340  
   341  And if using explicit private data collections and CouchDB state database,
   342  JSON content queries can be passed using the shim API:
   343  
   344  * ``GetPrivateDataQueryResult(collection, query string)``
   345  
   346  Limitations:
   347  
   348  * Clients that call chaincode that executes key range queries or JSON queries should be aware
   349    that they may receive a subset of the result set, if the peer they query has missing
   350    private data, based on the explanation in Private Data Dissemination section
   351    above.  Clients can query multiple peers and compare the results to
   352    determine if a peer may be missing some of the result set.
   353  * Chaincode that executes key range queries or JSON queries and updates data in a single
   354    transaction is not supported, as the query results cannot be validated on the peers
   355    that don’t have access to the private data, or on peers that are missing the
   356    private data that they have access to. If a chaincode invocation both queries
   357    and updates private data, the proposal request will return an error. If your application
   358    can tolerate result set changes between chaincode execution and validation/commit time,
   359    then you could call one chaincode function to perform the query, and then call a second
   360    chaincode function to make the updates. Note that calls to GetPrivateData() to retrieve
   361    individual keys can be made in the same transaction as PutPrivateData() calls, since
   362    all peers can validate key reads based on the hashed key version.
   363  * Since implicit private data collections are not explicitly defined,
   364    it is not possible to associate CouchDB indexes with them.
   365    It is therefore not recommended to utilize JSON queries with implicit private data collections.
   366  
   367  Using Indexes with collections
   368  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   369  
   370  The topic :doc:`couchdb_as_state_database` describes indexes that can be
   371  applied to the channel’s state database to enable JSON content queries, by
   372  packaging indexes in a ``META-INF/statedb/couchdb/indexes`` directory at chaincode
   373  installation time.  Similarly, indexes can also be applied to private data
   374  collections that are explicitly defined, by packaging indexes in a ``META-INF/statedb/couchdb/collections/<collection_name>/indexes``
   375  directory. An example index is available `here <https://github.com/hyperledger/fabric-samples/blob/{BRANCH}/chaincode/marbles02_private/go/META-INF/statedb/couchdb/collections/collectionMarbles/indexes/indexOwner.json>`_.
   376  
   377  Considerations when using private data
   378  --------------------------------------
   379  
   380  Private data purging
   381  ~~~~~~~~~~~~~~~~~~~~
   382  
   383  Private data in explicitly defined private data collections can be periodically purged from peers.
   384  For more details, see the ``blockToLive`` collection definition property above.
   385  
   386  Additionally, recall that prior to commit, peers store private data in a local
   387  transient data store. This data automatically gets purged when the transaction
   388  commits.  But if a transaction was never submitted to the channel and
   389  therefore never committed, the private data would remain in each peer’s
   390  transient store.  This data is purged from the transient store after a
   391  configurable number blocks by using the peer’s
   392  ``peer.gossip.pvtData.transientstoreMaxBlockRetention`` property in the peer
   393  ``core.yaml`` file.
   394  
   395  Updating a collection definition
   396  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   397  
   398  To update a collection definition or add a new collection, you can update
   399  the chaincode definition and pass the new collection configuration
   400  in the chaincode approve and commit transactions, for example using the ``--collections-config``
   401  flag if using the CLI. If a collection configuration is specified when updating
   402  the chaincode definition, a definition for each of the existing collections must be
   403  included.
   404  
   405  When updating a chaincode definition, you can add new private data collections,
   406  and update existing private data collections, for example to add new
   407  members to an existing collection or change one of the collection definition
   408  properties. Note that you cannot update the collection name or the
   409  blockToLive property, since a consistent blockToLive is required
   410  regardless of a peer's block height.
   411  
   412  Collection updates becomes effective when a peer commits the block with the updated
   413  chaincode definition. Note that collections cannot be
   414  deleted, as there may be prior private data hashes on the channel’s blockchain
   415  that cannot be removed.
   416  
   417  Private data reconciliation
   418  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
   419  
   420  Starting in v1.4, peers of organizations that are added to an existing collection
   421  will automatically fetch private data that was committed to the collection before
   422  they joined the collection.
   423  
   424  This private data "reconciliation" also applies to peers that
   425  were entitled to receive private data but did not yet receive it --- because of
   426  a network failure, for example --- by keeping track of private data that was "missing"
   427  at the time of block commit.
   428  
   429  Private data reconciliation occurs periodically based on the
   430  ``peer.gossip.pvtData.reconciliationEnabled`` and ``peer.gossip.pvtData.reconcileSleepInterval``
   431  properties in core.yaml. The peer will periodically attempt to fetch the private
   432  data from other collection member peers that are expected to have it.
   433  
   434  Note that this private data reconciliation feature only works on peers running
   435  v1.4 or later of Fabric.
   436  
   437  .. Licensed under Creative Commons Attribution 4.0 International License
   438     https://creativecommons.org/licenses/by/4.0/