github.com/kaituanwang/hyperledger@v2.0.1+incompatible/docs/source/private-data-arch.rst (about)

     1  Private Data
     2  ============
     3  
     4  .. note:: This topic assumes an understanding of the conceptual material in the
     5            `documentation on private data <private-data/private-data.html>`_.
     6  
     7  Private data collection definition
     8  ----------------------------------
     9  
    10  A collection definition contains one or more collections, each having a policy
    11  definition listing the organizations in the collection, as well as properties
    12  used to control dissemination of private data at endorsement time and,
    13  optionally, whether the data will be purged.
    14  
    15  Beginning with the Fabric chaincode lifecycle introduced with Fabric v2.0, the
    16  collection definition is part of the chaincode definition. The collection is
    17  approved by channel members, and then deployed when the chaincode definition
    18  is committed to the channel. The collection file needs to be the same for all
    19  channel members. If you are using the peer CLI to approve and commit the
    20  chaincode definition, use the ``--collections-config`` flag to specify the path
    21  to the collection definition file. If you are using the Fabric SDK for Node.js,
    22  visit `How to install and start your chaincode <https://hyperledger.github.io/fabric-sdk-node/master/tutorial-chaincode-lifecycle.html>`_.
    23  To use the `previous lifecycle process <https://hyperledger-fabric.readthedocs.io/en/release-1.4/chaincode4noah.html>`_ to deploy a private data collection,
    24  use the ``--collections-config`` flag when `instantiating your chaincode <https://hyperledger-fabric.readthedocs.io/en/latest/commands/peerchaincode.html#peer-chaincode-instantiate>`_.
    25  
    26  Collection definitions are composed of the following properties:
    27  
    28  * ``name``: Name of the collection.
    29  
    30  * ``policy``: The private data collection distribution policy defines which
    31    organizations' peers are allowed to persist the collection data expressed using
    32    the ``Signature`` policy syntax, with each member being included in an ``OR``
    33    signature policy list. To support read/write transactions, the private data
    34    distribution policy must define a broader set of organizations than the chaincode
    35    endorsement policy, as peers must have the private data in order to endorse
    36    proposed transactions. For example, in a channel with ten organizations,
    37    five of the organizations might be included in a private data collection
    38    distribution policy, but the endorsement policy might call for any three
    39    of the organizations to endorse.
    40  
    41  * ``requiredPeerCount``: Minimum number of peers (across authorized organizations)
    42    that each endorsing peer must successfully disseminate private data to before the
    43    peer signs the endorsement and returns the proposal response back to the client.
    44    Requiring dissemination as a condition of endorsement will ensure that private data
    45    is available in the network even if the endorsing peer(s) become unavailable. When
    46    ``requiredPeerCount`` is ``0``, it means that no distribution is **required**,
    47    but there may be some distribution if ``maxPeerCount`` is greater than zero. A
    48    ``requiredPeerCount`` of ``0`` would typically not be recommended, as it could
    49    lead to loss of private data in the network if the endorsing peer(s) becomes unavailable.
    50    Typically you would want to require at least some distribution of the private
    51    data at endorsement time to ensure redundancy of the private data on multiple
    52    peers in the network.
    53  
    54  * ``maxPeerCount``: For data redundancy purposes, the maximum number of other
    55    peers (across authorized organizations) that each endorsing peer will attempt
    56    to distribute the private data to. If an endorsing peer becomes unavailable between
    57    endorsement time and commit time, other peers that are collection members but who
    58    did not yet receive the private data at endorsement time, will be able to pull
    59    the private data from peers the private data was disseminated to. If this value
    60    is set to ``0``, the private data is not disseminated at endorsement time,
    61    forcing private data pulls against endorsing peers on all authorized peers at
    62    commit time.
    63  
    64  * ``blockToLive``: Represents how long the data should live on the private
    65    database in terms of blocks. The data will live for this specified number of
    66    blocks on the private database and after that it will get purged, making this
    67    data obsolete from the network so that it cannot be queried from chaincode,
    68    and cannot be made available to requesting peers. To keep private data
    69    indefinitely, that is, to never purge private data, set the ``blockToLive``
    70    property to ``0``.
    71  
    72  * ``memberOnlyRead``: a value of ``true`` indicates that peers automatically
    73    enforce that only clients belonging to one of the collection member organizations
    74    are allowed read access to private data. If a client from a non-member org
    75    attempts to execute a chaincode function that performs a read of a private data key,
    76    the chaincode invocation is terminated with an error. Utilize a value of
    77    ``false`` if you would like to encode more granular access control within
    78    individual chaincode functions.
    79  
    80  * ``memberOnlyWrite``: a value of ``true`` indicates that peers automatically
    81    enforce that only clients belonging to one of the collection member organizations
    82    are allowed to write private data from chaincode. If a client from a non-member org
    83    attempts to execute a chaincode function that performs a write on a private data key,
    84    the chaincode invocation is terminated with an error. Utilize a value of
    85    ``false`` if you would like to encode more granular access control within
    86    individual chaincode functions, for example you may want certain clients
    87    from non-member organization to be able to create private data in a certain
    88    collection.
    89  
    90  * ``endorsementPolicy``: An optional endorsement policy to utilize for the
    91    collection that overrides the chaincode level endorsement policy. A
    92    collection level endorsement policy may be specified in the form of a
    93    ``signaturePolicy`` or may be a ``channelConfigPolicy`` reference to
    94    an existing policy from the channel configuration. The ``endorsementPolicy``
    95    may be the same as the collection distribution ``policy``, or may require
    96    fewer or additional organization peers.
    97  
    98  Here is a sample collection definition JSON file, containing an array of two
    99  collection definitions:
   100  
   101  .. code:: bash
   102  
   103   [
   104    {
   105       "name": "collectionMarbles",
   106       "policy": "OR('Org1MSP.member', 'Org2MSP.member')",
   107       "requiredPeerCount": 0,
   108       "maxPeerCount": 3,
   109       "blockToLive":1000000,
   110       "memberOnlyRead": true,
   111       "memberOnlyWrite": true
   112    },
   113    {
   114       "name": "collectionMarblePrivateDetails",
   115       "policy": "OR('Org1MSP.member')",
   116       "requiredPeerCount": 0,
   117       "maxPeerCount": 3,
   118       "blockToLive":3,
   119       "memberOnlyRead": true,
   120       "memberOnlyWrite":true,
   121       "endorsementPolicy": {
   122         "signaturePolicy": "OR('Org1MSP.member')"
   123       }
   124    }
   125   ]
   126  
   127  This example uses the organizations from the Fabric test network, ``Org1`` and
   128  ``Org2`` . The policy in the  ``collectionMarbles`` definition authorizes both
   129  organizations to the private data. This is a typical configuration when the
   130  chaincode data needs to remain private from the ordering service nodes. However,
   131  the policy in the ``collectionMarblePrivateDetails`` definition restricts access
   132  to a subset of organizations in the channel (in this case ``Org1`` ). Additionally,
   133  writing to this collection requires endorsement from a ``Org1`` peer, even
   134  though the chaincode level endorsement policy may require endorsement from
   135  ``Org1`` or ``Org2``. And since "memberOnlyWrite" is true, only clients from
   136  ``Org1`` may invoke chaincode that writes to the private data collection.
   137  In this way you can control which organizations are entrusted to write to certain
   138  private data collections.
   139  
   140  Private data dissemination
   141  --------------------------
   142  
   143  Since private data is not included in the transactions that get submitted to
   144  the ordering service, and therefore not included in the blocks that get distributed
   145  to all peers in a channel, the endorsing peer plays an important role in
   146  disseminating private data to other peers of authorized organizations. This ensures
   147  the availability of private data in the channel's collection, even if endorsing
   148  peers become unavailable after their endorsement. To assist with this dissemination,
   149  the  ``maxPeerCount`` and ``requiredPeerCount`` properties in the collection definition
   150  control the degree of dissemination at endorsement time.
   151  
   152  If the endorsing peer cannot successfully disseminate the private data to at least
   153  the ``requiredPeerCount``, it will return an error back to the client. The endorsing
   154  peer will attempt to disseminate the private data to peers of different organizations,
   155  in an effort to ensure that each authorized organization has a copy of the private
   156  data. Since transactions are not committed at chaincode execution time, the endorsing
   157  peer and recipient peers store a copy of the private data in a local ``transient store``
   158  alongside their blockchain until the transaction is committed.
   159  
   160  When authorized peers do not have a copy of the private data in their transient
   161  data store at commit time (either because they were not an endorsing peer or because
   162  they did not receive the private data via dissemination at endorsement time),
   163  they will attempt to pull the private data from another authorized
   164  peer, *for a configurable amount of time* based on the peer property
   165  ``peer.gossip.pvtData.pullRetryThreshold`` in the peer configuration ``core.yaml``
   166  file.
   167  
   168  .. note:: The peers being asked for private data will only return the private data
   169            if the requesting peer is a member of the collection as defined by the
   170            private data dissemination policy.
   171  
   172  Considerations when using ``pullRetryThreshold``:
   173  
   174  * If the requesting peer is able to retrieve the private data within the
   175    ``pullRetryThreshold``, it will commit the transaction to its ledger
   176    (including the private data hash), and store the private data in its
   177    state database, logically separated from other channel state data.
   178  
   179  * If the requesting peer is not able to retrieve the private data within
   180    the ``pullRetryThreshold``, it will commit the transaction to it’s blockchain
   181    (including the private data hash), without the private data.
   182  
   183  * If the peer was entitled to the private data but it is missing, then
   184    that peer will not be able to endorse future transactions that reference
   185    the missing private data - a chaincode query for a key that is missing will
   186    be detected (based on the presence of the key’s hash in the state database),
   187    and the chaincode will receive an error.
   188  
   189  Therefore, it is important to set the ``requiredPeerCount`` and ``maxPeerCount``
   190  properties large enough to ensure the availability of private data in your
   191  channel. For example, if each of the endorsing peers become unavailable
   192  before the transaction commits, the ``requiredPeerCount`` and ``maxPeerCount``
   193  properties will have ensured the private data is available on other peers.
   194  
   195  .. note:: For collections to work, it is important to have cross organizational
   196            gossip configured correctly. Refer to our documentation on :doc:`gossip`,
   197            paying particular attention to the "anchor peers" and "external endpoint"
   198            configuration.
   199  
   200  Referencing collections from chaincode
   201  --------------------------------------
   202  
   203  A set of `shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim>`_
   204  are available for setting and retrieving private data.
   205  
   206  The same chaincode data operations can be applied to channel state data and
   207  private data, but in the case of private data, a collection name is specified
   208  along with the data in the chaincode APIs, for example
   209  ``PutPrivateData(collection,key,value)`` and ``GetPrivateData(collection,key)``.
   210  
   211  A single chaincode can reference multiple collections.
   212  
   213  Referencing implicit collections from chaincode
   214  -----------------------------------------------
   215  
   216  Starting in v2.0, an implicit private data collection can be used for each
   217  organization in a channel, so that you don't have to define collections if you'd
   218  like to utilize per-organization collections. Each org-specific implicit collection
   219  has a distribution policy and endorsement policy of the matching organization.
   220  You can therefore utilize implicit collections for use cases where you'd like
   221  to ensure that a specific organization has written to a collection key namespace.
   222  The v2.0 chaincode lifecycle uses implicit collections to track which organizations
   223  have approved a chaincode definition. Similarly, you can use implicit collections
   224  in application chaincode to track which organizations have approved or voted
   225  for some change in state.
   226  
   227  To write and read an implicit private data collection key, in the ``PutPrivateData``
   228  and ``GetPrivateData`` chaincode APIs, specify the collection parameter as
   229  ``"_implicit_org_<MSPID>"``, for example ``"_implicit_org_Org1MSP"``.
   230  
   231  .. note:: Application defined collection names are not allowed to start with an underscore,
   232            therefore there is no chance for an implicit collection name to collide
   233            with an application defined collection name.
   234  
   235  How to pass private data in a chaincode proposal
   236  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   237  
   238  Since the chaincode proposal gets stored on the blockchain, it is also important
   239  not to include private data in the main part of the chaincode proposal. A special
   240  field in the chaincode proposal called the ``transient`` field can be used to pass
   241  private data from the client (or data that chaincode will use to generate private
   242  data), to chaincode invocation on the peer.  The chaincode can retrieve the
   243  ``transient`` field by calling the `GetTransient() API <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStub.GetTransient>`_.
   244  This ``transient`` field gets excluded from the channel transaction.
   245  
   246  Protecting private data content
   247  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   248  If the private data is relatively simple and predictable (e.g. transaction dollar
   249  amount), channel members who are not authorized to the private data collection
   250  could try to guess the content of the private data via brute force hashing of
   251  the domain space, in hopes of finding a match with the private data hash on the
   252  chain. Private data that is predictable should therefore include a random "salt"
   253  that is concatenated with the private data key and included in the private data
   254  value, so that a matching hash cannot realistically be found via brute force.
   255  The random "salt" can be generated at the client side (e.g. by sampling a secure
   256  psuedo-random source) and then passed along with the private data in the transient
   257  field at the time of chaincode invocation.
   258  
   259  Access control for private data
   260  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   261  
   262  Until version 1.3, access control to private data based on collection membership
   263  was enforced for peers only. Access control based on the organization of the
   264  chaincode proposal submitter was required to be encoded in chaincode logic.
   265  Collection configuration options ``memberOnlyRead`` (since version v1.4) and
   266  ``memberOnlyWrite`` (since version v2.0) can automatically enforce that the chaincode
   267  proposal submitter must be from a collection member in order to read or write
   268  private data keys. For more information about collection
   269  configuration definitions and how to set them, refer back to the
   270  `Private data collection definition`_  section of this topic.
   271  
   272  .. note:: If you would like more granular access control, you can set
   273            ``memberOnlyRead`` and ``memberOnlyWrite`` to false. You can then apply your
   274            own access control logic in chaincode, for example by calling the GetCreator()
   275            chaincode API or using the client identity
   276            `chaincode library <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStub.GetCreator>`__ .
   277  
   278  Querying Private Data
   279  ~~~~~~~~~~~~~~~~~~~~~
   280  
   281  Private data collection can be queried just like normal channel data, using
   282  shim APIs:
   283  
   284  * ``GetPrivateDataByRange(collection, startKey, endKey string)``
   285  * ``GetPrivateDataByPartialCompositeKey(collection, objectType string, keys []string)``
   286  
   287  And for the CouchDB state database, JSON content queries can be passed using the
   288  shim API:
   289  
   290  * ``GetPrivateDataQueryResult(collection, query string)``
   291  
   292  Limitations:
   293  
   294  * Clients that call chaincode that executes range or rich JSON queries should be aware
   295    that they may receive a subset of the result set, if the peer they query has missing
   296    private data, based on the explanation in Private Data Dissemination section
   297    above.  Clients can query multiple peers and compare the results to
   298    determine if a peer may be missing some of the result set.
   299  * Chaincode that executes range or rich JSON queries and updates data in a single
   300    transaction is not supported, as the query results cannot be validated on the peers
   301    that don’t have access to the private data, or on peers that are missing the
   302    private data that they have access to. If a chaincode invocation both queries
   303    and updates private data, the proposal request will return an error. If your application
   304    can tolerate result set changes between chaincode execution and validation/commit time,
   305    then you could call one chaincode function to perform the query, and then call a second
   306    chaincode function to make the updates. Note that calls to GetPrivateData() to retrieve
   307    individual keys can be made in the same transaction as PutPrivateData() calls, since
   308    all peers can validate key reads based on the hashed key version.
   309  
   310  Using Indexes with collections
   311  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   312  
   313  The topic :doc:`couchdb_as_state_database` describes indexes that can be
   314  applied to the channel’s state database to enable JSON content queries, by
   315  packaging indexes in a ``META-INF/statedb/couchdb/indexes`` directory at chaincode
   316  installation time.  Similarly, indexes can also be applied to private data
   317  collections, by packaging indexes in a ``META-INF/statedb/couchdb/collections/<collection_name>/indexes``
   318  directory. An example index is available `here <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02_private/go/META-INF/statedb/couchdb/collections/collectionMarbles/indexes/indexOwner.json>`_.
   319  
   320  Considerations when using private data
   321  --------------------------------------
   322  
   323  Private data purging
   324  ~~~~~~~~~~~~~~~~~~~~
   325  
   326  Private data can be periodically purged from peers. For more details,
   327  see the ``blockToLive`` collection definition property above.
   328  
   329  Additionally, recall that prior to commit, peers store private data in a local
   330  transient data store. This data automatically gets purged when the transaction
   331  commits.  But if a transaction was never submitted to the channel and
   332  therefore never committed, the private data would remain in each peer’s
   333  transient store.  This data is purged from the transient store after a
   334  configurable number blocks by using the peer’s
   335  ``peer.gossip.pvtData.transientstoreMaxBlockRetention`` property in the peer
   336  ``core.yaml`` file.
   337  
   338  Updating a collection definition
   339  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   340  
   341  To update a collection definition or add a new collection, you can update
   342  the chaincode definition and pass the new collection configuration
   343  in the chaincode approve and commit transactions, for example using the ``--collections-config``
   344  flag if using the CLI. If a collection configuration is specified when updating
   345  the chaincode definition, a definition for each of the existing collections must be
   346  included.
   347  
   348  When updating a chaincode definition, you can add new private data collections,
   349  and update existing private data collections, for example to add new
   350  members to an existing collection or change one of the collection definition
   351  properties. Note that you cannot update the collection name or the
   352  blockToLive property, since a consistent blockToLive is required
   353  regardless of a peer's block height.
   354  
   355  Collection updates becomes effective when a peer commits the block with the updated
   356  chaincode definition. Note that collections cannot be
   357  deleted, as there may be prior private data hashes on the channel’s blockchain
   358  that cannot be removed.
   359  
   360  Private data reconciliation
   361  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
   362  
   363  Starting in v1.4, peers of organizations that are added to an existing collection
   364  will automatically fetch private data that was committed to the collection before
   365  they joined the collection.
   366  
   367  This private data "reconciliation" also applies to peers that
   368  were entitled to receive private data but did not yet receive it --- because of
   369  a network failure, for example --- by keeping track of private data that was "missing"
   370  at the time of block commit.
   371  
   372  Private data reconciliation occurs periodically based on the
   373  ``peer.gossip.pvtData.reconciliationEnabled`` and ``peer.gossip.pvtData.reconcileSleepInterval``
   374  properties in core.yaml. The peer will periodically attempt to fetch the private
   375  data from other collection member peers that are expected to have it.
   376  
   377  Note that this private data reconciliation feature only works on peers running
   378  v1.4 or later of Fabric.
   379  
   380  .. Licensed under Creative Commons Attribution 4.0 International License
   381     https://creativecommons.org/licenses/by/4.0/