github.com/hechain20/hechain@v0.0.0-20220316014945-b544036ba106/docs/source/private-data/private-data.md (about)

     1  # Private data
     2  
     3  ## What is private data?
     4  
     5  In cases where a group of organizations on a channel need to keep data private from
     6  other organizations on that channel, they have the option to create a new channel
     7  comprising just the organizations who need access to the data. However, creating
     8  separate channels in each of these cases creates additional administrative overhead
     9  (maintaining chaincode versions, policies, MSPs, etc), and doesn't allow for use
    10  cases in which you want all channel participants to see a transaction while keeping
    11  a portion of the data private.
    12  
    13  That's why Fabric offers the ability to create
    14  **private data collections**, which allow a defined subset of organizations on a
    15  channel the ability to endorse, commit, or query private data without having to
    16  create a separate channel.
    17  
    18  Private data collections can be defined explicitly within a chaincode definition.
    19  Additionally, every chaincode has an implicit private data namespace reserved for organization-specific
    20  private data. These implicit organization-specific private data collections can
    21  be used to store an individual organization's private data, which is useful
    22  if you would like to store private data related to a single organization,
    23  such as details about an asset owned by an organization or an organization's
    24  approval for a step in a multi-party business process implemented in chaincode.
    25  
    26  ## What is a private data collection?
    27  
    28  A collection is the combination of two elements:
    29  
    30  1. **The actual private data**, sent peer-to-peer [via gossip protocol](../gossip.html)
    31     to only the organization(s) authorized to see it. This data is stored in a
    32     private state database on the peers of authorized organizations,
    33     which can be accessed from chaincode on these authorized peers.
    34     The ordering service is not involved here and does not see the
    35     private data. Note that because gossip distributes the private data peer-to-peer
    36     across authorized organizations, it is required to set up anchor peers on the channel,
    37     and configure CORE_PEER_GOSSIP_EXTERNALENDPOINT on each peer,
    38     in order to bootstrap cross-organization communication.
    39  
    40  2. **A hash of that data**, which is endorsed, ordered, and written to the ledgers
    41     of every peer on the channel. The hash serves as evidence of the transaction and
    42     is used for state validation and can be used for audit purposes.
    43  
    44  The following diagram illustrates the ledger contents of a peer authorized to have
    45  private data and one which is not.
    46  
    47  ![private-data.private-data](./PrivateDataConcept-2.png)
    48  
    49  Collection members may decide to share the private data with other parties if they
    50  get into a dispute or if they want to transfer the asset to a third party. The
    51  third party can then compute the hash of the private data and see if it matches the
    52  state on the channel ledger, proving that the state existed between the collection
    53  members at a certain point in time.
    54  
    55  In some cases, you may decide to have a set of collections each comprised of a
    56  single organization. For example an organization may record private data in their own
    57  collection, which could later be shared with other channel members and
    58  referenced in chaincode transactions. We'll see examples of this in the sharing
    59  private data topic below.
    60  
    61  ### When to use a collection within a channel vs. a separate channel
    62  
    63  * Use **channels** when entire transactions (and ledgers) must be kept
    64    confidential within a set of organizations that are members of the channel.
    65  
    66  * Use **collections** when transactions (and ledgers) must be shared among a set
    67    of organizations, but when only a subset of those organizations should have
    68    access to some (or all) of the data within a transaction.  Additionally,
    69    since private data is disseminated peer-to-peer rather than via blocks,
    70    use private data collections when transaction data must be kept confidential
    71    from ordering service nodes.
    72  
    73  ## A use case to explain collections
    74  
    75  Consider a group of five organizations on a channel who trade produce:
    76  
    77  * **A Farmer** selling his goods abroad
    78  * **A Distributor** moving goods abroad
    79  * **A Shipper** moving goods between parties
    80  * **A Wholesaler** purchasing goods from distributors
    81  * **A Retailer** purchasing goods from shippers and wholesalers
    82  
    83  The **Distributor** might want to make private transactions with the
    84  **Farmer** and **Shipper** to keep the terms of the trades confidential from
    85  the **Wholesaler** and the **Retailer** (so as not to expose the markup they're
    86  charging).
    87  
    88  The **Distributor** may also want to have a separate private data relationship
    89  with the **Wholesaler** because it charges them a lower price than it does the
    90  **Retailer**.
    91  
    92  The **Wholesaler** may also want to have a private data relationship with the
    93  **Retailer** and the **Shipper**.
    94  
    95  Rather than defining many small channels for each of these relationships, multiple
    96  private data collections **(PDC)** can be defined to share private data between:
    97  
    98  1. PDC1: **Distributor**, **Farmer** and **Shipper**
    99  2. PDC2: **Distributor** and **Wholesaler**
   100  3. PDC3: **Wholesaler**, **Retailer** and **Shipper**
   101  
   102  ![private-data.private-data](./PrivateDataConcept-1.png)
   103  
   104  Using this example, peers owned by the **Distributor** will have multiple private
   105  databases inside their ledger which includes the private data from the
   106  **Distributor**, **Farmer** and **Shipper** relationship and the
   107  **Distributor** and **Wholesaler** relationship.
   108  
   109  ![private-data.private-data](./PrivateDataConcept-3.png)
   110  
   111  ## Transaction flow with private data
   112  
   113  When private data collections are referenced in chaincode, the transaction flow
   114  is slightly different in order to protect the confidentiality of the private
   115  data as transactions are proposed, endorsed, and committed to the ledger.
   116  
   117  For details on transaction flows that don't use private data refer to our
   118  documentation on [transaction flow](../txflow.html).
   119  
   120  1. The client application submits a proposal request to invoke a chaincode
   121     function (reading or writing private data) to a target peer, which will manage 
   122     the transaction submission on behalf of the client. The client application can
   123     [specify which organizations](../gateway.html#targeting-specific-endorsement-peers)
   124     should endorse the proposal request, or it can delegate the 
   125     [endorser selection logic](../gateway.html#how-the-gateway-endorses-your-transaction-proposal)
   126     to the gateway service in the target peer.  In the latter case, the gateway will 
   127     attempt to select a set of endorsing peers which are part of authorized organizations 
   128     of the collection(s) affected by the chaincode. The private data, or data used to 
   129     generate private data in chaincode, is sent in a `transient` field in the proposal.
   130     
   131  2. The endorsing peers simulate the transaction and store the private data in
   132     a `transient data store` (a temporary storage local to the peer). They
   133     distribute the private data, based on the collection policy, to authorized peers
   134     via [gossip](../gossip.html).
   135  
   136  3. The endorsing peers send the proposal response back to the target peer. The proposal
   137     response includes the endorsed read/write set, which includes public
   138     data, as well as a hash of any private data keys and values. *No private data is
   139     sent back to the target peer or client*. For more information on how endorsement works with
   140     private data, click [here](../private-data-arch.html#endorsement).
   141  
   142  4. The target peer verifies the proposal responses are the same before assembling the
   143     endorsements into a transaction, which is sent back to the client for signing.
   144     The target peer "broadcasts" the transaction (which includes the proposal
   145     response with the private data hashes) to the ordering service. The transactions
   146     with the private data hashes get included in blocks as normal.
   147     The block with the private data hashes is distributed to all the peers. In this way,
   148     all peers on the channel can validate transactions with the hashes of the private
   149     data in a consistent way, without knowing the actual private data.
   150  
   151  5. At block commit time, authorized peers use the collection policy to
   152     determine if they are authorized to have access to the private data. If they do,
   153     they will first check their local `transient data store` to determine if they
   154     have already received the private data at chaincode endorsement time. If not,
   155     they will attempt to pull the private data from another authorized peer. Then they
   156     will validate the private data against the hashes in the public block and commit the
   157     transaction and the block. Upon validation/commit, the private data is moved to
   158     their copy of the private state database and private writeset storage. The
   159     private data is then deleted from the `transient data store`.
   160  
   161  Note: The client application can collect the endorsements instead of delegating that step to the target peer.
   162  Refer to the [v2.3 Peers and Applications](https://hyperledger-fabric.readthedocs.io/en/release-2.3/peers/peers.html#applications-and-peers) topic for details.
   163  
   164  ## Sharing private data
   165  
   166  In many scenarios private data keys/values in one collection may need to be shared with
   167  other channel members or with other private data collections, for example when you
   168  need to transact on private data with a channel member or group of channel members
   169  who were not included in the original private data collection. The receiving parties
   170  will typically want to verify the private data against the on-chain hashes
   171  as part of the transaction.
   172  
   173  There are several aspects of private data collections that enable the
   174  sharing and verification of private data:
   175  
   176  * First, you don't necessarily have to be a member of a collection to write to a key in
   177    a collection, as long as the endorsement policy is satisfied.
   178    Endorsement policy can be defined at the chaincode level, key level (using state-based
   179    endorsement), or collection level (starting in Fabric v2.0).
   180  
   181  * Second, starting in v1.4.2 there is a chaincode API GetPrivateDataHash() that allows
   182    chaincode on non-member peers to read the hash value of a private key. This is an
   183    important feature as you will see later, because it allows chaincode to verify private
   184    data against the on-chain hashes that were created from private data in previous transactions.
   185  
   186  This ability to share and verify private data should be considered when designing
   187  applications and the associated private data collections.
   188  While you can certainly create sets of multilateral private data collections to share data
   189  among various combinations of channel members, this approach may result in a large
   190  number of collections that need to be defined.
   191  Alternatively, consider using a smaller number of private data collections (e.g.
   192  one collection per organization, or one collection per pair of organizations), and
   193  then sharing private data with other channel members, or with other
   194  collections as the need arises. Starting in Fabric v2.0, implicit organization-specific
   195  collections are available for any chaincode to utilize,
   196  so that you don't even have to define these per-organization collections when
   197  deploying chaincode.
   198  
   199  ### Private data sharing patterns
   200  
   201  When modeling private data collections per organization, multiple patterns become available
   202  for sharing or transferring private data without the overhead of defining many multilateral
   203  collections. Here are some of the sharing patterns that could be leveraged in chaincode
   204  applications:
   205  
   206  * **Use a corresponding public key for tracking public state** -
   207    You can optionally have a matching public key for tracking public state (e.g. asset
   208    properties, current ownership. etc), and for every organization that should have access
   209    to the asset's corresponding private data, you can create a private key/value in each
   210    organization's private data collection.
   211  
   212  * **Chaincode access control** -
   213    You can implement access control in your chaincode, to specify which clients can
   214    query private data in a collection. For example, store an access control list
   215    for a private data collection key or range of keys, then in the chaincode get the
   216    client submitter's credentials (using GetCreator() chaincode API or CID library API
   217    GetID() or GetMSPID() ), and verify they have access before returning the private
   218    data. Similarly you could require a client to pass a passphrase into chaincode,
   219    which must match a passphrase stored at the key level, in order to access the
   220    private data. Note, this pattern can also be used to restrict client access to public
   221    state data.
   222  
   223  * **Sharing private data out of band** -
   224    As an off-chain option, you could share private data out of band with other
   225    organizations, and they can hash the key/value to verify it matches
   226    the on-chain hash by using GetPrivateDataHash() chaincode API. For example,
   227    an organization that wishes to purchase an asset from you may want to verify
   228    an asset's properties and that you are the legitimate owner by checking the
   229    on-chain hash, prior to agreeing to the purchase.
   230  
   231  * **Sharing private data with other collections** -
   232    You could 'share' the private data on-chain with chaincode that creates a matching
   233    key/value in the other organization's private data collection. You'd pass the
   234    private data key/value to chaincode via transient field, and the chaincode
   235    could confirm a hash of the passed private data matches the on-chain hash from
   236    your collection using GetPrivateDataHash(), and then write the private data to
   237    the other organization's private data collection.
   238  
   239  * **Transferring private data to other collections** -
   240    You could 'transfer' the private data with chaincode that deletes the private data
   241    key in your collection, and creates it in another organization's collection.
   242    Again, use the transient field to pass the private data upon chaincode invoke,
   243    and in the chaincode use GetPrivateDataHash() to confirm that the data exists in
   244    your private data collection, before deleting the key from your collection and
   245    creating the key in another organization's collection. To ensure that a
   246    transaction always deletes from one collection and adds to another collection,
   247    you may want to require endorsements from additional parties, such as a
   248    regulator or auditor.
   249  
   250  * **Using private data for transaction approval** -
   251    If you want to get a counterparty's approval for a transaction before it is
   252    completed (e.g. an on-chain record that they agree to purchase an asset for
   253    a certain price), the chaincode can require them to 'pre-approve' the transaction,
   254    by either writing a private key to their private data collection or your collection,
   255    which the chaincode will then check using GetPrivateDataHash(). In fact, this is
   256    exactly the same mechanism that the built-in lifecycle system chaincode uses to
   257    ensure organizations agree to a chaincode definition before it is committed to
   258    a channel. Starting with Fabric v2.0, this pattern
   259    becomes more powerful with collection-level endorsement policies, to ensure
   260    that the chaincode is executed and endorsed on the collection owner's own trusted
   261    peer. Alternatively, a mutually agreed key with a key-level endorsement policy
   262    could be used, that is then updated with the pre-approval terms and endorsed
   263    on peers from the required organizations.
   264  
   265  * **Keeping transactors private** -
   266    Variations of the prior pattern can also eliminate leaking the transactors for a given
   267    transaction. For example a buyer indicates agreement to buy on their own collection,
   268    then in a subsequent transaction seller references the buyer's private data in
   269    their own private data collection. The proof of transaction with hashed references
   270    is recorded on-chain, only the buyer and seller know that they are the transactors,
   271    but they can reveal the pre-images if a need-to-know arises, such as in a subsequent
   272    transaction with another party who could verify the hashes.
   273  
   274  Coupled with the patterns above, it is worth noting that transactions with private
   275  data can be bound to the same conditions as regular channel state data, specifically:
   276  
   277  * **Key level transaction access control** -
   278    You can include ownership credentials in a private data value, so that subsequent
   279    transactions can verify that the submitter has ownership privilege to share or transfer
   280    the data. In this case the chaincode would get the submitter's credentials
   281    (e.g. using GetCreator() chaincode API or CID library API GetID() or GetMSPID() ),
   282    combine it with other private data that gets passed to the chaincode, hash it,
   283    and use GetPrivateDataHash() to verify that it matches the on-chain hash before
   284    proceeding with the transaction.
   285  
   286  * **Key level endorsement policies** -
   287    And also as with normal channel state data, you can use state-based endorsement
   288    to specify which organizations must endorse transactions that share or transfer
   289    private data, using SetPrivateDataValidationParameter() chaincode API,
   290    for example to specify that only an owner's organization peer, custodian's organization
   291    peer, or other third party must endorse such transactions.
   292  
   293  ### Example scenario: Asset transfer using private data collections
   294  
   295  The private data sharing patterns mentioned above can be combined to enable powerful
   296  chaincode-based applications. For example, consider how an asset transfer scenario
   297  could be implemented using per-organization private data collections:
   298  
   299  * An asset may be tracked by a UUID key in public chaincode state. Only the asset's
   300    ownership is recorded, nothing else is known about the asset.
   301  
   302  * The chaincode will require that any transfer request must originate from the owning client,
   303    and the key is bound by state-based endorsement requiring that a peer from the
   304    owner's organization and a regulator's organization must endorse any transfer requests.
   305  
   306  * The asset owner's private data collection contains the private details about
   307    the asset, keyed by a hash of the UUID. Other organizations and the ordering
   308    service will only see a hash of the asset details.
   309  
   310  * Let's assume the regulator is a member of each collection as well, and therefore
   311    persists the private data, although this need not be the case.
   312  
   313  A transaction to trade the asset would unfold as follows:
   314  
   315  1. Off-chain, the owner and a potential buyer strike a deal to trade the asset
   316     for a certain price.
   317  
   318  2. The seller provides proof of their ownership, by either passing the private details
   319     out of band, or by providing the buyer with credentials to query the private
   320     data on their node or the regulator's node.
   321  
   322  3. Buyer verifies a hash of the private details matches the on-chain public hash.
   323  
   324  4. The buyer invokes chaincode to record their bid details in their own private data collection.
   325     The chaincode is invoked on buyer's peer, and potentially on regulator's peer if required
   326     by the collection endorsement policy.
   327  
   328  5. The current owner (seller) invokes chaincode to sell and transfer the asset, passing in the
   329     private details and bid information. The chaincode is invoked on peers of the
   330     seller, buyer, and regulator, in order to meet the endorsement policy of the public
   331     key, as well as the endorsement policies of the buyer and seller private data collections.
   332  
   333  6. The chaincode verifies that the submitting client is the owner, verifies the private
   334     details against the hash in the seller's collection, and verifies the bid details
   335     against the hash in the buyer's collection. The chaincode then writes the proposed
   336     updates for the public key (setting ownership to the buyer, and setting endorsement
   337     policy to be the buying organization and regulator), writes the private details to the
   338     buyer's private data collection, and potentially deletes the private details from seller's
   339     collection. Prior to final endorsement, the endorsing peers ensure private data is
   340     disseminated to any other authorized peers of the seller and regulator.
   341  
   342  7. The seller submits the transaction with the public data and private data hashes
   343     for ordering, and it is distributed to all channel peers in a block.
   344  
   345  8. Each peer's block validation logic will consistently verify the endorsement policy
   346     was met (buyer, seller, regulator all endorsed), and verify that public and private
   347     state that was read in the chaincode has not been modified by any other transaction
   348     since chaincode execution.
   349  
   350  9. All peers commit the transaction as valid since it passed validation checks.
   351     Buyer peers and regulator peers retrieve the private data from other authorized
   352     peers if they did not receive it at endorsement time, and persist the private
   353     data in their private data state database (assuming the private data matched
   354     the hashes from the transaction).
   355  
   356  10. With the transaction completed, the asset has been transferred, and other
   357      channel members interested in the asset may query the history of the public
   358      key to understand its provenance, but will not have access to any private
   359      details unless an owner shares it on a need-to-know basis.
   360  
   361  The basic asset transfer scenario could be extended for other considerations,
   362  for example the transfer chaincode could verify that a payment record is available
   363  to satisfy payment versus delivery requirements, or verify that a bank has
   364  submitted a letter of credit, prior to the execution of the transfer chaincode.
   365  And instead of transactors directly hosting peers, they could transact through
   366  custodian organizations who are running peers.
   367  
   368  ## Purging private data
   369  
   370  For very sensitive data, even the parties sharing the private data might want
   371  --- or might be required by government regulations --- to periodically "purge" the data
   372  on their peers, leaving behind a hash of the data on the blockchain
   373  to serve as immutable evidence of the private data.
   374  
   375  In some of these cases, the private data only needs to exist on the peer's private
   376  database until it can be replicated into a database external to the peer's
   377  blockchain. The data might also only need to exist on the peers until a chaincode business
   378  process is done with it (trade settled, contract fulfilled, etc).
   379  
   380  To support these use cases, private data can be purged if it has not been modified
   381  for a configurable number of blocks. Purged private data cannot be queried from chaincode,
   382  and is not available to other requesting peers.
   383  
   384  ## How a private data collection is defined
   385  
   386  For more details on collection definitions, and other low level information about
   387  private data and collections, refer to the [private data reference topic](../private-data-arch.html).
   388  
   389  <!--- Licensed under Creative Commons Attribution 4.0 International License
   390  https://creativecommons.org/licenses/by/4.0/ -->