github.com/yacovm/fabric@v2.0.0-alpha.0.20191128145320-c5d4087dc723+incompatible/docs/source/private-data/private-data.md (about) 1 # Private data 2 3 ## What is private data? 4 5 In cases where a group of organizations on a channel need to keep data private from 6 other organizations on that channel, they have the option to create a new channel 7 comprising just the organizations who need access to the data. However, creating 8 separate channels in each of these cases creates additional administrative overhead 9 (maintaining chaincode versions, policies, MSPs, etc), and doesn't allow for use 10 cases in which you want all channel participants to see a transaction while keeping 11 a portion of the data private. 12 13 That's why, starting in v1.2, Fabric offers the ability to create 14 **private data collections**, which allow a defined subset of organizations on a 15 channel the ability to endorse, commit, or query private data without having to 16 create a separate channel. 17 18 ## What is a private data collection? 19 20 A collection is the combination of two elements: 21 22 1. **The actual private data**, sent peer-to-peer [via gossip protocol](../gossip.html) 23 to only the organization(s) authorized to see it. This data is stored in a 24 private state database on the peers of authorized organizations (sometimes 25 called a "side" database, or "SideDB"), which can be accessed from chaincode 26 on these authorized peers. 27 The ordering service is not involved here and does not see the 28 private data. Note that because gossip distributes the private data peer-to-peer 29 across authorized organizations, it is required to set up anchor peers on the channel, 30 and configure CORE_PEER_GOSSIP_EXTERNALENDPOINT on each peer, 31 in order to bootstrap cross-organization communication. 32 33 2. **A hash of that data**, which is endorsed, ordered, and written to the ledgers 34 of every peer on the channel. The hash serves as evidence of the transaction and 35 is used for state validation and can be used for audit purposes. 36 37 The following diagram illustrates the ledger contents of a peer authorized to have 38 private data and one which is not. 39 40  41 42 Collection members may decide to share the private data with other parties if they 43 get into a dispute or if they want to transfer the asset to a third party. The 44 third party can then compute the hash of the private data and see if it matches the 45 state on the channel ledger, proving that the state existed between the collection 46 members at a certain point in time. 47 48 ### When to use a collection within a channel vs. a separate channel 49 50 * Use **channels** when entire transactions (and ledgers) must be kept 51 confidential within a set of organizations that are members of the channel. 52 53 * Use **collections** when transactions (and ledgers) must be shared among a set 54 of organizations, but when only a subset of those organizations should have 55 access to some (or all) of the data within a transaction. Additionally, 56 since private data is disseminated peer-to-peer rather than via blocks, 57 use private data collections when transaction data must be kept confidential 58 from ordering service nodes. 59 60 ## A use case to explain collections 61 62 Consider a group of five organizations on a channel who trade produce: 63 64 * **A Farmer** selling his goods abroad 65 * **A Distributor** moving goods abroad 66 * **A Shipper** moving goods between parties 67 * **A Wholesaler** purchasing goods from distributors 68 * **A Retailer** purchasing goods from shippers and wholesalers 69 70 The **Distributor** might want to make private transactions with the 71 **Farmer** and **Shipper** to keep the terms of the trades confidential from 72 the **Wholesaler** and the **Retailer** (so as not to expose the markup they're 73 charging). 74 75 The **Distributor** may also want to have a separate private data relationship 76 with the **Wholesaler** because it charges them a lower price than it does the 77 **Retailer**. 78 79 The **Wholesaler** may also want to have a private data relationship with the 80 **Retailer** and the **Shipper**. 81 82 Rather than defining many small channels for each of these relationships, multiple 83 private data collections **(PDC)** can be defined to share private data between: 84 85 1. PDC1: **Distributor**, **Farmer** and **Shipper** 86 2. PDC2: **Distributor** and **Wholesaler** 87 3. PDC3: **Wholesaler**, **Retailer** and **Shipper** 88 89  90 91 Using this example, peers owned by the **Distributor** will have multiple private 92 databases inside their ledger which includes the private data from the 93 **Distributor**, **Farmer** and **Shipper** relationship and the 94 **Distributor** and **Wholesaler** relationship. Because these databases are kept 95 separate from the database that holds the channel ledger, private data is 96 sometimes referred to as "SideDB". 97 98  99 100 ## Transaction flow with private data 101 102 When private data collections are referenced in chaincode, the transaction flow 103 is slightly different in order to protect the confidentiality of the private 104 data as transactions are proposed, endorsed, and committed to the ledger. 105 106 For details on transaction flows that don't use private data refer to our 107 documentation on [transaction flow](../txflow.html). 108 109 1. The client application submits a proposal request to invoke a chaincode 110 function (reading or writing private data) to endorsing peers which are 111 part of authorized organizations of the collection. The private data, or 112 data used to generate private data in chaincode, is sent in a `transient` 113 field of the proposal. 114 115 2. The endorsing peers simulate the transaction and store the private data in 116 a `transient data store` (a temporary storage local to the peer). They 117 distribute the private data, based on the collection policy, to authorized peers 118 via [gossip](../gossip.html). 119 120 3. The endorsing peer sends the proposal response back to the client. The proposal 121 response includes the endorsed read/write set, which includes public 122 data, as well as a hash of any private data keys and values. *No private data is 123 sent back to the client*. For more information on how endorsement works with 124 private data, click [here](../private-data-arch.html#endorsement). 125 126 4. The client application submits the transaction (which includes the proposal 127 response with the private data hashes) to the ordering service. The transactions 128 with the private data hashes get included in blocks as normal. 129 The block with the private data hashes is distributed to all the peers. In this way, 130 all peers on the channel can validate transactions with the hashes of the private 131 data in a consistent way, without knowing the actual private data. 132 133 5. At block commit time, authorized peers use the collection policy to 134 determine if they are authorized to have access to the private data. If they do, 135 they will first check their local `transient data store` to determine if they 136 have already received the private data at chaincode endorsement time. If not, 137 they will attempt to pull the private data from another authorized peer. Then they 138 will validate the private data against the hashes in the public block and commit the 139 transaction and the block. Upon validation/commit, the private data is moved to 140 their copy of the private state database and private writeset storage. The 141 private data is then deleted from the `transient data store`. 142 143 ## Purging private data 144 145 For very sensitive data, even the parties sharing the private data might want 146 --- or might be required by government regulations --- to periodically "purge" the data 147 on their peers, leaving behind a hash of the data on the blockchain 148 to serve as immutable evidence of the private data. 149 150 In some of these cases, the private data only needs to exist on the peer's private 151 database until it can be replicated into a database external to the peer's 152 blockchain. The data might also only need to exist on the peers until a chaincode business 153 process is done with it (trade settled, contract fulfilled, etc). 154 155 To support these use cases, private data can be purged if it has not been modified 156 for a configurable number of blocks. Purged private data cannot be queried from chaincode, 157 and is not available to other requesting peers. 158 159 ## How a private data collection is defined 160 161 For more details on collection definitions, and other low level information about 162 private data and collections, refer to the [private data reference topic](../private-data-arch.html). 163 164 <!--- Licensed under Creative Commons Attribution 4.0 International License 165 https://creativecommons.org/licenses/by/4.0/ -->