github.com/kaituanwang/hyperledger@v2.0.1+incompatible/docs/source/couchdb_as_state_database.rst (about) 1 CouchDB as the State Database 2 ============================= 3 4 State Database options 5 ---------------------- 6 7 The current options for the peer state database are LevelDB and CouchDB. LevelDB is the default 8 key-value state database embedded in the peer process. CouchDB is an alternative external state database. 9 Like the LevelDB key-value store, CouchDB can store any binary data that is modeled in chaincode 10 (CouchDB attachment functionality is used internally for non-JSON binary data). But as a document 11 object store, CouchDB allows you to store data in JSON format, issue rich queries against your data, 12 and use indexes to support your queries. 13 14 Both LevelDB and CouchDB support core chaincode operations such as getting and setting a key 15 (asset), and querying based on keys. Keys can be queried by range, and composite keys can be 16 modeled to enable equivalence queries against multiple parameters. For example a composite 17 key of ``owner,asset_id`` can be used to query all assets owned by a certain entity. These key-based 18 queries can be used for read-only queries against the ledger, as well as in transactions that 19 update the ledger. 20 21 Modeling your data in JSON allows you to issue rich queries against the values of your data, 22 instead of only being able to query the keys. This makes it easier for your applications and 23 chaincode to read the data stored on the blockchain ledger. Using CouchDB can help you meet 24 auditing and reporting requirements for many use cases that are not supported by LevelDB. If you use 25 CouchDB and model your data in JSON, you can also deploy indexes with your chaincode. 26 Using indexes makes queries more flexible and efficient and enables you to query large 27 datasets from chaincode. 28 29 CouchDB runs as a separate database process alongside the peer, therefore there are additional 30 considerations in terms of setup, management, and operations. You may consider starting with the 31 default embedded LevelDB, and move to CouchDB if you require the additional complex rich queries. 32 It is a good practice to model asset data as JSON, so that you have the option to perform 33 complex rich queries if needed in the future. 34 35 .. note:: The key for a CouchDB JSON document can only contain valid UTF-8 strings and cannot begin 36 with an underscore ("_"). Whether you are using CouchDB or LevelDB, you should avoid using 37 U+0000 (nil byte) in keys. 38 39 JSON documents in CouchDB cannot use the following values as top level field names. These values 40 are reserved for internal use. 41 42 - ``Any field beginning with an underscore, "_"`` 43 - ``~version`` 44 45 Using CouchDB from Chaincode 46 ---------------------------- 47 48 Chaincode queries 49 ~~~~~~~~~~~~~~~~~ 50 51 Most of the `chaincode shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStubInterface>`__ 52 can be utilized with either LevelDB or CouchDB state database, e.g. ``GetState``, ``PutState``, 53 ``GetStateByRange``, ``GetStateByPartialCompositeKey``. Additionally when you utilize CouchDB as 54 the state database and model assets as JSON in chaincode, you can perform rich queries against 55 the JSON in the state database by using the ``GetQueryResult`` API and passing a CouchDB query string. 56 The query string follows the `CouchDB JSON query syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html>`__. 57 58 The `marbles02 fabric sample <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/marbles_chaincode.go>`__ 59 demonstrates use of CouchDB queries from chaincode. It includes a ``queryMarblesByOwner()`` function 60 that demonstrates parameterized queries by passing an owner id into chaincode. It then queries the 61 state data for JSON documents matching the docType of “marble” and the owner id using the JSON query 62 syntax: 63 64 .. code:: bash 65 66 {"selector":{"docType":"marble","owner":<OWNER_ID>}} 67 68 The responses to rich queries are useful for understanding the data on the ledger. However, 69 there is no guarantee that the result set for a rich query will be stable between 70 the chaincode execution and commit time. As a result, you should not use a rich query and 71 update the channel ledger in a single transaction. For example, if you perform a 72 rich query for all assets owned by Alice and transfer them to Bob, a new asset may 73 be assigned to Alice by another transaction between chaincode execution time 74 and commit time. 75 76 77 .. couchdb-pagination: 78 79 CouchDB pagination 80 ^^^^^^^^^^^^^^^^^^ 81 82 Fabric supports paging of query results for rich queries and range based queries. 83 APIs supporting pagination allow the use of page size and bookmarks to be used for 84 both range and rich queries. To support efficient pagination, the Fabric 85 pagination APIs must be used. Specifically, the CouchDB ``limit`` keyword will 86 not be honored in CouchDB queries since Fabric itself manages the pagination of 87 query results and implicitly sets the pageSize limit that is passed to CouchDB. 88 89 If a pageSize is specified using the paginated query APIs (``GetStateByRangeWithPagination()``, 90 ``GetStateByPartialCompositeKeyWithPagination()``, and ``GetQueryResultWithPagination()``), 91 a set of results (bound by the pageSize) will be returned to the chaincode along with 92 a bookmark. The bookmark can be returned from chaincode to invoking clients, 93 which can use the bookmark in a follow on query to receive the next "page" of results. 94 95 The pagination APIs are for use in read-only transactions only, the query results 96 are intended to support client paging requirements. For transactions 97 that need to read and write, use the non-paginated chaincode query APIs. Within 98 chaincode you can iterate through result sets to your desired depth. 99 100 Regardless of whether the pagination APIs are utilized, all chaincode queries are 101 bound by ``totalQueryLimit`` (default 100000) from ``core.yaml``. This is the maximum 102 number of results that chaincode will iterate through and return to the client, 103 in order to avoid accidental or malicious long-running queries. 104 105 .. note:: Regardless of whether chaincode uses paginated queries or not, the peer will 106 query CouchDB in batches based on ``internalQueryLimit`` (default 1000) 107 from ``core.yaml``. This behavior ensures reasonably sized result sets are 108 passed between the peer and CouchDB when executing chaincode, and is 109 transparent to chaincode and the calling client. 110 111 An example using pagination is included in the :doc:`couchdb_tutorial` tutorial. 112 113 CouchDB indexes 114 ~~~~~~~~~~~~~~~ 115 116 Indexes in CouchDB are required in order to make JSON queries efficient and are required for 117 any JSON query with a sort. Indexes enable you to query data from chaincode when you have 118 a large amount of data on your ledger. Indexes can be packaged alongside chaincode 119 in a ``/META-INF/statedb/couchdb/indexes`` directory. Each index must be defined in 120 its own text file with extension ``*.json`` with the index definition formatted in JSON 121 following the `CouchDB index JSON syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html#db-index>`__. 122 For example, to support the above marble query, a sample index on the ``docType`` and ``owner`` 123 fields is provided: 124 125 .. code:: bash 126 127 {"index":{"fields":["docType","owner"]},"ddoc":"indexOwnerDoc", "name":"indexOwner","type":"json"} 128 129 The sample index can be found `here <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/META-INF/statedb/couchdb/indexes/indexOwner.json>`__. 130 131 Any index in the chaincode’s ``META-INF/statedb/couchdb/indexes`` directory 132 will be packaged up with the chaincode for deployment. The index will be deployed 133 to a peers channel and chaincode specific database when the chaincode package is 134 installed on the peer and the chaincode definition is committed to the channel. If you 135 install the chaincode first and then commit the the chaincode definition to the 136 channel, the index will be deployed at commit time. If the chaincode has already 137 been defined on the channel and the chaincode package subsequently installed on 138 a peer joined to the channel, the index will be deployed at chaincode 139 **installation** time. 140 141 Upon deployment, the index will automatically be utilized by chaincode queries. CouchDB can automatically 142 determine which index to use based on the fields being used in a query. Alternatively, in the 143 selector query the index can be specified using the ``use_index`` keyword. 144 145 The same index may exist in subsequent versions of the chaincode that gets installed. To change the 146 index, use the same index name but alter the index definition. Upon installation/instantiation, the index 147 definition will get re-deployed to the peer’s state database. 148 149 If you have a large volume of data already, and later install the chaincode, the index creation upon 150 installation may take some time. Similarly, if you have a large volume of data already and commit the 151 definition of a subsequent chaincode version, the index creation may take some time. Avoid calling chaincode 152 functions that query the state database at these times as the chaincode query may time out while the 153 index is getting initialized. During transaction processing, the indexes will automatically get refreshed 154 as blocks are committed to the ledger. If the peer crashes during chaincode installation, the couchdb 155 indexes may not get created. If this occurs, you need to reinstall the chaincode to create the indexes. 156 157 CouchDB Configuration 158 --------------------- 159 160 CouchDB is enabled as the state database by changing the ``stateDatabase`` configuration option from 161 goleveldb to CouchDB. Additionally, the ``couchDBAddress`` needs to configured to point to the 162 CouchDB to be used by the peer. The username and password properties should be populated with 163 an admin username and password if CouchDB is configured with a username and password. Additional 164 options are provided in the ``couchDBConfig`` section and are documented in place. Changes to the 165 *core.yaml* will be effective immediately after restarting the peer. 166 167 You can also pass in docker environment variables to override core.yaml values, for example 168 ``CORE_LEDGER_STATE_STATEDATABASE`` and ``CORE_LEDGER_STATE_COUCHDBCONFIG_COUCHDBADDRESS``. 169 170 Below is the ``stateDatabase`` section from *core.yaml*: 171 172 .. code:: bash 173 174 state: 175 # stateDatabase - options are "goleveldb", "CouchDB" 176 # goleveldb - default state database stored in goleveldb. 177 # CouchDB - store state database in CouchDB 178 stateDatabase: goleveldb 179 # Limit on the number of records to return per query 180 totalQueryLimit: 10000 181 couchDBConfig: 182 # It is recommended to run CouchDB on the same server as the peer, and 183 # not map the CouchDB container port to a server port in docker-compose. 184 # Otherwise proper security must be provided on the connection between 185 # CouchDB client (on the peer) and server. 186 couchDBAddress: couchdb:5984 187 # This username must have read and write authority on CouchDB 188 username: 189 # The password is recommended to pass as an environment variable 190 # during start up (e.g. LEDGER_COUCHDBCONFIG_PASSWORD). 191 # If it is stored here, the file must be access control protected 192 # to prevent unintended users from discovering the password. 193 password: 194 # Number of retries for CouchDB errors 195 maxRetries: 3 196 # Number of retries for CouchDB errors during peer startup 197 maxRetriesOnStartup: 10 198 # CouchDB request timeout (unit: duration, e.g. 20s) 199 requestTimeout: 35s 200 # Limit on the number of records per each CouchDB query 201 # Note that chaincode queries are only bound by totalQueryLimit. 202 # Internally the chaincode may execute multiple CouchDB queries, 203 # each of size internalQueryLimit. 204 internalQueryLimit: 1000 205 # Limit on the number of records per CouchDB bulk update batch 206 maxBatchUpdateSize: 1000 207 # Warm indexes after every N blocks. 208 # This option warms any indexes that have been 209 # deployed to CouchDB after every N blocks. 210 # A value of 1 will warm indexes after every block commit, 211 # to ensure fast selector queries. 212 # Increasing the value may improve write efficiency of peer and CouchDB, 213 # but may degrade query response time. 214 warmIndexesAfterNBlocks: 1 215 216 CouchDB hosted in docker containers supplied with Hyperledger Fabric have the 217 capability of setting the CouchDB username and password with environment 218 variables passed in with the ``COUCHDB_USER`` and ``COUCHDB_PASSWORD`` environment 219 variables using Docker Compose scripting. 220 221 For CouchDB installations outside of the docker images supplied with Fabric, 222 the 223 `local.ini file of that installation 224 <http://docs.couchdb.org/en/2.1.1/config/intro.html#configuration-files>`__ 225 must be edited to set the admin username and password. 226 227 Docker compose scripts only set the username and password at the creation of 228 the container. The *local.ini* file must be edited if the username or password 229 is to be changed after creation of the container. 230 231 If you choose to map the fabric-couchdb container port to a host port, make sure you 232 are aware of the security implications. Mapping the CouchDB container port in a 233 development environment exposes the CouchDB REST API and allows you to visualize 234 the database via the CouchDB web interface (Fauxton). In a production environment 235 you should refrain from mapping the host port to restrict access to the CouchDB 236 container. Only the peer will be able to access the CouchDB container. 237 238 .. note:: CouchDB peer options are read on each peer startup. 239 240 Good practices for queries 241 -------------------------- 242 243 Avoid using chaincode for queries that will result in a scan of the entire 244 CouchDB database. Full length database scans will result in long response 245 times and will degrade the performance of your network. You can take some of 246 the following steps to avoid long queries: 247 248 - When using JSON queries: 249 250 * Be sure to create indexes in the chaincode package. 251 * Avoid query operators such as ``$or``, ``$in`` and ``$regex``, which lead 252 to full database scans. 253 254 - For range queries, composite key queries, and JSON queries: 255 256 * Utilize paging support instead of one large result set. 257 258 - If you want to build a dashboard or collect aggregate data as part of your 259 application, you can query an off-chain database that replicates the data 260 from your blockchain network. This will allow you to query and analyze the 261 blockchain data in a data store optimized for your needs, without degrading 262 the performance of your network or disrupting transactions. To achieve this, 263 applications may use block or chaincode events to write transaction data 264 to an off-chain database or analytics engine. For each block received, the block 265 listener application would iterate through the block transactions and build a 266 data store using the key/value writes from each valid transaction's ``rwset``. 267 The :doc:`peer_event_services` provide replayable events to ensure the 268 integrity of downstream data stores. 269 270 .. Licensed under Creative Commons Attribution 4.0 International License 271 https://creativecommons.org/licenses/by/4.0/