github.com/yacovm/fabric@v2.0.0-alpha.0.20191128145320-c5d4087dc723+incompatible/docs/source/couchdb_as_state_database.rst (about) 1 CouchDB as the State Database 2 ============================= 3 4 State Database options 5 ---------------------- 6 7 State database options include LevelDB and CouchDB. LevelDB is the default key-value state 8 database embedded in the peer process. CouchDB is an optional alternative external state database. 9 Like the LevelDB key-value store, CouchDB can store any binary data that is modeled in chaincode 10 (CouchDB attachment functionality is used internally for non-JSON binary data). But as a JSON 11 document store, CouchDB additionally enables rich query against the chaincode data, when chaincode 12 values (e.g. assets) are modeled as JSON data. 13 14 Both LevelDB and CouchDB support core chaincode operations such as getting and setting a key 15 (asset), and querying based on keys. Keys can be queried by range, and composite keys can be 16 modeled to enable equivalence queries against multiple parameters. For example a composite 17 key of ``owner,asset_id`` can be used to query all assets owned by a certain entity. These key-based 18 queries can be used for read-only queries against the ledger, as well as in transactions that 19 update the ledger. 20 21 If you model assets as JSON and use CouchDB, you can also perform complex rich queries against the 22 chaincode data values, using the CouchDB JSON query language within chaincode. These types of 23 queries are excellent for understanding what is on the ledger. Proposal responses for these types 24 of queries are typically useful to the client application, but are not typically submitted as 25 transactions to the ordering service. In fact, there is no guarantee the result set is stable 26 between chaincode execution and commit time for rich queries, and therefore rich queries 27 are not appropriate for use in update transactions, unless your application can guarantee the 28 result set is stable between chaincode execution time and commit time, or can handle potential 29 changes in subsequent transactions. For example, if you perform a rich query for all assets 30 owned by Alice and transfer them to Bob, a new asset may be assigned to Alice by another 31 transaction between chaincode execution time and commit time, and you would miss this "phantom" 32 item. 33 34 CouchDB runs as a separate database process alongside the peer, therefore there are additional 35 considerations in terms of setup, management, and operations. You may consider starting with the 36 default embedded LevelDB, and move to CouchDB if you require the additional complex rich queries. 37 It is a good practice to model chaincode asset data as JSON, so that you have the option to perform 38 complex rich queries if needed in the future. 39 40 .. note:: The key for a CouchDB JSON document can only contain valid UTF-8 strings and cannot begin 41 with an underscore ("_"). Whether you are using CouchDB or LevelDB, you should avoid using 42 U+0000 (nil byte) in keys. 43 44 JSON documents in CouchDB cannot use the following values as top level field names. These values 45 are reserved for internal use. 46 47 - ``Any field beginning with an underscore, "_"`` 48 - ``~version`` 49 50 Using CouchDB from Chaincode 51 ---------------------------- 52 53 Chaincode queries 54 ~~~~~~~~~~~~~~~~~ 55 56 Most of the `chaincode shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStubInterface>`__ 57 can be utilized with either LevelDB or CouchDB state database, e.g. ``GetState``, ``PutState``, 58 ``GetStateByRange``, ``GetStateByPartialCompositeKey``. Additionally when you utilize CouchDB as 59 the state database and model assets as JSON in chaincode, you can perform rich queries against 60 the JSON in the state database by using the ``GetQueryResult`` API and passing a CouchDB query string. 61 The query string follows the `CouchDB JSON query syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html>`__. 62 63 The `marbles02 fabric sample <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/marbles_chaincode.go>`__ 64 demonstrates use of CouchDB queries from chaincode. It includes a ``queryMarblesByOwner()`` function 65 that demonstrates parameterized queries by passing an owner id into chaincode. It then queries the 66 state data for JSON documents matching the docType of “marble” and the owner id using the JSON query 67 syntax: 68 69 .. code:: bash 70 71 {"selector":{"docType":"marble","owner":<OWNER_ID>}} 72 73 .. couchdb-pagination: 74 75 CouchDB pagination 76 ^^^^^^^^^^^^^^^^^^ 77 78 Fabric supports paging of query results for rich queries and range based queries. 79 APIs supporting pagination allow the use of page size and bookmarks to be used for 80 both range and rich queries. To support efficient pagination, the Fabric 81 pagination APIs must be used. Specifically, the CouchDB ``limit`` keyword will 82 not be honored in CouchDB queries since Fabric itself manages the pagination of 83 query results and implicitly sets the pageSize limit that is passed to CouchDB. 84 85 If a pageSize is specified using the paginated query APIs (``GetStateByRangeWithPagination()``, 86 ``GetStateByPartialCompositeKeyWithPagination()``, and ``GetQueryResultWithPagination()``), 87 a set of results (bound by the pageSize) will be returned to the chaincode along with 88 a bookmark. The bookmark can be returned from chaincode to invoking clients, 89 which can use the bookmark in a follow on query to receive the next "page" of results. 90 91 The pagination APIs are for use in read-only transactions only, the query results 92 are intended to support client paging requirements. For transactions 93 that need to read and write, use the non-paginated chaincode query APIs. Within 94 chaincode you can iterate through result sets to your desired depth. 95 96 Regardless of whether the pagination APIs are utilized, all chaincode queries are 97 bound by ``totalQueryLimit`` (default 100000) from ``core.yaml``. This is the maximum 98 number of results that chaincode will iterate through and return to the client, 99 in order to avoid accidental or malicious long-running queries. 100 101 .. note:: Regardless of whether chaincode uses paginated queries or not, the peer will 102 query CouchDB in batches based on ``internalQueryLimit`` (default 1000) 103 from ``core.yaml``. This behavior ensures reasonably sized result sets are 104 passed between the peer and CouchDB when executing chaincode, and is 105 transparent to chaincode and the calling client. 106 107 An example using pagination is included in the :doc:`couchdb_tutorial` tutorial. 108 109 CouchDB indexes 110 ~~~~~~~~~~~~~~~ 111 112 Indexes in CouchDB are required in order to make JSON queries efficient and are required for 113 any JSON query with a sort. Indexes can be packaged alongside chaincode in a 114 ``/META-INF/statedb/couchdb/indexes`` directory. Each index must be defined in its own 115 text file with extension ``*.json`` with the index definition formatted in JSON following the 116 `CouchDB index JSON syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html#db-index>`__. 117 For example, to support the above marble query, a sample index on the ``docType`` and ``owner`` 118 fields is provided: 119 120 .. code:: bash 121 122 {"index":{"fields":["docType","owner"]},"ddoc":"indexOwnerDoc", "name":"indexOwner","type":"json"} 123 124 The sample index can be found `here <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/META-INF/statedb/couchdb/indexes/indexOwner.json>`__. 125 126 Any index in the chaincode’s ``META-INF/statedb/couchdb/indexes`` directory 127 will be packaged up with the chaincode for deployment. When the chaincode is 128 both installed on a peer and instantiated on one of the peer’s channels, the 129 index will automatically be deployed to the peer’s channel and chaincode 130 specific state database (if it has been configured to use CouchDB). If you 131 install the chaincode first and then instantiate the chaincode on the channel, 132 the index will be deployed at chaincode **instantiation** time. If the 133 chaincode is already instantiated on a channel and you later install the 134 chaincode on a peer, the index will be deployed at chaincode **installation** 135 time. 136 137 Upon deployment, the index will automatically be utilized by chaincode queries. CouchDB can automatically 138 determine which index to use based on the fields being used in a query. Alternatively, in the 139 selector query the index can be specified using the ``use_index`` keyword. 140 141 The same index may exist in subsequent versions of the chaincode that gets installed. To change the 142 index, use the same index name but alter the index definition. Upon installation/instantiation, the index 143 definition will get re-deployed to the peer’s state database. 144 145 If you have a large volume of data already, and later install the chaincode, the index creation upon 146 installation may take some time. Similarly, if you have a large volume of data already and instantiate 147 a subsequent version of the chaincode, the index creation may take some time. Avoid calling chaincode 148 functions that query the state database at these times as the chaincode query may time out while the 149 index is getting initialized. During transaction processing, the indexes will automatically get refreshed 150 as blocks are committed to the ledger. 151 152 CouchDB Configuration 153 --------------------- 154 155 CouchDB is enabled as the state database by changing the ``stateDatabase`` configuration option from 156 goleveldb to CouchDB. Additionally, the ``couchDBAddress`` needs to configured to point to the 157 CouchDB to be used by the peer. The username and password properties should be populated with 158 an admin username and password if CouchDB is configured with a username and password. Additional 159 options are provided in the ``couchDBConfig`` section and are documented in place. Changes to the 160 *core.yaml* will be effective immediately after restarting the peer. 161 162 You can also pass in docker environment variables to override core.yaml values, for example 163 ``CORE_LEDGER_STATE_STATEDATABASE`` and ``CORE_LEDGER_STATE_COUCHDBCONFIG_COUCHDBADDRESS``. 164 165 Below is the ``stateDatabase`` section from *core.yaml*: 166 167 .. code:: bash 168 169 state: 170 # stateDatabase - options are "goleveldb", "CouchDB" 171 # goleveldb - default state database stored in goleveldb. 172 # CouchDB - store state database in CouchDB 173 stateDatabase: goleveldb 174 # Limit on the number of records to return per query 175 totalQueryLimit: 10000 176 couchDBConfig: 177 # It is recommended to run CouchDB on the same server as the peer, and 178 # not map the CouchDB container port to a server port in docker-compose. 179 # Otherwise proper security must be provided on the connection between 180 # CouchDB client (on the peer) and server. 181 couchDBAddress: couchdb:5984 182 # This username must have read and write authority on CouchDB 183 username: 184 # The password is recommended to pass as an environment variable 185 # during start up (e.g. LEDGER_COUCHDBCONFIG_PASSWORD). 186 # If it is stored here, the file must be access control protected 187 # to prevent unintended users from discovering the password. 188 password: 189 # Number of retries for CouchDB errors 190 maxRetries: 3 191 # Number of retries for CouchDB errors during peer startup 192 maxRetriesOnStartup: 10 193 # CouchDB request timeout (unit: duration, e.g. 20s) 194 requestTimeout: 35s 195 # Limit on the number of records per each CouchDB query 196 # Note that chaincode queries are only bound by totalQueryLimit. 197 # Internally the chaincode may execute multiple CouchDB queries, 198 # each of size internalQueryLimit. 199 internalQueryLimit: 1000 200 # Limit on the number of records per CouchDB bulk update batch 201 maxBatchUpdateSize: 1000 202 # Warm indexes after every N blocks. 203 # This option warms any indexes that have been 204 # deployed to CouchDB after every N blocks. 205 # A value of 1 will warm indexes after every block commit, 206 # to ensure fast selector queries. 207 # Increasing the value may improve write efficiency of peer and CouchDB, 208 # but may degrade query response time. 209 warmIndexesAfterNBlocks: 1 210 211 CouchDB hosted in docker containers supplied with Hyperledger Fabric have the 212 capability of setting the CouchDB username and password with environment 213 variables passed in with the ``COUCHDB_USER`` and ``COUCHDB_PASSWORD`` environment 214 variables using Docker Compose scripting. 215 216 For CouchDB installations outside of the docker images supplied with Fabric, 217 the 218 `local.ini file of that installation 219 <http://docs.couchdb.org/en/2.1.1/config/intro.html#configuration-files>`__ 220 must be edited to set the admin username and password. 221 222 Docker compose scripts only set the username and password at the creation of 223 the container. The *local.ini* file must be edited if the username or password 224 is to be changed after creation of the container. 225 226 .. note:: CouchDB peer options are read on each peer startup. 227 228 Good practices for queries 229 -------------------------- 230 231 Avoid using chaincode for queries that will result in a scan of the entire 232 CouchDB database. Full length database scans will result in long response 233 times and will degrade the performance of your network. You can take some of 234 the following steps to avoid long queries: 235 236 - When using JSON queries: 237 238 * Be sure to create indexes in the chaincode package. 239 * Avoid query operators such as ``$or``, ``$in`` and ``$regex``, which lead 240 to full database scans. 241 242 - For range queries, composite key queries, and JSON queries: 243 244 * Utilize paging support (as of v1.3) instead of one large result set. 245 246 - If you want to build a dashboard or collect aggregate data as part of your 247 application, you can query an off-chain database that replicates the data 248 from your blockchain network. This will allow you to query and analyze the 249 blockchain data in a data store optimized for your needs, without degrading 250 the performance of your network or disrupting transactions. To achieve this, 251 applications may use block or chaincode events to write transaction data 252 to an off-chain database or analytics engine. For each block received, the block 253 listener application would iterate through the block transactions and build a 254 data store using the key/value writes from each valid transaction's ``rwset``. 255 The :doc:`peer_event_services` provide replayable events to ensure the 256 integrity of downstream data stores. 257 258 .. Licensed under Creative Commons Attribution 4.0 International License 259 https://creativecommons.org/licenses/by/4.0/