github.com/yacovm/fabric@v2.0.0-alpha.0.20191128145320-c5d4087dc723+incompatible/docs/source/couchdb_as_state_database.rst (about)

     1  CouchDB as the State Database
     2  =============================
     3  
     4  State Database options
     5  ----------------------
     6  
     7  State database options include LevelDB and CouchDB. LevelDB is the default key-value state
     8  database embedded in the peer process. CouchDB is an optional alternative external state database.
     9  Like the LevelDB key-value store, CouchDB can store any binary data that is modeled in chaincode
    10  (CouchDB attachment functionality is used internally for non-JSON binary data). But as a JSON
    11  document store, CouchDB additionally enables rich query against the chaincode data, when chaincode
    12  values (e.g. assets) are modeled as JSON data.
    13  
    14  Both LevelDB and CouchDB support core chaincode operations such as getting and setting a key
    15  (asset), and querying based on keys. Keys can be queried by range, and composite keys can be
    16  modeled to enable equivalence queries against multiple parameters. For example a composite
    17  key of ``owner,asset_id`` can be used to query all assets owned by a certain entity. These key-based
    18  queries can be used for read-only queries against the ledger, as well as in transactions that
    19  update the ledger.
    20  
    21  If you model assets as JSON and use CouchDB, you can also perform complex rich queries against the
    22  chaincode data values, using the CouchDB JSON query language within chaincode. These types of
    23  queries are excellent for understanding what is on the ledger. Proposal responses for these types
    24  of queries are typically useful to the client application, but are not typically submitted as
    25  transactions to the ordering service. In fact, there is no guarantee the result set is stable
    26  between chaincode execution and commit time for rich queries, and therefore rich queries
    27  are not appropriate for use in update transactions, unless your application can guarantee the
    28  result set is stable between chaincode execution time and commit time, or can handle potential
    29  changes in subsequent transactions. For example, if you perform a rich query for all assets
    30  owned by Alice and transfer them to Bob, a new asset may be assigned to Alice by another
    31  transaction between chaincode execution time and commit time, and you would miss this "phantom"
    32  item.
    33  
    34  CouchDB runs as a separate database process alongside the peer, therefore there are additional
    35  considerations in terms of setup, management, and operations. You may consider starting with the
    36  default embedded LevelDB, and move to CouchDB if you require the additional complex rich queries.
    37  It is a good practice to model chaincode asset data as JSON, so that you have the option to perform
    38  complex rich queries if needed in the future.
    39  
    40  .. note:: The key for a CouchDB JSON document can only contain valid UTF-8 strings and cannot begin
    41     with an underscore ("_"). Whether you are using CouchDB or LevelDB, you should avoid using
    42     U+0000 (nil byte) in keys.
    43  
    44     JSON documents in CouchDB cannot use the following values as top level field names. These values
    45     are reserved for internal use.
    46  
    47     - ``Any field beginning with an underscore, "_"``
    48     - ``~version``
    49  
    50  Using CouchDB from Chaincode
    51  ----------------------------
    52  
    53  Chaincode queries
    54  ~~~~~~~~~~~~~~~~~
    55  
    56  Most of the `chaincode shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStubInterface>`__
    57  can be utilized with either LevelDB or CouchDB state database, e.g. ``GetState``, ``PutState``,
    58  ``GetStateByRange``, ``GetStateByPartialCompositeKey``. Additionally when you utilize CouchDB as
    59  the state database and model assets as JSON in chaincode, you can perform rich queries against
    60  the JSON in the state database by using the ``GetQueryResult`` API and passing a CouchDB query string.
    61  The query string follows the `CouchDB JSON query syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html>`__.
    62  
    63  The `marbles02 fabric sample <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/marbles_chaincode.go>`__
    64  demonstrates use of CouchDB queries from chaincode. It includes a ``queryMarblesByOwner()`` function
    65  that demonstrates parameterized queries by passing an owner id into chaincode. It then queries the
    66  state data for JSON documents matching the docType of “marble” and the owner id using the JSON query
    67  syntax:
    68  
    69  .. code:: bash
    70  
    71    {"selector":{"docType":"marble","owner":<OWNER_ID>}}
    72  
    73  .. couchdb-pagination:
    74  
    75  CouchDB pagination
    76  ^^^^^^^^^^^^^^^^^^
    77  
    78  Fabric supports paging of query results for rich queries and range based queries.
    79  APIs supporting pagination allow the use of page size and bookmarks to be used for
    80  both range and rich queries. To support efficient pagination, the Fabric
    81  pagination APIs must be used. Specifically, the CouchDB ``limit`` keyword will
    82  not be honored in CouchDB queries since Fabric itself manages the pagination of
    83  query results and implicitly sets the pageSize limit that is passed to CouchDB.
    84  
    85  If a pageSize is specified using the paginated query APIs (``GetStateByRangeWithPagination()``,
    86  ``GetStateByPartialCompositeKeyWithPagination()``, and ``GetQueryResultWithPagination()``),
    87  a set of results (bound by the pageSize) will be returned to the chaincode along with
    88  a bookmark. The bookmark can be returned from chaincode to invoking clients,
    89  which can use the bookmark in a follow on query to receive the next "page" of results.
    90  
    91  The pagination APIs are for use in read-only transactions only, the query results
    92  are intended to support client paging requirements. For transactions
    93  that need to read and write, use the non-paginated chaincode query APIs. Within
    94  chaincode you can iterate through result sets to your desired depth.
    95  
    96  Regardless of whether the pagination APIs are utilized, all chaincode queries are
    97  bound by ``totalQueryLimit`` (default 100000) from ``core.yaml``. This is the maximum
    98  number of results that chaincode will iterate through and return to the client,
    99  in order to avoid accidental or malicious long-running queries.
   100  
   101  .. note:: Regardless of whether chaincode uses paginated queries or not, the peer will
   102            query CouchDB in batches based on ``internalQueryLimit`` (default 1000)
   103            from ``core.yaml``. This behavior ensures reasonably sized result sets are
   104            passed between the peer and CouchDB when executing chaincode, and is
   105            transparent to chaincode and the calling client.
   106  
   107  An example using pagination is included in the :doc:`couchdb_tutorial` tutorial.
   108  
   109  CouchDB indexes
   110  ~~~~~~~~~~~~~~~
   111  
   112  Indexes in CouchDB are required in order to make JSON queries efficient and are required for
   113  any JSON query with a sort. Indexes can be packaged alongside chaincode in a
   114  ``/META-INF/statedb/couchdb/indexes`` directory. Each index must be defined in its own
   115  text file with extension ``*.json`` with the index definition formatted in JSON following the
   116  `CouchDB index JSON syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html#db-index>`__.
   117  For example, to support the above marble query, a sample index on the ``docType`` and ``owner``
   118  fields is provided:
   119  
   120  .. code:: bash
   121  
   122    {"index":{"fields":["docType","owner"]},"ddoc":"indexOwnerDoc", "name":"indexOwner","type":"json"}
   123  
   124  The sample index can be found `here <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/META-INF/statedb/couchdb/indexes/indexOwner.json>`__.
   125  
   126  Any index in the chaincode’s ``META-INF/statedb/couchdb/indexes`` directory
   127  will be packaged up with the chaincode for deployment. When the chaincode is
   128  both installed on a peer and instantiated on one of the peer’s channels, the
   129  index will automatically be deployed to the peer’s channel and chaincode
   130  specific state database (if it has been configured to use CouchDB). If you
   131  install the chaincode first and then instantiate the chaincode on the channel,
   132  the index will be deployed at chaincode **instantiation** time. If the
   133  chaincode is already instantiated on a channel and you later install the
   134  chaincode on a peer, the index will be deployed at chaincode **installation**
   135  time.
   136  
   137  Upon deployment, the index will automatically be utilized by chaincode queries. CouchDB can automatically
   138  determine which index to use based on the fields being used in a query. Alternatively, in the
   139  selector query the index can be specified using the ``use_index`` keyword.
   140  
   141  The same index may exist in subsequent versions of the chaincode that gets installed. To change the
   142  index, use the same index name but alter the index definition. Upon installation/instantiation, the index
   143  definition will get re-deployed to the peer’s state database.
   144  
   145  If you have a large volume of data already, and later install the chaincode, the index creation upon
   146  installation may take some time. Similarly, if you have a large volume of data already and instantiate
   147  a subsequent version of the chaincode, the index creation may take some time. Avoid calling chaincode
   148  functions that query the state database at these times as the chaincode query may time out while the
   149  index is getting initialized. During transaction processing, the indexes will automatically get refreshed
   150  as blocks are committed to the ledger.
   151  
   152  CouchDB Configuration
   153  ---------------------
   154  
   155  CouchDB is enabled as the state database by changing the ``stateDatabase`` configuration option from
   156  goleveldb to CouchDB. Additionally, the ``couchDBAddress`` needs to configured to point to the
   157  CouchDB to be used by the peer. The username and password properties should be populated with
   158  an admin username and password if CouchDB is configured with a username and password. Additional
   159  options are provided in the ``couchDBConfig`` section and are documented in place. Changes to the
   160  *core.yaml* will be effective immediately after restarting the peer.
   161  
   162  You can also pass in docker environment variables to override core.yaml values, for example
   163  ``CORE_LEDGER_STATE_STATEDATABASE`` and ``CORE_LEDGER_STATE_COUCHDBCONFIG_COUCHDBADDRESS``.
   164  
   165  Below is the ``stateDatabase`` section from *core.yaml*:
   166  
   167  .. code:: bash
   168  
   169      state:
   170        # stateDatabase - options are "goleveldb", "CouchDB"
   171        # goleveldb - default state database stored in goleveldb.
   172        # CouchDB - store state database in CouchDB
   173        stateDatabase: goleveldb
   174        # Limit on the number of records to return per query
   175        totalQueryLimit: 10000
   176        couchDBConfig:
   177           # It is recommended to run CouchDB on the same server as the peer, and
   178           # not map the CouchDB container port to a server port in docker-compose.
   179           # Otherwise proper security must be provided on the connection between
   180           # CouchDB client (on the peer) and server.
   181           couchDBAddress: couchdb:5984
   182           # This username must have read and write authority on CouchDB
   183           username:
   184           # The password is recommended to pass as an environment variable
   185           # during start up (e.g. LEDGER_COUCHDBCONFIG_PASSWORD).
   186           # If it is stored here, the file must be access control protected
   187           # to prevent unintended users from discovering the password.
   188           password:
   189           # Number of retries for CouchDB errors
   190           maxRetries: 3
   191           # Number of retries for CouchDB errors during peer startup
   192           maxRetriesOnStartup: 10
   193           # CouchDB request timeout (unit: duration, e.g. 20s)
   194           requestTimeout: 35s
   195           # Limit on the number of records per each CouchDB query
   196           # Note that chaincode queries are only bound by totalQueryLimit.
   197           # Internally the chaincode may execute multiple CouchDB queries,
   198           # each of size internalQueryLimit.
   199           internalQueryLimit: 1000
   200           # Limit on the number of records per CouchDB bulk update batch
   201           maxBatchUpdateSize: 1000
   202           # Warm indexes after every N blocks.
   203           # This option warms any indexes that have been
   204           # deployed to CouchDB after every N blocks.
   205           # A value of 1 will warm indexes after every block commit,
   206           # to ensure fast selector queries.
   207           # Increasing the value may improve write efficiency of peer and CouchDB,
   208           # but may degrade query response time.
   209           warmIndexesAfterNBlocks: 1
   210  
   211  CouchDB hosted in docker containers supplied with Hyperledger Fabric have the
   212  capability of setting the CouchDB username and password with environment
   213  variables passed in with the ``COUCHDB_USER`` and ``COUCHDB_PASSWORD`` environment
   214  variables using Docker Compose scripting.
   215  
   216  For CouchDB installations outside of the docker images supplied with Fabric,
   217  the
   218  `local.ini file of that installation
   219  <http://docs.couchdb.org/en/2.1.1/config/intro.html#configuration-files>`__
   220  must be edited to set the admin username and password.
   221  
   222  Docker compose scripts only set the username and password at the creation of
   223  the container. The *local.ini* file must be edited if the username or password
   224  is to be changed after creation of the container.
   225  
   226  .. note:: CouchDB peer options are read on each peer startup.
   227  
   228  Good practices for queries
   229  --------------------------
   230  
   231  Avoid using chaincode for queries that will result in a scan of the entire
   232  CouchDB database. Full length database scans will result in long response
   233  times and will degrade the performance of your network. You can take some of
   234  the following steps to avoid long queries:
   235  
   236  - When using JSON queries:
   237  
   238      * Be sure to create indexes in the chaincode package.
   239      * Avoid query operators such as ``$or``, ``$in`` and ``$regex``, which lead
   240        to full database scans.
   241  
   242  - For range queries, composite key queries, and JSON queries:
   243  
   244      * Utilize paging support (as of v1.3) instead of one large result set.
   245  
   246  - If you want to build a dashboard or collect aggregate data as part of your
   247    application, you can query an off-chain database that replicates the data
   248    from your blockchain network. This will allow you to query and analyze the
   249    blockchain data in a data store optimized for your needs, without degrading
   250    the performance of your network or disrupting transactions. To achieve this,
   251    applications may use block or chaincode events to write transaction data
   252    to an off-chain database or analytics engine. For each block received, the block
   253    listener application would iterate through the block transactions and build a
   254    data store using the key/value writes from each valid transaction's ``rwset``.
   255    The :doc:`peer_event_services` provide replayable events to ensure the
   256    integrity of downstream data stores.
   257  
   258  .. Licensed under Creative Commons Attribution 4.0 International License
   259     https://creativecommons.org/licenses/by/4.0/