github.com/kaituanwang/hyperledger@v2.0.1+incompatible/docs/source/couchdb_as_state_database.rst (about)

     1  CouchDB as the State Database
     2  =============================
     3  
     4  State Database options
     5  ----------------------
     6  
     7  The current options for the peer state database are LevelDB and CouchDB. LevelDB is the default
     8  key-value state database embedded in the peer process. CouchDB is an alternative external state database.
     9  Like the LevelDB key-value store, CouchDB can store any binary data that is modeled in chaincode
    10  (CouchDB attachment functionality is used internally for non-JSON binary data). But as a document
    11  object store, CouchDB allows you to store data in JSON format, issue rich queries against your data,
    12  and use indexes to support your queries.
    13  
    14  Both LevelDB and CouchDB support core chaincode operations such as getting and setting a key
    15  (asset), and querying based on keys. Keys can be queried by range, and composite keys can be
    16  modeled to enable equivalence queries against multiple parameters. For example a composite
    17  key of ``owner,asset_id`` can be used to query all assets owned by a certain entity. These key-based
    18  queries can be used for read-only queries against the ledger, as well as in transactions that
    19  update the ledger.
    20  
    21  Modeling your data in JSON allows you to issue rich queries against the values of your data,
    22  instead of only being able to query the keys. This makes it easier for your applications and
    23  chaincode to read the data stored on the blockchain ledger. Using CouchDB can help you meet
    24  auditing and reporting requirements for many use cases that are not supported by LevelDB. If you use
    25  CouchDB and model your data in JSON, you can also deploy indexes with your chaincode.
    26  Using indexes makes queries more flexible and efficient and enables you to query large
    27  datasets from chaincode.
    28  
    29  CouchDB runs as a separate database process alongside the peer, therefore there are additional
    30  considerations in terms of setup, management, and operations. You may consider starting with the
    31  default embedded LevelDB, and move to CouchDB if you require the additional complex rich queries.
    32  It is a good practice to model asset data as JSON, so that you have the option to perform
    33  complex rich queries if needed in the future.
    34  
    35  .. note:: The key for a CouchDB JSON document can only contain valid UTF-8 strings and cannot begin
    36     with an underscore ("_"). Whether you are using CouchDB or LevelDB, you should avoid using
    37     U+0000 (nil byte) in keys.
    38  
    39     JSON documents in CouchDB cannot use the following values as top level field names. These values
    40     are reserved for internal use.
    41  
    42     - ``Any field beginning with an underscore, "_"``
    43     - ``~version``
    44  
    45  Using CouchDB from Chaincode
    46  ----------------------------
    47  
    48  Chaincode queries
    49  ~~~~~~~~~~~~~~~~~
    50  
    51  Most of the `chaincode shim APIs <https://godoc.org/github.com/hyperledger/fabric-chaincode-go/shim#ChaincodeStubInterface>`__
    52  can be utilized with either LevelDB or CouchDB state database, e.g. ``GetState``, ``PutState``,
    53  ``GetStateByRange``, ``GetStateByPartialCompositeKey``. Additionally when you utilize CouchDB as
    54  the state database and model assets as JSON in chaincode, you can perform rich queries against
    55  the JSON in the state database by using the ``GetQueryResult`` API and passing a CouchDB query string.
    56  The query string follows the `CouchDB JSON query syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html>`__.
    57  
    58  The `marbles02 fabric sample <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/marbles_chaincode.go>`__
    59  demonstrates use of CouchDB queries from chaincode. It includes a ``queryMarblesByOwner()`` function
    60  that demonstrates parameterized queries by passing an owner id into chaincode. It then queries the
    61  state data for JSON documents matching the docType of “marble” and the owner id using the JSON query
    62  syntax:
    63  
    64  .. code:: bash
    65  
    66    {"selector":{"docType":"marble","owner":<OWNER_ID>}}
    67  
    68  The responses to rich queries are useful for understanding the data on the ledger. However,
    69  there is no guarantee that the result set for a rich query will be stable between
    70  the chaincode execution and commit time. As a result, you should not use a rich query and
    71  update the channel ledger in a single transaction. For example, if you perform a
    72  rich query for all assets owned by Alice and transfer them to Bob, a new asset may
    73  be assigned to Alice by another transaction between chaincode execution time
    74  and commit time.
    75  
    76  
    77  .. couchdb-pagination:
    78  
    79  CouchDB pagination
    80  ^^^^^^^^^^^^^^^^^^
    81  
    82  Fabric supports paging of query results for rich queries and range based queries.
    83  APIs supporting pagination allow the use of page size and bookmarks to be used for
    84  both range and rich queries. To support efficient pagination, the Fabric
    85  pagination APIs must be used. Specifically, the CouchDB ``limit`` keyword will
    86  not be honored in CouchDB queries since Fabric itself manages the pagination of
    87  query results and implicitly sets the pageSize limit that is passed to CouchDB.
    88  
    89  If a pageSize is specified using the paginated query APIs (``GetStateByRangeWithPagination()``,
    90  ``GetStateByPartialCompositeKeyWithPagination()``, and ``GetQueryResultWithPagination()``),
    91  a set of results (bound by the pageSize) will be returned to the chaincode along with
    92  a bookmark. The bookmark can be returned from chaincode to invoking clients,
    93  which can use the bookmark in a follow on query to receive the next "page" of results.
    94  
    95  The pagination APIs are for use in read-only transactions only, the query results
    96  are intended to support client paging requirements. For transactions
    97  that need to read and write, use the non-paginated chaincode query APIs. Within
    98  chaincode you can iterate through result sets to your desired depth.
    99  
   100  Regardless of whether the pagination APIs are utilized, all chaincode queries are
   101  bound by ``totalQueryLimit`` (default 100000) from ``core.yaml``. This is the maximum
   102  number of results that chaincode will iterate through and return to the client,
   103  in order to avoid accidental or malicious long-running queries.
   104  
   105  .. note:: Regardless of whether chaincode uses paginated queries or not, the peer will
   106            query CouchDB in batches based on ``internalQueryLimit`` (default 1000)
   107            from ``core.yaml``. This behavior ensures reasonably sized result sets are
   108            passed between the peer and CouchDB when executing chaincode, and is
   109            transparent to chaincode and the calling client.
   110  
   111  An example using pagination is included in the :doc:`couchdb_tutorial` tutorial.
   112  
   113  CouchDB indexes
   114  ~~~~~~~~~~~~~~~
   115  
   116  Indexes in CouchDB are required in order to make JSON queries efficient and are required for
   117  any JSON query with a sort. Indexes enable you to query data from chaincode when you have
   118  a large amount of data on your ledger. Indexes can be packaged alongside chaincode
   119  in a ``/META-INF/statedb/couchdb/indexes`` directory. Each index must be defined in
   120  its own text file with extension ``*.json`` with the index definition formatted in JSON
   121  following the `CouchDB index JSON syntax <http://docs.couchdb.org/en/2.1.1/api/database/find.html#db-index>`__.
   122  For example, to support the above marble query, a sample index on the ``docType`` and ``owner``
   123  fields is provided:
   124  
   125  .. code:: bash
   126  
   127    {"index":{"fields":["docType","owner"]},"ddoc":"indexOwnerDoc", "name":"indexOwner","type":"json"}
   128  
   129  The sample index can be found `here <https://github.com/hyperledger/fabric-samples/blob/master/chaincode/marbles02/go/META-INF/statedb/couchdb/indexes/indexOwner.json>`__.
   130  
   131  Any index in the chaincode’s ``META-INF/statedb/couchdb/indexes`` directory
   132  will be packaged up with the chaincode for deployment. The index will be deployed
   133  to a peers channel and chaincode specific database when the chaincode package is
   134  installed on the peer and the chaincode definition is committed to the channel. If you
   135  install the chaincode first and then commit the the chaincode definition to the
   136  channel, the index will be deployed at commit time. If the chaincode has already
   137  been defined on the channel and the chaincode package subsequently installed on
   138  a peer joined to the channel, the index will be deployed at chaincode
   139  **installation** time.
   140  
   141  Upon deployment, the index will automatically be utilized by chaincode queries. CouchDB can automatically
   142  determine which index to use based on the fields being used in a query. Alternatively, in the
   143  selector query the index can be specified using the ``use_index`` keyword.
   144  
   145  The same index may exist in subsequent versions of the chaincode that gets installed. To change the
   146  index, use the same index name but alter the index definition. Upon installation/instantiation, the index
   147  definition will get re-deployed to the peer’s state database.
   148  
   149  If you have a large volume of data already, and later install the chaincode, the index creation upon
   150  installation may take some time. Similarly, if you have a large volume of data already and commit the
   151  definition of a subsequent chaincode version, the index creation may take some time. Avoid calling chaincode
   152  functions that query the state database at these times as the chaincode query may time out while the
   153  index is getting initialized. During transaction processing, the indexes will automatically get refreshed
   154  as blocks are committed to the ledger. If the peer crashes during chaincode installation, the couchdb
   155  indexes may not get created. If this occurs, you need to reinstall the chaincode to create the indexes.
   156  
   157  CouchDB Configuration
   158  ---------------------
   159  
   160  CouchDB is enabled as the state database by changing the ``stateDatabase`` configuration option from
   161  goleveldb to CouchDB. Additionally, the ``couchDBAddress`` needs to configured to point to the
   162  CouchDB to be used by the peer. The username and password properties should be populated with
   163  an admin username and password if CouchDB is configured with a username and password. Additional
   164  options are provided in the ``couchDBConfig`` section and are documented in place. Changes to the
   165  *core.yaml* will be effective immediately after restarting the peer.
   166  
   167  You can also pass in docker environment variables to override core.yaml values, for example
   168  ``CORE_LEDGER_STATE_STATEDATABASE`` and ``CORE_LEDGER_STATE_COUCHDBCONFIG_COUCHDBADDRESS``.
   169  
   170  Below is the ``stateDatabase`` section from *core.yaml*:
   171  
   172  .. code:: bash
   173  
   174      state:
   175        # stateDatabase - options are "goleveldb", "CouchDB"
   176        # goleveldb - default state database stored in goleveldb.
   177        # CouchDB - store state database in CouchDB
   178        stateDatabase: goleveldb
   179        # Limit on the number of records to return per query
   180        totalQueryLimit: 10000
   181        couchDBConfig:
   182           # It is recommended to run CouchDB on the same server as the peer, and
   183           # not map the CouchDB container port to a server port in docker-compose.
   184           # Otherwise proper security must be provided on the connection between
   185           # CouchDB client (on the peer) and server.
   186           couchDBAddress: couchdb:5984
   187           # This username must have read and write authority on CouchDB
   188           username:
   189           # The password is recommended to pass as an environment variable
   190           # during start up (e.g. LEDGER_COUCHDBCONFIG_PASSWORD).
   191           # If it is stored here, the file must be access control protected
   192           # to prevent unintended users from discovering the password.
   193           password:
   194           # Number of retries for CouchDB errors
   195           maxRetries: 3
   196           # Number of retries for CouchDB errors during peer startup
   197           maxRetriesOnStartup: 10
   198           # CouchDB request timeout (unit: duration, e.g. 20s)
   199           requestTimeout: 35s
   200           # Limit on the number of records per each CouchDB query
   201           # Note that chaincode queries are only bound by totalQueryLimit.
   202           # Internally the chaincode may execute multiple CouchDB queries,
   203           # each of size internalQueryLimit.
   204           internalQueryLimit: 1000
   205           # Limit on the number of records per CouchDB bulk update batch
   206           maxBatchUpdateSize: 1000
   207           # Warm indexes after every N blocks.
   208           # This option warms any indexes that have been
   209           # deployed to CouchDB after every N blocks.
   210           # A value of 1 will warm indexes after every block commit,
   211           # to ensure fast selector queries.
   212           # Increasing the value may improve write efficiency of peer and CouchDB,
   213           # but may degrade query response time.
   214           warmIndexesAfterNBlocks: 1
   215  
   216  CouchDB hosted in docker containers supplied with Hyperledger Fabric have the
   217  capability of setting the CouchDB username and password with environment
   218  variables passed in with the ``COUCHDB_USER`` and ``COUCHDB_PASSWORD`` environment
   219  variables using Docker Compose scripting.
   220  
   221  For CouchDB installations outside of the docker images supplied with Fabric,
   222  the
   223  `local.ini file of that installation
   224  <http://docs.couchdb.org/en/2.1.1/config/intro.html#configuration-files>`__
   225  must be edited to set the admin username and password.
   226  
   227  Docker compose scripts only set the username and password at the creation of
   228  the container. The *local.ini* file must be edited if the username or password
   229  is to be changed after creation of the container.
   230  
   231  If you choose to map the fabric-couchdb container port to a host port, make sure you
   232  are aware of the security implications. Mapping the CouchDB container port in a
   233  development environment exposes the CouchDB REST API and allows you to visualize
   234  the database via the CouchDB web interface (Fauxton). In a production environment
   235  you should refrain from mapping the host port to restrict access to the CouchDB
   236  container. Only the peer will be able to access the CouchDB container.
   237  
   238  .. note:: CouchDB peer options are read on each peer startup.
   239  
   240  Good practices for queries
   241  --------------------------
   242  
   243  Avoid using chaincode for queries that will result in a scan of the entire
   244  CouchDB database. Full length database scans will result in long response
   245  times and will degrade the performance of your network. You can take some of
   246  the following steps to avoid long queries:
   247  
   248  - When using JSON queries:
   249  
   250      * Be sure to create indexes in the chaincode package.
   251      * Avoid query operators such as ``$or``, ``$in`` and ``$regex``, which lead
   252        to full database scans.
   253  
   254  - For range queries, composite key queries, and JSON queries:
   255  
   256      * Utilize paging support instead of one large result set.
   257  
   258  - If you want to build a dashboard or collect aggregate data as part of your
   259    application, you can query an off-chain database that replicates the data
   260    from your blockchain network. This will allow you to query and analyze the
   261    blockchain data in a data store optimized for your needs, without degrading
   262    the performance of your network or disrupting transactions. To achieve this,
   263    applications may use block or chaincode events to write transaction data
   264    to an off-chain database or analytics engine. For each block received, the block
   265    listener application would iterate through the block transactions and build a
   266    data store using the key/value writes from each valid transaction's ``rwset``.
   267    The :doc:`peer_event_services` provide replayable events to ensure the
   268    integrity of downstream data stores.
   269  
   270  .. Licensed under Creative Commons Attribution 4.0 International License
   271     https://creativecommons.org/licenses/by/4.0/