github.com/cozy/cozy-stack@v0.0.0-20240603063001-31110fa4cae1/docs/couchdb-quirks.md (about) 1 # CouchDB Quirks 2 3 ## Mango indexes 4 5 ### Exists operator 6 7 The 8 [`$exists` operator](http://docs.couchdb.org/en/stable/api/database/find.html#condition-operators) 9 can be used with a mango index for the `true` value, but not for the `false` 10 value. For `false`, a more heavy solution is required: 11 [a partial index](http://docs.couchdb.org/en/stable/api/database/find.html#find-partial-indexes). 12 13 ### Index selection 14 15 CouchDB may accept or refuse to use a mango index for a query, with obsure 16 reasons. In general, you can follow these two rules of thumb: 17 18 1. An index on the fields `foo, bar, baz` can be used only to fetch documents 19 where `foo`, `bar`, and `baz` exist. It means that a query that filters only 20 on the value on `foo` won't use the mango index, because it can miss a 21 document where `foo` has the expected value but without `bar` or `baz`. If 22 you know that all the documents that you want have the `bar` and `baz` 23 fields, you can just add two filters `$exists: true` (one for `bar`, the 24 other for `baz`). 25 26 2. You should use exactly the same sequence of fields for creating the index and 27 the `sort` operator of the query. If you have an index on `os, browser, ip` 28 for the `io.cozy.sessions.logins`, and you want to have all the documents for 29 a login from `windows`, sorted by `browser`, you can use the index, but you 30 should use `os, browser, ip` for the sort (or at least `os, browser`, even if 31 it is seems to weird to sort on `os` when all the sorted documents will have 32 the same value, `windows`). Please note that using `use_index` on a request, 33 the results will be sorted by default according to this rule. So, you can 34 omit the `sort` operator on the query (except if you want the `descending` 35 order). 36 37 ### Warnings for slow requests 38 39 When requesting a mango index, CouchDB can use an index. But there are also 40 cases where no index can be used, or where the index is not optimal. Let's 41 see the different scenarios: 42 43 - CouchDB doesn't use an index, it will respond with a warning, and cozy-stack 44 will transform this warning in an error, as developers should really avoid 45 this issue 46 47 - CouchDB can use an index for the selector but not for the sort, it will 48 respond with an error, and the cozy-stack will just forward the error 49 50 - CouchDB can use an index, but will still look at much more documents in 51 the index that what will be in the response (it happens with `$or` and `$in` 52 operators, which should be avoided), CouchDB 3+ will send a warning and the 53 cozy-stack will forward the documents and the warning to the client. 54 55 ### Comparison of strings 56 57 Comparison of strings is done using ICU which implements the Unicode Collation 58 Algorithm, giving a dictionary sorting of keys. This can give surprising 59 results if you were expecting ASCII ordering. Note that: 60 61 - All symbols sort before numbers and letters (even the “high” symbols like tilde, `0x7e`) 62 - Differing sequences of letters are compared without regard to case, so `a < aa` but also `A < aa` and `a < AA` 63 - Identical sequences of letters are compared with regard to case, with lowercase before uppercase, so `a < A`. 64 65 ## Old revisions 66 67 CouchDB keeps for each document a list of its revision (or more exactly a tree 68 with replication and conflicts). 69 70 It's possible to ask the list of the old revisions of a document with 71 [`GET /db/{docid}?revs_info=true`](http://docs.couchdb.org/en/stable/api/document/common.html#get--db-docid). 72 It works only if the document has not been deleted. For a deleted document, 73 [a trick](https://stackoverflow.com/questions/10854883/retrieve-just-deleted-document/10857330#10857330) 74 is to query the changes feed to know the last revision of the document, and to 75 recreate the document from this revision. 76 77 With an old revision, it's possible to get the content of the document at this 78 revision with `GET /db/{docid}?rev={rev}` if the database was not compacted. On 79 CouchDB 2.x, compacts happen automatically on all databases from times to times. 80 81 A `purge` operation consists to remove the tombstone for the deleted documents. 82 It is a manual operation, triggered by a 83 [`POST /db/_purge`](http://docs.couchdb.org/en/stable/api/database/misc.html). 84 85 ## Conflicts 86 87 It is possible to create a conflict on CouchDB like it does for the replication 88 by using `new_edits: false`, but it is not well documented to say the least. The 89 more accurate description was in the old wiki, that [no longer 90 exists](https://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Posting_Existing_Revisions). 91 Here is a copy of what it said: 92 93 > The replicator uses a special mode of \_bulk_docs. The documents it writes to 94 > the destination database already have revision IDs that need to be preserved for 95 > the two databases to be in sync (otherwise it would not be possible to tell that 96 > the two represent the same revision.) To prevent the database from assigning 97 > them new revision IDs, a "new_edits":false property is added to the JSON request 98 > body. 99 100 > Note that this changes the interpretation of the \_rev parameter in each 101 > document: rather than being the parent revision ID to be matched against, it's 102 > the existing revision ID that will be saved as-is into the database. And since 103 > it's important to retain revision history when adding to the database, each 104 > document body in this mode should have a \_revisions property that lists its 105 > revision history; the format of this property is described on the HTTP document 106 > API. For example: 107 108 > `curl -X POST -d '{"new_edits":false,"docs":[{"_id":"person","_rev":"2-3595405","_revisions":{"start":2,"ids":["3595405","877727288"]},"name":"jim"}]}' "$OTHER_DB/_bulk_docs"` 109 110 > This command will replicate one of the revisions created above, into a 111 > separate database `OTHER_DB`. It will have the same revision ID as in `DB`, 112 > `2-3595405`, and it will be known to have a parent revision with ID 113 > `1-877727288`. (Even though `OTHER_DB` will not have the body of that revision, 114 > the history will help it detect conflicts in future replications.) 115 116 > As with \_all_or_nothing, this mode can create conflicts; in fact, this is 117 > where the conflicts created by replication come from. 118 > In short, it's a `PUT /doc/{id}?new_edits=false` with `_rev` the new revision of 119 > the document, and `_revisions` the parents of this revision in the revisions 120 > tree of this document. 121 122 ### Conflict example 123 124 Here is an example of a CouchDB conflict. 125 126 Let's assume the following document with the revision history `[1-abc, 2-def]` 127 saved in database: 128 129 ``` 130 { 131 "_id": foo, 132 "_rev": 2-def, 133 "bar": "tender", 134 "_revisions": { 135 "ids": [ 136 "def", 137 "abc" 138 ] 139 } 140 } 141 ``` 142 143 The `_revisions` block is returned when passing `revs=true` to the query and 144 gives all the revision ids, which the revision part after the dash. 145 For instance, in `2-def`, `2` is called the "generation" and `def` the "id". 146 147 We update the document with a `POST /bulk_docs` query, with the following 148 content: 149 150 ``` 151 { 152 "docs": [ 153 { 154 "_id": "foo", 155 "_rev": "3-ghi", 156 "_revisions": { "start": 3, "ids": ["ghi", "xyz", "abc"] } 157 , 158 "bar": "racuda" 159 } 160 ], 161 "new_edits": false 162 } 163 ``` 164 165 This produces a conflict bewteen `2-def` and `2-xyz`: the former was first saved 166 in database, but we forced the latter to be a new child of `1-abc`. Hence, this 167 document will have two revisions branches: `1-abc, 2-def` and `1-abc, 2-xyz, 3-ghi`. 168 169 ### Sharing 170 171 In the [sharing protocol](https://docs.cozy.io/en/cozy-stack/sharing-design/), 172 we implement this behaviour as we follow the CouchDB replication model. However, 173 we prevent CouchDB conflicts for files and directories: see [this 174 explanation](https://docs.cozy.io/en/cozy-stack/sharing-design/#couchdb-conflicts) 175 176 ## Design docs in \_all_docs 177 178 When querying `GET /{db}/_all_docs`, the response include the design docs. It's 179 quite difficult to filter them, particulary when pagination is involved. We have 180 added an endpoint `GET /data/:doctype/_normal_docs` to the stack to help client 181 side applications to deal with this.