github.com/weaviate/weaviate@v1.24.6/adapters/repos/db/roaringset/doc.go (about) 1 // _ _ 2 // __ _____ __ ___ ___ __ _| |_ ___ 3 // \ \ /\ / / _ \/ _` \ \ / / |/ _` | __/ _ \ 4 // \ V V / __/ (_| |\ V /| | (_| | || __/ 5 // \_/\_/ \___|\__,_| \_/ |_|\__,_|\__\___| 6 // 7 // Copyright © 2016 - 2024 Weaviate B.V. All rights reserved. 8 // 9 // CONTACT: hello@weaviate.io 10 // 11 12 // Package roaringset contains all the LSM business logic that is unique 13 // to the "RoaringSet" strategy 14 // 15 // This package alone does not contain an entire LSM store. It's intended to be 16 // used as part of the [github.com/weaviate/weaviate/adapters/repos/db/lsmkv] package. 17 // 18 // # Motivation 19 // 20 // What makes the RoaringSet strategy unique is that it's essentially a fully 21 // persistent Roaring Bitmap that can be built up and updated incrementally 22 // (without write amplification) while being extremely fast to query. 23 // 24 // Without this specific strategy, it would not be efficient to use roaring 25 // bitmaps in an LSM store. For example: 26 // 27 // - Lucene uses posting lists in the inverted index on disk and supports 28 // converting them to a Roaring Bitmap at query time. This resulting bitmap 29 // can then be cached. However, the cost to initially convert a posting list 30 // to a roaring bitmap is quite huge. In our own tests, inserting 90M out of 31 // 100M possible ids into a [github.com/weaviate/sroar.Bitmap] takes about 32 // 3.5s. 33 // 34 // - You could store a regular roaring bitmap, such as 35 // [github.com/weaviate/sroar.Bitmap] in a regular LSM store, such as 36 // RocksDB. This would fix the retrieval issue and you should be able to 37 // retrieve and initialize a bitmap containing 90M objects in a few 38 // milliseconds. However, the cost to incrementally update this bitmap would 39 // be extreme. You would have to use a read-modify-write pattern which would 40 // lead to huge write-amplification on large setups. A 90M roaring bitmap 41 // is about 10.5MB, so to add a single entry (which would take up anywhere 42 // from 1 bit to 2 bytes), you would have to read 10.5MB and write 10.5MB 43 // again. That's not feasible except for bulk-loading. In Weaviate we cannot 44 // always assume bulk loading, as user behavior and insert orders are 45 // generally unpredictable. 46 // 47 // We solve this issue by making the LSM store roaring-bitmap-native. This way, 48 // we can keep the benefits of an LSM store (very fast writes) with the 49 // benefits of a serialized roaring bitmap (very fast reads/initializations). 50 // 51 // Essentially this means the RoaringSet strategy behaves like a fully 52 // persistent (and durable) Roaring Bitmap. See the next section to learn how 53 // it works under the hood. 54 // 55 // # Internals 56 // 57 // The public-facing methods make use of [github.com/weaviate/sroar.Bitmap]. 58 // This serialized bitmap already fulfills many of the criteria needed in 59 // Weaviate. It can be initialized at almost no cost (sharing memory) or very 60 // little cost (copying some memory). Furthermore, its set, remove, and 61 // intersection methods work well for the inverted index use cases in Weaviate. 62 // 63 // So, the novel part in the lsmkv.RoaringSet strategy does not sit in the 64 // roaring bitmap itself, but rather in the way it's persisted. It uses the 65 // standard principles of an LSM store where each new write is first cached in 66 // a memtable (and of course written into a Write-Ahead-Log to make it 67 // durable). The memtable is flushed into a disk segment when specific criteria 68 // are met (memtable size, WAL size, idle time, time since last flush, etc.). 69 // 70 // This means that each layer (represented by [BitmapLayer]) only contains the 71 // deltas that were written in a specific time interval. When reading, all 72 // layers must be combined into a single bitmap (see [BitmapLayers.Flatten]). 73 // 74 // Over time segments can be combined into fewer, larger segments using an LSM 75 // Compaction process. The logic for that can be found in [BitmapLayers.Merge]. 76 // 77 // To make sure access is efficient the entire RoaringSet strategy is built to 78 // avoid encoding/decoding steps. Instead we internally store data as simple 79 // byte slices. For example, see [SegmentNode]. You can access bitmaps without 80 // any meaningful allocations using [SegmentNode.Additions] and 81 // [SegmentNode.Deletions]. If you plan to hold on to the bitmap for a time 82 // window that is longer than holding a lock that prevents a compaction, you 83 // need to copy data (e.g. using [SegmentNode.AdditionsWithCopy]). Even with 84 // such a copy, reading a 90M-ids bitmap takes only single-digit milliseconds. 85 package roaringset