github.com/bhojpur/cache@v0.0.4/pkg/memory/README.md (about) 1 # Bhojpur Cache - In-Memory Storage Engine 2 3 It is a key/value database `storage engine` inspired by [Howard Chu's][hyc_symas] 4 [LMDB project][lmdb]. The goal of the project is to provide a simple, fast, and 5 reliable in-memory database storage engine for such projects that do not 6 require full-fledged database server functionality, such as: PostgreSQL or MySQL. 7 8 Since the [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database 9 storage engine is meant to be used as a low-level piece of functionality, thence 10 simplicity is the key. The Database APIs will be small and only focus on getting 11 values and setting values. 12 13 ## Getting Started 14 15 ### Installing 16 17 To start using [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database 18 storage engine, install Go and run `go get`: 19 20 ```sh 21 $ go get github.com/bhojpur/cache/pkg/memory... 22 ``` 23 24 It will retrieve the `in-memory storage` database engine library and install the 25 [Bhojpur Cache](https://github.com/bhojpur/cache) command line utility into 26 your `$GOBIN` path. 27 28 29 ### Opening an In-Memory Database 30 31 The top-level object in a [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory 32 database storage engine is a `DB`. It is represented as a single `file` on your data 33 storage volume and represents a consistent `snapshot` of your in-memory data. 34 35 To open your `in-memory database`, simply use the `memory.Open()` function: 36 37 ```go 38 package main 39 40 import ( 41 "log" 42 43 memory "github.com/bhojpur/cache/pkg/memory" 44 ) 45 46 func main() { 47 // Open the my.db data file in your current directory. 48 // It will be created, if the file doesn't exist. 49 db, err := memory.Open("my.db", 0600, nil) 50 if err != nil { 51 log.Fatal(err) 52 } 53 defer db.Close() 54 55 ... 56 } 57 ``` 58 59 Please note that [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database 60 storage engine obtains a **file lock** on the data file so multiple processes cannot 61 open the same database at the same time. Opening an already open [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database file will cause it to hang until the other processes closes 62 it. To prevent an indefinite wait time, you can pass a `timeout` option to the `Open()` 63 function: 64 65 ```go 66 db, err := memory.Open("my.db", 0600, &memory.Options{Timeout: 1 * time.Second}) 67 ``` 68 69 70 ### In-Memory Transactions 71 72 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 73 engine allows only `one read-write` transaction at a time, but allows as `many read-only` 74 transactions as you want at a time. Each transaction has a consistent view of 75 the data as it existed when the transaction started. 76 77 Individual transactions and all `objects` created from them (e.g. buckets, keys) 78 are not `thread safe`. To work with data in multiple `goroutines` you must start 79 a transaction for each one or use `locking` to ensure only one goroutine accesses 80 a transaction at a time. Creating a transaction from the `DB` is `thread safe`. 81 82 The `read-only` transactions and `read-write` transactions should not depend on 83 one another and generally shouldn't be opened simultaneously in the same goroutine. 84 It can cause a deadlock as the `read-write` transaction needs to periodically 85 re-map the data file, but it cannot do so while a `read-only` transaction is open. 86 87 88 #### Read-write Transactions 89 90 To start a `read-write` transaction, you can use the `DB.Update()` function: 91 92 ```go 93 err := db.Update(func(tx *memory.Tx) error { 94 ... 95 return nil 96 }) 97 ``` 98 99 Inside the closure, you have a consistent view of the database. You `commit` the 100 transaction by returning `nil` at the end. You can also `rollback` the transaction 101 at any point by returning an error. All database operations are allowed inside 102 a `read-write` transaction. 103 104 Always check the return `error` as it will report any disk failures that can cause 105 your transaction to remain incomplete. If you return an `error` within your closure 106 it will be passed through. 107 108 #### Read-only Transactions 109 110 To start a `read-only` transaction, you can use the `DB.View()` function: 111 112 ```go 113 err := db.View(func(tx *memory.Tx) error { 114 ... 115 return nil 116 }) 117 ``` 118 119 You also get a consistent view of the database within this closure. However, 120 no mutating operations are allowed within a `read-only` transaction. You can 121 only retrieve the `buckets`, retrieve values, and copy the database within a 122 `read-only` transaction. 123 124 125 #### Batch read-write Transactions 126 127 Each `DB.Update()` operation waits for the storage disk volumes to commit the 128 `writes`. This overhead can be minimized by combining multiple updates with 129 the `DB.Batch()` function: 130 131 ```go 132 err := db.Batch(func(tx *memory.Tx) error { 133 ... 134 return nil 135 }) 136 ``` 137 138 The concurrent `Batch` calls are opportunistically combined into larger 139 transactions. A `Batch` is only useful when there are multiple goroutines 140 calling it. 141 142 The trade-off is that `Batch` can call the given function multiple times, 143 if parts of the transaction fail. The function must be idempotent and side 144 effects must take effect only after a successful return from `DB.Batch()`. 145 146 For example: do not display messages from inside the function, instead 147 set variables in the enclosing scope: 148 149 ```go 150 var id uint64 151 err := db.Batch(func(tx *memory.Tx) error { 152 // Find last key in bucket, decode as bigendian uint64, increment 153 // by one, encode back to []byte, and add new key. 154 ... 155 id = newValue 156 return nil 157 }) 158 if err != nil { 159 return ... 160 } 161 fmt.Println("Allocated ID %d", id) 162 ``` 163 164 165 #### Managing transactions manually 166 167 The `DB.View()` and `DB.Update()` functions are wrappers around the `DB.Begin()` 168 function. These helper functions will start the transaction, execute a function, 169 then safely close your transaction if an error is returned. It is a recommended 170 way to use [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database 171 transactions. 172 173 However, sometimes you may want to manually `start` and `end` your transactions. 174 You can use the `DB.Begin()` function directly, but **please** be sure to close 175 the transaction. 176 177 ```go 178 // Start a writable transaction. 179 tx, err := db.Begin(true) 180 if err != nil { 181 return err 182 } 183 defer tx.Rollback() 184 185 // Use the transaction... 186 _, err := tx.CreateBucket([]byte("MyBucket")) 187 if err != nil { 188 return err 189 } 190 191 // Commit the transaction and check for error. 192 if err := tx.Commit(); err != nil { 193 return err 194 } 195 ``` 196 197 The first argument to `DB.Begin()` is a `boolean` stating, if the transaction 198 should be `writable`. 199 200 201 ### Using Buckets 202 203 The `Bucket` are collections of key/value pairs within the database. All keys 204 in a bucket must be unique. You can create a `Bucket` using the `DB.CreateBucket()` 205 function: 206 207 ```go 208 db.Update(func(tx *memory.Tx) error { 209 b, err := tx.CreateBucket([]byte("MyBucket")) 210 if err != nil { 211 return fmt.Errorf("create bucket: %s", err) 212 } 213 return nil 214 }) 215 ``` 216 217 You can also create a `Bucket` only if it doesn't exist by using the 218 `Tx.CreateBucketIfNotExists()` function. It's a common pattern to call this 219 function for all your top-level buckets after you open your database so that 220 you can guarantee they exist for future transactions. 221 222 To delete a `Bucket`, simply call the `Tx.DeleteBucket()` function. 223 224 225 ### Using key/value Pairs 226 227 To save a key/value pair to a `Bucket`, use the `Bucket.Put()` function: 228 229 ```go 230 db.Update(func(tx *memory.Tx) error { 231 b := tx.Bucket([]byte("MyBucket")) 232 err := b.Put([]byte("answer"), []byte("42")) 233 return err 234 }) 235 ``` 236 237 It will set the value of the `"answer"` key to `"42"` in the `MyBucket` 238 bucket. To retrieve this value, we can use the `Bucket.Get()` function: 239 240 ```go 241 db.View(func(tx *memory.Tx) error { 242 b := tx.Bucket([]byte("MyBucket")) 243 v := b.Get([]byte("answer")) 244 fmt.Printf("The answer is: %s\n", v) 245 return nil 246 }) 247 ``` 248 249 The `Get()` function does not return any error, because its operation is 250 guaranteed to work (unless there is some kind of system failure). If the `key` 251 exists, then it will return its byte slice value. If it doesn't exist, then it 252 will return `nil`. It is important to note that you can have a zero-length 253 value set to a `key` which is different than the key not existing. 254 255 Use the `Bucket.Delete()` function to delete a key from the `Bucket`. 256 257 Please note that values returned from the `Get()` are only valid, while the 258 transaction is open. If you need to use a value outside of the transaction 259 then you must use `copy()` to copy it to another byte slice. 260 261 262 ### Auto-incrementing integer for the bucket 263 264 By using the `NextSequence()` function, you can let [Bhojpur Cache](https://github.com/bhojpur/cache) 265 in-memory database storage engine determine a `sequence`, which can be used as the 266 unique identifier for your key/value pairs. See the example below. 267 268 ```go 269 // CreateUser saves u to the In-Memory database. The new user ID is set on u once the data is persisted. 270 func (s *Store) CreateUser(u *User) error { 271 return s.db.Update(func(tx *memory.Tx) error { 272 // Retrieve the users Bucket. 273 // This should be created when the In-Memory database is first opened. 274 b := tx.Bucket([]byte("users")) 275 276 // Generate ID for the user. 277 // This returns an error only if the Tx is closed or not writeable. 278 // That can't happen in an Update() call so I ignore the error check. 279 id, _ := b.NextSequence() 280 u.ID = int(id) 281 282 // Marshal user data into bytes. 283 buf, err := json.Marshal(u) 284 if err != nil { 285 return err 286 } 287 288 // Persist bytes to users Bucket. 289 return b.Put(itob(u.ID), buf) 290 }) 291 } 292 293 // itob returns an 8-byte big endian representation of v. 294 func itob(v int) []byte { 295 b := make([]byte, 8) 296 binary.BigEndian.PutUint64(b, uint64(v)) 297 return b 298 } 299 300 type User struct { 301 ID int 302 ... 303 } 304 ``` 305 306 ### Iterating over Keys 307 308 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 309 engine stores its `keys` in byte-sorted order within a `Bucket`. It makes sequential 310 iteration over these `keys` extremely fast. To iterate over the `keys`, we'll use a 311 `Cursor`: 312 313 ```go 314 db.View(func(tx *memory.Tx) error { 315 // Assuming that Bucket exists and has keys 316 b := tx.Bucket([]byte("MyBucket")) 317 318 c := b.Cursor() 319 320 for k, v := c.First(); k != nil; k, v = c.Next() { 321 fmt.Printf("key=%s, value=%s\n", k, v) 322 } 323 324 return nil 325 }) 326 ``` 327 328 The `Cursor` allows you to move to a specific point in the list of `keys` and 329 move `forward` or `backward` through the keys one at a time. 330 331 The following functions are available on the `Cursor` object: 332 333 ``` 334 First() Move to the first key. 335 Last() Move to the last key. 336 Seek() Move to a specific key. 337 Next() Move to the next key. 338 Prev() Move to the previous key. 339 ``` 340 341 Each of those functions has a return signature of `(key []byte, value []byte)`. 342 When you have iterated to the end of the `Cursor`, then `Next()` will return a 343 `nil` key. You must seek to a position using `First()`, `Last()`, or `Seek()` 344 before calling `Next()` or `Prev()`. If you do not seek to a position, then 345 these functions will return a `nil` key. 346 347 During iteration, if the `key` is non-`nil` but the value is `nil`, that means 348 the `key` refers to a `Bucket` rather than a value. Use `Bucket.Bucket()` to 349 access the sub-bucket. 350 351 #### Prefix Scans 352 353 To iterate over a `key` prefix, you can combine `Seek()` and `bytes.HasPrefix()`: 354 355 ```go 356 db.View(func(tx *memory.Tx) error { 357 // Assuming that Bucket exists and has keys 358 c := tx.Bucket([]byte("MyBucket")).Cursor() 359 360 prefix := []byte("1234") 361 for k, v := c.Seek(prefix); k != nil && bytes.HasPrefix(k, prefix); k, v = c.Next() { 362 fmt.Printf("key=%s, value=%s\n", k, v) 363 } 364 365 return nil 366 }) 367 ``` 368 369 #### Range Scans 370 371 Another common use case is scanning over a `range` such as, a `time range`. If 372 you use a sortable time encoding, such as `RFC3339`, then you can query a 373 specific `date range` like this: 374 375 ```go 376 db.View(func(tx *memory.Tx) error { 377 // Assume our events bucket exists and has RFC3339 encoded time keys. 378 c := tx.Bucket([]byte("Events")).Cursor() 379 380 // Our time range spans the 2010's decade. 381 min := []byte("2010-01-01T00:00:00Z") 382 max := []byte("2020-01-01T00:00:00Z") 383 384 // Iterate over the 2010's. 385 for k, v := c.Seek(min); k != nil && bytes.Compare(k, max) <= 0; k, v = c.Next() { 386 fmt.Printf("%s: %s\n", k, v) 387 } 388 389 return nil 390 }) 391 ``` 392 393 Note that, while `RFC3339` is sortable, the Go implementation of `RFC3339Nano` 394 does not use a fixed number of digits after the decimal point and is therefore 395 not sortable. 396 397 #### ForEach() 398 399 You can also use the function `ForEach()`, if you know you'll be iterating over 400 all the `keys` in a Bucket: 401 402 ```go 403 db.View(func(tx *memory.Tx) error { 404 // Assume that Bucket exists and has keys 405 b := tx.Bucket([]byte("MyBucket")) 406 407 b.ForEach(func(k, v []byte) error { 408 fmt.Printf("key=%s, value=%s\n", k, v) 409 return nil 410 }) 411 return nil 412 }) 413 ``` 414 415 Please note that `keys` and `values` in a `ForEach()` call are only valid, while 416 the transaction is `open`. If you need to use a `key` or `value` outside of the 417 transaction, you must use `copy()` to copy it to another byte slice. 418 419 ### Nested Buckets 420 421 You can also store a `Bucket` in a key to create nested buckets. The API is the 422 same as the Bucket Management API on the `DB` object: 423 424 ```go 425 func (*Bucket) CreateBucket(key []byte) (*Bucket, error) 426 func (*Bucket) CreateBucketIfNotExists(key []byte) (*Bucket, error) 427 func (*Bucket) DeleteBucket(key []byte) error 428 ``` 429 430 For example, you had a `multi-tenant` software application, where the root-level 431 bucket was the `Account` bucket. Inside of this bucket, there was a sequence of 432 `accounts`, which themselves are buckets. And, inside the sequence bucket, you 433 could have many more buckets pertaining to the `Account` itself (e.g., Users, 434 Notes, etc) isolating the information into logical groupings. 435 436 ```go 437 438 // createUser creates a new user in the given account. 439 func createUser(accountID int, u *User) error { 440 // Start the In-Memory database transaction. 441 tx, err := db.Begin(true) 442 if err != nil { 443 return err 444 } 445 defer tx.Rollback() 446 447 // Retrieve the root Bucket for the account. 448 // Assume this has already been created when the account was set up. 449 root := tx.Bucket([]byte(strconv.FormatUint(accountID, 10))) 450 451 // Setup the users Bucket. 452 bkt, err := root.CreateBucketIfNotExists([]byte("USERS")) 453 if err != nil { 454 return err 455 } 456 457 // Generate an ID for the new User. 458 userID, err := bkt.NextSequence() 459 if err != nil { 460 return err 461 } 462 u.ID = userID 463 464 // Marshal and save the encoded User. 465 if buf, err := json.Marshal(u); err != nil { 466 return err 467 } else if err := bkt.Put([]byte(strconv.FormatUint(u.ID, 10)), buf); err != nil { 468 return err 469 } 470 471 // Commit the In-Memory database transaction. 472 if err := tx.Commit(); err != nil { 473 return err 474 } 475 476 return nil 477 } 478 479 ``` 480 481 ### In-Memory Database Backups 482 483 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 484 engine stores a single `file` so it's easy to backup. You can use `Tx.WriteTo()` 485 function to write a consistent view of the in-memory database to a writer. If you 486 call this from a `read-only` transaction, it will perform a `hot backup` and not 487 block your other database reads and writes. 488 489 By default, it will use a regular file handle which will utilize the operating 490 system's page cache. 491 492 A common use case is to `Backup-over-HTTP` so that you could use tools like `cURL` 493 to do the in-memory database backups: 494 495 ```go 496 func BackupHandleFunc(w http.ResponseWriter, req *http.Request) { 497 err := db.View(func(tx *memory.Tx) error { 498 w.Header().Set("Content-Type", "application/octet-stream") 499 w.Header().Set("Content-Disposition", `attachment; filename="my.db"`) 500 w.Header().Set("Content-Length", strconv.Itoa(int(tx.Size()))) 501 _, err := tx.WriteTo(w) 502 return err 503 }) 504 if err != nil { 505 http.Error(w, err.Error(), http.StatusInternalServerError) 506 } 507 } 508 ``` 509 510 Then, you can `backup` the data using this command: 511 512 ```sh 513 $ curl http://localhost/backup > my.db 514 ``` 515 516 Or, you can open a web-browser and point to `http://localhost/backup`. It will 517 download the in-memory data `snapshot` automatically. 518 519 If you want to backup to another file you can use the `Tx.CopyFile()` helper 520 function. 521 522 523 ### Statistics 524 525 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database keeps 526 a running count of many of the internal operations as it performs so that you 527 can better understand what's going on. By grabbing an in-memory data `snapshot` 528 of these stats at two points in time, we can analyze what operations were 529 performed during that `time range`. 530 531 For example, we could start a goroutine to log the `stats` every 10 seconds: 532 533 ```go 534 go func() { 535 // Grab the initial stats. 536 prev := memdb.Stats() 537 538 for { 539 // Wait for 10s. 540 time.Sleep(10 * time.Second) 541 542 // Grab the current stats and diff them. 543 stats := memdb.Stats() 544 diff := stats.Sub(&prev) 545 546 // Encode stats to JSON and print to STDERR. 547 json.NewEncoder(os.Stderr).Encode(diff) 548 549 // Save stats for the next loop. 550 prev = stats 551 } 552 }() 553 ``` 554 555 It's also useful to `pipe` these stats to a service, such as: `statsd`, for 556 monitoring or to provide an HTTP endpoint that would perform a fixed-length 557 sample. 558 559 ### Read-only Mode 560 561 Sometimes it is useful to create a shared, `read-only` [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database. To do this, set the `Options.ReadOnly` flag when opening 562 your in-memory database. The `read-only` mode uses a shared lock to allow 563 multiple processes to read from the in-memory database, but it will block 564 any processes from opening the database file in `read-write` mode. 565 566 ```go 567 db, err := memory.Open("my.db", 0666, &memory.Options{ReadOnly: true}) 568 if err != nil { 569 log.Fatal(err) 570 } 571 ``` 572 573 ### Mobile Platform usage (e.g., Android / iOS) 574 575 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 576 engine is able to run on mobile devices by leveraging the binding feature of the 577 [GoMobile](https://github.com/golang/mobile) tool. Create a `struct` that will 578 contain your [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database 579 storage logic and a reference to a `*memory.DB` by initializing constructor that 580 takes in a file path where the database file will be stored. Neither the `Android` 581 nor `iOS` require extra permissions or cleanup from using this method. 582 583 ```go 584 func NewCacheDB(filepath string) *CacheDB { 585 db, err := memory.Open(filepath+"/demo.db", 0600, nil) 586 if err != nil { 587 log.Fatal(err) 588 } 589 590 return &CacheDB{db} 591 } 592 593 type CacheDB struct { 594 db *memory.DB 595 ... 596 } 597 598 func (b *CacheDB) Path() string { 599 return b.db.Path() 600 } 601 602 func (b *CacheDB) Close() { 603 b.db.Close() 604 } 605 ``` 606 607 The database logic should be defined as `methods` on this wrapper `struct`. 608 609 To initialize this `struct` from the native language (both the mobile platforms 610 now sync their local storage to the Cloud. These snippets disable that 611 functionality for the database file): 612 613 #### Android Platform 614 615 ```java 616 String path; 617 if (android.os.Build.VERSION.SDK_INT >=android.os.Build.VERSION_CODES.LOLLIPOP){ 618 path = getNoBackupFilesDir().getAbsolutePath(); 619 } else{ 620 path = getFilesDir().getAbsolutePath(); 621 } 622 Cachemobiledemo.CacheDB cacheDB = Cachemobiledemo.NewCacheDB(path) 623 ``` 624 625 #### iOS Platform 626 627 ```objc 628 - (void)demo { 629 NSString* path = [NSSearchPathForDirectoriesInDomains(NSLibraryDirectory, 630 NSUserDomainMask, 631 YES) objectAtIndex:0]; 632 GoCachemobiledemoCacheDB * demo = GoCachemobiledemoNewCacheDB(path); 633 [self addSkipBackupAttributeToItemAtPath:demo.path]; 634 //Some DB Logic would go here 635 [demo close]; 636 } 637 638 - (BOOL)addSkipBackupAttributeToItemAtPath:(NSString *) filePathString 639 { 640 NSURL* URL= [NSURL fileURLWithPath: filePathString]; 641 assert([[NSFileManager defaultManager] fileExistsAtPath: [URL path]]); 642 643 NSError *error = nil; 644 BOOL success = [URL setResourceValue: [NSNumber numberWithBool: YES] 645 forKey: NSURLIsExcludedFromBackupKey error: &error]; 646 if(!success){ 647 NSLog(@"Error excluding %@ from backup %@", [URL lastPathComponent], error); 648 } 649 return success; 650 } 651 652 ``` 653 654 ## Comparing with other Database Systems 655 656 ### PostgreSQL, MySQL, & other relational databases 657 658 Relational databases structure data into rows and are only accessible through 659 the use of SQL. This approach provides flexibility in how you store and query 660 your data, but also incurs overhead in parsing and planning SQL statements. The 661 [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database engine 662 accesses all data by a byte slice key. This [Bhojpur Cache](https://github.com/bhojpur/cache) 663 memory database fast to read and write data by key, but provides no built-in 664 support for joining values together. 665 666 Most relational databases (with the exception of `SQLite`) are standalone servers 667 that run separately from your application. This gives your systems flexibility 668 to connect multiple application servers to a single database server, but also 669 adds overhead in serializing and transporting data over the network. The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database engine runs as a library included in your 670 application, so all data access has to go through your application's process. 671 This brings data closer to your application, but limits multi-process access 672 to the data. 673 674 ### LevelDB, RocksDB 675 676 The `LevelDB` and its derivatives (e.g., RocksDB, HyperLevelDB) are similar to 677 the [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 678 engine in that they are libraries bundled into the application, however, their 679 underlying structure is a log-structured merge-tree (LSM tree). An `LSM` tree 680 optimizes random writes by using a `write ahead` log and multi-tiered, sorted 681 files, called `SSTables`. The [Bhojpur Cache](https://github.com/bhojpur/cache) 682 in-memory database storage engine uses a `B+tree` internally and only a single 683 file. Both the approaches have some trade-offs. 684 685 If you require a high random write throughput (>10,000 w/sec) or you need to use 686 spinning disk drives, then `LevelDB` could be a good choice. If your application 687 is `read-heavy` or does a lot of range scans then [Bhojpur Cache](https://github.com/bhojpur/cache) 688 in-memory database storage engine could be a good choice. 689 690 Another important consideration is that `LevelDB` does not have transactions. 691 It supports batch writing of key/values pairs and it supports read snapshots 692 but it will not give you the ability to do a `compare-and-swap` operation safely. 693 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 694 engine supports fully serializable ACID transactions. 695 696 ### LMDB 697 698 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory databse storage 699 engine was originally a port of `LMDB` so it is architecturally similar. Both use 700 a `B+tree`, have ACID semantics with fully serializable transactions, and support 701 lock-free MVCC using a `single writer` and `multiple readers`. 702 703 The two projects have somewhat diverged. LMDB heavily focuses on raw performance 704 while [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 705 engine has focused on simplicity and ease of use. For example, LMDB allows several 706 unsafe actions, such as: `direct writes` for the sake of performance. The 707 [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage engine 708 opts to disallow actions which can leave the database in a corrupted state. The 709 only exception to this in [Bhojpur Cache](https://github.com/bhojpur/cache) 710 in-memory database storage engine is `DB.NoSync`. 711 712 There are also a few differences in API. LMDB requires a maximum `mmap` size when 713 opening an `mdb_env` whereas [Bhojpur Cache](https://github.com/bhojpur/cache) 714 in-memory database storage engine will handle incremental `mmap` resizing 715 automatically. LMDB overloads the `getter` and `setter` functions with 716 multiple flags, whereas [Bhojpur Cache](https://github.com/bhojpur/cache) 717 in-memory database splits these specialized cases into their own functions. 718 719 ## Caveats & Limitations 720 721 It's important to pick the right tool for the job and [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage engine is no exception. Here are a few things to 722 note when evaluating and using it: 723 724 * [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 725 engine is good for read intensive workloads. Sequential write performance is 726 also fast but random writes can be slow. You can use `DB.Batch()` or add a 727 write-ahead log to help mitigate this issue. 728 729 * [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 730 engine uses a `B+tree` internally, so there can be a lot of random page access. 731 The **solid-state drive** (SSD) provide a significant performance boost over 732 spinning disk drives. 733 734 * Try to avoid `long running` read transactions. [Bhojpur Cache](https://github.com/bhojpur/cache) 735 in-memory database storage engine uses `copy-on-write` so old pages cannot be 736 reclaimed while an old transaction is using them. 737 738 * Byte slices returned from [Bhojpur Cache](https://github.com/bhojpur/cache) 739 in-memory database storage engine are only valid during a transaction. Once 740 the transaction has been committed or rolled back then the memory they point 741 to can be reused by a new page or can be unmapped from virtual memory and 742 you'll see an `unexpected fault address` panic when accessing it. 743 744 * [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 745 engine uses an exclusive write lock on the database file so it cannot be 746 shared by multiple processes. 747 748 * Be careful while using `Bucket.FillPercent`. Setting a high fill percent for 749 the `Buckets` that have random inserts will cause your database to have very 750 poor page utilization. 751 752 * In general, use **larger** buckets. Smaller buckets causes poor memory page 753 utilization once they become larger than the page size (typically 4KB). 754 755 * Bulk loading a lot of random writes into a new `Bucket` could be slow as the 756 page will not split until the transaction is committed. Randomly inserting 757 more than 100,000 key/value pairs into a single new `Bucket` in a single 758 transaction is not advised. 759 760 * [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 761 engine uses a `memory-mapped` file, so the underlying operating system handles 762 the caching of the data. Typically, the OS will cache as much of the file as 763 it can in the memory and will release the memory as needed to other processes. 764 This means that [Bhojpur Cache](https://github.com/bhojpur/cache) storage engine 765 can show very high memory usage when working with large databases. However, this 766 is expected and the OS will release memory as needed. [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory storage engine can handle databases much larger than the available 767 physical RAM, provided its `memory-map` fits in the process virtual address 768 space. It may be problematic on 32-bits systems. 769 770 * The data structures in the [Bhojpur Cache](https://github.com/bhojpur/cache) 771 in-memory database are memory mapped so the data file will be `endian` specific. 772 This means that you cannot copy a [Bhojpur Cache](https://github.com/bhojpur/cache) 773 database file from a little endian machine to a big endian machine and have it work. 774 For most users this is not a concern since most modern CPUs are little endian. 775 776 * Because of the way `pages` are laid out on disk, the [Bhojpur Cache](https://github.com/bhojpur/cache) 777 in-memory database storage engine cannot truncate data files and return free pages 778 back to the disk. Instead, [Bhojpur Cache](https://github.com/bhojpur/cache) 779 in-memory database storage engine maintains a `free list` of unused pages within 780 its data file. These free pages can be reused by later transactions. This works 781 well for many use cases as the databases generally tend to grow. However, it's 782 important to note that deleting large chunks of data will not allow you to 783 reclaim that space on disk. 784 785 ## Reading the Source Code 786 787 The [Bhojpur Cache](https://github.com/bhojpur/cache) in-memory database storage 788 engine is a relatively small code base (<3KLOC) for an embedded, serializable, 789 transactional key/value database so it can be a good starting point for people 790 interested in how databases work. 791 792 The best places to start are the main entry points into [Bhojpur Cache](https://github.com/bhojpur/cache) 793 in-memory database storage engine: 794 795 - `Open()` - Initializes the reference to the database. It's responsible for 796 creating the database if it doesn't exist, obtaining an exclusive lock on the 797 file, reading the meta pages, and memory-mapping the file. 798 799 - `DB.Begin()` - Starts a read-only or read-write transaction depending on the 800 value of the `writable` argument. This requires briefly obtaining the **meta** 801 lock to keep track of open transactions. Only one read-write transaction can 802 exist at a time so the **rwlock** is acquired during the life of a read-write 803 transaction. 804 805 - `Bucket.Put()` - Writes a key/value pair into a `Bucket`. After validating the 806 arguments, a cursor is used to traverse the B+tree to the page and position 807 where they key & value will be written. Once the position is found, the bucket 808 materializes the underlying page and the page's parent pages into memory as 809 "nodes". These nodes are where mutations occur during read-write transactions. 810 These changes get flushed to disk during commit. 811 812 - `Bucket.Get()` - Retrieves a key/value pair from a `Bucket`. This uses a cursor 813 to move to the page & position of a key/value pair. During a `read-only` 814 transaction, the key and value data is returned as a direct reference to the 815 underlying mmap file so there's no allocation overhead. For the `read-write` 816 transactions, this data may reference the mmap file or one of the in-memory 817 node values. 818 819 - `Cursor` - This object is simply for traversing the B+tree of on-disk pages 820 or in-memory nodes. It can seek to a specific key, move to the first or last 821 value, or it can move forward or backward. The cursor handles the movement up 822 and down the B+tree transparently to the end user. 823 824 - `Tx.Commit()` - Converts the in-memory dirty nodes and the list of free pages 825 into pages to be written to disk. Writing to disk then occurs in two phases. 826 First, the dirty pages are written to disk and an `fsync()` occurs. Secondly, 827 a new meta page with an incremented transaction ID is written and another 828 `fsync()` occurs. This `two phase write` ensures that partially written data 829 pages are ignored in the event of a crash since the meta page pointing to them 830 is never written. Partially written meta pages are invalidated, because they 831 are written with a checksum.