github.com/outcaste-io/ristretto@v0.2.3/README.md (about)

     1  # Ristretto
     2  [![Go Doc](https://img.shields.io/badge/godoc-reference-blue.svg)](http://godoc.org/github.com/outcaste-io/ristretto)
     3  [![Go Report Card](https://img.shields.io/badge/go%20report-A%2B-brightgreen)](https://goreportcard.com/report/github.com/outcaste-io/ristretto)
     4  [![Coverage](https://gocover.io/_badge/github.com/outcaste-io/ristretto)](https://gocover.io/github.com/outcaste-io/ristretto)
     5  ![Tests](https://github.com/outcaste-io/ristretto/workflows/tests/badge.svg)
     6  
     7  **This is a fork of dgraph-io/ristretto, maintained by @manishrjain.**
     8  
     9  Ristretto is a fast, concurrent cache library built with a focus on performance and correctness.
    10  
    11  The motivation to build Ristretto comes from the need for a contention-free
    12  cache.
    13  
    14  [issues]: https://github.com/outcaste-io/issues
    15  
    16  ## Features
    17  
    18  * **High Hit Ratios** - with our unique admission/eviction policy pairing, Ristretto's performance is best in class.
    19  	* **Eviction: SampledLFU** - on par with exact LRU and better performance on Search and Database traces.
    20  	* **Admission: TinyLFU** - extra performance with little memory overhead (12 bits per counter).
    21  * **Fast Throughput** - we use a variety of techniques for managing contention and the result is excellent throughput.
    22  * **Cost-Based Eviction** - any large new item deemed valuable can evict multiple smaller items (cost could be anything).
    23  * **Fully Concurrent** - you can use as many goroutines as you want with little throughput degradation.
    24  * **Metrics** - optional performance metrics for throughput, hit ratios, and other stats.
    25  * **Simple API** - just figure out your ideal `Config` values and you're off and running.
    26  
    27  ## Note on jemalloc
    28  
    29  We have been using jemalloc v5.2.1.
    30  To use jemalloc, please configure jemalloc with these flags:
    31  
    32  ```
    33  ./configure --with-install-suffix='_outcaste' --with-jemalloc-prefix='je_' --with-malloc-conf='background_thread:true,metadata_thp:auto'; \
    34  make
    35  make install_lib install_include # Use sudo if needed in this step.
    36  ```
    37  
    38  outserv/outserv Makefile has these build steps already present. You can run
    39  `make jemalloc` to install it. This jemalloc would not interfere with any other
    40  jemalloc installation that might already be present on the system.
    41  
    42  
    43  ## Status
    44  
    45  Ristretto is production-ready. See [Projects using Ristretto](#projects-using-ristretto).
    46  
    47  ## Table of Contents
    48  
    49  * [Usage](#Usage)
    50  	* [Example](#Example)
    51  	* [Config](#Config)
    52  		* [NumCounters](#Config)
    53  		* [MaxCost](#Config)
    54  		* [BufferItems](#Config)
    55  		* [Metrics](#Config)
    56  		* [OnEvict](#Config)
    57  		* [KeyToHash](#Config)
    58          * [Cost](#Config)
    59  * [Benchmarks](#Benchmarks)
    60  	* [Hit Ratios](#Hit-Ratios)
    61  		* [Search](#Search)
    62  		* [Database](#Database)
    63  		* [Looping](#Looping)
    64  		* [CODASYL](#CODASYL)
    65  	* [Throughput](#Throughput)
    66  		* [Mixed](#Mixed)
    67  		* [Read](#Read)
    68  		* [Write](#Write)
    69  * [Projects using Ristretto](#projects-using-ristretto)
    70  * [FAQ](#FAQ)
    71  
    72  ## Usage
    73  
    74  ### Example
    75  
    76  ```go
    77  func main() {
    78  	cache, err := ristretto.NewCache(&ristretto.Config{
    79  		NumCounters: 1e7,     // number of keys to track frequency of (10M).
    80  		MaxCost:     1 << 30, // maximum cost of cache (1GB).
    81  		BufferItems: 64,      // number of keys per Get buffer.
    82  	})
    83  	if err != nil {
    84  		panic(err)
    85  	}
    86  
    87  	// set a value with a cost of 1
    88  	cache.Set("key", "value", 1)
    89  
    90  	// wait for value to pass through buffers
    91  	cache.Wait()
    92  
    93  	value, found := cache.Get("key")
    94  	if !found {
    95  		panic("missing value")
    96  	}
    97  	fmt.Println(value)
    98  	cache.Del("key")
    99  }
   100  ```
   101  
   102  ### Config
   103  
   104  The `Config` struct is passed to `NewCache` when creating Ristretto instances (see the example above).
   105  
   106  **NumCounters** `int64`
   107  
   108  NumCounters is the number of 4-bit access counters to keep for admission and eviction. We've seen good performance in setting this to 10x the number of items you expect to keep in the cache when full.
   109  
   110  For example, if you expect each item to have a cost of 1 and MaxCost is 100, set NumCounters to 1,000. Or, if you use variable cost values but expect the cache to hold around 10,000 items when full, set NumCounters to 100,000. The important thing is the *number of unique items* in the full cache, not necessarily the MaxCost value.
   111  
   112  **MaxCost** `int64`
   113  
   114  MaxCost is how eviction decisions are made. For example, if MaxCost is 100 and a new item with a cost of 1 increases total cache cost to 101, 1 item will be evicted.
   115  
   116  MaxCost can also be used to denote the max size in bytes. For example, if MaxCost is 1,000,000 (1MB) and the cache is full with 1,000 1KB items, a new item (that's accepted) would cause 5 1KB items to be evicted.
   117  
   118  MaxCost could be anything as long as it matches how you're using the cost values when calling Set.
   119  
   120  **BufferItems** `int64`
   121  
   122  BufferItems is the size of the Get buffers. The best value we've found for this is 64.
   123  
   124  If for some reason you see Get performance decreasing with lots of contention (you shouldn't), try increasing this value in increments of 64. This is a fine-tuning mechanism and you probably won't have to touch this.
   125  
   126  **Metrics** `bool`
   127  
   128  Metrics is true when you want real-time logging of a variety of stats. The reason this is a Config flag is because there's a 10% throughput performance overhead.
   129  
   130  **OnEvict** `func(hashes [2]uint64, value interface{}, cost int64)`
   131  
   132  OnEvict is called for every eviction.
   133  
   134  **KeyToHash** `func(key interface{}) [2]uint64`
   135  
   136  KeyToHash is the hashing algorithm used for every key. If this is nil, Ristretto has a variety of [defaults depending on the underlying interface type](https://github.com/outcaste-io/ristretto/blob/master/z/z.go#L19-L41).
   137  
   138  Note that if you want 128bit hashes you should use the full `[2]uint64`,
   139  otherwise just fill the `uint64` at the `0` position and it will behave like
   140  any 64bit hash.
   141  
   142  **Cost** `func(value interface{}) int64`
   143  
   144  Cost is an optional function you can pass to the Config in order to evaluate
   145  item cost at runtime, and only for the Set calls that aren't dropped (this is
   146  useful if calculating item cost is particularly expensive and you don't want to
   147  waste time on items that will be dropped anyways).
   148  
   149  To signal to Ristretto that you'd like to use this Cost function:
   150  
   151  1. Set the Cost field to a non-nil function.
   152  2. When calling Set for new items or item updates, use a `cost` of 0.
   153  
   154  ## Benchmarks
   155  
   156  The benchmarks can be found in https://github.com/dgraph-io/benchmarks/tree/master/cachebench/ristretto.
   157  
   158  ### Hit Ratios
   159  
   160  #### Search
   161  
   162  This trace is described as "disk read accesses initiated by a large commercial
   163  search engine in response to various web search requests."
   164  
   165  <p align="center">
   166  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Hit%20Ratios%20-%20Search%20(ARC-S3).svg">
   167  </p>
   168  
   169  #### Database
   170  
   171  This trace is described as "a database server running at a commercial site
   172  running an ERP application on top of a commercial database."
   173  
   174  <p align="center">
   175  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Hit%20Ratios%20-%20Database%20(ARC-DS1).svg">
   176  </p>
   177  
   178  #### Looping
   179  
   180  This trace demonstrates a looping access pattern.
   181  
   182  <p align="center">
   183  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Hit%20Ratios%20-%20Glimpse%20(LIRS-GLI).svg">
   184  </p>
   185  
   186  #### CODASYL
   187  
   188  This trace is described as "references to a CODASYL database for a one hour
   189  period."
   190  
   191  <p align="center">
   192  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Hit%20Ratios%20-%20CODASYL%20(ARC-OLTP).svg">
   193  </p>
   194  
   195  ### Throughput
   196  
   197  All throughput benchmarks were ran on an Intel Core i7-8700K (3.7GHz) with 16gb
   198  of RAM.
   199  
   200  #### Mixed
   201  
   202  <p align="center">
   203  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Throughput%20-%20Mixed.svg">
   204  </p>
   205  
   206  #### Read
   207  
   208  <p align="center">
   209  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Throughput%20-%20Read%20(Zipfian).svg">
   210  </p>
   211  
   212  #### Write
   213  
   214  <p align="center">
   215  	<img src="https://raw.githubusercontent.com/dgraph-io/ristretto/master/benchmarks/Throughput%20-%20Write%20(Zipfian).svg">
   216  </p>
   217  
   218  ## Projects Using Ristretto
   219  
   220  Below is a list of known projects that use Ristretto:
   221  
   222  - [Badger](https://github.com/dgraph-io/badger) - Embeddable key-value DB in Go
   223  - [Dgraph](https://github.com/dgraph-io/dgraph) - Horizontally scalable and distributed GraphQL database with a graph backend
   224  - [Vitess](https://github.com/vitessio/vitess) - Database clustering system for horizontal scaling of MySQL
   225  - [SpiceDB](https://github.com/authzed/spicedb) - Horizontally scalable permissions database
   226  
   227  ## FAQ
   228  
   229  ### How are you achieving this performance? What shortcuts are you taking?
   230  
   231  We go into detail in the [Ristretto blog post](https://blog.dgraph.io/post/introducing-ristretto-high-perf-go-cache/), but in short: our throughput performance can be attributed to a mix of batching and eventual consistency. Our hit ratio performance is mostly due to an excellent [admission policy](https://arxiv.org/abs/1512.00727) and SampledLFU eviction policy.
   232  
   233  As for "shortcuts," the only thing Ristretto does that could be construed as one is dropping some Set calls. That means a Set call for a new item (updates are guaranteed) isn't guaranteed to make it into the cache. The new item could be dropped at two points: when passing through the Set buffer or when passing through the admission policy. However, this doesn't affect hit ratios much at all as we expect the most popular items to be Set multiple times and eventually make it in the cache.
   234  
   235  ### Is Ristretto distributed?
   236  
   237  No, it's just like any other Go library that you can import into your project and use in a single process.