github.com/etecs-ru/ristretto@v0.9.1/README.md (about)

     1  # Ristretto
     2  [![Go Reference](https://pkg.go.dev/badge/github.com/etecs-ru/ristretto.svg)](https://pkg.go.dev/github.com/etecs-ru/ristretto)
     3  [![Go Report Card](https://img.shields.io/badge/go%20report-A%2B-brightgreen)](https://goreportcard.com/report/github.com/etecs-ru/ristretto)
     4  [![Release](https://img.shields.io/github/release/etecs-ru/ristretto.svg)](https://github.com/etecs-ru/ristretto/releases/latest)
     5  [![Coverage](https://codecov.io/gh/etecs-ru/ristretto/branch/master/graph/badge.svg?token=UDWD7LBVTK)](https://codecov.io/gh/etecs-ru/ristretto)
     6  ![Tests](https://github.com/etecs-ru/ristretto/workflows/build/badge.svg)
     7  
     8  
     9  
    10  ## This is the fork of the [github.com/dgraph-io/ristretto](https://github.com/dgraph-io/ristretto).
    11  
    12  This fork applies the selected PRs from the original repo. Also runs GitHub Actions on each PR. See the CHANGELOG.md.
    13  
    14  ----
    15  
    16  
    17  Ristretto is a fast, concurrent cache library built with a focus on performance and correctness.
    18  
    19  The motivation to build Ristretto comes from the need for a contention-free
    20  cache in [Dgraph][].
    21  
    22  **Use [Discuss Issues](https://discuss.dgraph.io/tags/c/issues/35/ristretto/40) for reporting issues about this repository.**
    23  
    24  [Dgraph]: https://github.com/dgraph-io/dgraph
    25  
    26  ## Features
    27  
    28  * **High Hit Ratios** - with our unique admission/eviction policy pairing, Ristretto's performance is best in class.
    29  	* **Eviction: SampledLFU** - on par with exact LRU and better performance on Search and Database traces.
    30  	* **Admission: TinyLFU** - extra performance with little memory overhead (12 bits per counter).
    31  * **Fast Throughput** - we use a variety of techniques for managing contention and the result is excellent throughput.
    32  * **Cost-Based Eviction** - any large new item deemed valuable can evict multiple smaller items (cost could be anything).
    33  * **Fully Concurrent** - you can use as many goroutines as you want with little throughput degradation. 
    34  * **Metrics** - optional performance metrics for throughput, hit ratios, and other stats.
    35  * **Simple API** - just figure out your ideal `Config` values and you're off and running.
    36  
    37  ## Status
    38  
    39  Ristretto is production-ready. See [Projects using Ristretto](#projects-using-ristretto).
    40  
    41  ## Table of Contents
    42  
    43  * [Usage](#Usage)
    44  	* [Example](#Example)
    45  	* [Config](#Config)
    46  		* [NumCounters](#Config)
    47  		* [MaxCost](#Config)
    48  		* [BufferItems](#Config)
    49  		* [Metrics](#Config)
    50  		* [OnEvict](#Config)
    51  		* [KeyToHash](#Config)
    52  	        * [Cost](#Config)
    53  	* [Testing](#Testing)
    54  * [Benchmarks](#Benchmarks)
    55  	* [Hit Ratios](#Hit-Ratios)
    56  		* [Search](#Search)
    57  		* [Database](#Database)
    58  		* [Looping](#Looping)
    59  		* [CODASYL](#CODASYL)
    60  	* [Throughput](#Throughput)
    61  		* [Mixed](#Mixed)
    62  		* [Read](#Read)
    63  		* [Write](#Write)
    64  * [Projects using Ristretto](#projects-using-ristretto)
    65  * [FAQ](#FAQ)
    66  
    67  
    68  ## Usage
    69  
    70  ### Example
    71  
    72  ```go
    73  func main() {
    74  	cache, err := ristretto.NewCache(&ristretto.Config{
    75  		NumCounters: 1e7,     // number of keys to track frequency of (10M).
    76  		MaxCost:     1 << 30, // maximum cost of cache (1GB).
    77  		BufferItems: 64,      // number of keys per Get buffer.
    78  	})
    79  	if err != nil {
    80  		panic(err)
    81  	}
    82  
    83  	// set a value with a cost of 1
    84  	cache.Set("key", "value", 1)
    85  	
    86  	// wait for value to pass through buffers
    87  	cache.Wait()
    88  
    89  	value, found := cache.Get("key")
    90  	if !found {
    91  		panic("missing value")
    92  	}
    93  	fmt.Println(value)
    94  	cache.Del("key")
    95  }
    96  ```
    97  
    98  ### Config
    99  
   100  The `Config` struct is passed to `NewCache` when creating Ristretto instances (see the example above). 
   101  
   102  **NumCounters** `int64`
   103  
   104  NumCounters is the number of 4-bit access counters to keep for admission and eviction. We've seen good performance in setting this to 10x the number of items you expect to keep in the cache when full. 
   105  
   106  For example, if you expect each item to have a cost of 1 and MaxCost is 100, set NumCounters to 1,000. Or, if you use variable cost values but expect the cache to hold around 10,000 items when full, set NumCounters to 100,000. The important thing is the *number of unique items* in the full cache, not necessarily the MaxCost value. 
   107  
   108  **MaxCost** `int64`
   109  
   110  MaxCost is how eviction decisions are made. For example, if MaxCost is 100 and a new item with a cost of 1 increases total cache cost to 101, 1 item will be evicted. 
   111  
   112  MaxCost can also be used to denote the max size in bytes. For example, if MaxCost is 1,000,000 (1MB) and the cache is full with 1,000 1KB items, a new item (that's accepted) would cause 5 1KB items to be evicted. 
   113  
   114  MaxCost could be anything as long as it matches how you're using the cost values when calling Set. 
   115  
   116  **BufferItems** `int64`
   117  
   118  BufferItems is the size of the Get buffers. The best value we've found for this is 64. 
   119  
   120  If for some reason you see Get performance decreasing with lots of contention (you shouldn't), try increasing this value in increments of 64. This is a fine-tuning mechanism and you probably won't have to touch this.
   121  
   122  **Metrics** `bool`
   123  
   124  Metrics is true when you want real-time logging of a variety of stats. The reason this is a Config flag is because there's a 10% throughput performance overhead. 
   125  
   126  **OnEvict** `func(hashes [2]uint64, value interface{}, cost int64)`
   127  
   128  OnEvict is called for every eviction.
   129  
   130  **KeyToHash** `func(key interface{}) [2]uint64`
   131  
   132  KeyToHash is the hashing algorithm used for every key. If this is nil, Ristretto has a variety of [defaults depending on the underlying interface type](https://github.com/etecs-ru/ristretto/blob/master/z/z.go#L19-L41).
   133  
   134  Note that if you want 128bit hashes you should use the full `[2]uint64`,
   135  otherwise just fill the `uint64` at the `0` position and it will behave like
   136  any 64bit hash.
   137  
   138  **Cost** `func(value interface{}) int64`
   139  
   140  Cost is an optional function you can pass to the Config in order to evaluate
   141  item cost at runtime, and only for the Set calls that aren't dropped (this is
   142  useful if calculating item cost is particularly expensive and you don't want to
   143  waste time on items that will be dropped anyways).
   144  
   145  To signal to Ristretto that you'd like to use this Cost function:
   146  
   147  1. Set the Cost field to a non-nil function.
   148  2. When calling Set for new items or item updates, use a `cost` of 0.
   149  
   150  ### Testing
   151  
   152  If you wish to mock out the caching functionality for use in your tests, you should use the `CacheInterface` in your code.
   153  This enables you to generate mocks using something like [mockery](https://github.com/vektra/mockery) or provide your own implementation.
   154  
   155  ## Benchmarks
   156  
   157  The benchmarks can be found in https://github.com/dgraph-io/benchmarks/tree/master/cachebench/ristretto.
   158  
   159  ### Hit Ratios
   160  
   161  #### Search
   162  
   163  This trace is described as "disk read accesses initiated by a large commercial
   164  search engine in response to various web search requests."
   165  
   166  <p align="center">
   167  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Hit%20Ratios%20-%20Search%20(ARC-S3).svg">
   168  </p>
   169  
   170  #### Database
   171  
   172  This trace is described as "a database server running at a commercial site
   173  running an ERP application on top of a commercial database."
   174  
   175  <p align="center">
   176  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Hit%20Ratios%20-%20Database%20(ARC-DS1).svg">
   177  </p>
   178  
   179  #### Looping
   180  
   181  This trace demonstrates a looping access pattern.
   182  
   183  <p align="center">
   184  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Hit%20Ratios%20-%20Glimpse%20(LIRS-GLI).svg">
   185  </p>
   186  
   187  #### CODASYL
   188  
   189  This trace is described as "references to a CODASYL database for a one hour
   190  period."
   191  
   192  <p align="center">
   193  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Hit%20Ratios%20-%20CODASYL%20(ARC-OLTP).svg">
   194  </p>
   195  
   196  ### Throughput
   197  
   198  All throughput benchmarks were ran on an Intel Core i7-8700K (3.7GHz) with 16gb
   199  of RAM.
   200  
   201  #### Mixed
   202  
   203  <p align="center">
   204  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Throughput%20-%20Mixed.svg">
   205  </p>
   206  
   207  #### Read
   208  
   209  <p align="center">
   210  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Throughput%20-%20Read%20(Zipfian).svg">
   211  </p>
   212  
   213  #### Write
   214  
   215  <p align="center">
   216  	<img src="https://raw.githubusercontent.com/etecs-ru/ristretto/master/benchmarks/Throughput%20-%20Write%20(Zipfian).svg">
   217  </p>
   218  
   219  ## Projects Using Ristretto
   220  
   221  Below is a list of known projects that use Ristretto:
   222  
   223  - [Badger](https://github.com/dgraph-io/badger) - Embeddable key-value DB in Go
   224  - [Dgraph](https://github.com/dgraph-io/dgraph) - Horizontally scalable and distributed GraphQL database with a graph backend
   225  - [Vitess](https://github.com/vitessio/vitess) - Database clustering system for horizontal scaling of MySQL
   226  - [SpiceDB](https://github.com/authzed/spicedb) - Horizontally scalable permissions database
   227  
   228  ## FAQ
   229  
   230  ### How are you achieving this performance? What shortcuts are you taking?
   231  
   232  We go into detail in the [Ristretto blog post](https://blog.dgraph.io/post/introducing-ristretto-high-perf-go-cache/), but in short: our throughput performance can be attributed to a mix of batching and eventual consistency. Our hit ratio performance is mostly due to an excellent [admission policy](https://arxiv.org/abs/1512.00727) and SampledLFU eviction policy.
   233  
   234  As for "shortcuts," the only thing Ristretto does that could be construed as one is dropping some Set calls. That means a Set call for a new item (updates are guaranteed) isn't guaranteed to make it into the cache. The new item could be dropped at two points: when passing through the Set buffer or when passing through the admission policy. However, this doesn't affect hit ratios much at all as we expect the most popular items to be Set multiple times and eventually make it in the cache. 
   235  
   236  ### Is Ristretto distributed?
   237  
   238  No, it's just like any other Go library that you can import into your project and use in a single process.