github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20150820_store_pool.md

github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20150820_store_pool.md (about)

1 - Feature Name: Store Pool
2 - Status: completed
3 - Start Date: 2015-08-20
4 - RFC PR: [#2286](https://github.com/cockroachdb/cockroach/pull/2286),
5 [#2336](https://github.com/cockroachdb/cockroach/pull/2336)
6 - Cockroach Issue: [#2149](https://github.com/cockroachdb/cockroach/issues/2149),
7 [#620](https://github.com/cockroachdb/cockroach/issues/620)
8
9 # Summary
10
11 Add a new `StorePool` service on each node that monitors all the stores and
12 reports on their current status and health. Based on just a store ID, the
13 pool will report the health of the store. Initially this health will only
14 be if the store is dead or not, but will expand to include other factors in the
15 future. This will also be the ideal location to add any calculations about
16 which store would be best suited to take on a new replica, subsuming some of
17 the work from the allocator.
18
19 This new service will work perfectly with #2153 and #2171
20
21 # Motivation
22
23 The decisions about when to add/remove replicas for rebalancing and repairing
24 require the knowledge about the health of other stores. There needs to be a
25 local source of truth for those decisions.
26
27 # Detailed design
28
29 ## Configuration
30 Add a new configuration setting called `TimeUntilStoreDead` which contains
31 the number of seconds after which if a store was not heard from, it is
32 considered dead. The default value for this will be 5 minutes.
33
34 ## Monitor
35 Add a new service called `StorePool` that starts when the node is started.
36 This new service will run until the stopper is called and have access to
37 gossip.
38
39 `StorePool` will maintain a map of store IDs to store descriptors and a variety
40 of heath statistic about the store. It will also maintain a `lastUpdatedTime`
41 which will be set whenever a store descriptor is updated. When this happens,
42 if the store was previously marked as dead, it will restored. To maintain this
43 map, a callback from gossip for store descriptors will be added. When this
44 `lastUpdatedTime` is longer than the `TimeUntilStoreDead`, the store is
45 considered dead and any replicas on this store may be removed. Note that that
46 the work to remove replicas is performed elsewhere.
47
48 Monitor will maintain a timespan `timeUntilNextDead` which is calculated by
49 taking the nearest `lastUpdatedTime` for all the stores and adding
50 `TimeUntilStoreDead` and the store ID associated with the timeout.
51
52 Monitor will trigger on `timeUntilNextDead` which when triggered checks to see
53 if that store has not been updated.
54 If the store hasn't been updated, it will mark it as dead.
55 Then it will calculate the next `timeUntilNextDead` to wake up the service.
56
57 # Drawbacks
58
59 Can't think of any right now. Perhaps that we're adding a new service, but it
60 should be very lightweight.
61
62 # Alternatives
63
64 1. Instead of creating a new store monitoring service, add all of this into
65 gossip. Gossip already has most of the store information in it.
66 - Gossip requires a good refactoring and I can see this as one of the first
67 steps to do so. This will mean that you can get store lists from somewhere
68 other than gossip and it will include more details as well. Adding these
69 calculations into gossip seems cumbersome.
70 2. Instead of creating a store pool, create a node pool. This will allow each
71 node to choose which store of theirs the new range should be assigned to and
72 give the nodes more control over their internal systems.
73 - Right now, this is the wrong direction, but in the longer team, giving the
74 node more control might simplify the decision making for allocations,
75 repairs and rebalances and each node can report their own version of
76 capacity and free space.
77
78 # Unresolved questions
79
80 If RFCs #2153 and #2171 aren't implemented, should we consider another option?
81
82