git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20241022124111-5361f0ecebd3/doc/policy.md (about) 1 # Placement Policy 2 3 This document describes placement policies, their purpose, syntax and semantics. 4 5 ## Index 6 7 - [Introduction](#introduction) 8 - [Operations](#operations) 9 - [Basic Expressions](#basic-expressions) 10 - [`FILTER`](#filter) 11 - [`SELECT`](#select) 12 - [`REP`](#rep) 13 - [Policies](#policies) 14 - [The policy playground](#the-policy-playground) 15 - [`CBF`](#cbf) 16 - [`UNIQUE`](#unique) 17 - [More examples](#more-examples) 18 - [Appendix 1: Operators](#appendix-1-operators) 19 - [Appendix 2: Policy playground commands](#appendix-2-policy-playground-commands) 20 21 ## Introduction 22 23 The purpose of a **placement policy** is to determine whichs node(s) of a frostfs system will store an object. Namely, given a **netmap** (a set of nodes) and a placement policy, a subset of those nodes is selected to store a given object. An important aspect is that since nodes in a netmap come and go due to distributed nature of the system, this selection must be deterministic and consistent, i.e. different nodes must come to the same conclusion as long as they share the same view of the netmap. 24 25 > ℹ️ Throughout this document, we will consider each node as a dictionary of attributes and a global unique ID. 26 27 One way to think about the placement policy, is as a pipeline of operations which begins with a set of nodes (the entire netmap) and gradually refine it into only the nodes that will be in charge of storing the object. More specifically, each operation in this pipeline takes a set of nodes and transforms it into a subset of those nodes. The transformation is done purely on the basis of the node attributes. 28 29  30 31 The three main operations are: 32 1. `FILTER`: filters a set of nodes based on their attributes. 33 2. `SELECT`: selects a specific amount of nodes from a set of nodes based on certain conditions. 34 3. `REP`: specifies how many nodes (and which ones) from a set of nodes are used to store an object. 35 36 In the next sections, we will explore each of them in detail. 37 38 ## Operations 39 40 ### Basic Expressions 41 42 Before exploring the operations in detail, we must get acquainted with the basic expressions that appear in a placement policy. As mentioned above, the placement policy operates solely on the basis of node attributes, and as such, basic expressions mostly revolve around node attribute comparison. 43 44 A comparison expression expresses whether a node attribute equals a specified value: 45 ``` 46 AttributeName Operation AttributeValue 47 ``` 48 49 For example, the following expression 50 ```sql 51 City EQ 'Moscow' 52 ``` 53 asserts that the node attribute `City` equals the value `Moscow`. 54 55 Comparison expressions can be nested via boolean operators and parentheses. For example, the following expression: 56 ```sql 57 (City EQ 'Moscow') AND (Disks GT 2) 58 ``` 59 asserts that the node attribute `City` equals the value `Moscow` and the node attribute `Disks` must be greater than `2`. Note that the arguments can be either a string or a number. 60 61 See [Appendix 1](#appendix-1-operators) for a complete list of supported operators. 62 63 ### `FILTER` 64 65 A `FILTER` operation takes as input a set of nodes and returns a subset of those nodes. It's useful for selecting nodes that have (or lack) specific attributes. Its basic syntax is as follows: 66 ```bnf 67 FILTER <expr> AS <id> 68 ``` 69 70 For example, the following filter 71 ```sql 72 FILTER Color EQ 'Red' AS RedNodes 73 ``` 74 selects those nodes for which the `Color` attribute equals `Red`, and discards the rest. The filter's identifier is `RedNodes`, which can be used to reference it in other parts of the placement policy. For example, you could reference the above filter in another filter as follows 75 ```sql 76 FILTER @RedNodes AND (City EQ 'Moscow') AS RedMoscowNodes 77 ``` 78 which would select those nodes for which the `Color` attribute equals `Red` and the `City` attribute equals `Moscow`. You can think of the `@` operator as embedding the referenced filter expression verbatim where it's used. This makes it easy to compose filters. However, filters can be referenced via `@` only within filter expressions. In other places you can simply use the filter identifier directly. 79 80 > ⚠️ Every filter requires a unique identifier. What would be the use of a filter that you cannot reference? 81 82 The following diagram illustrates the filter operation 83 84  85 86 where the nodes are represented as colored circles, with their color representing the value of their `Color` attribute, respectively. 87 88 > ℹ️ A filter referring to all nodes in the netmap always exists and can be referenced by `*`. 89 90 ### `SELECT` 91 92 A `SELECT` operation specifies how many and which nodes from a subset previously obtained from a `FILTER` will be available to build replica groups for object storage. It's not that different from a `FILTER` in that it transforms a set of nodes into a subset of those, but while a `FILTER` cannot control the size of the resulting subset and other characteristics, a `SELECT` can. 93 94 Its basic syntax is as follows: 95 ```bnf 96 SELECT <count> {IN (SAME|DISTINCT) <attribute>} FROM <filter> {AS <id>} 97 ``` 98 99 In a nutshell, a `SELECT` takes a filter result as input and outputs a specific number of nodes, optionally enforcing that all output nodes must either share or differ in a specific attribute. Note that only the output node count and the source filter are required. 100 101 Let's see some examples 102 ```sql 103 -- Selects exactly one node from the entire netmap 104 SELECT 1 FROM * 105 106 -- Same as above, but with an identifier for the selection 107 SELECT 1 FROM * AS ONE 108 109 -- Selects two nodes from the RedOrBlueNodes filter, such that both selected nodes 110 -- share the same value for the Color attribute, i.e. both red or both blue. 111 SELECT 2 IN SAME Color FROM RedOrBlueNodes 112 113 -- Selects two nodes from the RedOrBlueNodes filter, such that the selected nodes 114 -- have distinct values for the Color attribute, i.e. one red and one blue. 115 -- The selection is also given an identifier. 116 SELECT 2 IN DISTINCT Color FROM RedOrBlueNodes AS MyNodes 117 ``` 118 119 The last example is illustrated in the following diagram: 120 121  122 123 > ℹ️ At this point, notice that while `FILTER`'s output is always unique (namely, every node in the input is either filtered in or out), that is not always the case for `SELECT`. In the last example above, there is more than one way to select two nodes with distinct `Color` attribute. Because we require the output to be deterministic and consistent (so that all nodes agree on which nodes to store a given object without having to commmunicate with each other), we need a way to reach this consensus efficiently. Internally, the policy engine uses [Rendezvouz Hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing) to ensure this. If you want more control over what nodes are actually selected, you can always use narrower filters/selections to ensure this. 124 125 ### `REP` 126 127 A `REP` operation specifies how many copies of an object need to be stored (`REP` stands for "replica"). A placement policy can contain multiple replica operations, with each of them representing a replica group, i.e. a group of objects associated with the same replica. Following our analogy with a pipeline, `REP` operations are the sink or output nodes. 128 129 Its basic syntax is as follows: 130 ```bnf 131 REP <count> {IN <select>} 132 ``` 133 134 If a select is not specified, then the entire netmap is used as input. The only exception to this rule is when exactly 1 replica and 1 selector are being present: in this case the only selector is being used instead of the whole netmap. 135 The resulting nodes will be used to actually store objects and they constitute a replica group (or simply, "a replica"). 136 137 Examples 138 ```sql 139 -- A replica consisting of a single copy, stored in an arbitrary node of the netmap. 140 REP 1 141 142 -- A replica consisting of three copies, each stored in a different node from the selection 143 -- identified as 'MyNodes'. 144 REP 3 IN MyNodes 145 ``` 146 147 The following diagram illustrates the `REP` operation: 148 149  150 151 > ⚠️ Notice that although we use `REP 1` in the examples, in real life scenarios you almost always want to have more than a single node in each replica for redundancy. 152 153 ## Policies 154 155 In order to specify a complete placement policy, we just need to assemble it from the operations described above. 156 157 Its basic (simplified) syntax is as follows: 158 ```bnf 159 <rep>+ <select>* <filter>* 160 ``` 161 162 We begin by stating all our `REP` operations, followed by all the `SELECT` operations and finally all the `FILTER` operations. Note that this is the reverse order in which they are applied. Also note that at least one `REP` operation is required. 163 164 Here's a complete example: 165 ```sql 166 REP 1 IN MyNodes 167 SELECT 2 IN DISTINCT Color FROM RedOrBlueNodes AS MyNodes 168 FILTER Color EQ 'Red' AS RedNodes 169 FILTER Color EQ 'Blue' AS BlueNodes 170 FILTER @RedNodes OR @BlueNodes AS RedOrBlueNodes 171 ``` 172 173 In additional to this basic syntax, there are a couple of additional useful options to specify which nodes and how many nodes are actually selected to store objects. We explore these in the next sections. 174 175 ### The policy playground 176 177 > ℹ️ This section assumes you have an up-to-date version of the `frostfs-cli`. 178 179 While simple placement policies have predictable results that can be understood at a glance, more complex ones need careful consideration before deployment. In order to simplify understanding a policy's outcome and experimenting while learning, a builtin tool is provided as part of the `frostfs-cli` for this purpose: the policy playground. 180 181 For the remainder of this guide, we will use the policy playground to setup a virtual netmap (that is, one that doesn't require any networking or deployment) and test various policies. In order to visualize this netmap easily, each node will have three attributes: a character, a shape and a color 182 183  184 185 We can start the policy playground as follows: 186 ```sh 187 $ frostfs-cli container policy-playground 188 > 189 ``` 190 191 Since we didn't pass any endpoint, the initial netmap is empty, which we can verify with the `ls` command (to list the nodes in the netmap): 192 ```sh 193 > ls 194 > 195 ``` 196 197 Nows let's add virtual nodes to represent our test netmap in the figure above 198 ```sh 199 > add 01 Char:A Shape:Circle Color:Blue 200 > add 02 Char:B Shape:Circle Color:Green 201 > add 03 Char:C Shape:Circle Color:Red 202 > add 04 Char:D Shape:Square Color:Blue 203 > add 05 Char:E Shape:Square Color:Green 204 > add 06 Char:F Shape:Square Color:Red 205 > add 07 Char:G Shape:Diamond Color:Blue 206 > add 08 Char:H Shape:Diamond Color:Green 207 > add 09 Char:I Shape:Diamond Color:Red 208 ``` 209 210 and verify that the netmap now contains what we expect 211 ```sh 212 > ls 213 1: id=06 attrs={Char:F Shape:Square Color:Red} 214 2: id=08 attrs={Char:H Shape:Diamond Color:Green} 215 3: id=01 attrs={Char:A Shape:Circle Color:Blue} 216 4: id=04 attrs={Char:D Shape:Square Color:Blue} 217 5: id=05 attrs={Char:E Shape:Square Color:Green} 218 6: id=09 attrs={Char:I Shape:Diamond Color:Red} 219 7: id=02 attrs={Char:B Shape:Circle Color:Green} 220 8: id=03 attrs={Char:C Shape:Circle Color:Red} 221 9: id=07 attrs={Char:G Shape:Diamond Color:Blue} 222 ``` 223 224 With our sample netmap setup, we can now continue. 225 226 ### `CBF` 227 228 Consider the following policy: 229 230 ```sql 231 REP 1 232 ``` 233 234 It builds a replica consisting of one copy, selected from the entire netmap. If we evaluate this policy in our sample netmap, we obtain a result which is probably unexpected: 235 236 ```sh 237 > eval REP 1 238 1: [06 05 02] 239 ``` 240 241 The `eval` commands evaluates a policy and lists in a separate line the nodes selected for each `REP` operation, in the order they appear in the policy. We were expecting a single node, but we got three instead. The reason is that there's a policy-wide parameter called **container backup factor** (or CBF). This parameter is a multiplier which controls the maximum number of storage nodes: for example, if a policy requires 10 nodes and the CBF is 3, it means that the policy can store an object in up to 10 × 3 nodes. However, if there are not enough nodes and fewer (but at least 10) are used, this is not considered an error. 242 243 The default value for CBF is `3`, which explains our result above, given than every node in the netmap agrees with the policy. The CBF can be explicitly set in the policy right after the `REP` operations. For example 244 245 ```sh 246 > eval REP 1 CBF 1 247 1: [06] 248 ``` 249 250 results in what we expected in the first example. On the other hand 251 252 ```sh 253 > eval REP 1 IN MyNodes SELECT 1 IN SAME Char FROM * AS MyNodes 254 1: [01] 255 ``` 256 257 results in a single node despite the default CBF, because there are not enough nodes compatible with the selection. 258 259 ### `UNIQUE` 260 261 Consider the following policy: 262 ```sql 263 REP 1 264 REP 1 265 CBF 2 266 ``` 267 268 If we evaluate it 269 ```sh 270 > eval REP 1 REP 1 CBF 2 271 1: [06 05] 272 2: [06 05] 273 ``` 274 275 we find that each replica gets two nodes, in accordance with the CBF. However, these nodes are exactly the same and this might not be desirable. In order to force the policy engine to select different nodes for each replica, we can use the `UNIQUE` option, which is specified right before the `REP` operations. 276 277 In our example, if we change it to 278 ```sql 279 UNIQUE 280 REP 1 281 REP 1 282 CBF 2 283 ``` 284 285 and evaluate it 286 287 ```sh 288 > eval UNIQUE REP 1 REP 1 CBF 2 289 1: [06 05] 290 2: [02 03] 291 ``` 292 293 we now find that the nodes selected for each replica are now distinct from each other. 294 295 ### More examples 296 297 This section presents some more examples of placement policies and their result when applied to the sample netmap. Try to figure out the result before looking at it or evaluating the policy. 298 299 #### Example #1 300 ```sql 301 REP 1 IN TwoRedNodes 302 SELECT 2 FROM RedNodes AS TwoRedNodes 303 FILTER Color EQ 'Red' AS RedNodes 304 ``` 305 306 <details> 307 <summary>Result</summary> 308 309 ```sh 310 > eval REP 1 IN TwoRedNodes SELECT 2 FROM RedNodes AS TwoRedNodes FILTER Color EQ 'Red' AS RedNodes 311 1: [06 09 03] 312 ``` 313 314 </details> 315 316 #### Example #2 317 ```sql 318 REP 1 IN TwoRedNodes 319 REP 1 IN TwoRedNodes 320 SELECT 2 FROM RedNodes AS TwoRedNodes 321 FILTER Color EQ 'Red' AS RedNodes 322 ``` 323 324 <details> 325 <summary>Result</summary> 326 327 ```sh 328 > eval REP 1 REP 1 IN TwoRedNodes SELECT 2 FROM RedNodes AS TwoRedNodes FILTER Color EQ 'Red' AS RedNodes 329 1: [06 09 03] 330 2: [06 09 03] 331 ``` 332 333 </details> 334 335 #### Example #3 336 ```sql 337 REP 2 IN MyNodes 338 REP 2 IN MyNodes 339 SELECT 2 FROM RedOrBlueNodes AS MyNodes 340 FILTER Color EQ 'Red' AS RedNodes 341 FILTER Color EQ 'Blue' AS BlueNodes 342 FILTER @RedNodes OR @BlueNodes AS RedOrBlueNodes 343 ``` 344 345 <details> 346 <summary>Result</summary> 347 348 ```sh 349 > eval REP 2 IN MyNodes REP 2 IN MyNodes SELECT 2 FROM RedOrBlueNodes AS MyNodes FILTER Color EQ 'Red' AS RedNodes FILTER Color EQ 'Blue' AS BlueNodes FILTER @RedNodes OR @BlueNodes AS RedOrBlueNodes 350 1: [06 01 04 03 09 07] 351 2: [06 01 04 03 09 07] 352 ``` 353 354 </details> 355 356 #### Example #4 357 ```sql 358 REP 2 IN MyRedNodes 359 REP 2 IN MyBlueNodes 360 CBF 1 361 SELECT 2 FROM RedNodes AS MyRedNodes 362 SELECT 2 FROM BlueNodes AS MyBlueNodes 363 FILTER Color EQ 'Red' AS RedNodes 364 FILTER Color EQ 'Blue' AS BlueNodes 365 ``` 366 367 <details> 368 <summary>Result</summary> 369 370 ```sh 371 > eval REP 2 IN MyRedNodes REP 2 IN MyBlueNodes CBF 1 SELECT 2 FROM RedNodes AS MyRedNodes SELECT 2 FROM BlueNodes AS MyBlueNodes FILTER Color EQ 'Red' AS RedNodes FILTER Color EQ 'Blue' AS BlueNodes 372 1: [06 03] 373 2: [01 04] 374 ``` 375 376 </details> 377 378 #### Example #5 379 ```sql 380 UNIQUE 381 REP 1 IN MyGreenNodes 382 REP 1 IN MyGreenNodes 383 REP 1 IN MyGreenNodes 384 CBF 1 385 SELECT 1 FROM GreenNodes AS MyGreenNodes 386 FILTER Color EQ 'Green' AS GreenNodes 387 ``` 388 389 <details> 390 <summary>Result</summary> 391 392 ```sh 393 > eval UNIQUE REP 1 IN MyGreenNodes REP 1 IN MyGreenNodes REP 1 IN MyGreenNodes CBF 1 SELECT 1 FROM GreenNodes AS MyGreenNodes FILTER Color EQ 'Green' AS GreenNodes 394 1: [05] 395 2: [02] 396 3: [08] 397 ``` 398 399 </details> 400 401 #### Example #6 402 ```sql 403 REP 1 IN MyNodes 404 REP 2 405 CBF 2 406 SELECT 1 FROM CuteNodes AS MyNodes 407 FILTER (Color EQ 'Blue') AND NOT (Shape EQ 'Circle' OR Shape EQ 'Square') AS CuteNodes 408 ``` 409 410 <details> 411 <summary>Result</summary> 412 413 ```sh 414 eval REP 1 IN MyNodes REP 2 CBF 2 SELECT 1 FROM CuteNodes AS MyNodes FILTER (Color EQ 'Blue') AND NOT (Shape EQ 'Circle' OR Shape EQ 'Square') AS CuteNodes 415 1: [07] 416 2: [06 05 02 03] 417 ``` 418 419 </details> 420 421 ## Appendix 1: Operators 422 423 Comparison operators (all binary): 424 - `EQ`: equals 425 - `NE`: not equal 426 - `GE`: greater or equal 427 - `GT`: greater than 428 - `LE`: less or equal 429 - `LT`: less than 430 431 Pattern operator: 432 - `LIKE`: specifies pattern for an attribute. Uses as a wildcard symbol `*`. 433 - `... ATTR LIKE "VAL"` - the behaviour is equal to `EQ` 434 - `... ATTR LIKE "VAL*"` - matches all which starts with `VAL` 435 - `... ATTR LIKE "*VAL"` - matches all which ends with `VAL` 436 - `... ATTR LIKE "*VAL*"` - matches all which contains `VAL` 437 438 Logical operators: 439 - `NOT`: negation (unary) 440 - `AND`: conjunction (binary) 441 - `OR`: disjunction (binary) 442 443 Others: 444 - `@`: filter reference 445 - `(`: left parenthesis 446 - `)`: right parenthesis 447 448 ## Appendix 2: Policy playground commands 449 450 - `ls`: list nodes in the current netmap and their attributes 451 - `add`: add a node to the current netmap. If it already exists, it will be overwritten. 452 - `remove`: remove a node from the current netmap. 453 - `eval`: evaluate a placement policy on the current netmap.