k8s.io/perf-tests/clusterloader2@v0.0.0-20240304094227-64bdb12da87e/docs/design.md (about) 1 # Cluster loader vision 2 3 Author: wojtek-t 4 5 Last update time: 1st Aug 2018 6 7 ## Background 8 9 As of 31/03/2018, all our scalability tests, are regular e2e tests written in Go. 10 It makes them really unfriendly for developers not working on scalability who just 11 want to load test Kubernetes features they are working on. Doing so in many 12 situations requires understanding how those tests really work, modify their code 13 to test the new feature and only then run and debug tests. Alternatively, they 14 may create a dedicated load test for their particular feature on their own, which 15 may be easier, but on the other hand may not exercise important metrics that our 16 performance tests check. This workflow is far from optimal. 17 18 That said, long time ago we came up with the idea that users should be able to 19 just bring their own objects definitions in JSON format, potentially annotate them 20 with some metadata to describe how the load should be generated from these and 21 testing infrastructure should do everything else automatically. 22 23 In early 2017 the prototype of “Cluster Loader” has been created. It proved that 24 configuring tests with json/yaml files is possible. But its functionality is very 25 limited and it is very far from enabling migration any of existing scalability 26 tests to that framework. 27 28 We would like to get back to that and build fully functional Cluster Loader and 29 use it as a framework to run all our scalability tests. This doc is describing 30 high-level vision of how this will work. 31 32 33 ## Vision 34 35 At the high level, a single test will consist of a number of steps. In each of 36 those steps we will be creating, updating and/or deleting a number of different 37 objects. 38 Additionally, we will introduce a set of predefined operations (that user will be 39 able to use as phases). They will allow users to monitor/measure performance impact 40 of user-defined phases. 41 The following subsections will describe this in a bit more detail. 42 43 ### Config 44 45 A single test scenario will be defined by a `Config`. Its schema will be as following: 46 47 ``` 48 struct Config { 49 // Number of namespaces automanaged by ClusterLoader. 50 Namespaces int32 51 // Steps of the test. 52 Steps []Step 53 // Tuning sets that are used by steps. 54 TuningSets []TuningSet 55 } 56 ``` 57 58 With a test being defined by a single json/yaml file, it should be pretty simple 59 to modify scenarios, and fork them to new ones. 60 61 Note that before running any steps, ClusterLoader will create all the requested 62 namespaces and after running all of them will delete them (together with all 63 objects that remained undeleted after running the test). Namespaces are described 64 in more details in the later part of this document. 65 66 ### Step 67 68 Each step will consist of a number of create, update and delete operations (potentially 69 on many different object types) or a number of monitoring/measurement-related actions. 70 A single step is defined as following: 71 72 ``` 73 struct Step { 74 // Only one of these can be non-empty. 75 Phases []Phase 76 Measurements []string 77 Name string 78 } 79 ``` 80 81 We make `Phases` and `Measurements` separate concepts to ensure correct ordering 82 between those two types of actions. It's very important to ensure that proper 83 measurements are started before we start given actions and they are stopped when 84 all actions are done. 85 86 Also note that all `Phases` and `Measurements` within a single `Step` will be 87 run in parallel - a `Step` ends when all its `Phases` or `Measurements` finish. 88 That also means, that individual steps run in serial. 89 90 Step has optional `Name`. If step is named, a timer will be fired 91 for that step automatically. 92 93 ### Phase 94 95 Phase declaratively defines state of objects we should reach in the underlying 96 Kubernetes cluster. A single declaration may result in a number of create, update 97 and delete operations depending on the current state of the cluster. 98 99 We define the phase as following: 100 101 ``` 102 struct Phase { 103 // Set of namespaces in which objects should be reconciled. 104 // If null, objects are assumed to be cluster scoped. 105 NamespaceRange *NamespaceRange 106 // Number of instances of a given object to exist in each 107 // of referenced namespaces. 108 ReplicasPerNamespace int32 109 // Name of TuningSet to be used. 110 TuningSet string 111 // A set of objects that should be reconciled. 112 Objects []Object 113 } 114 ``` 115 116 ``` 117 struct Object { 118 // Type definition for a given object. 119 ObjectType ObjectType 120 // Base name from which names of objects will be created. 121 // Names of objects will be "basename-0", "basename-1", ... 122 Basename string 123 // A file path to object definition. 124 ObjectTemplatePath string 125 } 126 ``` 127 128 The semantic of the above structure will be as following: 129 - `Phases` within a single `Step` will be run in parallel (to recall, 130 individual `Steps` run in serial). 131 - `Objects` within a single `Phase` will be reconciled in serial for a given 132 (namespace, replica number) pair. For different (namespace, replica number) 133 pairs they will be spread using a given tuning set. 134 135 The rationale for having such structure is the following: 136 - `Objects` represent a collection of Kubernetes objects that can be logically 137 though of as a unit of workload (e.g. application comprised of a service, 138 deployment and a volume). Conceptually, this collection is out unit of 139 replication. Note that we process the `Objects` slice serially which allows 140 ordering between objects of a unit (e.g. create a service before deployment). 141 The replication itself is done according to `TuningSet` and 142 `ReplicasPerNamespace` parameters of the `Phase`. 143 - Running multiple `Phases` in parallel allows to run different workloads at the 144 same time. As an example, it allows to create two different types of 145 applications in parallel (possibly using different tuning sets). 146 - Running `Steps` in serial allow you to synchronize between `Phases` (and 147 for example block finishing measurement on all phases from previous step 148 to be finished). 149 150 Within an single `Phase` we make an explicit assumption that if `ReplicasPerNamespace` 151 changes, any of `ObjectTemplatePath` cannot change at the same time (assuming it 152 already exists for a given set of objects). That basically means that within a 153 single `Phase` operations for a given `Object` may only be of a single type 154 (create, update of delete). 155 156 All of the objects are assumed to be units of workload. 157 Therefore, if an object comes with dependents, all of its dependents will be affected 158 by the operation performed on this object. E.g. removing instance of a `ReplicationCotroller` 159 will also result with removing depended `Pods`. 160 161 To make it more explicit: 162 - if `ReplicasPerNamespace` is different than it previously was, we will create 163 or delete a number of objects to reach expected cluster state 164 - if `ReplicasPerNamespace` is the same as it previously was, we will update all 165 objects to the referenced template. 166 167 An appropriate validation will be added to cluster loader to ensure the above for 168 a given input config. 169 170 Note that (namespace number, object type, basename) tuple defines a set of 171 replicated objects. 172 173 All `Object` changes for a given (namespace, replica number) pair are treated as 174 a unit of action. Such units will be spread over time using a referenced tuning 175 set (described below). 176 177 The following definition makes the API declarative and thus a bit similar with 178 Kubernetes API. 179 180 Caveats: 181 182 - Note that even with such declarative approach, we may e.g. express a phase of 183 randomly scaling a number of objects. This would be possible by expressing e.g. 184 `spec.replicas: <3+RAND()%5>` in DeploymentSpec. 185 This will require evaluating templates once for every object, but that should 186 be fine. 187 188 ### Multiple copies of the same workload 189 190 To fill-in large clusters, we would need to spread objects across different namespaces. 191 In many cases, it will be enough for users for many namespaces to contain the same 192 objects (or to be more specific: objects created from the same templates). Obviously, 193 we want the config for the test to be as small as possible. 194 As a result, we will introduce the following rules: 195 - In the top-level test definition, we will define the number of namespaces that will 196 be automanaged by ClusterLoader. 197 - The automanaged namespaces will have names of the form “namespace-<number>” for 198 number in range 1..N (where N is the number of namespaces in a test) 199 200 However, users may want to create their own namespaces (as part of `Phases`) and 201 create objects in them. That is perfectly valid usecase that will be supported. 202 203 To make it possible to reference a set of namespaces (both automanaged and user-created), 204 we introduce the following type: 205 206 ``` 207 struct NamespaceRange { 208 Min int32 209 Max int32 210 Basename *string 211 } 212 ``` 213 214 The `NamespaceRange` would select all namespace `\<Basename\>-\<i\>` for `i` in the 215 range [Min, Max]. If `Basename` is unset, it will be default to the basename used for 216 automanaged namespaces (i.e. `namespace`). 217 218 #### Defining object type 219 220 In order to update or delete an object, users need to be able to define type of 221 object that this operation is about. Thus, we introduce the following type for 222 this purpose: 223 224 ``` 225 struct ObjectType { 226 APIGroup string 227 APIVersion string 228 Kind string 229 } 230 ``` 231 232 Using this will allow us to easily use dynamic client in most of the places 233 which may significantly simplify Cluster Loader itself. 234 235 #### Tuning Set 236 237 Since we would like to be able to fully load even very big clusters, we need to 238 be able to create a number of “similar” objects. A “Tuning Set” concept will allow 239 us to spread those operations across some time. 240 We define Tuning Set as following: 241 242 ``` 243 struct TuningSet { 244 Name string 245 InitialDelay time.Duration 246 // Exactly one of the following should be set. 247 QpsLoad *QpsLoad 248 RandomizedLoad *RandomizedLoad 249 SteppedLoad *SteppedLoad 250 } 251 252 // QpsLoad defines a uniform load with a given QPS. 253 struct QpsLoad { 254 Qps float 255 } 256 257 // RandomizedLoad defines a load that is spread randomly 258 // across a given total time. 259 struct RandomizedLoad { 260 AverageQps float 261 } 262 263 // SteppedLoad defines a load that generates a burst of 264 // a given size every X seconds. 265 struct SteppedLoad { 266 BurstSize int32 267 StepDelay time.Duration 268 } 269 ``` 270 271 More policies can be introduced in the future. 272 273 ### Measurements 274 275 A critical part of Cluster Loader is an ability to check whether tests (defined by 276 configs) are satisfying set of Kubernetes performance SLOs. 277 Fortunately, for testing a specific functionality, we don’t really change the SLO. 278 We may want to, from time to time, tweak how do we measure existing SLOs or introduce 279 a new one, but it is fine to require changes to the framework to achieve that. 280 281 As a result, mechanisms to measure specific SLOs (or gather other types of metrics) 282 will be incorporated into Cluster Loader framework. We will expect that developers 283 trying to introduce a new SLO (or change how do we measure the existing one) will 284 be modifying that part of Cluster Loader codebase. Within the codebase, we will try 285 to provide a relatively easy framework to achieve it though. 286 287 At the high level, to implement gathering a given portion of data or measure a new 288 SLO, you will need to implement a very simple Go interface: 289 290 ``` 291 type Measurement interface { 292 Execute(config *MeasurementConfig) error 293 } 294 295 // An instance of below struct would be constructed by clusterloader during runtime 296 // and passed to the Execute method. 297 struct MeasurementConfig { 298 // Client to access the k8s api. 299 Clientset *k8sclient.ClientSet 300 // Interface to access the cloud-provider api (can be skipped for initial version). 301 CloudProvider *cloudprovider.Interface 302 // Params is a map of {name: value} pairs enabling for injection of arbitrary 303 // config into the Execute method. This is copied over from the Params field 304 // in the Measurement config (explained later) as it is. 305 Params map[string]interface{} 306 } 307 ``` 308 309 Once you implement such an interface, registering it in the correct 310 place will allow you to use those as phases in your config. 311 As an example, consider gathering resource usage from system components. 312 It will be enough to implement something like the following: 313 314 ``` 315 struct ResourceGatherer { 316 // Some fields that you need. 317 } 318 319 func (r *ResourceGatherer) Execute(c MeasurementConfig) error { 320 if c.Params["start"] { 321 // Initialize gatherer. 322 // Start the gathering goroutines. 323 return nil 324 } 325 if c.Params["stop"] { 326 // Stop the gatherer goroutines. 327 // Validate and/or save the results. 328 return nil 329 } 330 // Handling of any other potential cases. 331 } 332 ``` 333 334 and registering this type in some factory, to enable use of `ResourceGatherer` 335 as a measurement 'Method' in your test. And finally, at the config level, 336 each `Measurement` is defined as: 337 338 ``` 339 struct Measurement { 340 // The measurement method to be run. 341 // Such method has to be registered in ClusterLoader factory. 342 Method string 343 // Identifier is a string for differentiating this measurement instance 344 // from other instances of the same method. 345 Identifier string 346 // Params is a map of {name: value} pairs which will be passed to the 347 // measurement method - allowing for injection of arbitrary parameters to it. 348 Params map[string]interface{} 349 } 350 ``` 351 352 To begin with, we will provide few built-in measurement methods like: 353 354 ``` 355 ResourceGatherer 356 ProfileGatherer 357 MetricsGatherer 358 APICallLatencyValidator 359 PodStartupLatencyValidator 360 ``` 361 362 363 ## Future enhancements 364 365 This section contains future enhancements that will need to happen, but not 366 necessary at the early beginning. 367 368 1. Simple templating in json files. 369 This would be extremely useful (necessary) feature to enable referencing 370 objects from other objects. As an example, let's say that we want to reference 371 secret number `i` from deployment number `i`. 372 We would achieve that by providing very simple templating mechanism at the 373 level of files with object definitions. The exact details are TBD, but the 374 high-level proposal is to: 375 - use the `{{ param }}` syntax for templates 376 - support only very simply mathematical operations, and symbols: 377 - `N` would mean the number of that object (as defined in `basename-<N>`) 378 - `RAND` will be random integer 379 - `%` (modulo) operation will be supported 380 - `+` operation will be supported 381 - though, only simple expressions (like {{ N+i%5 }} or {{ RAND%3+5 }} will 382 be supported (at least initially). 383 384 2. Feedback loop from monitoring. 385 Assume that we defined some SLO (that Cluster Loader is able to measure) and 386 now we want to understand what conditions need to be satisfied to meet this 387 SLO (e.g. what throughput we can support to meet the latency SLO). 388 Providing a feedback loop from measurements to load generation tuning can 389 solve that problem automatically for us. 390 There are a number of details that needs to be figured out to do that, that's 391 not needed for the initial version (or for migration existing scalability 392 tests), but that should also happen once the framework is already usable.