github.com/wmuizelaar/kpt@v0.0.0-20221018115725-bd564717b2ed/docs/design-docs/03-pipeline-merge.md (about) 1 # Pipeline Merge 2 3 * Author(s): Phani Teja Marupaka, phanimarupaka 4 * Approver: Sunil Arora, Mike Borozdin 5 6 ## Why 7 8 Currently, `kpt pkg update` doesn't merge the `pipeline` section in the Kptfile as expected. 9 The fact that the `pipeline` section is a non-associative list with no defined function identity makes 10 it very difficult to merge with upstream counterparts. Ordering of the functions is also important. 11 This friction is forcing users to use `setters` and discouraging them from declaring other functions in the `pipeline` as 12 they will be deleted during the `kpt pkg update`. [Here](https://github.com/GoogleContainerTools/kpt/issues/2529) 13 is the example issue. Merging the pipeline correctly will reduce huge amounts of 14 friction in declaring new functions. This will encourage users to declare more functions 15 in the pipeline which in turn helps to **avoid excessive parameterization**. 16 17 Consider the example of [Landing Zone](https://github.com/GoogleCloudPlatform/blueprints/tree/main/catalog) blueprints. 18 Parameters(setters) are the primary interface for the package. This is an anti-pattern for package-as-interface motivation 19 and one of the major blockers for not using other functions is the merge behavior 20 of the pipeline. If this problem is solved, LZ maintainers can rewrite the packages 21 with best practices aligned to the bigger goal of treating the package-as-interface, 22 discourage excessive use of setters and only use them as parameterization techniques of last resort. 23 24 ## Design 25 26 In order to solve this problem, we should merge the pipeline section of the Kptfile 27 in a custom manner based on the most common use-cases and the expected outputs in 28 such scenarios. There are no user interface changes. Users will invoke the `kpt pkg 29 update` command in the same way they do currently. This effort will improve the merged 30 output of the pipeline section. 31 32 **Is this change backwards compatible**: We are not making any changes to the api, 33 we are only improving the merged output of the `pipeline` section. This change will 34 produce a different output of Kptfile when compared to the current version but this 35 is not a breaking change. 36 37 ## User guide 38 39 Here is what users can expect when they invoke the command, `kpt pkg update` on a package. 40 41 #### Terminology 42 43 **Original**: Source of the last fetched version of the package, represented by the upstreamLock section in Kptfile. 44 45 **Updated upstream**: Declared source of the updated package, represented by the upstream section in Kptfile. 46 47 **Local**: Local fork of the package on disk. 48 49 Firstly, we need to define the identity of the function in order to uniquely identify 50 a function across three sources to perform a 3-way merge. We will introduce a 51 new field `name` for identifying functions and merge pipeline as associative list. 52 The aim is to encourage users eventually to have `name` field specified for all functions 53 similar to containers in deployment. But in the meanwhile, we will be using image name in 54 order to identify the function and make it an associative list for merging. The limitation 55 of this approach is the multiple functions with same image can be declared in the pipeline. 56 In that case we will fall back to current merge logic of merging pipeline as non-associative list. 57 In order to have deterministic behavior, we will use image as merge key iff none of the 58 functions have `name` field specified and iff there are no duplicate image names declare in functions. 59 60 So the algorithm to merge: 61 62 1. If `name` field is specified in all the functions across all sources, merge 63 it as associative list using `name` field as merge key. 64 2. If `name` field is not specified in all the functions across all sources, 65 and if there is no duplicate declaration of image names for functions, use `image` value 66 (excluding the version) as merge key and merge it as associative list. 67 3. In all other cases fall back to the current merge behavior. 68 69 Here is an example of the merging apply-setters function when new setters are added 70 upstream and existing setter values are updated locally. This is the most common 71 use case where kpt fails to merge them. Since `name` field is not specified, `image` 72 value is used to identify and merge the function 73 74 ``` 75 Original 76 ``` 77 ```yaml 78 pipeline: 79 mutators: 80 - image: gcr.io/kpt-fn/apply-setters:v0.1 81 configMap: 82 image: nginx 83 tag: 1.0.1 84 ``` 85 ``` 86 Updated upstream 87 ``` 88 ```yaml 89 pipeline: 90 mutators: 91 - image: gcr.io/kpt-fn/apply-setters:v0.1 92 configMap: 93 image: nginx 94 tag: 1.0.1 95 new-setter: new-setter-value // new setter is added 96 ``` 97 ``` 98 Local 99 ``` 100 101 ```yaml 102 pipeline: 103 mutators: 104 - image: gcr.io/kpt-fn/apply-setters:v0.1 105 configMap: 106 image: nginx 107 tag: 1.2.0 // value of tag is updated 108 ``` 109 ``` 110 Current Output 111 ``` 112 ```yaml 113 pipeline: 114 mutators: 115 - image: gcr.io/kpt-fn/apply-setters:v0.1 116 configMap: 117 image: nginx 118 tag: 1.0.1 // entire pipeline is overridden by upstream 119 new-setter: new-setter-value 120 ``` 121 ``` 122 Expected Output 123 ``` 124 ```yaml 125 pipeline: 126 mutators: 127 - image: gcr.io/kpt-fn/apply-setters:v0.1 128 configMap: 129 image: nginx 130 tag: 1.2.0 // updated tag is preserved 131 new-setter: new-setter-value // new setter is pulled 132 ``` 133 134 In the above scenario, the value of the setter tag is not updated upstream, so 135 the modified local value will be preserved. But in the following example, both 136 upstream and local values change. So, similar to merging resources, upstream 137 value wins if the same fields in both upstream and local are updated. 138 139 ``` 140 Original 141 ``` 142 ```yaml 143 pipeline: 144 mutators: 145 - image: gcr.io/kpt-fn/set-labels:v0.1 146 configPath: labels.yaml 147 ``` 148 ``` 149 Updated upstream 150 ``` 151 ```yaml 152 pipeline: 153 mutators: 154 - image: gcr.io/kpt-fn/set-labels:v0.1 155 configPath: labels-updated.yaml // upstream value changed 156 ``` 157 ``` 158 Local 159 ``` 160 ```yaml 161 pipeline: 162 mutators: 163 - image: gcr.io/kpt-fn/set-labels:v0.1 164 configPath: labels-local.yaml // local value changed 165 ``` 166 ``` 167 Expected Output 168 ``` 169 ```yaml 170 pipeline: 171 mutators: 172 - image: gcr.io/kpt-fn/set-labels:v0.1 173 configPath: labels-updated.yaml // upstream overrides local 174 ``` 175 176 Similarly, the upstream version wins if both upstream and local are updated, else local is preserved. 177 178 ``` 179 Original 180 ``` 181 ```yaml 182 pipeline: 183 mutators: 184 - image: gcr.io/kpt-fn/set-annotations:v0.1 185 configPath: annotations.yaml 186 ``` 187 ``` 188 Updated upstream 189 ``` 190 ```yaml 191 pipeline: 192 mutators: 193 - image: gcr.io/kpt-fn/set-annotations:v0.2.0 194 configPath: annotations.yaml 195 ``` 196 ``` 197 Local 198 ``` 199 ```yaml 200 pipeline: 201 mutators: 202 - image: gcr.io/kpt-fn/set-annotations:v0.1.1 203 configPath: annotations.yaml 204 ``` 205 ``` 206 Expected Output 207 ``` 208 ```yaml 209 pipeline: 210 mutators: 211 - image: gcr.io/kpt-fn/set-annotations:v0.2.0 212 configPath: annotations.yaml 213 ``` 214 215 #### More examples with expected output 216 217 Newly added upstream function. 218 219 ``` 220 Original 221 ``` 222 ```yaml 223 pipeline: 224 mutators: 225 - image: gcr.io/kpt-fn/apply-setters:v0.1 226 configPath: setters.yaml 227 ``` 228 ``` 229 Updated upstream 230 ``` 231 ```yaml 232 pipeline: 233 mutators: 234 - image: gcr.io/kpt-fn/apply-setters:v0.1 235 configPath: setters.yaml 236 - image: gcr.io/kpt-fn/generate-folders:v0.1 237 ``` 238 ``` 239 Local 240 ``` 241 ```yaml 242 pipeline: 243 mutators: 244 - image: gcr.io/kpt-fn/apply-setters:v0.1 245 configPath: setters.yaml 246 - image: gcr.io/kpt-fn/set-namespace:v0.1 247 configMap: 248 namespace: foo 249 ``` 250 ``` 251 Expected output 252 ``` 253 ```yaml 254 pipeline: 255 mutators: 256 - image: gcr.io/kpt-fn/apply-setters:v0.1 257 configPath: setters.yaml 258 - image: gcr.io/kpt-fn/set-namespace:v0.1 259 configMap: 260 namespace: foo 261 - image: gcr.io/kpt-fn/generate-folders:v0.1 262 ``` 263 264 If a function is deleted upstream and not changed on the local, it will be deleted on local. 265 266 ``` 267 Original 268 ``` 269 ```yaml 270 pipeline: 271 mutators: 272 - image: gcr.io/kpt-fn/apply-setters:v0.1 273 configPath: setters.yaml 274 - image: gcr.io/kpt-fn/generate-folders:v0.1 275 ``` 276 ``` 277 Updated upstream 278 ``` 279 ```yaml 280 pipeline: 281 mutators: 282 - image: gcr.io/kpt-fn/apply-setters:v0.1 283 configPath: setters.yaml 284 ``` 285 ``` 286 Local 287 ``` 288 ```yaml 289 pipeline: 290 mutators: 291 - image: gcr.io/kpt-fn/apply-setters:v0.1 292 configPath: setters.yaml 293 - image: gcr.io/kpt-fn/generate-folders:v0.1 294 - image: gcr.io/kpt-fn/set-namespace:v0.1 295 configMap: 296 namespace: foo 297 ``` 298 ``` 299 Expected output 300 ``` 301 ```yaml 302 pipeline: 303 mutators: 304 - image: gcr.io/kpt-fn/apply-setters:v0.1 305 configPath: setters.yaml 306 - image: gcr.io/kpt-fn/set-namespace:v0.1 307 configMap: 308 namespace: foo 309 ``` 310 311 Same function declared multiple times: If the same function is declared multiple 312 times and `name` field is not specified, fall back to the current merge behavior 313 314 ``` 315 Original 316 ``` 317 ```yaml 318 pipeline: 319 mutators: 320 - image: gcr.io/kpt-fn/search-replace:v0.1 321 configMap: 322 by-value: foo 323 put-value: bar 324 - image: gcr.io/kpt-fn/search-replace:v0.1 325 configMap: 326 by-value: abc 327 put-comment: ${some-setter-name} 328 ``` 329 ``` 330 Updated upstream 331 ``` 332 ```yaml 333 pipeline: 334 mutators: 335 - image: gcr.io/kpt-fn/search-replace:v0.1 336 configMap: 337 by-value: foo 338 put-value: bar-new 339 - image: gcr.io/kpt-fn/search-replace:v0.1 340 configMap: 341 by-value: abc 342 put-comment: ${updated-setter-name} 343 ``` 344 ``` 345 Local 346 ``` 347 ```yaml 348 pipeline: 349 mutators: 350 - image: gcr.io/kpt-fn/generate-folders:v0.1 351 - image: gcr.io/kpt-fn/search-replace:v0.1 352 configMap: 353 by-value: foo 354 put-value: bar 355 - image: gcr.io/kpt-fn/set-labels:v0.1 356 configMap: 357 app: db 358 - image: gcr.io/kpt-fn/search-replace:v0.1 359 configMap: 360 by-value: abc 361 put-comment: ${some-setter-name} 362 - image: gcr.io/kpt-fn/search-replace:v0.1 363 configMap: 364 by-value: YOUR_TEAM 365 put-value: my-team 366 ``` 367 ``` 368 Expected output 369 ``` 370 ```yaml 371 pipeline: 372 mutators: 373 - image: gcr.io/kpt-fn/search-replace:v0.1 374 configMap: 375 by-value: foo 376 put-value: bar-new 377 - image: gcr.io/kpt-fn/search-replace:v0.1 378 configMap: 379 by-value: abc 380 put-comment: ${updated-setter-name} 381 ``` 382 383 In the following scenario, only one function has the `name` field specified. All the 384 other functions doesn't have name specified, hence we fall back to current merge logic. 385 386 ``` 387 Original 388 ``` 389 ```yaml 390 pipeline: 391 mutators: 392 - image: gcr.io/kpt-fn/search-replace:v0.1 393 configMap: 394 by-value: foo 395 put-value: bar 396 - image: gcr.io/kpt-fn/search-replace:v0.1 397 configMap: 398 by-value: abc 399 put-comment: ${some-setter-name} 400 ``` 401 ``` 402 Updated upstream 403 ``` 404 ```yaml 405 pipeline: 406 mutators: 407 - image: gcr.io/kpt-fn/search-replace:v0.1 408 configMap: 409 by-value: foo 410 put-value: bar-new 411 - image: gcr.io/kpt-fn/search-replace:v0.1 412 configMap: 413 by-value: abc 414 put-comment: ${updated-setter-name} 415 ``` 416 ``` 417 Local 418 ``` 419 ```yaml 420 pipeline: 421 mutators: 422 - image: gcr.io/kpt-fn/search-replace:v0.1 423 name: my-new-function 424 configMap: 425 by-value: YOUR_TEAM 426 put-value: my-team 427 - image: gcr.io/kpt-fn/generate-folders:v0.1 428 - image: gcr.io/kpt-fn/search-replace:v0.1 429 configMap: 430 by-value: foo 431 put-value: bar 432 - image: gcr.io/kpt-fn/set-labels:v0.1 433 configMap: 434 app: db 435 - image: gcr.io/kpt-fn/search-replace:v0.1 436 configMap: 437 by-value: abc 438 put-comment: ${some-setter-name} 439 ``` 440 ``` 441 Expected output 442 ``` 443 ```yaml 444 pipeline: 445 mutators: 446 - image: gcr.io/kpt-fn/search-replace:v0.1 447 configMap: 448 by-value: foo 449 put-value: bar-new 450 - image: gcr.io/kpt-fn/search-replace:v0.1 451 configMap: 452 by-value: abc 453 put-comment: ${updated-setter-name} 454 ``` 455 456 Here is the ideal scenario which leads to deterministic merge behavior using `name` 457 field across all sources 458 459 ``` 460 Original 461 ``` 462 ```yaml 463 pipeline: 464 mutators: 465 - image: gcr.io/kpt-fn/search-replace:v0.1 466 name: sr1 467 configMap: 468 by-value: foo 469 put-value: bar 470 - image: gcr.io/kpt-fn/search-replace:v0.1 471 name: sr2 472 configMap: 473 by-value: abc 474 put-comment: ${some-setter-name} 475 ``` 476 ``` 477 Updated upstream 478 ``` 479 ```yaml 480 pipeline: 481 mutators: 482 - image: gcr.io/kpt-fn/search-replace:v0.1 483 name: sr1 484 configMap: 485 by-value: foo 486 put-value: bar-new 487 - image: gcr.io/kpt-fn/search-replace:v0.1 488 name: sr2 489 configMap: 490 by-value: abc 491 put-comment: ${updated-setter-name} 492 ``` 493 ``` 494 Local 495 ``` 496 ```yaml 497 pipeline: 498 mutators: 499 - image: gcr.io/kpt-fn/search-replace:v0.1 500 name: my-new-function 501 configMap: 502 by-value: YOUR_TEAM 503 put-value: my-team 504 - image: gcr.io/kpt-fn/generate-folders:v0.1 505 name: gf1 506 - image: gcr.io/kpt-fn/search-replace:v0.1 507 name: sr1 508 configMap: 509 by-value: foo 510 put-value: bar 511 - image: gcr.io/kpt-fn/set-labels:v0.1 512 name: sl1 513 configMap: 514 app: db 515 - image: gcr.io/kpt-fn/search-replace:v0.1 516 name: sr2 517 configMap: 518 by-value: abc 519 put-comment: ${some-setter-name} 520 ``` 521 ``` 522 Expected output 523 ``` 524 ```yaml 525 pipeline: 526 mutators: 527 - image: gcr.io/kpt-fn/search-replace:v0.1 528 name: my-new-function 529 configMap: 530 by-value: YOUR_TEAM 531 put-value: my-team 532 - image: gcr.io/kpt-fn/generate-folders:v0.1 533 name: gf1 534 - image: gcr.io/kpt-fn/search-replace:v0.1 535 name: sr1 536 configMap: 537 by-value: foo 538 put-value: bar-new 539 - image: gcr.io/kpt-fn/set-labels:v0.1 540 name: sl1 541 configMap: 542 app: db 543 - image: gcr.io/kpt-fn/search-replace:v0.1 544 name: sr2 545 configMap: 546 by-value: abc 547 put-comment: ${updated-setter-name} 548 ``` 549 550 Merging selectors is difficult as there is no identity. If both upstream and 551 local selectors for a given function diverge, the entire section of selectors 552 from upstream will override the selectors on local for that function. 553 554 ``` 555 Origin 556 ``` 557 ```yaml 558 pipeline: 559 mutators: 560 - image: gcr.io/kpt-fn/ensure-name-substring:v0.1 561 selectors: 562 - kind: Deployment 563 name: wordpress 564 - kind: Service 565 name: wordpress 566 ``` 567 ``` 568 Updated upstream 569 ``` 570 ```yaml 571 pipeline: 572 mutators: 573 - image: gcr.io/kpt-fn/ensure-name-substring:v0.1 574 selectors: 575 - kind: Deployment 576 name: wordpress 577 - kind: Service 578 name: wordpress 579 - kind: Foo 580 name: wordpress 581 ``` 582 ``` 583 Local 584 ``` 585 ```yaml 586 pipeline: 587 mutators: 588 - image: gcr.io/kpt-fn/ensure-name-substring:v0.1 589 selectors: 590 - kind: Deployment 591 name: my-wordpress 592 - kind: Service 593 name: my-wordpress 594 - namespace: my-space 595 ``` 596 ``` 597 Expected output 598 ``` 599 ```yaml 600 pipeline: 601 mutators: 602 - image: gcr.io/kpt-fn/apply-setters:v0.1 603 configPath: setters.yaml 604 - image: gcr.io/kpt-fn/set-namespace:v0.1 605 configMap: 606 namespace: foo 607 - image: gcr.io/kpt-fn/generate-folders:v0.1 608 ``` 609 610 ## Open issues 611 612 https://github.com/GoogleContainerTools/kpt/issues/2529 613 614 ## Alternatives Considered 615 616 For identifying the function, we can add the function version to the primary key 617 (in addition to the function name). But it is highly likely that 618 changing the function version means updating the function as opposed to adding a new function. 619 620 Why should upstream win in case of conflicts ? Is this what the user always expects? 621 - Not necessarily. User expectations might be different in different scenarios for resolving conflicts. 622 But since we already went down the path of upstream-wins strategy in case of conflicts for merging resources, 623 we are going down that route to maintain consistency. 624 - There is an open [issue](https://github.com/GoogleContainerTools/kpt/issues/1437) to support 625 multiple conflict resolution strategies and provide interactive update behavior to resolve 626 conflicts which is out of scope for this doc and will be addressed soon.