golang.org/x/exp@v0.0.0-20240506185415-9bf2ced13842/apidiff/README.md (about) 1 # Checking Go API Compatibility 2 3 The `apidiff` tool in this directory determines whether two examples of a 4 package or module are compatible. The goal is to help the developer make an 5 informed choice of semantic version after they have changed the code of their 6 module. 7 8 `apidiff` reports two kinds of changes: incompatible ones, which require 9 incrementing the major part of the semantic version, and compatible ones, which 10 require a minor version increment. If no API changes are reported but there are 11 code changes that could affect client code, then the patch version should 12 be incremented. 13 14 `apidiff` may be used to display API differences between any two packages or 15 modules, not just different versions of the same thing. It does this by ignoring 16 the package import paths when directly comparing two packages, and 17 by ignoring module paths when comparing two modules. That is to say, when 18 comparing two modules, the package import paths **do** matter, but are compared 19 _relative_ to their respective module root. 20 21 ## Compatibility Desiderata 22 23 Any tool that checks compatibility can offer only an approximation. No tool can 24 detect behavioral changes; and even if it could, whether a behavioral change is 25 a breaking change or not depends on many factors, such as whether it closes a 26 security hole or fixes a bug. Even a change that causes some code to fail to 27 compile may not be considered a breaking change by the developers or their 28 users. It may only affect code marked as experimental or unstable, for 29 example, or the break may only manifest in unlikely cases. 30 31 For a tool to be useful, its notion of compatibility must be relaxed enough to 32 allow reasonable changes, like adding a field to a struct, but strict enough to 33 catch significant breaking changes. A tool that is too lax will miss important 34 incompatibilities, and users will stop trusting it; one that is too strict may 35 generate so much noise that users will ignore it. 36 37 To a first approximation, this tool reports a change as incompatible if it could 38 cause client code to stop compiling. But `apidiff` ignores five ways in which 39 code may fail to compile after a change. Three of them are mentioned in the 40 [Go 1 Compatibility Guarantee](https://golang.org/doc/go1compat). 41 42 ### Unkeyed Struct Literals 43 44 Code that uses an unkeyed struct literal would fail to compile if a field was 45 added to the struct, making any such addition an incompatible change. An example: 46 47 ``` 48 // old 49 type Point struct { X, Y int } 50 51 // new 52 type Point struct { X, Y, Z int } 53 54 // client 55 p := pkg.Point{1, 2} // fails in new because there are more fields than expressions 56 ``` 57 Here and below, we provide three snippets: the code in the old version of the 58 package, the code in the new version, and the code written in a client of the package, 59 which refers to it by the name `pkg`. The client code compiles against the old 60 code but not the new. 61 62 ### Embedding and Shadowing 63 64 Adding an exported field to a struct can break code that embeds that struct, 65 because the newly added field may conflict with an identically named field 66 at the same struct depth. A selector referring to the latter would become 67 ambiguous and thus erroneous. 68 69 70 ``` 71 // old 72 type Point struct { X, Y int } 73 74 // new 75 type Point struct { X, Y, Z int } 76 77 // client 78 type z struct { Z int } 79 80 var v struct { 81 pkg.Point 82 z 83 } 84 85 _ = v.Z // fails in new 86 ``` 87 In the new version, the last line fails to compile because there are two embedded `Z` 88 fields at the same depth, one from `z` and one from `pkg.Point`. 89 90 91 ### Using an Identical Type Externally 92 93 If it is possible for client code to write a type expression representing the 94 underlying type of a defined type in a package, then external code can use it in 95 assignments involving the package type, making any change to that type incompatible. 96 ``` 97 // old 98 type Point struct { X, Y int } 99 100 // new 101 type Point struct { X, Y, Z int } 102 103 // client 104 var p struct { X, Y int } = pkg.Point{} // fails in new because of Point's extra field 105 ``` 106 Here, the external code could have used the provided name `Point`, but chose not 107 to. I'll have more to say about this and related examples later. 108 109 ### unsafe.Sizeof and Friends 110 111 Since `unsafe.Sizeof`, `unsafe.Offsetof` and `unsafe.Alignof` are constant 112 expressions, they can be used in an array type literal: 113 114 ``` 115 // old 116 type S struct{ X int } 117 118 // new 119 type S struct{ X, y int } 120 121 // client 122 var a [unsafe.Sizeof(pkg.S{})]int = [8]int{} // fails in new because S's size is not 8 123 ``` 124 Use of these operations could make many changes to a type potentially incompatible. 125 126 127 ### Type Switches 128 129 A package change that merges two different types (with same underlying type) 130 into a single new type may break type switches in clients that refer to both 131 original types: 132 133 ``` 134 // old 135 type T1 int 136 type T2 int 137 138 // new 139 type T1 int 140 type T2 = T1 141 142 // client 143 switch x.(type) { 144 case T1: 145 case T2: 146 } // fails with new because two cases have the same type 147 ``` 148 This sort of incompatibility is sufficiently esoteric to ignore; the tool allows 149 merging types. 150 151 ## First Attempt at a Definition 152 153 Our first attempt at defining compatibility captures the idea that all the 154 exported names in the old package must have compatible equivalents in the new 155 package. 156 157 A new package is compatible with an old one if and only if: 158 - For every exported package-level name in the old package, the same name is 159 declared in the new at package level, and 160 - the names denote the same kind of object (e.g. both are variables), and 161 - the types of the objects are compatible. 162 163 We will work out the details (and make some corrections) below, but it is clear 164 already that we will need to determine what makes two types compatible. And 165 whatever the definition of type compatibility, it's certainly true that if two 166 types are the same, they are compatible. So we will need to decide what makes an 167 old and new type the same. We will call this sameness relation _correspondence_. 168 169 ## Type Correspondence 170 171 Go already has a definition of when two types are the same: 172 [type identity](https://golang.org/ref/spec#Type_identity). 173 But identity isn't adequate for our purpose: it says that two defined 174 types are identical if they arise from the same definition, but it's unclear 175 what "same" means when talking about two different packages (or two versions of 176 a single package). 177 178 The obvious change to the definition of identity is to require that old and new 179 [defined types](https://golang.org/ref/spec#Type_definitions) 180 have the same name instead. But that doesn't work either, for two 181 reasons. First, type aliases can equate two defined types with different names: 182 183 ``` 184 // old 185 type E int 186 187 // new 188 type t int 189 type E = t 190 ``` 191 Second, an unexported type can be renamed: 192 193 ``` 194 // old 195 type u1 int 196 var V u1 197 198 // new 199 type u2 int 200 var V u2 201 ``` 202 Here, even though `u1` and `u2` are unexported, their exported fields and 203 methods are visible to clients, so they are part of the API. But since the name 204 `u1` is not visible to clients, it can be changed compatibly. We say that `u1` 205 and `u2` are _exposed_: a type is exposed if a client package can declare variables of that type. 206 207 We will say that an old defined type _corresponds_ to a new one if they have the 208 same name, or one can be renamed to the other without otherwise changing the 209 API. In the first example above, old `E` and new `t` correspond. In the second, 210 old `u1` and new `u2` correspond. 211 212 Two or more old defined types can correspond to a single new type: we consider 213 "merging" two types into one to be a compatible change. As mentioned above, 214 code that uses both names in a type switch will fail, but we deliberately ignore 215 this case. However, a single old type can correspond to only one new type. 216 217 So far, we've explained what correspondence means for defined types. To extend 218 the definition to all types, we parallel the language's definition of type 219 identity. So, for instance, an old and a new slice type correspond if their 220 element types correspond. 221 222 ## Definition of Compatibility 223 224 We can now present the definition of compatibility used by `apidiff`. 225 226 ### Module Compatibility 227 228 > A new module is compatible with an old one if: 229 >1. Each package present in the old module also appears in the new module, 230 > with matching import paths relative to their respective module root, and 231 >2. Each package present in both modules fulfills Package Compatibility as 232 > defined below. 233 > 234 >Otherwise the modules are incompatible. 235 236 If a package is converted into a nested module of the original module then 237 comparing two versions of the module, before and after nested module creation, 238 will produce an incompatible package removal message. This removal message does 239 not necessarily mean that client code will need to change. If the package API 240 retains Package Compatibility after nested module creation, then only the 241 `go.mod` of the client code will need to change. Take the following example: 242 243 ``` 244 ./ 245 go.mod 246 go.sum 247 foo.go 248 bar/bar.go 249 ``` 250 251 Where `go.mod` is: 252 253 ``` 254 module example.com/foo 255 256 go 1.20 257 ``` 258 259 Where `bar/bar.go` is: 260 261 ``` 262 package bar 263 264 var V int 265 ``` 266 267 And `foo.go` is: 268 269 ``` 270 package foo 271 272 import "example.com/foo/bar" 273 274 _ = bar.V 275 ``` 276 277 Creating a nested module with the package `bar` while retaining Package 278 Compatibility is _code_ compatible, because the import path of the package does 279 not change: 280 281 ``` 282 ./ 283 go.mod 284 go.sum 285 foo.go 286 bar/ 287 bar.go 288 go.mod 289 go.sum 290 ``` 291 292 Where `bar/go.mod` is: 293 ``` 294 module example.com/foo/bar 295 296 go 1.20 297 ``` 298 299 And the top-level `go.mod` becomes: 300 ``` 301 module example.com/foo 302 303 go 1.20 304 305 // New dependency on nested module. 306 require example.com/foo/bar v1.0.0 307 ``` 308 309 If during nested module creation either Package Compatibility is broken, like so 310 in `bar/bar.go`: 311 312 ``` 313 package bar 314 315 // Changed from V to T. 316 var T int 317 ``` 318 319 Or the nested module uses a name other than the original package's import path, 320 like so in `bar/go.mod`: 321 322 ``` 323 // Completely different module name 324 module example.com/qux 325 326 go 1.20 327 ``` 328 329 Then the move is backwards incompatible for client code. 330 331 ### Package Compatibility 332 333 > A new package is compatible with an old one if: 334 >1. Each exported name in the old package's scope also appears in the new 335 >package's scope, and the object (constant, variable, function or type) denoted 336 >by that name in the old package is compatible with the object denoted by the 337 >name in the new package, and 338 >2. For every exposed type that implements an exposed interface in the old package, 339 > its corresponding type should implement the corresponding interface in the new package. 340 > 341 >Otherwise the packages are incompatible. 342 343 As an aside, the tool also finds exported names in the new package that are not 344 exported in the old, and marks them as compatible changes. 345 346 Clause 2 is discussed further in "Whole-Package Compatibility." 347 348 ### Object Compatibility 349 350 This section provides compatibility rules for constants, variables, functions 351 and types. 352 353 #### Constants 354 355 >A new exported constant is compatible with an old one of the same name if and only if 356 >1. Their types correspond, and 357 >2. Their values are identical. 358 359 It is tempting to allow changing a typed constant to an untyped one. That may 360 seem harmless, but it can break code like this: 361 362 ``` 363 // old 364 const C int64 = 1 365 366 // new 367 const C = 1 368 369 // client 370 var x = C // old type is int64, new is int 371 var y int64 = x // fails with new: different types in assignment 372 ``` 373 374 A change to the value of a constant can break compatibility if the value is used 375 in an array type: 376 377 ``` 378 // old 379 const C = 1 380 381 // new 382 const C = 2 383 384 // client 385 var a [C]int = [1]int{} // fails with new because [2]int and [1]int are different types 386 ``` 387 Changes to constant values are rare, and determining whether they are compatible 388 or not is better left to the user, so the tool reports them. 389 390 #### Variables 391 392 >A new exported variable is compatible with an old one of the same name if and 393 >only if their types correspond. 394 395 Correspondence doesn't look past names, so this rule does not prevent adding a 396 field to `MyStruct` if the package declares `var V MyStruct`. It does, however, mean that 397 398 ``` 399 var V struct { X int } 400 ``` 401 is incompatible with 402 ``` 403 var V struct { X, Y int } 404 ``` 405 I discuss this at length below in the section "Compatibility, Types and Names." 406 407 #### Functions 408 409 >A new exported function or variable is compatible with an old function of the 410 >same name if and only if their types (signatures) correspond. 411 412 This rule captures the fact that, although many signature changes are compatible 413 for all call sites, none are compatible for assignment: 414 415 ``` 416 var v func(int) = pkg.F 417 ``` 418 Here, `F` must be of type `func(int)` and not, for instance, `func(...int)` or `func(interface{})`. 419 420 Note that the rule permits changing a function to a variable. This is a common 421 practice, usually done for test stubbing, and cannot break any code at compile 422 time. 423 424 #### Exported Types 425 426 > A new exported type is compatible with an old one if and only if their 427 > names are the same and their types correspond. 428 429 This rule seems far too strict. But, ignoring aliases for the moment, it demands only 430 that the old and new _defined_ types correspond. Consider: 431 ``` 432 // old 433 type T struct { X int } 434 435 // new 436 type T struct { X, Y int } 437 ``` 438 The addition of `Y` is a compatible change, because this rule does not require 439 that the struct literals have to correspond, only that the defined types 440 denoted by `T` must correspond. (Remember that correspondence stops at type 441 names.) 442 443 If one type is an alias that refers to the corresponding defined type, the 444 situation is the same: 445 446 ``` 447 // old 448 type T struct { X int } 449 450 // new 451 type u struct { X, Y int } 452 type T = u 453 ``` 454 Here, the only requirement is that old `T` corresponds to new `u`, not that the 455 struct types correspond. (We can't tell from this snippet that the old `T` and 456 the new `u` do correspond; that depends on whether `u` replaces `T` throughout 457 the API.) 458 459 However, the following change is incompatible, because the names do not 460 denote corresponding types: 461 462 ``` 463 // old 464 type T = struct { X int } 465 466 // new 467 type T = struct { X, Y int } 468 ``` 469 ### Type Literal Compatibility 470 471 Only five kinds of types can differ compatibly: defined types, structs, 472 interfaces, channels and numeric types. We only consider the compatibility of 473 the last four when they are the underlying type of a defined type. See 474 "Compatibility, Types and Names" for a rationale. 475 476 We justify the compatibility rules by enumerating all the ways a type 477 can be used, and by showing that the allowed changes cannot break any code that 478 uses values of the type in those ways. 479 480 Values of all types can be used in assignments (including argument passing and 481 function return), but we do not require that old and new types are assignment 482 compatible. That is because we assume that the old and new packages are never 483 used together: any given binary will link in either the old package or the new. 484 So in describing how a type can be used in the sections below, we omit 485 assignment. 486 487 Any type can also be used in a type assertion or conversion. The changes we allow 488 below may affect the run-time behavior of these operations, but they cannot affect 489 whether they compile. The only such breaking change would be to change 490 the type `T` in an assertion `x.T` so that it no longer implements the interface 491 type of `x`; but the rules for interfaces below disallow that. 492 493 > A new type is compatible with an old one if and only if they correspond, or 494 > one of the cases below applies. 495 496 #### Defined Types 497 498 Other than assignment, the only ways to use a defined type are to access its 499 methods, or to make use of the properties of its underlying type. Rule 2 below 500 covers the latter, and rules 3 and 4 cover the former. 501 502 > A new defined type is compatible with an old one if and only if all of the 503 > following hold: 504 >1. They correspond. 505 >2. Their underlying types are compatible. 506 >3. The new exported value method set is a superset of the old. 507 >4. The new exported pointer method set is a superset of the old. 508 509 An exported method set is a method set with all unexported methods removed. 510 When comparing methods of a method set, we require identical names and 511 corresponding signatures. 512 513 Removing an exported method is clearly a breaking change. But removing an 514 unexported one (or changing its signature) can be breaking as well, if it 515 results in the type no longer implementing an interface. See "Whole-Package 516 Compatibility," below. 517 518 #### Channels 519 520 > A new channel type is compatible with an old one if 521 > 1. The element types correspond, and 522 > 2. Either the directions are the same, or the new type has no direction. 523 524 Other than assignment, the only ways to use values of a channel type are to send 525 and receive on them, to close them, and to use them as map keys. Changes to a 526 channel type cannot cause code that closes a channel or uses it as a map key to 527 fail to compile, so we need not consider those operations. 528 529 Rule 1 ensures that any operations on the values sent or received will compile. 530 Rule 2 captures the fact that any program that compiles with a directed channel 531 must use either only sends, or only receives, so allowing the other operation 532 by removing the channel direction cannot break any code. 533 534 535 #### Interfaces 536 537 > A new interface is compatible with an old one if and only if: 538 > 1. The old interface does not have an unexported method, and it corresponds 539 > to the new interfaces (i.e. they have the same method set), or 540 > 2. The old interface has an unexported method and the new exported method set is a 541 > superset of the old. 542 543 Other than assignment, the only ways to use an interface are to implement it, 544 embed it, or call one of its methods. (Interface values can also be used as map 545 keys, but that cannot cause a compile-time error.) 546 547 Certainly, removing an exported method from an interface could break a client 548 call, so neither rule allows it. 549 550 Rule 1 also disallows adding a method to an interface without an existing unexported 551 method. Such an interface can be implemented in client code. If adding a method 552 were allowed, a type that implements the old interface could fail to implement 553 the new one: 554 555 ``` 556 type I interface { M1() } // old 557 type I interface { M1(); M2() } // new 558 559 // client 560 type t struct{} 561 func (t) M1() {} 562 var i pkg.I = t{} // fails with new, because t lacks M2 563 ``` 564 565 Rule 2 is based on the observation that if an interface has an unexported 566 method, the only way a client can implement it is to embed it. 567 Adding a method is compatible in this case, because the embedding struct will 568 continue to implement the interface. Adding a method also cannot break any call 569 sites, since no program that compiles could have any such call sites. 570 571 #### Structs 572 573 > A new struct is compatible with an old one if all of the following hold: 574 > 1. The new set of top-level exported fields is a superset of the old. 575 > 2. The new set of _selectable_ exported fields is a superset of the old. 576 > 3. If the old struct is comparable, so is the new one. 577 578 The set of selectable exported fields is the set of exported fields `F` 579 such that `x.F` is a valid selector expression for a value `x` of the struct 580 type. `F` may be at the top level of the struct, or it may be a field of an 581 embedded struct. 582 583 Two fields are the same if they have the same name and corresponding types. 584 585 Other than assignment, there are only four ways to use a struct: write a struct 586 literal, select a field, use a value of the struct as a map key, or compare two 587 values for equality. The first clause ensures that struct literals compile; the 588 second, that selections compile; and the third, that equality expressions and 589 map index expressions compile. 590 591 #### Numeric Types 592 593 > A new numeric type is compatible with an old one if and only if they are 594 > both unsigned integers, both signed integers, both floats or both complex 595 > types, and the new one is at least as large as the old on both 32-bit and 596 > 64-bit architectures. 597 598 Other than in assignments, numeric types appear in arithmetic and comparison 599 expressions. Since all arithmetic operations but shifts (see below) require that 600 operand types be identical, and by assumption the old and new types underly 601 defined types (see "Compatibility, Types and Names," below), there is no way for 602 client code to write an arithmetic expression that compiles with operands of the 603 old type but not the new. 604 605 Numeric types can also appear in type switches and type assertions. Again, since 606 the old and new types underly defined types, type switches and type assertions 607 that compiled using the old defined type will continue to compile with the new 608 defined type. 609 610 Going from an unsigned to a signed integer type is an incompatible change for 611 the sole reason that only an unsigned type can appear as the right operand of a 612 shift. If this rule is relaxed, then changes from an unsigned type to a larger 613 signed type would be compatible. See [this 614 issue](https://github.com/golang/go/issues/19113). 615 616 Only integer types can be used in bitwise and shift operations, and for indexing 617 slices and arrays. That is why switching from an integer to a floating-point 618 type--even one that can represent all values of the integer type--is an 619 incompatible change. 620 621 622 Conversions from floating-point to complex types or vice versa are not permitted 623 (the predeclared functions real, imag, and complex must be used instead). To 624 prevent valid floating-point or complex conversions from becoming invalid, 625 changing a floating-point type to a complex type or vice versa is considered an 626 incompatible change. 627 628 Although conversions between any two integer types are valid, assigning a 629 constant value to a variable of integer type that is too small to represent the 630 constant is not permitted. That is why the only compatible changes are to 631 a new type whose values are a superset of the old. The requirement that the new 632 set of values must include the old on both 32-bit and 64-bit machines allows 633 conversions from `int32` to `int` and from `int` to `int64`, but not the other 634 direction; and similarly for `uint`. 635 636 Changing a type to or from `uintptr` is considered an incompatible change. Since 637 its size is not specified, there is no way to know whether the new type's values 638 are a superset of the old type's. 639 640 ## Whole-Package Compatibility 641 642 Some changes that are compatible for a single type are not compatible when the 643 package is considered as a whole. For example, if you remove an unexported 644 method on a defined type, it may no longer implement an interface of the 645 package. This can break client code: 646 647 ``` 648 // old 649 type T int 650 func (T) m() {} 651 type I interface { m() } 652 653 // new 654 type T int // no method m anymore 655 656 // client 657 var i pkg.I = pkg.T{} // fails with new because T lacks m 658 ``` 659 660 Similarly, adding a method to an interface can cause defined types 661 in the package to stop implementing it. 662 663 The second clause in the definition for package compatibility handles these 664 cases. To repeat: 665 > 2. For every exposed type that implements an exposed interface in the old package, 666 > its corresponding type should implement the corresponding interface in the new package. 667 Recall that a type is exposed if it is part of the package's API, even if it is 668 unexported. 669 670 Other incompatibilities that involve more than one type in the package can arise 671 whenever two types with identical underlying types exist in the old or new 672 package. Here, a change "splits" an identical underlying type into two, breaking 673 conversions: 674 675 ``` 676 // old 677 type B struct { X int } 678 type C struct { X int } 679 680 // new 681 type B struct { X int } 682 type C struct { X, Y int } 683 684 // client 685 var b B 686 _ = C(b) // fails with new: cannot convert B to C 687 ``` 688 Finally, changes that are compatible for the package in which they occur can 689 break downstream packages. That can happen even if they involve unexported 690 methods, thanks to embedding. 691 692 The definitions given here don't account for these sorts of problems. 693 694 695 ## Compatibility, Types and Names 696 697 The above definitions state that the only types that can differ compatibly are 698 defined types and the types that underly them. Changes to other type literals 699 are considered incompatible. For instance, it is considered an incompatible 700 change to add a field to the struct in this variable declaration: 701 702 ``` 703 var V struct { X int } 704 ``` 705 or this alias definition: 706 ``` 707 type T = struct { X int } 708 ``` 709 710 We make this choice to keep the definition of compatibility (relatively) simple. 711 A more precise definition could, for instance, distinguish between 712 713 ``` 714 func F(struct { X int }) 715 ``` 716 where any changes to the struct are incompatible, and 717 718 ``` 719 func F(struct { X, u int }) 720 ``` 721 where adding a field is compatible (since clients cannot write the signature, 722 and thus cannot assign `F` to a variable of the signature type). The definition 723 should then also allow other function signature changes that only require 724 call-site compatibility, like 725 726 ``` 727 func F(struct { X, u int }, ...int) 728 ``` 729 The result would be a much more complex definition with little benefit, since 730 the examples in this section rarely arise in practice.