github.com/powerman/golang-tools@v0.1.11-0.20220410185822-5ad214d8d803/internal/apidiff/README.md (about) 1 # Checking Go Package API Compatibility 2 3 The `apidiff` tool in this directory determines whether two versions of the same 4 package are compatible. The goal is to help the developer make an informed 5 choice of semantic version after they have changed the code of their module. 6 7 `apidiff` reports two kinds of changes: incompatible ones, which require 8 incrementing the major part of the semantic version, and compatible ones, which 9 require a minor version increment. If no API changes are reported but there are 10 code changes that could affect client code, then the patch version should 11 be incremented. 12 13 Because `apidiff` ignores package import paths, it may be used to display API 14 differences between any two packages, not just different versions of the same 15 package. 16 17 The current version of `apidiff` compares only packages, not modules. 18 19 20 ## Compatibility Desiderata 21 22 Any tool that checks compatibility can offer only an approximation. No tool can 23 detect behavioral changes; and even if it could, whether a behavioral change is 24 a breaking change or not depends on many factors, such as whether it closes a 25 security hole or fixes a bug. Even a change that causes some code to fail to 26 compile may not be considered a breaking change by the developers or their 27 users. It may only affect code marked as experimental or unstable, for 28 example, or the break may only manifest in unlikely cases. 29 30 For a tool to be useful, its notion of compatibility must be relaxed enough to 31 allow reasonable changes, like adding a field to a struct, but strict enough to 32 catch significant breaking changes. A tool that is too lax will miss important 33 incompatibilities, and users will stop trusting it; one that is too strict may 34 generate so much noise that users will ignore it. 35 36 To a first approximation, this tool reports a change as incompatible if it could 37 cause client code to stop compiling. But `apidiff` ignores five ways in which 38 code may fail to compile after a change. Three of them are mentioned in the 39 [Go 1 Compatibility Guarantee](https://golang.org/doc/go1compat). 40 41 ### Unkeyed Struct Literals 42 43 Code that uses an unkeyed struct literal would fail to compile if a field was 44 added to the struct, making any such addition an incompatible change. An example: 45 46 ``` 47 // old 48 type Point struct { X, Y int } 49 50 // new 51 type Point struct { X, Y, Z int } 52 53 // client 54 p := pkg.Point{1, 2} // fails in new because there are more fields than expressions 55 ``` 56 Here and below, we provide three snippets: the code in the old version of the 57 package, the code in the new version, and the code written in a client of the package, 58 which refers to it by the name `pkg`. The client code compiles against the old 59 code but not the new. 60 61 ### Embedding and Shadowing 62 63 Adding an exported field to a struct can break code that embeds that struct, 64 because the newly added field may conflict with an identically named field 65 at the same struct depth. A selector referring to the latter would become 66 ambiguous and thus erroneous. 67 68 69 ``` 70 // old 71 type Point struct { X, Y int } 72 73 // new 74 type Point struct { X, Y, Z int } 75 76 // client 77 type z struct { Z int } 78 79 var v struct { 80 pkg.Point 81 z 82 } 83 84 _ = v.Z // fails in new 85 ``` 86 In the new version, the last line fails to compile because there are two embedded `Z` 87 fields at the same depth, one from `z` and one from `pkg.Point`. 88 89 90 ### Using an Identical Type Externally 91 92 If it is possible for client code to write a type expression representing the 93 underlying type of a defined type in a package, then external code can use it in 94 assignments involving the package type, making any change to that type incompatible. 95 ``` 96 // old 97 type Point struct { X, Y int } 98 99 // new 100 type Point struct { X, Y, Z int } 101 102 // client 103 var p struct { X, Y int } = pkg.Point{} // fails in new because of Point's extra field 104 ``` 105 Here, the external code could have used the provided name `Point`, but chose not 106 to. I'll have more to say about this and related examples later. 107 108 ### unsafe.Sizeof and Friends 109 110 Since `unsafe.Sizeof`, `unsafe.Offsetof` and `unsafe.Alignof` are constant 111 expressions, they can be used in an array type literal: 112 113 ``` 114 // old 115 type S struct{ X int } 116 117 // new 118 type S struct{ X, y int } 119 120 // client 121 var a [unsafe.Sizeof(pkg.S{})]int = [8]int{} // fails in new because S's size is not 8 122 ``` 123 Use of these operations could make many changes to a type potentially incompatible. 124 125 126 ### Type Switches 127 128 A package change that merges two different types (with same underlying type) 129 into a single new type may break type switches in clients that refer to both 130 original types: 131 132 ``` 133 // old 134 type T1 int 135 type T2 int 136 137 // new 138 type T1 int 139 type T2 = T1 140 141 // client 142 switch x.(type) { 143 case T1: 144 case T2: 145 } // fails with new because two cases have the same type 146 ``` 147 This sort of incompatibility is sufficiently esoteric to ignore; the tool allows 148 merging types. 149 150 ## First Attempt at a Definition 151 152 Our first attempt at defining compatibility captures the idea that all the 153 exported names in the old package must have compatible equivalents in the new 154 package. 155 156 A new package is compatible with an old one if and only if: 157 - For every exported package-level name in the old package, the same name is 158 declared in the new at package level, and 159 - the names denote the same kind of object (e.g. both are variables), and 160 - the types of the objects are compatible. 161 162 We will work out the details (and make some corrections) below, but it is clear 163 already that we will need to determine what makes two types compatible. And 164 whatever the definition of type compatibility, it's certainly true that if two 165 types are the same, they are compatible. So we will need to decide what makes an 166 old and new type the same. We will call this sameness relation _correspondence_. 167 168 ## Type Correspondence 169 170 Go already has a definition of when two types are the same: 171 [type identity](https://golang.org/ref/spec#Type_identity). 172 But identity isn't adequate for our purpose: it says that two defined 173 types are identical if they arise from the same definition, but it's unclear 174 what "same" means when talking about two different packages (or two versions of 175 a single package). 176 177 The obvious change to the definition of identity is to require that old and new 178 [defined types](https://golang.org/ref/spec#Type_definitions) 179 have the same name instead. But that doesn't work either, for two 180 reasons. First, type aliases can equate two defined types with different names: 181 182 ``` 183 // old 184 type E int 185 186 // new 187 type t int 188 type E = t 189 ``` 190 Second, an unexported type can be renamed: 191 192 ``` 193 // old 194 type u1 int 195 var V u1 196 197 // new 198 type u2 int 199 var V u2 200 ``` 201 Here, even though `u1` and `u2` are unexported, their exported fields and 202 methods are visible to clients, so they are part of the API. But since the name 203 `u1` is not visible to clients, it can be changed compatibly. We say that `u1` 204 and `u2` are _exposed_: a type is exposed if a client package can declare variables of that type. 205 206 We will say that an old defined type _corresponds_ to a new one if they have the 207 same name, or one can be renamed to the other without otherwise changing the 208 API. In the first example above, old `E` and new `t` correspond. In the second, 209 old `u1` and new `u2` correspond. 210 211 Two or more old defined types can correspond to a single new type: we consider 212 "merging" two types into one to be a compatible change. As mentioned above, 213 code that uses both names in a type switch will fail, but we deliberately ignore 214 this case. However, a single old type can correspond to only one new type. 215 216 So far, we've explained what correspondence means for defined types. To extend 217 the definition to all types, we parallel the language's definition of type 218 identity. So, for instance, an old and a new slice type correspond if their 219 element types correspond. 220 221 ## Definition of Compatibility 222 223 We can now present the definition of compatibility used by `apidiff`. 224 225 ### Package Compatibility 226 227 > A new package is compatible with an old one if: 228 >1. Each exported name in the old package's scope also appears in the new 229 >package's scope, and the object (constant, variable, function or type) denoted 230 >by that name in the old package is compatible with the object denoted by the 231 >name in the new package, and 232 >2. For every exposed type that implements an exposed interface in the old package, 233 > its corresponding type should implement the corresponding interface in the new package. 234 > 235 >Otherwise the packages are incompatible. 236 237 As an aside, the tool also finds exported names in the new package that are not 238 exported in the old, and marks them as compatible changes. 239 240 Clause 2 is discussed further in "Whole-Package Compatibility." 241 242 ### Object Compatibility 243 244 This section provides compatibility rules for constants, variables, functions 245 and types. 246 247 #### Constants 248 249 >A new exported constant is compatible with an old one of the same name if and only if 250 >1. Their types correspond, and 251 >2. Their values are identical. 252 253 It is tempting to allow changing a typed constant to an untyped one. That may 254 seem harmless, but it can break code like this: 255 256 ``` 257 // old 258 const C int64 = 1 259 260 // new 261 const C = 1 262 263 // client 264 var x = C // old type is int64, new is int 265 var y int64 = x // fails with new: different types in assignment 266 ``` 267 268 A change to the value of a constant can break compatibility if the value is used 269 in an array type: 270 271 ``` 272 // old 273 const C = 1 274 275 // new 276 const C = 2 277 278 // client 279 var a [C]int = [1]int{} // fails with new because [2]int and [1]int are different types 280 ``` 281 Changes to constant values are rare, and determining whether they are compatible 282 or not is better left to the user, so the tool reports them. 283 284 #### Variables 285 286 >A new exported variable is compatible with an old one of the same name if and 287 >only if their types correspond. 288 289 Correspondence doesn't look past names, so this rule does not prevent adding a 290 field to `MyStruct` if the package declares `var V MyStruct`. It does, however, mean that 291 292 ``` 293 var V struct { X int } 294 ``` 295 is incompatible with 296 ``` 297 var V struct { X, Y int } 298 ``` 299 I discuss this at length below in the section "Compatibility, Types and Names." 300 301 #### Functions 302 303 >A new exported function or variable is compatible with an old function of the 304 >same name if and only if their types (signatures) correspond. 305 306 This rule captures the fact that, although many signature changes are compatible 307 for all call sites, none are compatible for assignment: 308 309 ``` 310 var v func(int) = pkg.F 311 ``` 312 Here, `F` must be of type `func(int)` and not, for instance, `func(...int)` or `func(interface{})`. 313 314 Note that the rule permits changing a function to a variable. This is a common 315 practice, usually done for test stubbing, and cannot break any code at compile 316 time. 317 318 #### Exported Types 319 320 > A new exported type is compatible with an old one if and only if their 321 > names are the same and their types correspond. 322 323 This rule seems far too strict. But, ignoring aliases for the moment, it demands only 324 that the old and new _defined_ types correspond. Consider: 325 ``` 326 // old 327 type T struct { X int } 328 329 // new 330 type T struct { X, Y int } 331 ``` 332 The addition of `Y` is a compatible change, because this rule does not require 333 that the struct literals have to correspond, only that the defined types 334 denoted by `T` must correspond. (Remember that correspondence stops at type 335 names.) 336 337 If one type is an alias that refers to the corresponding defined type, the 338 situation is the same: 339 340 ``` 341 // old 342 type T struct { X int } 343 344 // new 345 type u struct { X, Y int } 346 type T = u 347 ``` 348 Here, the only requirement is that old `T` corresponds to new `u`, not that the 349 struct types correspond. (We can't tell from this snippet that the old `T` and 350 the new `u` do correspond; that depends on whether `u` replaces `T` throughout 351 the API.) 352 353 However, the following change is incompatible, because the names do not 354 denote corresponding types: 355 356 ``` 357 // old 358 type T = struct { X int } 359 360 // new 361 type T = struct { X, Y int } 362 ``` 363 ### Type Literal Compatibility 364 365 Only five kinds of types can differ compatibly: defined types, structs, 366 interfaces, channels and numeric types. We only consider the compatibility of 367 the last four when they are the underlying type of a defined type. See 368 "Compatibility, Types and Names" for a rationale. 369 370 We justify the compatibility rules by enumerating all the ways a type 371 can be used, and by showing that the allowed changes cannot break any code that 372 uses values of the type in those ways. 373 374 Values of all types can be used in assignments (including argument passing and 375 function return), but we do not require that old and new types are assignment 376 compatible. That is because we assume that the old and new packages are never 377 used together: any given binary will link in either the old package or the new. 378 So in describing how a type can be used in the sections below, we omit 379 assignment. 380 381 Any type can also be used in a type assertion or conversion. The changes we allow 382 below may affect the run-time behavior of these operations, but they cannot affect 383 whether they compile. The only such breaking change would be to change 384 the type `T` in an assertion `x.T` so that it no longer implements the interface 385 type of `x`; but the rules for interfaces below disallow that. 386 387 > A new type is compatible with an old one if and only if they correspond, or 388 > one of the cases below applies. 389 390 #### Defined Types 391 392 Other than assignment, the only ways to use a defined type are to access its 393 methods, or to make use of the properties of its underlying type. Rule 2 below 394 covers the latter, and rules 3 and 4 cover the former. 395 396 > A new defined type is compatible with an old one if and only if all of the 397 > following hold: 398 >1. They correspond. 399 >2. Their underlying types are compatible. 400 >3. The new exported value method set is a superset of the old. 401 >4. The new exported pointer method set is a superset of the old. 402 403 An exported method set is a method set with all unexported methods removed. 404 When comparing methods of a method set, we require identical names and 405 corresponding signatures. 406 407 Removing an exported method is clearly a breaking change. But removing an 408 unexported one (or changing its signature) can be breaking as well, if it 409 results in the type no longer implementing an interface. See "Whole-Package 410 Compatibility," below. 411 412 #### Channels 413 414 > A new channel type is compatible with an old one if 415 > 1. The element types correspond, and 416 > 2. Either the directions are the same, or the new type has no direction. 417 418 Other than assignment, the only ways to use values of a channel type are to send 419 and receive on them, to close them, and to use them as map keys. Changes to a 420 channel type cannot cause code that closes a channel or uses it as a map key to 421 fail to compile, so we need not consider those operations. 422 423 Rule 1 ensures that any operations on the values sent or received will compile. 424 Rule 2 captures the fact that any program that compiles with a directed channel 425 must use either only sends, or only receives, so allowing the other operation 426 by removing the channel direction cannot break any code. 427 428 429 #### Interfaces 430 431 > A new interface is compatible with an old one if and only if: 432 > 1. The old interface does not have an unexported method, and it corresponds 433 > to the new interfaces (i.e. they have the same method set), or 434 > 2. The old interface has an unexported method and the new exported method set is a 435 > superset of the old. 436 437 Other than assignment, the only ways to use an interface are to implement it, 438 embed it, or call one of its methods. (Interface values can also be used as map 439 keys, but that cannot cause a compile-time error.) 440 441 Certainly, removing an exported method from an interface could break a client 442 call, so neither rule allows it. 443 444 Rule 1 also disallows adding a method to an interface without an existing unexported 445 method. Such an interface can be implemented in client code. If adding a method 446 were allowed, a type that implements the old interface could fail to implement 447 the new one: 448 449 ``` 450 type I interface { M1() } // old 451 type I interface { M1(); M2() } // new 452 453 // client 454 type t struct{} 455 func (t) M1() {} 456 var i pkg.I = t{} // fails with new, because t lacks M2 457 ``` 458 459 Rule 2 is based on the observation that if an interface has an unexported 460 method, the only way a client can implement it is to embed it. 461 Adding a method is compatible in this case, because the embedding struct will 462 continue to implement the interface. Adding a method also cannot break any call 463 sites, since no program that compiles could have any such call sites. 464 465 #### Structs 466 467 > A new struct is compatible with an old one if all of the following hold: 468 > 1. The new set of top-level exported fields is a superset of the old. 469 > 2. The new set of _selectable_ exported fields is a superset of the old. 470 > 3. If the old struct is comparable, so is the new one. 471 472 The set of selectable exported fields is the set of exported fields `F` 473 such that `x.F` is a valid selector expression for a value `x` of the struct 474 type. `F` may be at the top level of the struct, or it may be a field of an 475 embedded struct. 476 477 Two fields are the same if they have the same name and corresponding types. 478 479 Other than assignment, there are only four ways to use a struct: write a struct 480 literal, select a field, use a value of the struct as a map key, or compare two 481 values for equality. The first clause ensures that struct literals compile; the 482 second, that selections compile; and the third, that equality expressions and 483 map index expressions compile. 484 485 #### Numeric Types 486 487 > A new numeric type is compatible with an old one if and only if they are 488 > both unsigned integers, both signed integers, both floats or both complex 489 > types, and the new one is at least as large as the old on both 32-bit and 490 > 64-bit architectures. 491 492 Other than in assignments, numeric types appear in arithmetic and comparison 493 expressions. Since all arithmetic operations but shifts (see below) require that 494 operand types be identical, and by assumption the old and new types underly 495 defined types (see "Compatibility, Types and Names," below), there is no way for 496 client code to write an arithmetic expression that compiles with operands of the 497 old type but not the new. 498 499 Numeric types can also appear in type switches and type assertions. Again, since 500 the old and new types underly defined types, type switches and type assertions 501 that compiled using the old defined type will continue to compile with the new 502 defined type. 503 504 Going from an unsigned to a signed integer type is an incompatible change for 505 the sole reason that only an unsigned type can appear as the right operand of a 506 shift. If this rule is relaxed, then changes from an unsigned type to a larger 507 signed type would be compatible. See [this 508 issue](https://github.com/golang/go/issues/19113). 509 510 Only integer types can be used in bitwise and shift operations, and for indexing 511 slices and arrays. That is why switching from an integer to a floating-point 512 type--even one that can represent all values of the integer type--is an 513 incompatible change. 514 515 516 Conversions from floating-point to complex types or vice versa are not permitted 517 (the predeclared functions real, imag, and complex must be used instead). To 518 prevent valid floating-point or complex conversions from becoming invalid, 519 changing a floating-point type to a complex type or vice versa is considered an 520 incompatible change. 521 522 Although conversions between any two integer types are valid, assigning a 523 constant value to a variable of integer type that is too small to represent the 524 constant is not permitted. That is why the only compatible changes are to 525 a new type whose values are a superset of the old. The requirement that the new 526 set of values must include the old on both 32-bit and 64-bit machines allows 527 conversions from `int32` to `int` and from `int` to `int64`, but not the other 528 direction; and similarly for `uint`. 529 530 Changing a type to or from `uintptr` is considered an incompatible change. Since 531 its size is not specified, there is no way to know whether the new type's values 532 are a superset of the old type's. 533 534 ## Whole-Package Compatibility 535 536 Some changes that are compatible for a single type are not compatible when the 537 package is considered as a whole. For example, if you remove an unexported 538 method on a defined type, it may no longer implement an interface of the 539 package. This can break client code: 540 541 ``` 542 // old 543 type T int 544 func (T) m() {} 545 type I interface { m() } 546 547 // new 548 type T int // no method m anymore 549 550 // client 551 var i pkg.I = pkg.T{} // fails with new because T lacks m 552 ``` 553 554 Similarly, adding a method to an interface can cause defined types 555 in the package to stop implementing it. 556 557 The second clause in the definition for package compatibility handles these 558 cases. To repeat: 559 > 2. For every exposed type that implements an exposed interface in the old package, 560 > its corresponding type should implement the corresponding interface in the new package. 561 Recall that a type is exposed if it is part of the package's API, even if it is 562 unexported. 563 564 Other incompatibilities that involve more than one type in the package can arise 565 whenever two types with identical underlying types exist in the old or new 566 package. Here, a change "splits" an identical underlying type into two, breaking 567 conversions: 568 569 ``` 570 // old 571 type B struct { X int } 572 type C struct { X int } 573 574 // new 575 type B struct { X int } 576 type C struct { X, Y int } 577 578 // client 579 var b B 580 _ = C(b) // fails with new: cannot convert B to C 581 ``` 582 Finally, changes that are compatible for the package in which they occur can 583 break downstream packages. That can happen even if they involve unexported 584 methods, thanks to embedding. 585 586 The definitions given here don't account for these sorts of problems. 587 588 589 ## Compatibility, Types and Names 590 591 The above definitions state that the only types that can differ compatibly are 592 defined types and the types that underly them. Changes to other type literals 593 are considered incompatible. For instance, it is considered an incompatible 594 change to add a field to the struct in this variable declaration: 595 596 ``` 597 var V struct { X int } 598 ``` 599 or this alias definition: 600 ``` 601 type T = struct { X int } 602 ``` 603 604 We make this choice to keep the definition of compatibility (relatively) simple. 605 A more precise definition could, for instance, distinguish between 606 607 ``` 608 func F(struct { X int }) 609 ``` 610 where any changes to the struct are incompatible, and 611 612 ``` 613 func F(struct { X, u int }) 614 ``` 615 where adding a field is compatible (since clients cannot write the signature, 616 and thus cannot assign `F` to a variable of the signature type). The definition 617 should then also allow other function signature changes that only require 618 call-site compatibility, like 619 620 ``` 621 func F(struct { X, u int }, ...int) 622 ``` 623 The result would be a much more complex definition with little benefit, since 624 the examples in this section rarely arise in practice.