github.com/cosmos/cosmos-sdk@v0.50.10/docs/architecture/adr-054-semver-compatible-modules.md (about) 1 # ADR 054: Semver Compatible SDK Modules 2 3 ## Changelog 4 5 * 2022-04-27: First draft 6 7 ## Status 8 9 DRAFT 10 11 ## Abstract 12 13 In order to move the Cosmos SDK to a system of decoupled semantically versioned 14 modules which can be composed in different combinations (ex. staking v3 with 15 bank v1 and distribution v2), we need to reassess how we organize the API surface 16 of modules to avoid problems with go semantic import versioning and 17 circular dependencies. This ADR explores various approaches we can take to 18 addressing these issues. 19 20 ## Context 21 22 There has been [a fair amount of desire](https://github.com/cosmos/cosmos-sdk/discussions/10162) 23 in the community for semantic versioning in the SDK and there has been significant 24 movement to splitting SDK modules into [standalone go modules](https://github.com/cosmos/cosmos-sdk/issues/11899). 25 Both of these will ideally allow the ecosystem to move faster because we won't 26 be waiting for all dependencies to update synchronously. For instance, we could 27 have 3 versions of the core SDK compatible with the latest 2 releases of 28 CosmWasm as well as 4 different versions of staking . This sort of setup would 29 allow early adopters to aggressively integrate new versions, while allowing 30 more conservative users to be selective about which versions they're ready for. 31 32 In order to achieve this, we need to solve the following problems: 33 34 1. because of the way [go semantic import versioning](https://research.swtch.com/vgo-import) (SIV) 35 works, moving to SIV naively will actually make it harder to achieve these goals 36 2. circular dependencies between modules need to be broken to actually release 37 many modules in the SDK independently 38 3. pernicious minor version incompatibilities introduced through correctly 39 [evolving protobuf schemas](https://developers.google.com/protocol-buffers/docs/proto3#updating) 40 without correct [unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering) 41 42 Note that all the following discussion assumes that the proto file versioning and state machine versioning of a module 43 are distinct in that: 44 45 * proto files are maintained in a non-breaking way (using something 46 like [buf breaking](https://docs.buf.build/breaking/overview) 47 to ensure all changes are backwards compatible) 48 * proto file versions get bumped much less frequently, i.e. we might maintain `cosmos.bank.v1` through many versions 49 of the bank module state machine 50 * state machine breaking changes are more common and ideally this is what we'd want to semantically version with 51 go modules, ex. `x/bank/v2`, `x/bank/v3`, etc. 52 53 ### Problem 1: Semantic Import Versioning Compatibility 54 55 Consider we have a module `foo` which defines the following `MsgDoSomething` and that we've released its state 56 machine in go module `example.com/foo`: 57 58 ```protobuf 59 package foo.v1; 60 61 message MsgDoSomething { 62 string sender = 1; 63 uint64 amount = 2; 64 } 65 66 service Msg { 67 DoSomething(MsgDoSomething) returns (MsgDoSomethingResponse); 68 } 69 ``` 70 71 Now consider that we make a revision to this module and add a new `condition` field to `MsgDoSomething` and also 72 add a new validation rule on `amount` requiring it to be non-zero, and that following go semantic versioning we 73 release the next state machine version of `foo` as `example.com/foo/v2`. 74 75 ```protobuf 76 // Revision 1 77 package foo.v1; 78 79 message MsgDoSomething { 80 string sender = 1; 81 82 // amount must be a non-zero integer. 83 uint64 amount = 2; 84 85 // condition is an optional condition on doing the thing. 86 // 87 // Since: Revision 1 88 Condition condition = 3; 89 } 90 ``` 91 92 Approaching this naively, we would generate the protobuf types for the initial 93 version of `foo` in `example.com/foo/types` and we would generate the protobuf 94 types for the second version in `example.com/foo/v2/types`. 95 96 Now let's say we have a module `bar` which talks to `foo` using this keeper 97 interface which `foo` provides: 98 99 ```go 100 type FooKeeper interface { 101 DoSomething(MsgDoSomething) error 102 } 103 ``` 104 105 #### Scenario A: Backward Compatibility: Newer Foo, Older Bar 106 107 Imagine we have a chain which uses both `foo` and `bar` and wants to upgrade to 108 `foo/v2`, but the `bar` module has not upgraded to `foo/v2`. 109 110 In this case, the chain will not be able to upgrade to `foo/v2` until `bar` 111 has upgraded its references to `example.com/foo/types.MsgDoSomething` to 112 `example.com/foo/v2/types.MsgDoSomething`. 113 114 Even if `bar`'s usage of `MsgDoSomething` has not changed at all, the upgrade 115 will be impossible without this change because `example.com/foo/types.MsgDoSomething` 116 and `example.com/foo/v2/types.MsgDoSomething` are fundamentally different 117 incompatible structs in the go type system. 118 119 #### Scenario B: Forward Compatibility: Older Foo, Newer Bar 120 121 Now let's consider the reverse scenario, where `bar` upgrades to `foo/v2` 122 by changing the `MsgDoSomething` reference to `example.com/foo/v2/types.MsgDoSomething` 123 and releases that as `bar/v2` with some other changes that a chain wants. 124 The chain, however, has decided that it thinks the changes in `foo/v2` are too 125 risky and that it'd prefer to stay on the initial version of `foo`. 126 127 In this scenario, it is impossible to upgrade to `bar/v2` without upgrading 128 to `foo/v2` even if `bar/v2` would have worked 100% fine with `foo` other 129 than changing the import path to `MsgDoSomething` (meaning that `bar/v2` 130 doesn't actually use any new features of `foo/v2`). 131 132 Now because of the way go semantic import versioning works, we are locked 133 into either using `foo` and `bar` OR `foo/v2` and `bar/v2`. We cannot have 134 `foo` + `bar/v2` OR `foo/v2` + `bar`. The go type system doesn't allow this 135 even if both versions of these modules are otherwise compatible with each 136 other. 137 138 #### Naive Mitigation 139 140 A naive approach to fixing this would be to not regenerate the protobuf types 141 in `example.com/foo/v2/types` but instead just update `example.com/foo/types` 142 to reflect the changes needed for `v2` (adding `condition` and requiring 143 `amount` to be non-zero). Then we could release a patch of `example.com/foo/types` 144 with this update and use that for `foo/v2`. But this change is state machine 145 breaking for `v1`. It requires changing the `ValidateBasic` method to reject 146 the case where `amount` is zero, and it adds the `condition` field which 147 should be rejected based 148 on [ADR 020 unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering). 149 So adding these changes as a patch on `v1` is actually incorrect based on semantic 150 versioning. Chains that want to stay on `v1` of `foo` should not 151 be importing these changes because they are incorrect for `v1.` 152 153 ### Problem 2: Circular dependencies 154 155 None of the above approaches allow `foo` and `bar` to be separate modules 156 if for some reason `foo` and `bar` depend on each other in different ways. 157 For instance, we can't have `foo` import `bar/types` while `bar` imports 158 `foo/types`. 159 160 We have several cases of circular module dependencies in the SDK 161 (ex. staking, distribution and slashing) that are legitimate from a state machine 162 perspective. Without separating the API types out somehow, there would be 163 no way to independently semantically version these modules without some other 164 mitigation. 165 166 ### Problem 3: Handling Minor Version Incompatibilities 167 168 Imagine that we solve the first two problems but now have a scenario where 169 `bar/v2` wants the option to use `MsgDoSomething.condition` which only `foo/v2` 170 supports. If `bar/v2` works with `foo` `v1` and sets `condition` to some non-nil 171 value, then `foo` will silently ignore this field resulting in a silent logic 172 possibly dangerous logic error. If `bar/v2` were able to check whether `foo` was 173 on `v1` or `v2` and dynamically, it could choose to only use `condition` when 174 `foo/v2` is available. Even if `bar/v2` were able to perform this check, however, 175 how do we know that it is always performing the check properly. Without 176 some sort of 177 framework-level [unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering), 178 it is hard to know whether these pernicious hard to detect bugs are getting into 179 our app and a client-server layer such as [ADR 033: Inter-Module Communication](./adr-033-protobuf-inter-module-comm.md) 180 may be needed to do this. 181 182 ## Solutions 183 184 ### Approach A) Separate API and State Machine Modules 185 186 One solution (first proposed in https://github.com/cosmos/cosmos-sdk/discussions/10582) is to isolate all protobuf 187 generated code into a separate module 188 from the state machine module. This would mean that we could have state machine 189 go modules `foo` and `foo/v2` which could use a types or API go module say 190 `foo/api`. This `foo/api` go module would be perpetually on `v1.x` and only 191 accept non-breaking changes. This would then allow other modules to be 192 compatible with either `foo` or `foo/v2` as long as the inter-module API only 193 depends on the types in `foo/api`. It would also allow modules `foo` and `bar` 194 to depend on each other in that both of them could depend on `foo/api` and 195 `bar/api` without `foo` directly depending on `bar` and vice versa. 196 197 This is similar to the naive mitigation described above except that it separates 198 the types into separate go modules which in and of itself could be used to 199 break circular module dependencies. It has the same problems as the naive solution, 200 otherwise, which we could rectify by: 201 202 1. removing all state machine breaking code from the API module (ex. `ValidateBasic` and any other interface methods) 203 2. embedding the correct file descriptors for unknown field filtering in the binary 204 205 #### Migrate all interface methods on API types to handlers 206 207 To solve 1), we need to remove all interface implementations from generated 208 types and instead use a handler approach which essentially means that given 209 a type `X`, we have some sort of resolver which allows us to resolve interface 210 implementations for that type (ex. `sdk.Msg` or `authz.Authorization`). For 211 example: 212 213 ```go 214 func (k Keeper) DoSomething(msg MsgDoSomething) error { 215 var validateBasicHandler ValidateBasicHandler 216 err := k.resolver.Resolve(&validateBasic, msg) 217 if err != nil { 218 return err 219 } 220 221 err = validateBasicHandler.ValidateBasic() 222 ... 223 } 224 ``` 225 226 In the case of some methods on `sdk.Msg`, we could replace them with declarative 227 annotations. For instance, `GetSigners` can already be replaced by the protobuf 228 annotation `cosmos.msg.v1.signer`. In the future, we may consider some sort 229 of protobuf validation framework (like https://github.com/bufbuild/protoc-gen-validate 230 but more Cosmos-specific) to replace `ValidateBasic`. 231 232 #### Pinned FileDescriptor's 233 234 To solve 2), state machine modules must be able to specify what the version of 235 the protobuf files was that they were built against. For instance if the API 236 module for `foo` upgrades to `foo/v2`, the original `foo` module still needs 237 a copy of the original protobuf files it was built with so that ADR 020 238 unknown field filtering will reject `MsgDoSomething` when `condition` is 239 set. 240 241 The simplest way to do this may be to embed the protobuf `FileDescriptor`s into 242 the module itself so that these `FileDescriptor`s are used at runtime rather 243 than the ones that are built into the `foo/api` which may be different. Using 244 [buf build](https://docs.buf.build/build/usage#output-format), [go embed](https://pkg.go.dev/embed), 245 and a build script we can probably come up with a solution for embedding 246 `FileDescriptor`s into modules that is fairly straightforward. 247 248 #### Potential limitations to generated code 249 250 One challenge with this approach is that it places heavy restrictions on what 251 can go in API modules and requires that most of this is state machine breaking. 252 All or most of the code in the API module would be generated from protobuf 253 files, so we can probably control this with how code generation is done, but 254 it is a risk to be aware of. 255 256 For instance, we do code generation for the ORM that in the future could 257 contain optimizations that are state machine breaking. We 258 would either need to ensure very carefully that the optimizations aren't 259 actually state machine breaking in generated code or separate this generated code 260 out from the API module into the state machine module. Both of these mitigations 261 are potentially viable but the API module approach does require an extra level 262 of care to avoid these sorts of issues. 263 264 #### Minor Version Incompatibilities 265 266 This approach in and of itself does little to address any potential minor 267 version incompatibilities and the 268 requisite [unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering). 269 Likely some sort of client-server routing layer which does this check such as 270 [ADR 033: Inter-Module communication](./adr-033-protobuf-inter-module-comm.md) 271 is required to make sure that this is done properly. We could then allow 272 modules to perform a runtime check given a `MsgClient`, ex: 273 274 ```go 275 func (k Keeper) CallFoo() error { 276 if k.interModuleClient.MinorRevision(k.fooMsgClient) >= 2 { 277 k.fooMsgClient.DoSomething(&MsgDoSomething{Condition: ...}) 278 } else { 279 ... 280 } 281 } 282 ``` 283 284 To do the unknown field filtering itself, the ADR 033 router would need to use 285 the [protoreflect API](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect) 286 to ensure that no fields unknown to the receiving module are set. This could 287 result in an undesirable performance hit depending on how complex this logic is. 288 289 ### Approach B) Changes to Generated Code 290 291 An alternate approach to solving the versioning problem is to change how protobuf code is generated and move modules 292 mostly or completely in the direction of inter-module communication as described 293 in [ADR 033](./adr-033-protobuf-inter-module-comm.md). 294 In this paradigm, a module could generate all the types it needs internally - including the API types of other modules - 295 and talk to other modules via a client-server boundary. For instance, if `bar` needs to talk to `foo`, it could 296 generate its own version of `MsgDoSomething` as `bar/internal/foo/v1.MsgDoSomething` and just pass this to the 297 inter-module router which would somehow convert it to the version which foo needs (ex. `foo/internal.MsgDoSomething`). 298 299 Currently, two generated structs for the same protobuf type cannot exist in the same go binary without special 300 build flags (see https://developers.google.com/protocol-buffers/docs/reference/go/faq#fix-namespace-conflict). 301 A relatively simple mitigation to this issue would be to set up the protobuf code to not register protobuf types 302 globally if they are generated in an `internal/` package. This will require modules to register their types manually 303 with the app-level level protobuf registry, this is similar to what modules already do with the `InterfaceRegistry` 304 and amino codec. 305 306 If modules _only_ do ADR 033 message passing then a naive and non-performant solution for 307 converting `bar/internal/foo/v1.MsgDoSomething` 308 to `foo/internal.MsgDoSomething` would be marshaling and unmarshaling in the ADR 033 router. This would break down if 309 we needed to expose protobuf types in `Keeper` interfaces because the whole point is to try to keep these types 310 `internal/` so that we don't end up with all the import version incompatibilities we've described above. However, 311 because of the issue with minor version incompatibilities and the need 312 for [unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering), 313 sticking with the `Keeper` paradigm instead of ADR 033 may be unviable to begin with. 314 315 A more performant solution (that could maybe be adapted to work with `Keeper` interfaces) would be to only expose 316 getters and setters for generated types and internally store data in memory buffers which could be passed from 317 one implementation to another in a zero-copy way. 318 319 For example, imagine this protobuf API with only getters and setters is exposed for `MsgSend`: 320 321 ```go 322 type MsgSend interface { 323 proto.Message 324 GetFromAddress() string 325 GetToAddress() string 326 GetAmount() []v1beta1.Coin 327 SetFromAddress(string) 328 SetToAddress(string) 329 SetAmount([]v1beta1.Coin) 330 } 331 332 func NewMsgSend() MsgSend { return &msgSendImpl{memoryBuffers: ...} } 333 ``` 334 335 Under the hood, `MsgSend` could be implemented based on some raw memory buffer in the same way 336 that [Cap'n Proto](https://capnproto.org) 337 and [FlatBuffers](https://google.github.io/flatbuffers/) so that we could convert between one version of `MsgSend` 338 and another without serialization (i.e. zero-copy). This approach would have the added benefits of allowing zero-copy 339 message passing to modules written in other languages such as Rust and accessed through a VM or FFI. It could also make 340 unknown field filtering in inter-module communication simpler if we require that all new fields are added in sequential 341 order, ex. just checking that no field `> 5` is set. 342 343 Also, we wouldn't have any issues with state machine breaking code on generated types because all the generated 344 code used in the state machine would actually live in the state machine module itself. Depending on how interface 345 types and protobuf `Any`s are used in other languages, however, it may still be desirable to take the handler 346 approach described in approach A. Either way, types implementing interfaces would still need to be registered 347 with an `InterfaceRegistry` as they are now because there would be no way to retrieve them via the global registry. 348 349 In order to simplify access to other modules using ADR 033, a public API module (maybe even one 350 [remotely generated by Buf](https://docs.buf.build/bsr/remote-generation/go)) could be used by client modules instead 351 of requiring to generate all client types internally. 352 353 The big downsides of this approach are that it requires big changes to how people use protobuf types and would be a 354 substantial rewrite of the protobuf code generator. This new generated code, however, could still be made compatible 355 with 356 the [`google.golang.org/protobuf/reflect/protoreflect`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect) 357 API in order to work with all standard golang protobuf tooling. 358 359 It is possible that the naive approach of marshaling/unmarshaling in the ADR 033 router is an acceptable intermediate 360 solution if the changes to the code generator are seen as too complex. However, since all modules would likely need 361 to migrate to ADR 033 anyway with this approach, it might be better to do this all at once. 362 363 ### Approach C) Don't address these issues 364 365 If the above solutions are seen as too complex, we can also decide not to do anything explicit to enable better module 366 version compatibility, and break circular dependencies. 367 368 In this case, when developers are confronted with the issues described above they can require dependencies to update in 369 sync (what we do now) or attempt some ad-hoc potentially hacky solution. 370 371 One approach is to ditch go semantic import versioning (SIV) altogether. Some people have commented that go's SIV 372 (i.e. changing the import path to `foo/v2`, `foo/v3`, etc.) is too restrictive and that it should be optional. The 373 golang maintainers disagree and only officially support semantic import versioning. We could, however, take the 374 contrarian perspective and get more flexibility by using 0.x-based versioning basically forever. 375 376 Module version compatibility could then be achieved using go.mod replace directives to pin dependencies to specific 377 compatible 0.x versions. For instance if we knew `foo` 0.2 and 0.3 were both compatible with `bar` 0.3 and 0.4, we 378 could use replace directives in our go.mod to stick to the versions of `foo` and `bar` we want. This would work as 379 long as the authors of `foo` and `bar` avoid incompatible breaking changes between these modules. 380 381 Or, if developers choose to use semantic import versioning, they can attempt the naive solution described above 382 and would also need to use special tags and replace directives to make sure that modules are pinned to the correct 383 versions. 384 385 Note, however, that all of these ad-hoc approaches, would be vulnerable to the minor version compatibility issues 386 described above unless [unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering) 387 is properly addressed. 388 389 ### Approach D) Avoid protobuf generated code in public APIs 390 391 An alternative approach would be to avoid protobuf generated code in public module APIs. This would help avoid the 392 discrepancy between state machine versions and client API versions at the module to module boundaries. It would mean 393 that we wouldn't do inter-module message passing based on ADR 033, but rather stick to the existing keeper approach 394 and take it one step further by avoiding any protobuf generated code in the keeper interface methods. 395 396 Using this approach, our `foo.Keeper.DoSomething` method wouldn't have the generated `MsgDoSomething` struct (which 397 comes from the protobuf API), but instead positional parameters. Then in order for `foo/v2` to support the `foo/v1` 398 keeper it would simply need to implement both the v1 and v2 keeper APIs. The `DoSomething` method in v2 could have the 399 additional `condition` parameter, but this wouldn't be present in v1 at all so there would be no danger of a client 400 accidentally setting this when it isn't available. 401 402 So this approach would avoid the challenge around minor version incompatibilities because the existing module keeper 403 API would not get new fields when they are added to protobuf files. 404 405 Taking this approach, however, would likely require making all protobuf generated code internal in order to prevent 406 it from leaking into the keeper API. This means we would still need to modify the protobuf code generator to not 407 register `internal/` code with the global registry, and we would still need to manually register protobuf 408 `FileDescriptor`s (this is probably true in all scenarios). It may, however, be possible to avoid needing to refactor 409 interface methods on generated types to handlers. 410 411 Also, this approach doesn't address what would be done in scenarios where modules still want to use the message router. 412 Either way, we probably still want a way to pass messages from one module to another router safely even if it's just for 413 use cases like `x/gov`, `x/authz`, CosmWasm, etc. That would still require most of the things outlined in approach (B), 414 although we could advise modules to prefer keepers for communicating with other modules. 415 416 The biggest downside of this approach is probably that it requires a strict refactoring of keeper interfaces to avoid 417 generated code leaking into the API. This may result in cases where we need to duplicate types that are already defined 418 in proto files and then write methods for converting between the golang and protobuf version. This may end up in a lot 419 of unnecessary boilerplate and that may discourage modules from actually adopting it and achieving effective version 420 compatibility. Approaches (A) and (B), although heavy handed initially, aim to provide a system which once adopted 421 more or less gives the developer version compatibility for free with minimal boilerplate. Approach (D) may not be able 422 to provide such a straightforward system since it requires a golang API to be defined alongside a protobuf API in a 423 way that requires duplication and differing sets of design principles (protobuf APIs encourage additive changes 424 while golang APIs would forbid it). 425 426 Other downsides to this approach are: 427 * no clear roadmap to supporting modules in other languages like Rust 428 * doesn't get us any closer to proper object capability security (one of the goals of ADR 033) 429 * ADR 033 needs to be done properly anyway for the set of use cases which do need it 430 431 ## Decision 432 433 The latest **DRAFT** proposal is: 434 435 1. we are alignment on adopting [ADR 033](./adr-033-protobuf-inter-module-comm.md) not just as an addition to the 436 framework, but as a core replacement to the keeper paradigm entirely. 437 2. the ADR 033 inter-module router will accommodate any variation of approach (A) or (B) given the following rules: 438 a. if the client type is the same as the server type then pass it directly through, 439 b. if both client and server use the zero-copy generated code wrappers (which still need to be defined), then pass 440 the memory buffers from one wrapper to the other, or 441 c. marshal/unmarshal types between client and server. 442 443 This approach will allow for both maximal correctness and enable a clear path to enabling modules within in other 444 languages, possibly executed within a WASM VM. 445 446 ### Minor API Revisions 447 448 To declare minor API revisions of proto files, we propose the following guidelines (which were already documented 449 in [cosmos.app.v1alpha module options](../proto/cosmos/app/v1alpha1/module.proto)): 450 * proto packages which are revised from their initial version (considered revision `0`) should include a `package` 451 * comment in some .proto file containing the test `Revision N` at the start of a comment line where `N` is the current 452 revision number. 453 * all fields, messages, etc. added in a version beyond the initial revision should add a comment at the start of a 454 comment line of the form `Since: Revision N` where `N` is the non-zero revision it was added. 455 456 It is advised that there is a 1:1 correspondence between a state machine module and versioned set of proto files 457 which are versioned either as a buf module a go API module or both. If the buf schema registry is used, the version of 458 this buf module should always be `1.N` where `N` corresponds to the package revision. Patch releases should be used when 459 only documentation comments are updated. It is okay to include proto packages named `v2`, `v3`, etc. in this same 460 `1.N` versioned buf module (ex. `cosmos.bank.v2`) as long as all these proto packages consist of a single API intended 461 to be served by a single SDK module. 462 463 ### Introspecting Minor API Revisions 464 465 In order for modules to introspect the minor API revision of peer modules, we propose adding the following method 466 to `cosmossdk.io/core/intermodule.Client`: 467 468 ```go 469 ServiceRevision(ctx context.Context, serviceName string) uint64 470 ``` 471 472 Modules could all this using the service name statically generated by the go grpc code generator: 473 474 ```go 475 intermoduleClient.ServiceRevision(ctx, bankv1beta1.Msg_ServiceDesc.ServiceName) 476 ``` 477 478 In the future, we may decide to extend the code generator used for protobuf services to add a field 479 to client types which does this check more concisely, ex: 480 481 ```go 482 package bankv1beta1 483 484 type MsgClient interface { 485 Send(context.Context, MsgSend) (MsgSendResponse, error) 486 ServiceRevision(context.Context) uint64 487 } 488 ``` 489 490 ### Unknown Field Filtering 491 492 To correctly perform [unknown field filtering](./adr-020-protobuf-transaction-encoding.md#unknown-field-filtering), 493 the inter-module router can do one of the following: 494 495 * use the `protoreflect` API for messages which support that 496 * for gogo proto messages, marshal and use the existing `codec/unknownproto` code 497 * for zero-copy messages, do a simple check on the highest set field number (assuming we can require that fields are 498 adding consecutively in increasing order) 499 500 ### `FileDescriptor` Registration 501 502 Because a single go binary may contain different versions of the same generated protobuf code, we cannot rely on the 503 global protobuf registry to contain the correct `FileDescriptor`s. Because `appconfig` module configuration is itself 504 written in protobuf, we would like to load the `FileDescriptor`s for a module before loading a module itself. So we 505 will provide ways to register `FileDescriptor`s at module registration time before instantiation. We propose the 506 following `cosmossdk.io/core/appmodule.Option` constructors for the various cases of how `FileDescriptor`s may be 507 packaged: 508 509 ```go 510 package appmodule 511 512 // this can be used when we are using google.golang.org/protobuf compatible generated code 513 // Ex: 514 // ProtoFiles(bankv1beta1.File_cosmos_bank_v1beta1_module_proto) 515 func ProtoFiles(file []protoreflect.FileDescriptor) Option {} 516 517 // this can be used when we are using gogo proto generated code. 518 func GzippedProtoFiles(file [][]byte) Option {} 519 520 // this can be used when we are using buf build to generated a pinned file descriptor 521 func ProtoImage(protoImage []byte) Option {} 522 ``` 523 524 This approach allows us to support several ways protobuf files might be generated: 525 * proto files generated internally to a module (use `ProtoFiles`) 526 * the API module approach with pinned file descriptors (use `ProtoImage`) 527 * gogo proto (use `GzippedProtoFiles`) 528 529 ### Module Dependency Declaration 530 531 One risk of ADR 033 is that dependencies are called at runtime which are not present in the loaded set of SDK modules. 532 Also we want modules to have a way to define a minimum dependency API revision that they require. Therefore, all 533 modules should declare their set of dependencies upfront. These dependencies could be defined when a module is 534 instantiated, but ideally we know what the dependencies are before instantiation and can statically look at an app 535 config and determine whether the set of modules. For example, if `bar` requires `foo` revision `>= 1`, then we 536 should be able to know this when creating an app config with two versions of `bar` and `foo`. 537 538 We propose defining these dependencies in the proto options of the module config object itself. 539 540 ### Interface Registration 541 542 We will also need to define how interface methods are defined on types that are serialized as `google.protobuf.Any`'s. 543 In light of the desire to support modules in other languages, we may want to think of solutions that will accommodate 544 other languages such as plugins described briefly in [ADR 033](./adr-033-protobuf-inter-module-comm.md#internal-methods). 545 546 ### Testing 547 548 In order to ensure that modules are indeed with multiple versions of their dependencies, we plan to provide specialized 549 unit and integration testing infrastructure that automatically tests multiple versions of dependencies. 550 551 #### Unit Testing 552 553 Unit tests should be conducted inside SDK modules by mocking their dependencies. In a full ADR 033 scenario, 554 this means that all interaction with other modules is done via the inter-module router, so mocking of dependencies 555 means mocking their msg and query server implementations. We will provide both a test runner and fixture to make this 556 streamlined. The key thing that the test runner should do to test compatibility is to test all combinations of 557 dependency API revisions. This can be done by taking the file descriptors for the dependencies, parsing their comments 558 to determine the revisions various elements were added, and then created synthetic file descriptors for each revision 559 by subtracting elements that were added later. 560 561 Here is a proposed API for the unit test runner and fixture: 562 563 ```go 564 package moduletesting 565 566 import ( 567 "context" 568 "testing" 569 570 "cosmossdk.io/core/intermodule" 571 "cosmossdk.io/depinject" 572 "google.golang.org/grpc" 573 "google.golang.org/protobuf/proto" 574 "google.golang.org/protobuf/reflect/protodesc" 575 ) 576 577 type TestFixture interface { 578 context.Context 579 intermodule.Client // for making calls to the module we're testing 580 BeginBlock() 581 EndBlock() 582 } 583 584 type UnitTestFixture interface { 585 TestFixture 586 grpc.ServiceRegistrar // for registering mock service implementations 587 } 588 589 type UnitTestConfig struct { 590 ModuleConfig proto.Message // the module's config object 591 DepinjectConfig depinject.Config // optional additional depinject config options 592 DependencyFileDescriptors []protodesc.FileDescriptorProto // optional dependency file descriptors to use instead of the global registry 593 } 594 595 // Run runs the test function for all combinations of dependency API revisions. 596 func (cfg UnitTestConfig) Run(t *testing.T, f func(t *testing.T, f UnitTestFixture)) { 597 // ... 598 } 599 ``` 600 601 Here is an example for testing bar calling foo which takes advantage of conditional service revisions in the expected 602 mock arguments: 603 604 ```go 605 func TestBar(t *testing.T) { 606 UnitTestConfig{ModuleConfig: &foomodulev1.Module{}}.Run(t, func (t *testing.T, f moduletesting.UnitTestFixture) { 607 ctrl := gomock.NewController(t) 608 mockFooMsgServer := footestutil.NewMockMsgServer() 609 foov1.RegisterMsgServer(f, mockFooMsgServer) 610 barMsgClient := barv1.NewMsgClient(f) 611 if f.ServiceRevision(foov1.Msg_ServiceDesc.ServiceName) >= 1 { 612 mockFooMsgServer.EXPECT().DoSomething(gomock.Any(), &foov1.MsgDoSomething{ 613 ..., 614 Condition: ..., // condition is expected in revision >= 1 615 }).Return(&foov1.MsgDoSomethingResponse{}, nil) 616 } else { 617 mockFooMsgServer.EXPECT().DoSomething(gomock.Any(), &foov1.MsgDoSomething{...}).Return(&foov1.MsgDoSomethingResponse{}, nil) 618 } 619 res, err := barMsgClient.CallFoo(f, &MsgCallFoo{}) 620 ... 621 }) 622 } 623 ``` 624 625 The unit test runner would make sure that no dependency mocks return arguments which are invalid for the service 626 revision being tested to ensure that modules don't incorrectly depend on functionality not present in a given revision. 627 628 #### Integration Testing 629 630 An integration test runner and fixture would also be provided which instead of using mocks would test actual module 631 dependencies in various combinations. Here is the proposed API: 632 633 ```go 634 type IntegrationTestFixture interface { 635 TestFixture 636 } 637 638 type IntegrationTestConfig struct { 639 ModuleConfig proto.Message // the module's config object 640 DependencyMatrix map[string][]proto.Message // all the dependent module configs 641 } 642 643 // Run runs the test function for all combinations of dependency modules. 644 func (cfg IntegationTestConfig) Run(t *testing.T, f func (t *testing.T, f IntegrationTestFixture)) { 645 // ... 646 } 647 ``` 648 649 And here is an example with foo and bar: 650 651 ```go 652 func TestBarIntegration(t *testing.T) { 653 IntegrationTestConfig{ 654 ModuleConfig: &barmodulev1.Module{}, 655 DependencyMatrix: map[string][]proto.Message{ 656 "runtime": []proto.Message{ // test against two versions of runtime 657 &runtimev1.Module{}, 658 &runtimev2.Module{}, 659 }, 660 "foo": []proto.Message{ // test against three versions of foo 661 &foomodulev1.Module{}, 662 &foomodulev2.Module{}, 663 &foomodulev3.Module{}, 664 } 665 } 666 }.Run(t, func (t *testing.T, f moduletesting.IntegrationTestFixture) { 667 barMsgClient := barv1.NewMsgClient(f) 668 res, err := barMsgClient.CallFoo(f, &MsgCallFoo{}) 669 ... 670 }) 671 } 672 ``` 673 674 Unlike unit tests, integration tests actually pull in other module dependencies. So that modules can be written 675 without direct dependencies on other modules and because golang has no concept of development dependencies, integration 676 tests should be written in separate go modules, ex. `example.com/bar/v2/test`. Because this paradigm uses go semantic 677 versioning, it is possible to build a single go module which imports 3 versions of bar and 2 versions of runtime and 678 can test these all together in the six various combinations of dependencies. 679 680 ## Consequences 681 682 ### Backwards Compatibility 683 684 Modules which migrate fully to ADR 033 will not be compatible with existing modules which use the keeper paradigm. 685 As a temporary workaround we may create some wrapper types that emulate the current keeper interface to minimize 686 the migration overhead. 687 688 ### Positive 689 690 * we will be able to deliver interoperable semantically versioned modules which should dramatically increase the 691 ability of the Cosmos SDK ecosystem to iterate on new features 692 * it will be possible to write Cosmos SDK modules in other languages in the near future 693 694 ### Negative 695 696 * all modules will need to be refactored somewhat dramatically 697 698 ### Neutral 699 700 * the `cosmossdk.io/core/appconfig` framework will play a more central role in terms of how modules are defined, this 701 is likely generally a good thing but does mean additional changes for users wanting to stick to the pre-depinject way 702 of wiring up modules 703 * `depinject` is somewhat less needed or maybe even obviated because of the full ADR 033 approach. If we adopt the 704 core API proposed in https://github.com/cosmos/cosmos-sdk/pull/12239, then a module would probably always instantiate 705 itself with a method `ProvideModule(appmodule.Service) (appmodule.AppModule, error)`. There is no complex wiring of 706 keeper dependencies in this scenario and dependency injection may not have as much of (or any) use case. 707 708 ## Further Discussions 709 710 The decision described above is considered in draft mode and is pending final buy-in from the team and key stakeholders. 711 Key outstanding discussions if we do adopt that direction are: 712 713 * how do module clients introspect dependency module API revisions 714 * how do modules determine a minor dependency module API revision requirement 715 * how do modules appropriately test compatibility with different dependency versions 716 * how to register and resolve interface implementations 717 * how do modules register their protobuf file descriptors depending on the approach they take to generated code (the 718 API module approach may still be viable as a supported strategy and would need pinned file descriptors) 719 720 ## References 721 722 * https://github.com/cosmos/cosmos-sdk/discussions/10162 723 * https://github.com/cosmos/cosmos-sdk/discussions/10582 724 * https://github.com/cosmos/cosmos-sdk/discussions/10368 725 * https://github.com/cosmos/cosmos-sdk/pull/11340 726 * https://github.com/cosmos/cosmos-sdk/issues/11899 727 * [ADR 020](./adr-020-protobuf-transaction-encoding.md) 728 * [ADR 033](./adr-033-protobuf-inter-module-comm.md)