github.com/cosmos/cosmos-sdk@v0.50.10/docs/architecture/adr-028-public-key-addresses.md (about)

     1  # ADR 028: Public Key Addresses
     2  
     3  ## Changelog
     4  
     5  * 2020/08/18: Initial version
     6  * 2021/01/15: Analysis and algorithm update
     7  
     8  ## Status
     9  
    10  Proposed
    11  
    12  ## Abstract
    13  
    14  This ADR defines an address format for all addressable Cosmos SDK accounts. That includes: new public key algorithms, multisig public keys, and module accounts.
    15  
    16  ## Context
    17  
    18  Issue [\#3685](https://github.com/cosmos/cosmos-sdk/issues/3685) identified that public key
    19  address spaces are currently overlapping. We confirmed that it significantly decreases security of Cosmos SDK.
    20  
    21  ### Problem
    22  
    23  An attacker can control an input for an address generation function. This leads to a birthday attack, which significantly decreases the security space.
    24  To overcome this, we need to separate the inputs for different kind of account types:
    25  a security break of one account type shouldn't impact the security of other account types.
    26  
    27  ### Initial proposals
    28  
    29  One initial proposal was extending the address length and
    30  adding prefixes for different types of addresses.
    31  
    32  @ethanfrey explained an alternate approach originally used in https://github.com/iov-one/weave:
    33  
    34  > I spent quite a bit of time thinking about this issue while building weave... The other cosmos Sdk.
    35  > Basically I define a condition to be a type and format as human readable string with some binary data appended. This condition is hashed into an Address (again at 20 bytes). The use of this prefix makes it impossible to find a preimage for a given address with a different condition (eg ed25519 vs secp256k1).
    36  > This is explained in depth here https://weave.readthedocs.io/en/latest/design/permissions.html
    37  > And the code is here, look mainly at the top where we process conditions. https://github.com/iov-one/weave/blob/master/conditions.go
    38  
    39  And explained how this approach should be sufficiently collision resistant:
    40  
    41  > Yeah, AFAIK, 20 bytes should be collision resistance when the preimages are unique and not malleable. A space of 2^160 would expect some collision to be likely around 2^80 elements (birthday paradox). And if you want to find a collision for some existing element in the database, it is still 2^160. 2^80 only is if all these elements are written to state.
    42  > The good example you brought up was eg. a public key bytes being a valid public key on two algorithms supported by the codec. Meaning if either was broken, you would break accounts even if they were secured with the safer variant. This is only as the issue when no differentiating type info is present in the preimage (before hashing into an address).
    43  > I would like to hear an argument if the 20 bytes space is an actual issue for security, as I would be happy to increase my address sizes in weave. I just figured cosmos and ethereum and bitcoin all use 20 bytes, it should be good enough. And the arguments above which made me feel it was secure. But I have not done a deeper analysis.
    44  
    45  This led to the first proposal (which we proved to be not good enough):
    46  we concatenate a key type with a public key, hash it and take the first 20 bytes of that hash, summarized as `sha256(keyTypePrefix || keybytes)[:20]`.
    47  
    48  ### Review and Discussions
    49  
    50  In [\#5694](https://github.com/cosmos/cosmos-sdk/issues/5694) we discussed various solutions.
    51  We agreed that 20 bytes it's not future proof, and extending the address length is the only way to allow addresses of different types, various signature types, etc.
    52  This disqualifies the initial proposal.
    53  
    54  In the issue we discussed various modifications:
    55  
    56  * Choice of the hash function.
    57  * Move the prefix out of the hash function: `keyTypePrefix + sha256(keybytes)[:20]` [post-hash-prefix-proposal].
    58  * Use double hashing: `sha256(keyTypePrefix + sha256(keybytes)[:20])`.
    59  * Increase to keybytes hash slice from 20 byte to 32 or 40 bytes. We concluded that 32 bytes, produced by a good hash functions is future secure.
    60  
    61  ### Requirements
    62  
    63  * Support currently used tools - we don't want to break an ecosystem, or add a long adaptation period. Ref: https://github.com/cosmos/cosmos-sdk/issues/8041
    64  * Try to keep the address length small - addresses are widely used in state, both as part of a key and object value.
    65  
    66  ### Scope
    67  
    68  This ADR only defines a process for the generation of address bytes. For end-user interactions with addresses (through the API, or CLI, etc.), we still use bech32 to format these addresses as strings. This ADR doesn't change that.
    69  Using Bech32 for string encoding gives us support for checksum error codes and handling of user typos.
    70  
    71  ## Decision
    72  
    73  We define the following account types, for which we define the address function:
    74  
    75  1. simple accounts: represented by a regular public key (ie: secp256k1, sr25519)
    76  2. naive multisig: accounts composed by other addressable objects (ie: naive multisig)
    77  3. composed accounts with a native address key (ie: bls, group module accounts)
    78  4. module accounts: basically any accounts which cannot sign transactions and which are managed internally by modules
    79  
    80  ### Legacy Public Key Addresses Don't Change
    81  
    82  Currently (Jan 2021), the only officially supported Cosmos SDK user accounts are `secp256k1` basic accounts and legacy amino multisig.
    83  They are used in existing Cosmos SDK zones. They use the following address formats:
    84  
    85  * secp256k1: `ripemd160(sha256(pk_bytes))[:20]`
    86  * legacy amino multisig: `sha256(aminoCdc.Marshal(pk))[:20]`
    87  
    88  We don't want to change existing addresses. So the addresses for these two key types will remain the same.
    89  
    90  The current multisig public keys use amino serialization to generate the address. We will retain
    91  those public keys and their address formatting, and call them "legacy amino" multisig public keys
    92  in protobuf. We will also create multisig public keys without amino addresses to be described below.
    93  
    94  ### Hash Function Choice
    95  
    96  As in other parts of the Cosmos SDK, we will use `sha256`.
    97  
    98  ### Basic Address
    99  
   100  We start with defining a base algorithm for generating addresses which we will call `Hash`. Notably, it's used for accounts represented by a single key pair. For each public key schema we have to have an associated `typ` string, explained in the next section. `hash` is the cryptographic hash function defined in the previous section.
   101  
   102  ```go
   103  const A_LEN = 32
   104  
   105  func Hash(typ string, key []byte) []byte {
   106      return hash(hash(typ) + key)[:A_LEN]
   107  }
   108  ```
   109  
   110  The `+` is bytes concatenation, which doesn't use any separator.
   111  
   112  This algorithm is the outcome of a consultation session with a professional cryptographer.
   113  Motivation: this algorithm keeps the address relatively small (length of the `typ` doesn't impact the length of the final address)
   114  and it's more secure than [post-hash-prefix-proposal] (which uses the first 20 bytes of a pubkey hash, significantly reducing the address space).
   115  Moreover the cryptographer motivated the choice of adding `typ` in the hash to protect against a switch table attack.
   116  
   117  `address.Hash` is a low level function to generate _base_ addresses for new key types. Example:
   118  
   119  * BLS: `address.Hash("bls", pubkey)`
   120  
   121  ### Composed Addresses
   122  
   123  For simple composed accounts (like a new naive multisig) we generalize the `address.Hash`. The address is constructed by recursively creating addresses for the sub accounts, sorting the addresses and composing them into a single address. It ensures that the ordering of keys doesn't impact the resulting address.
   124  
   125  ```go
   126  // We don't need a PubKey interface - we need anything which is addressable.
   127  type Addressable interface {
   128      Address() []byte
   129  }
   130  
   131  func Composed(typ string, subaccounts []Addressable) []byte {
   132      addresses = map(subaccounts, \a -> LengthPrefix(a.Address()))
   133      addresses = sort(addresses)
   134      return address.Hash(typ, addresses[0] + ... + addresses[n])
   135  }
   136  ```
   137  
   138  The `typ` parameter should be a schema descriptor, containing all significant attributes with deterministic serialization (eg: utf8 string).
   139  `LengthPrefix` is a function which prepends 1 byte to the address. The value of that byte is the length of the address bits before prepending. The address must be at most 255 bits long.
   140  We are using `LengthPrefix` to eliminate conflicts - it assures, that for 2 lists of addresses: `as = {a1, a2, ..., an}` and `bs = {b1, b2, ..., bm}` such that every `bi` and `ai` is at most 255 long, `concatenate(map(as, (a) => LengthPrefix(a))) = map(bs, (b) => LengthPrefix(b))` if `as = bs`.
   141  
   142  Implementation Tip: account implementations should cache addresses.
   143  
   144  #### Multisig Addresses
   145  
   146  For a new multisig public keys, we define the `typ` parameter not based on any encoding scheme (amino or protobuf). This avoids issues with non-determinism in the encoding scheme.
   147  
   148  Example:
   149  
   150  ```protobuf
   151  package cosmos.crypto.multisig;
   152  
   153  message PubKey {
   154    uint32 threshold = 1;
   155    repeated google.protobuf.Any pubkeys = 2;
   156  }
   157  ```
   158  
   159  ```go
   160  func (multisig PubKey) Address() {
   161  	// first gather all nested pub keys
   162  	var keys []address.Addressable  // cryptotypes.PubKey implements Addressable
   163  	for _, _key := range multisig.Pubkeys {
   164  		keys = append(keys, key.GetCachedValue().(cryptotypes.PubKey))
   165  	}
   166  
   167  	// form the type from the message name (cosmos.crypto.multisig.PubKey) and the threshold joined together
   168  	prefix := fmt.Sprintf("%s/%d", proto.MessageName(multisig), multisig.Threshold)
   169  
   170  	// use the Composed function defined above
   171  	return address.Composed(prefix, keys)
   172  }
   173  ```
   174  
   175  
   176  ### Derived Addresses
   177  
   178  We must be able to cryptographically derive one address from another one. The derivation process must guarantee hash properties, hence we use the already defined `Hash` function:
   179  
   180  ```go
   181  func Derive(address, derivationKey []byte) []byte {
   182  	return Hash(addres, derivationKey)
   183  }
   184  ```
   185  
   186  ### Module Account Addresses
   187  
   188  A module account will have `"module"` type. Module accounts can have sub accounts. The submodule account will be created based on module name, and sequence of derivation keys. Typically, the first derivation key should be a class of the derived accounts. The derivation process has a defined order: module name, submodule key, subsubmodule key... An example module account is created using:
   189  
   190  ```go
   191  address.Module(moduleName, key)
   192  ```
   193  
   194  An example sub-module account is created using:
   195  
   196  ```go
   197  groupPolicyAddresses := []byte{1}
   198  address.Module(moduleName, groupPolicyAddresses, policyID)
   199  ```
   200  
   201  The `address.Module` function is using `address.Hash` with `"module"` as the type argument, and byte representation of the module name concatenated with submodule key. The two last component must be uniquely separated to avoid potential clashes (example: modulename="ab" & submodulekey="bc" will have the same derivation key as modulename="a" & submodulekey="bbc").
   202  We use a null byte (`'\x00'`) to separate module name from the submodule key. This works, because null byte is not a part of a valid module name. Finally, the sub-submodule accounts are created by applying the `Derive` function recursively.
   203  We could use `Derive` function also in the first step (rather than concatenating module name with zero byte and the submodule key). We decided to do concatenation to avoid one level of derivation and speed up computation.
   204  
   205  For backward compatibility with the existing `authtypes.NewModuleAddress`, we add a special case in `Module` function: when no derivation key is provided, we fallback to the "legacy" implementation. 
   206  
   207  ```go
   208  func Module(moduleName string, derivationKeys ...[]byte) []byte{
   209  	if len(derivationKeys) == 0 {
   210  		return authtypes.NewModuleAddress(modulenName)  // legacy case
   211  	}
   212  	submoduleAddress := Hash("module", []byte(moduleName) + 0 + key)
   213  	return fold((a, k) => Derive(a, k), subsubKeys, submoduleAddress)
   214  }
   215  ```
   216  
   217  **Example 1**  A lending BTC pool address would be:
   218  
   219  ```go
   220  btcPool := address.Module("lending", btc.Address()})
   221  ```
   222  
   223  If we want to create an address for a module account depending on more than one key, we can concatenate them:
   224  
   225  ```go
   226  btcAtomAMM := address.Module("amm", btc.Address() + atom.Address()})
   227  ```
   228  
   229  **Example 2**  a smart-contract address could be constructed by:
   230  
   231  ```go
   232  smartContractAddr = Module("mySmartContractVM", smartContractsNamespace, smartContractKey})
   233  
   234  // which equals to:
   235  smartContractAddr = Derived(
   236      Module("mySmartContractVM", smartContractsNamespace), 
   237      []{smartContractKey})
   238  ```
   239  
   240  ### Schema Types
   241  
   242  A `typ` parameter used in `Hash` function SHOULD be unique for each account type.
   243  Since all Cosmos SDK account types are serialized in the state, we propose to use the protobuf message name string.
   244  
   245  Example: all public key types have a unique protobuf message type similar to:
   246  
   247  ```protobuf
   248  package cosmos.crypto.sr25519;
   249  
   250  message PubKey {
   251  	bytes key = 1;
   252  }
   253  ```
   254  
   255  All protobuf messages have unique fully qualified names, in this example `cosmos.crypto.sr25519.PubKey`.
   256  These names are derived directly from .proto files in a standardized way and used
   257  in other places such as the type URL in `Any`s. We can easily obtain the name using
   258  `proto.MessageName(msg)`.
   259  
   260  ## Consequences
   261  
   262  ### Backwards Compatibility
   263  
   264  This ADR is compatible with what was committed and directly supported in the Cosmos SDK repository.
   265  
   266  ### Positive
   267  
   268  * a simple algorithm for generating addresses for new public keys, complex accounts and modules
   269  * the algorithm generalizes _native composed keys_
   270  * increased security and collision resistance of addresses
   271  * the approach is extensible for future use-cases - one can use other address types, as long as they don't conflict with the address length specified here (20 or 32 bytes).
   272  * support new account types.
   273  
   274  ### Negative
   275  
   276  * addresses do not communicate key type, a prefixed approach would have done this
   277  * addresses are 60% longer and will consume more storage space
   278  * requires a refactor of KVStore store keys to handle variable length addresses
   279  
   280  ### Neutral
   281  
   282  * protobuf message names are used as key type prefixes
   283  
   284  ## Further Discussions
   285  
   286  Some accounts can have a fixed name or may be constructed in other way (eg: modules). We were discussing an idea of an account with a predefined name (eg: `me.regen`), which could be used by institutions.
   287  Without going into details, these kinds of addresses are compatible with the hash based addresses described here as long as they don't have the same length.
   288  More specifically, any special account address must not have a length equal to 20 or 32 bytes.
   289  
   290  ## Appendix: Consulting session
   291  
   292  End of Dec 2020 we had a session with [Alan Szepieniec](https://scholar.google.be/citations?user=4LyZn8oAAAAJ&hl=en) to consult the approach presented above.
   293  
   294  Alan general observations:
   295  
   296  * we don’t need 2-preimage resistance
   297  * we need 32bytes address space for collision resistance
   298  * when an attacker can control an input for object with an address then we have a problem with birthday attack
   299  * there is an issue with smart-contracts for hashing
   300  * sha2 mining can be use to breaking address pre-image
   301  
   302  Hashing algorithm
   303  
   304  * any attack breaking blake3 will break blake2
   305  * Alan is pretty confident about the current security analysis of the blake hash algorithm. It was a finalist, and the author is well known in security analysis.
   306  
   307  Algorithm:
   308  
   309  * Alan recommends to hash the prefix: `address(pub_key) = hash(hash(key_type) + pub_key)[:32]`, main benefits:
   310      * we are free to user arbitrary long prefix names
   311      * we still don’t risk collisions
   312      * switch tables
   313  * discussion about penalization -> about adding prefix post hash
   314  * Aaron asked about post hash prefixes (`address(pub_key) = key_type + hash(pub_key)`) and differences. Alan noted that this approach has longer address space and it’s stronger.
   315  
   316  Algorithm for complex / composed keys:
   317  
   318  * merging tree like addresses with same algorithm are fine
   319  
   320  Module addresses: Should module addresses have different size to differentiate it?
   321  
   322  * we will need to set a pre-image prefix for module addresse to keept them in 32-byte space: `hash(hash('module') + module_key)`
   323  * Aaron observation: we already need to deal with variable length (to not break secp256k1 keys).
   324  
   325  Discssion about arithmetic hash function for ZKP
   326  
   327  * Posseidon / Rescue
   328  * Problem: much bigger risk because we don’t know much techniques and history of crypto-analysis of arithmetic constructions. It’s still a new ground and area of active research.
   329  
   330  Post quantum signature size
   331  
   332  * Alan suggestion: Falcon: speed / size ration - very good.
   333  * Aaron - should we think about it?
   334    Alan: based on early extrapolation this thing will get able to break EC cryptography in 2050 . But that’s a lot of uncertainty. But there is magic happening with recurions / linking / simulation and that can speedup the progress.
   335  
   336  Other ideas
   337  
   338  * Let’s say we use same key and two different address algorithms for 2 different use cases. Is it still safe to use it? Alan: if we want to hide the public key (which is not our use case), then it’s less secure but there are fixes.
   339  
   340  ### References
   341  
   342  * [Notes](https://hackmd.io/_NGWI4xZSbKzj1BkCqyZMw)