
     1  Authentication in the Cloud, 
     2  i.e. A Survey of Approaches to Authentication for Tao Applications
     3  ------------------------------------------------------------------
     5  Let's consider how services (components, programs, etc.) running within a cloud
     6  infrastructure can authenticate each other. Specifically, let's assume:
     7  * There are two programs, A and B. Program B opens a connection to A, and B
     8    wants assurance that it is really speaking with A (or vice versa).
     9  * Assume that A and B can be identified by code hashes Ha and Hb.
    10  * A and B are part of the same administrative domain, Dapp, which I'll represent
    11    as signing key Kd.
    12  * When executed, A and B are configured with domain-specific
    13    information containing, e.g. Ha, Hb, and Kd.
    14  * A and B may be running on the same cloud node or on different nodes within a
    15    single cloud. 
    16  * The underlying cloud nodes and infrastructure are probably a different
    17    administrative domain, Dcloud, which I'll represent as signing key Kc.
    18  * The cloud infrastructure implements Tao, or something like Tao, which provides
    19    to A and B various trusted local services, i.e. management of encryption or
    20    signing keys, sealed storage, measurement and attestation, etc.
    22  We want to construct an authenticated channel between A and B. 
    24  Authentication Across Cloud Nodes
    25  ---------------------------------
    27  Suppose A and B are executing on different cloud nodes. Here are some approaches
    28  we could take:
    30  1. TLS + local asym. keys + local Tao attestation + offline TPM attestation.
    32  This is what is currently used in the Tao implementation.
    33  * Kd publishes an attestation describing what constitutes a "good platform".
    34    This includes, e.g. Kc and various other parameters.
    35  * Kd publishes an attestation describing what constitutes a "good program", i.e.
    36    containing Ha and Hb.
    37  * Each local cloud node platform i has a TPM, represented by key Ktpm_i, that
    38    serves as a local root of trust.
    39  * Kc attests that Ktpm_i is held by a good platform, i.e. a platform with
    40    certain properties P_i. This step is done offline, e.g. during
    41    installation/configuration of the cloud node.
    42  * Programs A and B generate asymmetric keys Ka and Kb. This might happen when
    43    A and B first execute or when they are installed and configured.
    44  * Ktpm_i attests (through one or more levels of Tao) that Ka is held by
    45    program A and that A has been configured to run under domain Kd.
    46  * B opens a TLS channel to A and the two authenticate using Ka and Kb and
    47    self-signed certificates.
    48  * A then sends to B the attestations from Kc and from Ktpm_i.
    49  * B does this:
    50    - Using Kd's attestation about platforms, B checks that Kc and Kc's
    51  	attestation about Ktpm_i satisfies the definition for Ktpm_i to be a "good
    52  	platform".
    53    - B extracts the domain key from Ktpm_i's attestation and checks that it
    54      matches the app domain key Kd that B itself was configured with.
    55    - Using Kd's attestation about programs, B extracts the hash from Ktpm_i's
    56  	attestation and checks that it meets the domain's definition of "program A".
    57    - B extracts the key from Ktpm_i's attestation and checks that it matches the
    58      TLS key that was used during connection handshaking.
    60  At this point, B knows that it is connected to program A.
    62  Messages:
    63    C -> D : "I am key=Kc and I run various platforms"
    64    D -> ? : D1a = sign(kd, "key=Kc platforms with properties prop=P1 running Tao hash=Hta, are good")
    65    D -> ? : D1b = sign(kd, "key=Kc platforms with properties prop=P2 running Tao hash=Htb, are good")
    66    D -> ? : D2a = sign(kd, "On good platforms, prog=Ha(Kd) are good")
    67    D -> ? : D2b = sign(kd, "On good platforms, prog=Hb(Kd) are good")
    68    C -> M_1 -> A: C1 = sign(kc, "tpm=Ktpm_1 has properties prop=P1")
    69    C -> M_2 -> B: C2 = sign(kc, "tpm=Ktpm_2 has properties prop=P2")
    70    M_1    : run code that constitutes Tao and A(Kd)
    71    M_2    : run code that constitutes Tao and B(Kd)
    72    A      : generate Ka,ka
    73    B      : generate Kb,kb
    74    Ktpm_1 -> A : A1 = sign(ktpm_1, "Code with hash=Ht says key=Ka binds to id=Ha(Kd)")
    75    M_1 -> A    : C1
    76    Ktpm_2 -> B : B1 = sign(ktpm_2, "Code with hash=Ht says key=Kb binds to id=Hb(Kd)")
    77    M_1 -> B    : C2
    78    A <-> B     : Establish TLS channel with Ka and Kb and self-signed certs
    79    A -> B      : C1
    80    A -> B      : A1
    81    B : verify(Kd, D1a)              // B learns D's policy about platforms
    82      : verify(D1a.key, C1)          // B learns C's view of A's purported platform
    83      : verify C1.prop == D1a.prop   // A's purported platform is good for D
    84      : verify(C1.tpm, A1)           // B learns A's purported platform's view of A's code
    85      : verify A1.hash == D1a.hash   // B learns A's purported platform OS is good for D
    86  	: verify(Kd, D2a)              // B learns D's policy about programs
    87  	: verify == D2a.prog     // A's puported program is good for D
    88      : verify A1.key == TLS.peerkey // B learns peer is really as purported (*)
    89  	: conclude my peer is
    90    A : similar...
    92    (*) Note - Any good program A would not leak ka to any bad program, and no
    93    good program would use ka even if it had access (e.g. Tao or TPM). And a good
    94    TPM would not generate A1 for a bad program. So when B sees combination of Ka
    95    in the TLS handshake, plus a matching A1, it knows peer is good.
    97  Performance:
    98    Setup: 3 asymmetric signature generations
    99      * D signs 1 messages (or many) during configuration of domain
   100      * C signs 1 message per platform during installation/setup of platform(s)
   101    Program launch: 2 asym key generations, 2 asym signatures
   102      * A and B generate asymmetric keys
   103      * TPMs each sign 1 message per program (or more with Tao indirect key)
   104    Connection: 6 asym verifications, 2-party asym TLS with length-1 cert chains.
   105      * A and B each verify 1 message from D 
   106      * A and B each verify 1 message from C
   107      * A and B each verify 1 message from TPM
   108      * A and B do TLS
   110  Assumptions:
   111    Unforgeable asymmeric key signatures
   112      (used for Kd platform attestation, Kd program attestation, Kc platform
   113  	attestation, ...)
   114    [NO - An ideal hash function (used to compute program identity)]
   115    [NO - Ability to audit code]
   116    A suitable and well-known mechanism for associating short identifiers with OS
   117      (i.e. TPM+tboot PCR scheme)
   118    A suitable and well-known mechanism for associating short, unique identifiers with
   119      configured, executing programs.
   120      (used for Tao, e.g. an ideal hash function and a schema for code, args, env, etc.)
   121    Various non-interactive (i.e. offline) principals can keep long-term secrets
   122      (holders of Kd and Kc need to protect corresponding private half)
   123    Cloud provider key Kc is well-known
   124      (in practice - via web and dns/dnssec, social mechanisms, etc.)
   125    Cloud provider has a way to learn Ktpm_i and learn/eval platform properties
   126      (this is completely unspecified, probably has some physical aspects to take
   127  	ownership and get Ktpm_i out of the TPM, i.e. TPM "physical presence", and
   128  	may involve generating and keeping TPM "owner" passwords)
   129    TPMs and Tao work as advertised
   130      (tpm protects ktpm_i, only signs correct statements, etc.)
   131    Online/interactive programs can generate random asymmetric keys 
   132    Online/interactive programs can keep secrets in short term (or long-term?)
   133      (kc is secret from all but local TPM, local OS/Tao, C, and D)
   134    TLS works as advertised
   135      (B can authenticate peer as a public key)
   137  2. TLS + local asym. keys + app certs (+ bootstrap via option 1)
   139  Similar to option 1, but don't use self-signed TLS certificates. Instead, A and
   140  B, after generating their local keys Ka and Kb, both contact some service that
   141  represents the appplication domain. That service provides A and B with x509
   142  certificates signed by Kd. At connection time, B just does normal TLS
   143  authentication then checks that the peer certificate was provided by Kd and
   144  contains the name "program A". This is close to typical https/TLS usage. How
   145  does the service decide whether to issue a certificate for some key? Presumably
   146  using option 1 between A (or B) and the service.
   148  Messages:
   149    C -> D : C0 = "I am key=Kc and I run various platforms"
   150    C -> M_1 -> A: C1 = sign(kc, "tpm=Ktpm_1 has properties prop=P1")
   151    C -> M_2 -> B: C2 = sign(kc, "tpm=Ktpm_2 has properties prop=P2")
   152    M_1    : run code that constitutes Tao and A(Kd)
   153    M_2    : run code that constitutes Tao and B(Kd)
   154    A      : generate Ka,ka
   155    B      : generate Kb,kb
   156    Ktpm_1 -> A : A1 = sign(ktpm_1, "Code with hash=Ht says key=Ka binds to id=Ha(Kd)")
   157    M_1 -> A    : C1
   158    Ktpm_2 -> B : B1 = sign(ktpm_2, "Code with hash=Ht says key=Kb binds to id=Hb(Kd)")
   159    M_1 -> B    : C2
   160    A -> D      : Kc, C1, A1
   161    D : verify Kc is reasonable
   162      : verify(Kc, C1)
   163      : verify C1.prop are reasonable
   164  	: verify(C1.tpm, A1)
   165  	: verify A1.hash and are reasonable
   166    D -> A : D1a = sign_x509(kd, "key=Ka binds to commonname=Ha(Kd) for ca=Kd ...")
   167    B -> D      : Kc, C2, B1
   168    D : same as above
   169    D -> B : D1b = sign_x509(kd, "key=Kb binds to commonname=Hb(Kd) for ca=Kd ...")
   170    A <-> B     : Establish TLS channel with Ka and Kb and certs D1a, D1b
   171    B : verify == Kd && TLS.peercert.key == TLS.peerkey
   172  	: conclude my peer is TLS.peercert.commonname
   173    A : similar...
   175  Performance:
   176    Setup: 2 asymmetric signature generations, 2 asym verifications
   177      * C signs 1 message per platform during installation/setup of platform(s)
   178      * D verifies 2 messages from C
   179    Program launch: 2 asym key generations, 4 asym signatures, 2 asym
   180      verifications, and extra round-trip to D (twice).
   181      * A and B generate asymmetric keys
   182      * TPMs each sign 1 message per program (or more with Tao indirect key)
   183      * D verifies 2 messages from TPMs
   184      * D signs 2 messages
   185    Connection: 2-party asym TLS with length-2 cert chains.
   186      * A and B do TLS
   188  Assumptions: Same as (1) except...
   189    Fewer non-interactive principals involved
   190    Interactive principal can keep long-term secrets
   191      (holder of Kd needs to be online but also protect the private half)
   192    x509 is secure
   194  3. TLS + preshared sym. keys (+ bootstrap via option 1 or 2)
   196  The shared app service generates a symmetric key Kab for the pair (A, B), and
   197  provides A with (B, Kab) and B with (A, Kab). B connects to A TLS pre-shared key
   198  using Kab. Since B was given (A, Kab) by the trusted shared service, B knows it
   199  is connected to A (or to itself -- but that case is easy to rule out by just
   200  exchanging names).  How does the service decide to give some program the key
   201  Kab? Presumably using option 1 or 2 between A (or B) and the service, or using
   202  any of a variety of (very interesting) proposals found in the literature for
   203  identity-based crypto and symmetric key generation/distribution among a group of
   204  nodes.
   206  Performance:
   207    Setup: 2 asymmetric signature generations, 2 asym verifications
   208      * C signs 1 message per platform during installation/setup of platform(s)
   209      * D verifies 2 messages from C
   210    Program launch e.g.: 2 asym key generations, 2 asym signatures, 1-party TLS
   211  	with length-1 cert chain (twice), extra round-trip to D (twice), 2 asym
   212  	verifications, and 1 sym key generation (twice).
   213      * A and B generate asymmetric keys
   214      * TPMs each sign 1 message per program (or more with Tao indirect key)
   215      * D verifies 2 messages from TPMs
   216      * D generates 1 symmetric shared key
   217    Connection: 2-party sym TLS with psk and no cert chains.
   218      * A and B do TLS
   220  Assumptions: Same as (2) except...
   221    Ability to generate short-term symmetric shared keys
   222    No x509
   225  4. TLS + shared sym. keys (+ bootstrap via Tao)
   227  Suppose the underlying cloud platform can generate a shared key for any pair of
   228  programs. A and B both request a shared key, then use TLS pre-shared key to
   229  establish a connection. There are lots of ways for the underlying platforms to
   230  generate shared keys. One is for them to derive shared keys from a shared
   231  master key, presumably installed at platform-configuration time.
   233  Messages:
   234    ? -> M_* : km 
   235    M_1      : run code that constitutes Tao and A(...)
   236    M_2      : run code that constitutes Tao and B(...)
   237    A        : generate Ka,ka
   238    B        : generate Kb,kb
   239    A -> OSa : request key for Ha(...),Hb(...)
   240    OSa      : generate Kab
   241    OSa -> A : Kab
   242    B -> OSb : request key for Ha(...),Hb(...)
   243    OSb      : generate Kab
   244    OSb -> B : Kab
   245    A <-> B  : TLS with psk, hint Ha(...),Hb(...)
   247  Performance:
   248    Setup: None.
   249    Program launch: 2 sym key generations.
   250    Connection: 2-party sym TLS with psk and no cert chains.
   252  Assumptions: 
   253    Ability to generate short-term symmetric shared keys
   254    Interactive principal can keep long-term master secret
   255      (OS is online but also holds a master shared secret)
   256    Ability to safeguard shared master secret
   258  5. Platform-provided authentication
   260  Let the underlying platform perform authentication on behalf of A and B. A and B
   261  would establish a TLS or TLS-like connection, but the authentication handshaking
   262  would be done by the underlying platform. How do the underlying platforms
   263  authenticate? Presumably using option 1, 2, 3, or 6.
   265  6. TLS + cached sessions (+ bootstrap via option 1 or 2)
   267  B connects to A using option 1 or 2. Both sides then cache their TLS sessions.
   268  Subsequently, B connects to A using the cached TLS sessions.
   270  Authentication Within a Cloud Node
   271  ----------------------------------
   273  Suppose A and B are executing on the same cloud node.  Any of options 1 - 6
   274  still work, of course. Option 4 becomes somewhat easier, because there is no
   275  longer a need to share a master key across cloud platforms.  Option 5 becomes
   276  easier because the underlying platform can trivially authenticate to itself. And
   277  for options that rely on an shared app service for generating or attesting to
   278  keys, if the shared app service is co-located on the same cloud node, then
   279  bootstrapping can be done using option 4 or 5 or any of the other options.
   281  Some other options become available when A and B execute on the same cloud node:
   283  7. OS-secured channels
   285  Within a single machine, TLS and cryptography aren't really necessary if we have
   286  some other means of establishing authenticated channels. Linux pipes, for
   287  example.
   289  Authorization Without Authentication
   290  ------------------------------------
   292  8. Macaroons / Cookies
   294  In some scenarios, we care more about authorization than about authentication or
   295  auditing. In Macaroons, for example, clients (which are essentially anonymous)
   296  hold cookie-like secret tokens, and services grant any bearer of an appropriate
   297  token access to resources. Applying this scheme to our intra-cloud scenario,
   298  program A might hold a macaroon which it sends to B over a TLS connection, and B
   299  makes decisions on the basis of that macaroon without every authenticating the
   300  connection. However, there are issues:
   302  Before program A sends a macaroon to B over some connection, A needs to
   303  authenticate the connection. Presumably this is done with options 1-7 (in
   304  typical end-user https sceanios, this would be option 2).
   306  Program A needs to obtain the macaroon from B either directly or through some
   307  intermediary. These connections need to be authenticated as well, presumably
   308  using options 1-7 (in typical end-user https scenarios, this would be passwords
   309  which function here like pre-shared keys).
   311  Other Approaches or Existing Work?
   312  ----------------------------------
   314  Are there other approaches and related work I should be looking at? I am not
   315  finding a whole lot of literature on authentication (or authorization) *within*
   316  cloud services.
   318  * Check how Amazon EC2, or Google or Microsoft cloud providers authenticate.
   320  Short Paper Outline
   321  -------------------
   323  * Problem introduction
   324  * Survey of approaches 1-8
   325  * Discussion of tradeoffs 
   326   - Setup costs (per-platform and/or per-program) vs. connection-time costs
   327   - Crypto tradeoffs: need for secure storage, performance of shared vs. symmetric crypto
   328   - Trust issues, size of tusted computing base, multi-tennant issues
   329  * Quantitative Evaluation (?)
   330   - Can we look at patterns of connections in a cloud service so that we can make
   331     a more informed discussion of the costs?