github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20170628_web_session_login.md (about)

     1  - Feature Name: Session Authentication for GRPC API
     2  - Status: In Progress
     3  - Start Date: 2017-7-11
     4  - Authors: Matt Tracy
     5  - RFC PR: #16829
     6  - Cockroach Issue: #6307
     7  
     8  # Summary
     9  The Cockroach Server currently provides a number of HTTP endpoints (the Admin UI
    10  and /debug endpoints) which return data about the cluster; however, these
    11  endpoints are not secured behind with any sort of authorization or
    12  authentication mechanism. This RPC proposes an authentication method which will
    13  require each incoming request to be associated with a *login session*. A login
    14  session will be created using a username/password combination. Incoming requests
    15  which are not associated with a valid session will be rejected.
    16  
    17  Changes to the CockroachDB server will be the addition of a sessions table, the
    18  addition of a new RPC endpoint to allow the Admin UI to create a new session,
    19  and a modification of our current RPC gateway server to enforce the new session
    20  requirement for incoming requests.  Also included in this proposal is a method
    21  for preventing CSRF (Cross-site request forgery) attacks.
    22  
    23  The AdminUI will be modified to require that users create a new session by
    24  "signing in" before the UI becomes usable. Outgoing requests will be modified
    25  slightly to utilize the new CSRF mitigation method.
    26  
    27  This RPC does not propose any new *authorization* mechanisms; it only
    28  authenticates sessions, without enforcing any specific permissions on different
    29  users.
    30  
    31  # Motivation
    32  Long-term goals for the UI include many scenarios in which users can actually
    33  modify CockroachDB settings through the UI. Proper authentication and
    34  authorization are a hard requirement for this; not all users of the system
    35  should have access to modify it, and any modifications made must be auditable.
    36  
    37  Even for the read-only scenarios currently enabled by the Admin UI, we are not
    38  properly enforcing permissions; for example, all Admin UI users can see the
    39  entire schema of the cluster, while database users can be restricted to seeing a
    40  subset of databases and tables.
    41  
    42  # Detailed design
    43  
    44  The design has the following components:
    45  
    46  + A "web_sessions" table which holds information about currently valid user
    47  sessions.
    48  + A "login" RPC accessible from the Admin UI over HTTP.  This RPC is called with
    49  a username/password pair, which are validated by the server; if valid, a new
    50  session is created and a cookie with a session ID is returned with the response.
    51  + All Admin UI methods, other than the new login request, will be modified to
    52  require a valid session cookie be sent with incoming http requests. This will be
    53  done at the service level before dispatching a specific method; the session's
    54  username will be added to the method context if the session is valid.
    55  + The Admin UI will be modified to require a logged-in session before allowing
    56  users to navigate to the current pages. If the user is not logged in, a dialog
    57  will be displayed for the user to input a username and password.
    58  
    59  + A CSRF mitigation method will be added to both the client and the server.
    60      + The server gives the client a cookie containing a cryptographically random
    61        value. This is done when static assets are loaded.
    62      + HTTP Requests from the Admin UI will be augmented to read this cookie and
    63        use it to write a custom HTTP header.
    64      + The server verifies the custom header is present and matches the cookie on
    65        all incoming requests.
    66  
    67  ## Backend Changes
    68  
    69  ### Session Table
    70  A new metadata table will be created to hold system sessions:
    71  
    72  ```sql
    73  CREATE TABLE system.web_sessions {
    74      id              SERIAL      PRIMARY KEY,
    75      "hashedSecret"  BYTES       NOT NULL,
    76      username        STRING      NOT NULL,
    77      "createdAt"     TIMESTAMP   NOT NULL DEFAULT now(),
    78      "expiresAt"     TIMESTAMP   NOT NULL,
    79      "revokedAt"     TIMESTAMP,
    80      "lastUsedAt"    TIMESTAMP   NOT NULL DEFAULT now(),
    81      "auditInfo"     STRING,
    82      INDEX(expiresAt),
    83      INDEX(createdAt),
    84  }
    85  ```
    86  
    87  `id` is the identifier of the session, which is of type SERIAL (equivalent to
    88  INT DEFAULT unique_rowid()).
    89  
    90  `hashedSecret` is the hash of a cryptographically random byte array generated
    91  and shared only with the original creator of the session. The server does not
    92  store the original bytes generated, but rather hashes the generated value and
    93  stores the hash. The server requires incoming requests to have the original
    94  version of the secret for the provided session id. This allows the session's id
    95  to be used in auditing logs which are readable by non-root users; if we did not
    96  require this secret, any user with access to auditing logs would be able to
    97  impersonate sessions.
    98  
    99  `username` stores the username used to create a session.
   100  
   101  Each session records four important timestamps:
   102  + `createdAt` indicates when the session was created. When a new session is
   103  created, this is set to the current time.
   104  + `expiresAt` indicates when the session will expire. When a new session is
   105  created, this is set to (current time + `server.web_session_timeout`);
   106  web_session_timeout is a new server setting (a duration) added as part of this
   107  RFC.
   108  + `revokedAt` is a possibly null timestamp indicating when the session was
   109  explicitly revoked. If it has not been revoked, this field is null. When a new
   110  session is created, this field is null.
   111  + `lastUsedAt` is a field being used to track whether a session is active. It
   112  will be periodically refreshed if a session continues to be used; this can help
   113  give an impression of whether a session is still being actively used after it is
   114  created. When a new session is created, this is set to the current time.
   115  
   116  The `auditInfo` string field contains an optional JSON-formatted string
   117  object containing other information relevant for auditing purposes. Examples of
   118  the type of information that may be included:
   119  + The IP address used to create the session originally.
   120  + If the session is revoked, the revoking username and session ID.
   121  
   122  Secondary indexes are added on the following fields:
   123  + `expiresAt`, which quickly allows an admin to query for possibly active
   124  sessions.
   125  + `createdAt`, which allows querying sessions in order of creation. This should
   126  be useful for auditing purposes.
   127  
   128  #### Creation of new table.
   129  
   130  The new table will be added using a backwards-compatible migration; the same
   131  fashion used for the jobs and settings tables. This is trivially possible
   132  because no existing code is interacting with these tables.
   133  
   134  The migration system, along with instructions for adding a new table, have an
   135  entry point in `pkg/migrations/migrations.go`.
   136  
   137  ### Session Creation
   138  Sessions will be created by calling a new HTTP endpoint "UserLogin". This new
   139  method will be on its own new service "Authentication" separate from existing
   140  services (Status/Admin); this is for reasons that will be explained in the
   141  Session Enforcement session.
   142  
   143  ```protobuf
   144  message UserLoginRequest {
   145      string username = 1;
   146      string password = 2;
   147  }
   148  
   149  message UserLoginResponse {
   150      // No information to return.
   151  }
   152  
   153  message UserLogoutRequest {
   154      // No information needed.
   155  }
   156  
   157  message UserLogoutResponse {
   158      // No information to return.
   159  }
   160  
   161  service Authentication {
   162    rpc UserLogin(UserLoginRequest) returns (UserLoginResponse) {
   163      // ...
   164    }
   165  
   166    rpc UserLogout(UserLogoutRequest) returns (UserLogoutResponse) {
   167      // ...
   168    }
   169  }
   170  ```
   171  
   172  When called, UserLogin will check the provided `username`/`password` pair against
   173  the `system.users` table, which stores usernames along with hashed versions of
   174  passwords.
   175  
   176  If the username/password is valid, a new entry is inserted into the
   177  `web_sessions` table and a 200 "OK" response is returned to the client. If the
   178  username or password is invalid, a 401 Unauthorized response is returned.
   179  
   180  For successful login responses a new cookie "session" is created containing the
   181  ID and secret of the newly created session. This is used to associate future
   182  requests from the client with the session.
   183  
   184  Cookie headers can be added to the response using GRPC's SetHeader method. These
   185  are attached as grpc metadata, which is then converted by GRPC Gateway into the
   186  appropriate HTTP response headers.
   187  
   188  The "session" cookie is marked as "httponly" and is thus not accessible from
   189  javascript; this is a defense-in-depth measure to prevent session exfiltration
   190  by XSS attacks. The cookie is also marked as "secure", which prevents the cookie
   191  from being sent over an unsecured http connection.
   192  
   193  The UserLogin method, when called successfully, will revoke the current session
   194  by setting its `revokedAt` field to the current time. It will then return the
   195  appropriate headers to delete the "session"
   196  
   197  ### Session Enforcement
   198  
   199  Session enforcement is handled by adding a new muxing wrapper around the
   200  existing "gwMux" used by grpc gateway services. Notably, the new wrapper will
   201  only be added for the existing services; it will *not* be added for the new
   202  Authentication service, because the UserLogin method must be accessible without
   203  a session.
   204  
   205  The new mux will enforce that all incoming connections have a valid session token.
   206  A valid request will pass all of the following checks:
   207  
   208  1. The incoming request has a "session" cookie.
   209  2. The value of the id in the "session" cookie contains the ID of a session in
   210  the session table. This is confirmed by performing a SELECT from the session
   211  table.
   212  3. The value of the secret in the "session" cookie matches the "hashedSecret"
   213  from the session retrieved from the session table. This is confirmed by
   214  hashing the secret in the cookie and comparing.
   215  3. The session's `revokedAt` timestamp is null.
   216  4. The session's `expiresAt` timestamp is in the future.
   217  
   218  If any of the above criteria are not true, the incoming request is rejected with
   219  a 401 Unauthorized message.
   220  
   221  If the the session *does* pass the above criteria, then the username and session
   222  ID are added to the context.Context before passing the call to gwMux. These
   223  values can later be accessed in specific methods by accessing them from the
   224  context.
   225  
   226  If the session's `lastUsedAt` field is older than new system setting
   227  `server.web_session_last_used_refresh`, the session record will be updated
   228  to set its `lastUsedAt` value to the current time.
   229  
   230  ### CSRF Enforcement
   231  
   232  CSRF is enforced using the "Double-submit cookie" method. This has two parts:
   233  
   234  + When the client sends a request to the server, it must read the value from a
   235  cookie "csrf-token" (if present) and writes that value to an HTTP header on the
   236  request "x-csrf-token".
   237  + When the server receives any request, it ensures that the value of the
   238  "csrf-token" cookie sent with that request is the same as the value in
   239  "x-csrf-token" header.
   240  
   241  Any request that does not have a matching "csrf-token" cookie and "x-csrf-token"
   242  header is rejected with a 401 Unauthorized error.
   243  
   244  In order for this to work, the "csrf-token" cookie needs to be created at some
   245  point. This is done on initial page load, when the server retrieves the static
   246  assets (accomplished by wrapping the http.FileServer currently used with an
   247  outer handler that sets the cookie. The outer handler will only set the cookie
   248  for the main entry point of the website, `index.html`). The cookie contains a
   249  cryptographically random generated string. The generated value does not need to
   250  be stored anywhere on the server side. The cookie should have the `secure`
   251  attribute.
   252  
   253  CSRF involves an attacker's third-party website constructing a request to your
   254  website, where the request contains some sort of malicious action. If the user
   255  is logged in to your website, their valid session cookie will be sent with the
   256  request and the action will be authorized. "Double-cookie" prevents this by
   257  requiring the requester to set the "x-csrf-token" header based on a
   258  domain-specific cookie; the third-party website cannot access that cookie, so it
   259  cannot properly set the header. Simply explained, "Double cookie" works by
   260  requiring the sender to prove that it can read cookies from a trusted origin.
   261  
   262  The random value must be set on the server because it must be cryptographically
   263  random; the client application is javascript and does not have reliable access
   264  to a reliable random source, and attackers could possibly guess any random
   265  values generated.
   266  
   267  CSRF protection will be added to all GRPC gateway services, including the new
   268  Authentication service (which does not itself require authentication, but will
   269  require CSRF protection).
   270  
   271  ### "Debug" pages
   272  
   273  A small number of "debug" pages are not part of the Admin UI application, but
   274  are instead provided as HTML generated directly from Go. These endpoints can
   275  also be moved underneath the session enforcement wrapper to require a valid
   276  login session.
   277  
   278  However, we will need to redirect users to the Admin UI in order for them to
   279  actually create a session; therefore, the wrapper for these methods will not
   280  return a "401" error response, but rather a redirection to the Admin UI.
   281  
   282  Because these pages are being migrated to the Admin UI proper, there is no
   283  pressing need to improve the user experience beyond the redirect.
   284  
   285  ### Insecure Mode
   286  
   287  If the cluster is running explicitly in "insecure" mode, the following changes
   288  will apply:
   289  
   290  + All requests will be acceptable. Where session validation would normally
   291  occur, an empty session ID and the "root" user will be added to the request
   292  context.
   293  + CSRF Token validation will not be applied.
   294  
   295  Insecure mode is trivially indicated to the client by serving from "http"
   296  addresses instead of "https". The client can thus check javascript variable
   297  `window.location.protocol` and adjust its behavior accordingly.
   298  
   299  ## Frontend Changes
   300  
   301  ### Logged-in State
   302  
   303  The front-end will be modified to have a "logged in" and "logged out" state.
   304  The logged-in state will be equivalent to the current behavior of the UI. The
   305  "logged out" state displays a full-screen prompt for the user to enter
   306  credentials.
   307  
   308  This mode will be enforced with the following mechanism:
   309  
   310  + A new "logged in" value will be added to the Admin UI state.
   311  + If the "logged in" value is not present, the top-level "Layout" element
   312  will display the full-screen login dialogue instead of the currently requested
   313  route component.
   314  + Upon successful login, the username used to log in is added to the "logged in"
   315  value in the Admin UI state. *The username is also recorded to LocalStorage*;
   316  this is necessary because javascript explicitly will not have access to the
   317  session cookie, and thus will not be able to recognize that it is logged in if
   318  the session is resumed later.
   319  + If the "logged in" value is present, the top-level "Layout" element will
   320  render the components of the currently requested route.
   321  + If any request comes back with a "401 Unauthorized", it is safe to assume that
   322  the user's session is no longer valid, and the "logged in" Admin UI state will
   323  thus be cleared.
   324  
   325  ### Login Dialog
   326  
   327  The Login Dialog is a full-screen component that prompts the user for a username
   328  and password.
   329  
   330  This component will be displayed by the top-level "Layout" element if the
   331  "logged in" value is not set in the Admin UI State. While the dialog is
   332  displayed, no other navigation controls are accessible.
   333  
   334  Upon successful login, the route requested by the user before seeing the login
   335  prompt will be rendered.
   336  
   337  ### Logout button
   338  
   339  After login, all pages will display a "log out" option in the top right corner
   340  of the screen.
   341  
   342  ### CSRF Support
   343  
   344  In order to perform CSRF properly, all outgoing requests will be need to read
   345  the value of "csrf-token" cookie and send it back to the server in the
   346  "x-csrf-token" HTTP header. This can be added in a central location at the method
   347  `timeoutFetch`.
   348  
   349  # Drawbacks
   350  
   351  The primary drawback of this login system is that it requires a session lookup
   352  for every request. If that proves expensive, the cost of this could be
   353  significantly mitigated by adding a short-term cache for sessions on each node.
   354  
   355  # Alternatives
   356  
   357  ### Stateless Sessions
   358  
   359  One alternatives considered was a "Stateless Session", where there would be
   360  no sessions table, but instead session information would be encoded using a
   361  "Javascript Web Token" (JWT). This is a signed object returned to the user
   362  instead of a session token; the object contains the username, csrf token,
   363  and session id. Using JWT, servers would not need to consult a sessions table
   364  for incoming requests, but instead would simply need to verify the signature
   365  on the token.
   366  
   367  The major issue with JWT is that it does not provide a way to revoke login
   368  sessions; to do this, we would need to store a blacklist of revoked session IDs,
   369  which removes much of the advantage of not having the sessions table in the
   370  first place.
   371  
   372  JWT would also require all machines on the cluster to maintain a shared secret
   373  for signing the tokens.
   374  
   375  ### HTTP Basic Auth
   376  
   377  The simplest option may be HTTP Basic authorization, wherein the browser allows
   378  the user to "log in", and sends a username/password combination with every
   379  request to the server. This requires no implementation on our part beyond
   380  verifying username/password, and in combination with HTTPS there is very little
   381  risk of being compromised by an attacker.
   382  
   383  However, this has two main drawbacks:
   384  
   385  + It does not allow us to track actions at the session level, which is desirable
   386  from the perspective of auditing.
   387  + It does not allow us to provide a custom "login" dialog, a "logout" button, or
   388  persistent client sessions.
   389  
   390  # Unresolved Questions
   391  
   392  ### 401 Unauthorized
   393  
   394  The "401 Unauthorized" response seems to be the most semantically correct HTTP
   395  code to return for unauthenticated attempts to access API methods. However,
   396  returning a 401 from a request seems to cause browsers to display
   397  username/password dialog for HTTP Basic auth, which we do not want to happen.
   398  
   399  We can try returning 401 responses *without* the WWW-Authenticate method, but
   400  that seems to be in [violation of HTTP
   401  standards](https://tools.ietf.org/html/rfc7235#section-3.1). We could try
   402  sending a custom value in the WWW-authenticate field (such as "custom"), but
   403  it's also not clear that this will prevent the browser from preventing a
   404  login dialog pop-up.
   405  
   406  If 401 proves to be problematic in this way, we will instead send 403 Forbidden
   407  in all cases where 401 is used in this RFC.