github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20170701_jobs_monitoring.md (about)

     1  - Feature Name: Jobs Monitoring
     2  - Status: completed ([#16733](https://github.com/cockroachdb/cockroach/pull/16733))
     3  - Start Date: 2017-06-22
     4  - Author: Julian Gilyadov
     5  - RFC PR: [#16688](https://github.com/cockroachdb/cockroach/pull/16688)
     6  - Cockroach Issue: None
     7  
     8  # Motivation
     9  
    10  Recent syntax additions such as `SHOW [QUERIES|SESSIONS]` syntax provide a
    11  convenient interface for monitoring queries and sessions, but for a consistent
    12  and symmetric interface it is important to add `SHOW JOBS` statement too,
    13  especially considering the forthcoming `CANCEL [QUERY|JOB]` statement.
    14  
    15  Although a `SHOW JOBS` statement is additional SQL syntax with no direct basis
    16  in another SQL dialect, it is better to have this implemented rather than have
    17  an inconsistent interface where it is possible to monitor queries and sessions
    18  but not jobs.
    19  
    20  It is also worth noting that at least one current customer has requested the
    21  ability to query job status from SQL directly.
    22  
    23  # Design
    24  
    25  For consistency, it is best to mimic the `SHOW [QUERIES|SESSIONS]` syntax
    26  ([RFC](20170505_monitoring_queries_and_sessions.md)):
    27  
    28  ```sql
    29  SHOW JOBS
    30  ```
    31  
    32  Since all jobs are cluster jobs, there's no need to specify `LOCAL` or `CLUSTER`
    33  (like in the RFC above). Users who want more fine-grained queries can use `SHOW
    34  JOBS` as a data source in a `FROM` clause, thanks to [#15590]:
    35  
    36  ```sql
    37  SELECT fraction_completed FROM [SHOW JOBS] WHERE type = 'RESTORE';
    38  ```
    39  
    40  Though the [`system.jobs`](20170215_system_jobs.md) table is the source for
    41  `SHOW JOBS` and can be queried by users (`SELECT * FROM system.jobs`),
    42  `system.jobs` cannot be used for monitoring. Users cannot retrieve any of
    43  several important fields stored in the `payload` protobuf, like 
    44  `fraction_completed` and `username`, from SQL directly.
    45  
    46  The `crdb_internal.jobs` table wraps the `system.jobs` table and exposes the
    47  important information by decoding the `payload` protobuf. The columns returned
    48  from `crdb_internal.jobs` are exactly the columns we intend intend to return
    49  from `SHOW JOBS`. (In v1.0, we weren't ready to make `crdb_internal.jobs` part
    50  of the public API.) Since we have two months before v1.1 is considered
    51  stable, we feel comfortable making `SHOW JOBS` part of the public API.
    52  
    53  For reference, the columns are:
    54  
    55  * `id` — the ID of the job
    56  * `type` — the type of the job (currently BACKUP or RESTORE)
    57  * `description` — the query used to create the job
    58  * `username` — the user who created the job
    59  * `descriptor_ids` — the IDs of the tables impacted by the job
    60  * `status` — pending, running, succeeded, failed (aborted soon)
    61  * `created, started, finished` — the time of state transitions
    62  * `modified` — the last time this row was updated
    63  * `fraction_completed` — the approximate progress of this job
    64  * `error` — the error that caused the job to fail, if it failed
    65  
    66  # Implementation
    67  
    68  The implementation of `SHOW JOBS` could essentially copy/paste what the internal
    69  `crdb_internal.jobs` table does, with the intention of removing
    70  `crdb_internal.jobs`; however, by keeping the virtual table, we can take
    71  advantage of a forthcoming optimization to push filtering down to the virtual
    72  table constructors, which avoids populating all jobs in memory if the user
    73  specifies a `WHERE` clause.
    74  
    75  Because the internal `crdb_internal.jobs` table is here to stay, the
    76  implementation of `SHOW JOBS` should be simply parsing the internal jobs table,
    77  similarly how `SHOW DATABASES` is implemented:
    78  
    79  ```go
    80  // ShowJobs returns all jobs, past and present, on the cluster.
    81  // Privileges: None.
    82  func (p *planner) ShowJobs(ctx context.Context, n *parser.ShowJobs) (planNode, error) {
    83    const getJobs = `TABLE crdb_internal.jobs`
    84    stmt, err := parser.ParseOne(getJobs)
    85    if err != nil {
    86      return nil, err
    87    }
    88    return p.newPlan(ctx, stmt, nil)
    89  }
    90  ```
    91  
    92  Tests for `SHOW JOBS` should be added in `jobs_test.go` by wrapping the contents
    93  of the `verifyJobRecord` in a function which takes a `source` string. The
    94  function is simply called twice, once with `crdb_internal.jobs` as `source` and
    95  once with `[SHOW JOBS]` as `source`.
    96  
    97  ``` go
    98  func verifyJobRecord(...) error {
    99    testSource := func(source string) error {
   100      var typ string
   101      var created pq.NullTime
   102      // Additional fields...
   103      db.QueryRow(fmt.Sprintf(
   104        `SELECT type, created, ...
   105        FROM %s WHERE created >= $1 ORDER BY created LIMIT 1`, source),
   106        expected.Before,
   107      ).Scan(
   108        &typ, &created
   109      )
   110      // Verification...
   111    }
   112    if err := testSource(`crdb_internal.jobs`); err != nil {
   113      return err
   114    }
   115    return testSource(`[SHOW JOBS]`)
   116  }
   117  ```
   118  
   119  *NB: The implementation of this RFC landed before the RFC itself in [#16733].*
   120  
   121  # Drawbacks
   122  
   123  `SHOW JOBS` is additional SQL syntax with no direct basis in another SQL
   124  dialect.
   125  
   126  # Alternatives
   127  
   128  We could avoid introducing a new syntax for jobs monitoring and simply document
   129  the `crdb_internal.jobs` table as the way to track the status of these
   130  long-running backup, restore, and schema change jobs. The consensus during the
   131  comment period, however, was that we'll be ready to commit to `SHOW JOBS` as a
   132  public API for the v1.1 release.
   133  
   134  # Unresolved questions
   135  
   136  None.
   137  
   138  [#15590]: https://github.com/cockroachdb/cockroach/pull/15590
   139  [#16733]: https://github.com/cockroachdb/cockroach/pull/16733