github.com/cockroachdb/cockroach@v20.2.0-alpha.1+incompatible/docs/RFCS/20170701_jobs_monitoring.md (about) 1 - Feature Name: Jobs Monitoring 2 - Status: completed ([#16733](https://github.com/cockroachdb/cockroach/pull/16733)) 3 - Start Date: 2017-06-22 4 - Author: Julian Gilyadov 5 - RFC PR: [#16688](https://github.com/cockroachdb/cockroach/pull/16688) 6 - Cockroach Issue: None 7 8 # Motivation 9 10 Recent syntax additions such as `SHOW [QUERIES|SESSIONS]` syntax provide a 11 convenient interface for monitoring queries and sessions, but for a consistent 12 and symmetric interface it is important to add `SHOW JOBS` statement too, 13 especially considering the forthcoming `CANCEL [QUERY|JOB]` statement. 14 15 Although a `SHOW JOBS` statement is additional SQL syntax with no direct basis 16 in another SQL dialect, it is better to have this implemented rather than have 17 an inconsistent interface where it is possible to monitor queries and sessions 18 but not jobs. 19 20 It is also worth noting that at least one current customer has requested the 21 ability to query job status from SQL directly. 22 23 # Design 24 25 For consistency, it is best to mimic the `SHOW [QUERIES|SESSIONS]` syntax 26 ([RFC](20170505_monitoring_queries_and_sessions.md)): 27 28 ```sql 29 SHOW JOBS 30 ``` 31 32 Since all jobs are cluster jobs, there's no need to specify `LOCAL` or `CLUSTER` 33 (like in the RFC above). Users who want more fine-grained queries can use `SHOW 34 JOBS` as a data source in a `FROM` clause, thanks to [#15590]: 35 36 ```sql 37 SELECT fraction_completed FROM [SHOW JOBS] WHERE type = 'RESTORE'; 38 ``` 39 40 Though the [`system.jobs`](20170215_system_jobs.md) table is the source for 41 `SHOW JOBS` and can be queried by users (`SELECT * FROM system.jobs`), 42 `system.jobs` cannot be used for monitoring. Users cannot retrieve any of 43 several important fields stored in the `payload` protobuf, like 44 `fraction_completed` and `username`, from SQL directly. 45 46 The `crdb_internal.jobs` table wraps the `system.jobs` table and exposes the 47 important information by decoding the `payload` protobuf. The columns returned 48 from `crdb_internal.jobs` are exactly the columns we intend intend to return 49 from `SHOW JOBS`. (In v1.0, we weren't ready to make `crdb_internal.jobs` part 50 of the public API.) Since we have two months before v1.1 is considered 51 stable, we feel comfortable making `SHOW JOBS` part of the public API. 52 53 For reference, the columns are: 54 55 * `id` — the ID of the job 56 * `type` — the type of the job (currently BACKUP or RESTORE) 57 * `description` — the query used to create the job 58 * `username` — the user who created the job 59 * `descriptor_ids` — the IDs of the tables impacted by the job 60 * `status` — pending, running, succeeded, failed (aborted soon) 61 * `created, started, finished` — the time of state transitions 62 * `modified` — the last time this row was updated 63 * `fraction_completed` — the approximate progress of this job 64 * `error` — the error that caused the job to fail, if it failed 65 66 # Implementation 67 68 The implementation of `SHOW JOBS` could essentially copy/paste what the internal 69 `crdb_internal.jobs` table does, with the intention of removing 70 `crdb_internal.jobs`; however, by keeping the virtual table, we can take 71 advantage of a forthcoming optimization to push filtering down to the virtual 72 table constructors, which avoids populating all jobs in memory if the user 73 specifies a `WHERE` clause. 74 75 Because the internal `crdb_internal.jobs` table is here to stay, the 76 implementation of `SHOW JOBS` should be simply parsing the internal jobs table, 77 similarly how `SHOW DATABASES` is implemented: 78 79 ```go 80 // ShowJobs returns all jobs, past and present, on the cluster. 81 // Privileges: None. 82 func (p *planner) ShowJobs(ctx context.Context, n *parser.ShowJobs) (planNode, error) { 83 const getJobs = `TABLE crdb_internal.jobs` 84 stmt, err := parser.ParseOne(getJobs) 85 if err != nil { 86 return nil, err 87 } 88 return p.newPlan(ctx, stmt, nil) 89 } 90 ``` 91 92 Tests for `SHOW JOBS` should be added in `jobs_test.go` by wrapping the contents 93 of the `verifyJobRecord` in a function which takes a `source` string. The 94 function is simply called twice, once with `crdb_internal.jobs` as `source` and 95 once with `[SHOW JOBS]` as `source`. 96 97 ``` go 98 func verifyJobRecord(...) error { 99 testSource := func(source string) error { 100 var typ string 101 var created pq.NullTime 102 // Additional fields... 103 db.QueryRow(fmt.Sprintf( 104 `SELECT type, created, ... 105 FROM %s WHERE created >= $1 ORDER BY created LIMIT 1`, source), 106 expected.Before, 107 ).Scan( 108 &typ, &created 109 ) 110 // Verification... 111 } 112 if err := testSource(`crdb_internal.jobs`); err != nil { 113 return err 114 } 115 return testSource(`[SHOW JOBS]`) 116 } 117 ``` 118 119 *NB: The implementation of this RFC landed before the RFC itself in [#16733].* 120 121 # Drawbacks 122 123 `SHOW JOBS` is additional SQL syntax with no direct basis in another SQL 124 dialect. 125 126 # Alternatives 127 128 We could avoid introducing a new syntax for jobs monitoring and simply document 129 the `crdb_internal.jobs` table as the way to track the status of these 130 long-running backup, restore, and schema change jobs. The consensus during the 131 comment period, however, was that we'll be ready to commit to `SHOW JOBS` as a 132 public API for the v1.1 release. 133 134 # Unresolved questions 135 136 None. 137 138 [#15590]: https://github.com/cockroachdb/cockroach/pull/15590 139 [#16733]: https://github.com/cockroachdb/cockroach/pull/16733