github.com/pachyderm/pachyderm@v1.13.4/doc/docs/master/concepts/pipeline-concepts/pipeline/cron.md (about) 1 # Cron Pipeline 2 3 Pachyderm triggers pipelines when new changes appear in the input repository. 4 However, if you want to trigger a pipeline based on time instead of upon 5 arrival of input data, you can schedule such pipelines to run periodically 6 by using the Pachyderm built-in cron input type. 7 8 A standard pipeline with a PFS input might not satisfy 9 the requirements of the following tasks: 10 11 - Scrape websites 12 - Make API calls 13 - Query a database 14 - Retrieve a file from a location accessible through an S3 protocol 15 or a File Transfer Protocol (FTP). 16 17 A minimum cron pipeline must include the following parameters: 18 19 | Parameter | Description | 20 | ---------- | ------------ | 21 | `"name"` | A descriptive name of the cron pipeline. | 22 | `"spec"` | An interval between scheduled cron jobs. You can specify any value that is <br> formatted according to [RFC 3339](https://www.ietf.org/rfc/rfc3339.txt). <br> For example, if you set `*/10 * * * *`, the pipeline runs every ten minutes. | 23 24 ## Example of a Cron Pipeline 25 26 For example, you want to query a database every ten seconds and update your 27 dataset with the new data every time the pipeline is triggered. The following 28 pipeline extract illustrates how you can specify this configuration. 29 30 !!! example 31 32 ```json 33 "input": { 34 "cron": { 35 "name": "tick", 36 "spec": "@every 10s" 37 } 38 } 39 ``` 40 41 When you create this pipeline, Pachyderm creates a new input data repository 42 that corresponds to the `cron` input. Then, Pachyderm automatically commits 43 a timestamp file to the `cron` input repository every ten seconds, which 44 triggers the pipeline. 45 46  47 48 The pipeline runs every ten seconds, queries the database and updates its 49 output. By default, each cron trigger adds a new tick file to the cron input 50 repository, accumulating more datums over time. This behavior works for some 51 pipelines. For others, you might want each tick file to overwrite the 52 previous one. You can set the overwrite flag to true to overwrite the 53 timestamp file on each tick. To learn more about overwriting commits in 54 Pachyderm, see [Datum processing](../datum/index.md). 55 56 !!! example 57 58 ```json 59 "input": { 60 "cron": { 61 "name": "tick", 62 "spec": "@every 10s", 63 "overwrite": true 64 } 65 } 66 ``` 67 68 !!! note "See Also:" 69 [Periodic Ingress from MongoDB](https://github.com/pachyderm/pachyderm/tree/master/examples/db)