github.com/mattyr/nomad@v0.3.3-0.20160919021406-3485a065154a/website/source/docs/jobops/updating.html.md (about) 1 --- 2 layout: "docs" 3 page_title: "Operating a Job: Updating Jobs" 4 sidebar_current: "docs-jobops-updating" 5 description: |- 6 Learn how to do safely update Nomad Jobs. 7 --- 8 9 # Updating a Job 10 11 When operating a service, updating the version of the job will be a common task. 12 Under a cluster scheduler the same best practices apply for reliably deploying 13 new versions including: rolling updates, blue-green deploys and canaries which 14 are special cased blue-green deploys. This section will explore how to do each 15 of these safely with Nomad. 16 17 ## Rolling Updates 18 19 In order to update a service without introducing down-time, Nomad has build in 20 support for rolling updates. When a job specifies a rolling update, with the 21 below syntax, Nomad will only update `max-parallel` number of task groups at a 22 time and will wait `stagger` duration before updating the next set. 23 24 ``` 25 job "rolling" { 26 ... 27 update { 28 stagger = "30s" 29 max_parallel = 1 30 } 31 ... 32 } 33 ``` 34 35 We can use the [`nomad plan` command](/docs/commands/plan.html) while updating 36 jobs to ensure the scheduler will do as we expect. In this example, we have 3 37 web server instances that we want to update their version. After the job file 38 was modified we can run `plan`: 39 40 ``` 41 $ nomad plan my-web.nomad 42 +/- Job: "my-web" 43 +/- Task Group: "web" (3 create/destroy update) 44 +/- Task: "web" (forces create/destroy update) 45 +/- Config { 46 +/- image: "nginx:1.10" => "nginx:1.11" 47 port_map[0][http]: "80" 48 } 49 50 Scheduler dry-run: 51 - All tasks successfully allocated. 52 - Rolling update, next evaluation will be in 10s. 53 54 Job Modify Index: 7 55 To submit the job with version verification run: 56 57 nomad run -check-index 7 my-web.nomad 58 59 When running the job with the check-index flag, the job will only be run if the 60 server side version matches the the job modify index returned. If the index has 61 changed, another user has modified the job and the plan's results are 62 potentially invalid. 63 ``` 64 65 Here we can see that Nomad will destroy the 3 existing tasks and create 3 66 replacements but it will occur with a rolling update with a stagger of `10s`. 67 For more details on the update block, see 68 the [Jobspec documentation](/docs/jobspec/index.html#update). 69 70 ## Blue-green and Canaries 71 72 Blue-green deploys have several names, Red/Black, A/B, Blue/Green, but the 73 concept is the same. The idea is to have two sets of applications with only one 74 of them being live at a given time, except while transitioning from one set to 75 another. What the term "live" means is that the live set of applications are 76 the set receiving traffic. 77 78 So imagine we have an API server that has 10 instances deployed to production 79 at version 1 and we want to upgrade to version 2. Hopefully the new version has 80 been tested in a QA environment and is now ready to start accepting production 81 traffic. 82 83 In this case we would consider version 1 to be the live set and we want to 84 transition to version 2. We can model this workflow with the below job: 85 86 ``` 87 job "my-api" { 88 ... 89 90 group "api-green" { 91 count = 10 92 93 task "api-server" { 94 driver = "docker" 95 96 config { 97 image = "api-server:v1" 98 } 99 } 100 } 101 102 group "api-blue" { 103 count = 0 104 105 task "api-server" { 106 driver = "docker" 107 108 config { 109 image = "api-server:v2" 110 } 111 } 112 } 113 } 114 ``` 115 116 Here we can see the live group is "api-green" since it has a non-zero count. To 117 transition to v2, we up the count of "api-blue" and down the count of 118 "api-green". We can now see how the canary process is a special case of 119 blue-green. If we set "api-blue" to `count = 1` and "api-green" to `count = 9`, 120 there will still be the original 10 instances but we will be testing only one 121 instance of the new version, essentially canarying it. 122 123 If at any time we notice that the new version is behaving incorrectly and we 124 want to roll back, all that we have to do is drop the count of the new group to 125 0 and restore the original version back to 10. This fine control lets job 126 operators be confident that deployments will not cause down time. If the deploy 127 is successful and we fully transition from v1 to v2 the job file will look like 128 this: 129 130 ``` 131 job "my-api" { 132 ... 133 134 group "api-green" { 135 count = 0 136 137 task "api-server" { 138 driver = "docker" 139 140 config { 141 image = "api-server:v1" 142 } 143 } 144 } 145 146 group "api-blue" { 147 count = 10 148 149 task "api-server" { 150 driver = "docker" 151 152 config { 153 image = "api-server:v2" 154 } 155 } 156 } 157 } 158 ``` 159 160 Now "api-blue" is the live group and when we are ready to update the api to v3, 161 we would modify "api-green" and repeat this process. The rate at which the count 162 of groups are incremented and decremented is totally up to the user. It is 163 usually good practice to start by transition one at a time until a certain 164 confidence threshold is met based on application specific logs and metrics. 165 166 ## Handling Drain Signals 167 168 On operating systems that support signals, Nomad will signal the application 169 before killing it. This gives the application time to gracefully drain 170 connections and conduct any other cleanup that is necessary. Certain 171 applications take longer to drain than others and as such Nomad lets the job 172 file specify how long to wait in-between signaling the application to exit and 173 forcefully killing it. This is configurable via the `kill_timeout`. More details 174 can be seen in the [Jobspec documentation](/docs/jobspec/index.html#kill_timeout).