github.com/axw/juju@v0.0.0-20161005053422-4bd6544d08d4/doc/charms-in-action.txt (about) 1 Charms in action 2 ================ 3 4 This document describes the behaviour of the go implementation of the unit 5 agent, whose behaviour differs in some respects from that of the python 6 implementation. This information is largely relevant to charm authors, and 7 potentially to developers interested in the unit agent. 8 9 Hooks 10 ----- 11 12 A service unit's direct action is entirely defined by its charm's hooks. Hooks 13 are executable files in a charm's hooks directory; hooks with particular names 14 will be invoked by the juju unit agent at particular times, and thereby cause 15 changes to the world. 16 17 Whenever a hook-worthy event takes place, the unit agent tries to run the hook 18 with the appropriate name. If the hook doesn't exist, the agent continues 19 without complaint; if it does, it is invoked without arguments in a specific 20 environment, and its output is written to the unit agent's log. If it returns 21 a non-zero exit code, the agent enters an error state and awaits resolution; 22 otherwise it continues to process model changes as before. 23 24 In general, a unit will run hooks in a clear sequence, about which a number of 25 useful guarantees are made. All such guarantees come with the caveat that there 26 is [TODO: will be: `remove-unit --force`] a mechanism for forcible termination 27 of a unit, and that a unit so terminated will just stop, dead, and completely 28 fail to run anything else ever again. This shouldn't actually be a big deal in 29 practice. 30 31 Errors in hooks 32 --------------- 33 34 Hooks should ideally be idempotent, so that they can fail and be re-executed 35 from scratch without trouble. As a hook author, you don't have complete control 36 over the times your hook might be stopped: if the unit agent process is killed 37 for any reason while running a hook, then when it recovers it will treat that 38 hook as having failed -- just as if it had returned a non-zero exit code -- and 39 request user intervention. 40 41 It is unrealistic to expect great sophistication on the part of the average user, 42 and as a charm author you should expect that users will attempt to re-execute 43 failed hooks before attempting to investigate or understand the situation. You 44 should therefore make every effort to ensure your hooks are idempotent when 45 aborted and restarted. 46 47 [TODO: I have a vague feeling that `juju resolved` actually defaults to "just 48 pretend the hook ran successfully" mode. I'm not sure that's really the best 49 default, but I'm also not sure we're in a position to change the UI that much.] 50 51 The most sophisticated charms will consider the nature of their operations with 52 care, and will be prepared to internally retry any operations they suspect of 53 having failed transiently, to ensure that they only request user intervention in 54 the most trying circumstances; and will also be careful to log any relevant 55 information or advice before signalling the error. 56 57 [TODO: I just thought; it would be really nice to have a juju-fail hook tool, 58 which would allow charm authors to explicity set the unit's error status to 59 something a bit more sophisticated than "X hook failed". Wishlist, really.] 60 61 Charm deployment 62 ---------------- 63 64 * A charm is deployed into a directory that is entirely owned and controlled 65 by juju. 66 * At certain times, control of the directory is ceded to the charm (by 67 running a hook) or to the user (by entering an error state). 68 * At these times, and only at these times, should the charm directory be 69 used by anything other than juju itself. 70 71 The most important consequence of this is that it is a mistake to conflate the 72 state of the charm with the state of the software deployed by the charm: it's 73 fine to store *charm* state in the charm directory, but the charm must deploy 74 its actual software elsewhere on the system. 75 76 To put it another way: deleting the charm directory should not impact the 77 software deployed by the charm in any way; and there is currently no mechanism 78 by which deployed software can safely feed information back into the charm 79 and/or expect that it will be acted upon in a timely way. 80 81 [TODO: this sucks a bit. We have plans for a tool called `juju-run`, which 82 would allow an arbitrary script to be invoked as though it were a hook at any 83 time (well, it'd block until no other hook were running, but still). Probably 84 isn't even that hard but it's still rolling around my brain, might either click 85 soon or be overridden by higher priorities and be left for ages. I'm less sure, 86 but have a suspicion, that `juju ssh <unit>` should also default to a juju-run 87 model: primarily because, without this, in the context of forced upgrades, 88 the system cannot offer *any* guarantees about what it might suddenly do to the 89 charm directory while the user's doing things with it. The alternative is to 90 allow unguarded ssh, but tell people that they have to use something like 91 `juju-run --interactive` before they modify the charm dir; this feels somewhat 92 user-hostile, though.] 93 94 Execution environment 95 --------------------- 96 97 Every hook is run in the deployed charm directory, in an environment with the 98 following characteristics: 99 100 * $PATH is prefixed by a directory containing command line tools through 101 which the hooks can interact with juju. 102 * $CHARM_DIR holds the path to the charm directory. 103 * $JUJU_UNIT_NAME holds the name of the local unit. 104 * $JUJU_CONTEXT_ID and $JUJU_AGENT_SOCKET are set (but should not be messed 105 with: the command line tools won't work without them). 106 * $JUJU_API_ADDRESSES holds a space separated list of juju API addresses. 107 * $JUJU_MODEL_NAME holds the human friendly name of the current model. 108 109 Hook tools 110 ---------- 111 112 All hooks can directly use the following tools: 113 114 * juju-log (write arguments direct to juju's log (potentially redundant, hook 115 output is all logged anyway, but --debug may remain useful)) 116 * unit-get (returns the local unit's private-address or public-address) 117 * open-port (marks the supplied port/protocol as ready to open when the 118 service is exposed) 119 * close-port (reverses the effect of open-port) 120 * config-get (get current service configuration values) 121 * relation-get (get the settings of some related unit) 122 * relation-set (write the local unit's relation settings) 123 * relation-ids (list all relations using a given charm relation) 124 * relation-list (list all units of a related service) 125 * storage-add (add storage instances) 126 * storage-get (get storage instance values) 127 * status-get (get unit workload status information) 128 * status-set (set unit workload status information) 129 130 Within the context of a single hook execution, the above tools present a 131 sandboxed view of the system with the following properties: 132 133 * Any data retrieved corresponds to the real value of the underlying state at 134 some point in time. 135 * Once state data has been observed within a given hook execution, further 136 requests for the same data will produce the same results, unless that data 137 has been explicitly changed with relation-set. 138 * Data changed by relation-set is only written to global state when the hook 139 completes without error; changes made by a failing hook will be discarded 140 and never observed by any other part of the system. 141 * Not actually sandboxed: open-port and close-port operate directly on state. 142 [TODO: lp:1089304 - might be a little tricky.] 143 144 Hook kinds 145 ---------- 146 147 There are 5 `unit hooks` with predefined names that can be implemented by any 148 charm: 149 150 * install 151 * config-changed 152 * start 153 * upgrade-charm 154 * stop 155 156 For every relation defined by a charm, an additional 4 `relation hooks` can be 157 implemented, named after the charm relation: 158 159 * <name>-relation-joined 160 * <name>-relation-changed 161 * <name>-relation-departed 162 * <name>-relation-broken 163 164 Unit hooks 165 ---------- 166 167 The `install` hook always runs once, and only once, before any other hook. 168 169 The `config-changed` hook always runs once immediately after the install hook, 170 and likewise after the upgrade-charm hook. It also runs whenever the service 171 configuration changes, and when recovering from transient unit agent errors. 172 173 The `start` hook always runs once immediately after the first config-changed 174 hook; there are currently no other circumstances in which it will be called, 175 but this may change in the future. 176 177 The `upgrade-charm` hook always runs once immediately after the charm directory 178 contents have been changed by an unforced charm upgrade operation, and *may* do 179 so after a forced upgrade; but will *not* be run after a forced upgrade from an 180 existing error state. (Consequently, neither will the config-changed hook that 181 would ordinarily follow the upgrade-charm.) 182 183 The `stop` hook is the last hook to be run before the unit is destroyed. In the 184 future, it may be called in other situations. 185 186 In normal operation, a unit will run at least the install, start, config-changed 187 and stop hooks over the course of its lifetime. 188 189 It should be noted that, while all hook tools are available to all hooks, the 190 relation-* tools are not useful to the install, start, and stop hooks; this is 191 because the first two are run before the unit has any opportunity to participate 192 in any relations, and the stop hooks will not be run while the unit is still 193 participating in one. 194 195 Relation hooks 196 -------------- 197 198 For each charm relation, any or all of the 4 relation hooks can be implemented. 199 Relation hooks operate in an environment slightly different to that of unit 200 hooks, in the following ways: 201 202 * JUJU_RELATION is set to the name of the charm relation. This is of limited 203 value, because every relation hook already "knows" what charm relation it 204 was written for; that is, in the "foo-relation-joined" hook, JUJU_RELATION 205 is "foo". 206 * JUJU_RELATION_ID is more useful, because it serves as unique identifier for 207 a particular relation, and thereby allows the charm to handle distinct 208 relations over a single endpoint. In hooks for the "foo" charm relation, 209 JUJU_RELATION_ID always has the form "foo:<id>", where id uniquely but 210 opaquely identifies the runtime relation currently in play. 211 * The relation-* hook tools, which ordinarily require that a relation be 212 specified, assume they're being called with respect to the current 213 relation. The default can of course be overridden as usual. 214 215 Furthermore, all relation hooks except relation-broken are notifications about 216 some specific unit of a related service, and operate in an environment with the 217 following additional properties: 218 219 * JUJU_REMOTE_UNIT is set to the name of the current related unit. 220 * The relation-get hook tool, which ordinarily requires that a related unit 221 be specified, assumes that it is being called with respect to the current 222 related unit. The default can of course be overridden as usual. 223 224 For every relation in which a unit partcipates, hooks for the appropriate charm 225 relation are run according to the following rules. 226 227 The "relation-joined" hook always runs once when a related unit is first seen. 228 229 The "relation-changed" hook for a given unit always runs once immediately 230 following the relation-joined hook for that unit, and subsequently whenever 231 the related unit changes its settings (by calling relation-set and exiting 232 without error). Note that "immediately" only applies within the context of 233 this particular runtime relation -- that is, when "foo-relation-joined" is 234 run for unit "bar/99" in relation id "foo:123", the only guarantee is that 235 the next hook to be run *in relation id "foo:123"* will be "foo-relation-changed" 236 for "bar/99". Unit hooks may intervene, as may hooks for other relations, 237 and even for other "foo" relations. 238 239 The "relation-departed" hook for a given unit always runs once when a related 240 unit is no longer related. After the "relation-departed" hook has run, no 241 further notifications will be received from that unit; however, its settings 242 will remain accessible via relation-get for the complete lifetime of the 243 relation. 244 245 The "relation-broken" hook is not specific to any unit, and always runs once 246 when the local unit is ready to depart the relation itself. Before this hook 247 is run, a relation-departed hook will be executed for every unit known to be 248 related; it will never run while the relation appears to have members, but it 249 may be the first and only hook to run for a given relation. The stop hook will 250 not run while relations remain to be broken. 251 252 Relations in depth 253 ------------------ 254 255 A unit's `scope` consists of the group of units that are transitively connected 256 to that unit within a particular relation. So, for a globally-scoped relation, 257 that means every unit of each service in the relation; for a locally-scoped 258 relation, it means only those sets of units which are deployed alongside one 259 another. That is to say: a globally-scoped relation has a single unit scope, 260 whilst a locally-scoped relation has one for each principal unit. 261 262 When a unit becomes aware that it is a member of a relation, its only self- 263 directed action is to `join` its scope within that relation. This involves two 264 steps: 265 266 * Write initial relation settings (just one value, "private-address"), to 267 ensure that they will be available to observers before they're triggered 268 by the next step; 269 * Signal its existence, and role in the relation, to the rest of the system. 270 271 The unit then starts observing and reacting to any other units in its scope 272 which are playing a role in which it is interested. To be specific: 273 274 * Each provider unit observes every requirer unit 275 * Each requirer unit observes every provider unit 276 * Each peer unit observes every other peer unit 277 278 Now, suppose that some unit as the very first unit to join the relation; and 279 let's say it's a requirer. No provider units are present, so no hooks will fire. 280 But, when a provider unit joins the relation, the requirer and provider become 281 aware of each other almost simultaneously. (Similarly, the first two units in a 282 peer relation become aware of each other almost simultaneously.) 283 284 So, concurrently, the units on each side of the relation run their relation-joined 285 and relation-changed hooks with respect to their counterpart. The intent is that 286 they communicate appropriate information to each other to set up some sort of 287 connection, by using the relation-set and relation-get hook tools; but neither 288 unit is safe to assume that any particular setting has yet been set by its 289 counterpart. 290 291 This sounds kinda tricky to deal with, but merely requires suitable respect for 292 the relation-get tool: it is important to realise that relation-get is *never* 293 guaranteed to contain any values at all, because we have decided that it's 294 perfectly legitimate for a unit to delete its own private-address value. 295 296 [TODO: There is a school of thought that maintains that we should add an 297 independent "juju-private-address" setting that *is* guaranteed, but for now 298 the reality is that relation-get can *always* fail to produce any given value. 299 However, in the name of sanity, it's probably reasonable to treat a missing 300 private-address as an error, and assume that `relation-get private-address` is 301 always safe. For all other values, we must operate with the understanding that 302 relation-get can always fail.] 303 304 In one specific kind of hook, this is easy to deal with. A relation-changed hook 305 can always exit without error when the current remote unit is missing data, 306 because the hook is guaranteed to be run again when that data changes -- and, 307 assuming the remote unit is running a charm that agrees on how to implement the 308 interface, the data *will* change and the hook *will* be run again. 309 310 In *all* other cases -- unit hooks, relation hooks for a different relation, 311 relation hooks for a different remote unit in the same relation, and even 312 relation hooks other than -changed for the *same* remote unit -- there is no 313 such guarantee. These hooks all run on their own schedule, and there is no 314 reason to expect them to be re-run on a predictable schedule, or in some cases 315 ever again. 316 317 This means that all such hooks need to be able to handle missing relation data, 318 and to complete successfully; they mustn't fail, because the user is powerless 319 to resolve the situation, and they can't even wait for state to change, because 320 they all see their own sandboxed composite snapshot of fairly-recent state, 321 which never changes. 322 323 So, outside a vey narrow range of circumstances, relation-get should be treated 324 with particular care. The corresponding advice for relation-set is very simple 325 by comparison: relation-set should be called early and often. Because the unit 326 agent serializes hook execution, there is never any danger of concurrent changes 327 to the data, and so a null setting change can be safely ignored, and will not 328 cause other units to react. 329 330 Departing relations 331 ------------------- 332 333 A unit will depart a relation when either the relation or the unit itself is 334 marked for termination. In either case, it follows the same sequence: 335 336 * For every known related unit -- those which have joined and not yet 337 departed -- run the relation-departed hook. 338 * Run the relation-broken hook. 339 * `depart` from its scope in the relation. 340 341 The unit's departure from its scope will in turn be detected by units of the 342 related service, and cause them to run relation-departed hooks. A unit's 343 relation settings persist beyond its own departure from the relation; the 344 final unit to depart a relation marked for termination is responsible for 345 destroying the relation and all associated data. 346 347 Debugging charms 348 ---------------- 349 350 Facilities are currently not good. 351 352 * juju ssh 353 * juju debug-hooks [TODO: not implemented] 354 * juju debug-log [TODO: not implemented] 355