github.com/rogpeppe/juju@v0.0.0-20140613142852-6337964b789e/doc/charms-in-action.txt

github.com/rogpeppe/juju@v0.0.0-20140613142852-6337964b789e/doc/charms-in-action.txt (about)

     1  Charms in action
     2  ================
     3  
     4  This document describes the behaviour of the go implementation of the unit
     5  agent, whose behaviour differs in some respects from that of the python
     6  implementation. This information is largely relevant to charm authors, and
     7  potentially to developers interested in the unit agent.
     8  
     9  Hooks
    10  -----
    11  
    12  A service unit's direct action is entirely defined by its charm's hooks. Hooks
    13  are executable files in a charm's hooks directory; hooks with particular names
    14  will be invoked by the juju unit agent at particular times, and thereby cause
    15  changes to the world.
    16  
    17  Whenever a hook-worthy event takes place, the unit agent tries to run the hook
    18  with the appropriate name. If the hook doesn't exist, the agent continues
    19  without complaint; if it does, it is invoked without arguments in a specific
    20  environment, and its output is written to the unit agent's log. If it returns
    21  a non-zero exit code, the agent enters an error state and awaits resolution;
    22  otherwise it continues to process environment changes as before.
    23  
    24  In general, a unit will run hooks in a clear sequence, about which a number of
    25  useful guarantees are made. All such guarantees come with the caveat that there
    26  is [TODO: will be: `remove-unit --force`] a mechanism for forcible termination
    27  of a unit, and that a unit so terminated will just stop, dead, and completely
    28  fail to run anything else ever again. This shouldn't actually be a big deal in
    29  practice.
    30  
    31  Errors in hooks
    32  ---------------
    33  
    34  Hooks should ideally be idempotent, so that they can fail and be re-executed
    35  from scratch without trouble. As a hook author, you don't have complete control
    36  over the times your hook might be stopped: if the unit agent process is killed
    37  for any reason while running a hook, then when it recovers it will treat that
    38  hook as having failed -- just as if it had returned a non-zero exit code -- and
    39  request user intervention.
    40  
    41  It is unrealistic to expect great sophistication on the part of the average user,
    42  and as a charm author you should expect that users will attempt to re-execute
    43  failed hooks before attempting to investigate or understand the situation. You
    44  should therefore make every effort to ensure your hooks are idempotent when
    45  aborted and restarted.
    46  
    47  [TODO: I have a vague feeling that `juju resolved` actually defaults to "just
    48  pretend the hook ran successfully" mode. I'm not sure that's really the best
    49  default, but I'm also not sure we're in a position to change the UI that much.]
    50  
    51  The most sophisticated charms will consider the nature of their operations with
    52  care, and will be prepared to internally retry any operations they suspect of
    53  having failed transiently, to ensure that they only request user intervention in
    54  the most trying circumstances; and will also be careful to log any relevant
    55  information or advice before signalling the error.
    56  
    57  [TODO: I just thought; it would be really nice to have a juju-fail hook tool,
    58  which would allow charm authors to explicity set the unit's error status to
    59  something a bit more sophisticated than "X hook failed". Wishlist, really.]
    60  
    61  Charm deployment
    62  ----------------
    63  
    64    * A charm is deployed into a directory that is entirely owned and controlled
    65      by juju.
    66    * At certain times, control of the directory is ceded to the charm (by
    67      running a hook) or to the user (by entering an error state).
    68    * At these times, and only at these times, should the charm directory be
    69      used by anything other than juju itself.
    70  
    71  The most important consequence of this is that it is a mistake to conflate the
    72  state of the charm with the state of the software deployed by the charm: it's
    73  fine to store *charm* state in the charm directory, but the charm must deploy
    74  its actual software elsewhere on the system.
    75  
    76  To put it another way: deleting the charm directory should not impact the
    77  software deployed by the charm in any way; and there is currently no mechanism
    78  by which deployed software can safely feed information back into the charm
    79  and/or expect that it will be acted upon in a timely way.
    80  
    81  [TODO: this sucks a bit. We have plans for a tool called `juju-run`, which
    82  would allow an arbitrary script to be invoked as though it were a hook at any
    83  time (well, it'd block until no other hook were running, but still). Probably
    84  isn't even that hard but it's still rolling around my brain, might either click
    85  soon or be overridden by higher priorities and be left for ages. I'm less sure,
    86  but have a suspicion, that `juju ssh <unit>` should also default to a juju-run
    87  environment: primarily because, without this, in the context of forced upgrades,
    88  the system cannot offer *any* guarantees about what it might suddenly do to the
    89  charm directory while the user's doing things with it. The alternative is to
    90  allow unguarded ssh, but tell people that they have to use something like
    91  `juju-run --interactive` before they modify the charm dir; this feels somewhat
    92  user-hostile, though.]
    93  
    94  Execution environment
    95  ---------------------
    96  
    97  Every hook is run in the deployed charm directory, in an environment with the
    98  following characteristics:
    99  
   100    * $PATH is prefixed by a directory containing command line tools through
   101      which the hooks can interact with juju.
   102    * $CHARM_DIR holds the path to the charm directory.
   103    * $JUJU_UNIT_NAME holds the name of the local unit.
   104    * $JUJU_CONTEXT_ID and $JUJU_AGENT_SOCKET are set (but should not be messed
   105      with: the command line tools won't work without them).
   106    * $JUJU_API_ADDRESSES holds a space separated list of juju API addresses.
   107    * $JUJU_ENV_NAME holds the human friendly name of the current environment.
   108  
   109  Hook tools
   110  ----------
   111  
   112  All hooks can directly use the following tools:
   113  
   114    * juju-log (write arguments direct to juju's log (potentially redundant, hook
   115      output is all logged anyway, but --debug may remain useful))
   116    * unit-get (returns the local unit's private-address or public-address)
   117    * open-port (marks the supplied port/protocol as ready to open when the
   118      service is exposed)
   119    * close-port (reverses the effect of open-port)
   120    * config-get (get current service configuration values)
   121    * relation-get (get the settings of some related unit)
   122    * relation-set (write the local unit's relation settings)
   123    * relation-ids (list all relations using a given charm relation)
   124    * relation-list (list all units of a related service)
   125  
   126  Within the context of a single hook execution, the above tools present a
   127  sandboxed view of the system with the following properties:
   128  
   129    * Any data retrieved corresponds to the real value of the underlying state at
   130      some point in time.
   131    * Once state data has been observed within a given hook execution, further
   132      requests for the same data will produce the same results, unless that data
   133      has been explicitly changed with relation-set.
   134    * Data changed by relation-set is only written to global state when the hook
   135      completes without error; changes made by a failing hook will be discarded
   136      and never observed by any other part of the system.
   137    * Not actually sandboxed: open-port and close-port operate directly on state.
   138      [TODO: lp:1089304 - might be a little tricky.]
   139  
   140  Hook kinds
   141  ----------
   142  
   143  There are 5 `unit hooks` with predefined names that can be implemented by any
   144  charm:
   145  
   146    * install
   147    * config-changed
   148    * start
   149    * upgrade-charm
   150    * stop
   151  
   152  For every relation defined by a charm, an additional 4 `relation hooks` can be
   153  implemented, named after the charm relation:
   154  
   155    * <name>-relation-joined
   156    * <name>-relation-changed
   157    * <name>-relation-departed
   158    * <name>-relation-broken
   159  
   160  Unit hooks
   161  ----------
   162  
   163  The `install` hook always runs once, and only once, before any other hook.
   164  
   165  The `config-changed` hook always runs once immediately after the install hook,
   166  and likewise after the upgrade-charm hook. It also runs whenever the service
   167  configuration changes, and when recovering from transient unit agent errors.
   168  
   169  The `start` hook always runs once immediately after the first config-changed
   170   hook; there are currently no other circumstances in which it will be called,
   171  but this may change in the future.
   172  
   173  The `upgrade-charm` hook always runs once immediately after the charm directory
   174  contents have been changed by an unforced charm upgrade operation, and *may* do
   175  so after a forced upgrade; but will *not* be run after a forced upgrade from an
   176  existing error state. (Consequently, neither will the config-changed hook that
   177  would ordinarily follow the upgrade-charm.)
   178  
   179  The `stop` hook is the last hook to be run before the unit is destroyed. In the
   180  future, it may be called in other situations.
   181  
   182  In normal operation, a unit will run at least the install, start, config-changed
   183  and stop hooks over the course of its lifetime.
   184  
   185  It should be noted that, while all hook tools are available to all hooks, the
   186  relation-* tools are not useful to the install, start, and stop hooks; this is
   187  because the first two are run before the unit has any opportunity to participate
   188  in any relations, and the stop hooks will not be run while the unit is still
   189  participating in one.
   190  
   191  Relation hooks
   192  --------------
   193  
   194  For each charm relation, any or all of the 4 relation hooks can be implemented.
   195  Relation hooks operate in an environment slightly different to that of unit
   196  hooks, in the following ways:
   197  
   198    * JUJU_RELATION is set to the name of the charm relation. This is of limited
   199      value, because every relation hook already "knows" what charm relation it
   200      was written for; that is, in the "foo-relation-joined" hook, JUJU_RELATION
   201      is "foo".
   202    * JUJU_RELATION_ID is more useful, because it serves as unique identifier for
   203      a particular relation, and thereby allows the charm to handle distinct
   204      relations over a single endpoint. In hooks for the "foo" charm relation,
   205      JUJU_RELATION_ID always has the form "foo:<id>", where id uniquely but
   206      opaquely identifies the runtime relation currently in play.
   207    * The relation-* hook tools, which ordinarily require that a relation be
   208      specified, assume they're being called with respect to the current
   209      relation. The default can of course be overridden as usual.
   210  
   211  Furthermore, all relation hooks except relation-broken are notifications about
   212  some specific unit of a related service, and operate in an environment with the
   213  following additional properties:
   214  
   215    * JUJU_REMOTE_UNIT is set to the name of the current related unit.
   216    * The relation-get hook tool, which ordinarily requires that a related unit
   217      be specified, assumes that it is being called with respect to the current
   218      related unit. The default can of course be overridden as usual.
   219  
   220  For every relation in which a unit partcipates, hooks for the appropriate charm
   221  relation are run according to the following rules.
   222  
   223  The "relation-joined" hook always runs once when a related unit is first seen.
   224  
   225  The "relation-changed" hook for a given unit always runs once immediately
   226  following the relation-joined hook for that unit, and subsequently whenever
   227  the related unit changes its settings (by calling relation-set and exiting
   228  without error). Note that "immediately" only applies within the context of
   229  this particular runtime relation -- that is, when "foo-relation-joined" is
   230  run for unit "bar/99" in relation id "foo:123", the only guarantee is that
   231  the next hook to be run *in relation id "foo:123"* will be "foo-relation-changed"
   232  for "bar/99". Unit hooks may intervene, as may hooks for other relations,
   233  and even for other "foo" relations.
   234  
   235  The "relation-departed" hook for a given unit always runs once when a related
   236  unit is no longer related. After the "relation-departed" hook has run, no
   237  further notifications will be received from that unit; however, its settings
   238  will remain accessible via relation-get for the complete lifetime of the
   239  relation.
   240  
   241  The "relation-broken" hook is not specific to any unit, and always runs once
   242  when the local unit is ready to depart the relation itself. Before this hook
   243  is run, a relation-departed hook will be executed for every unit known to be
   244  related; it will never run while the relation appears to have members, but it
   245  may be the first and only hook to run for a given relation. The stop hook will
   246  not run while relations remain to be broken.
   247  
   248  Relations in depth
   249  ------------------
   250  
   251  A unit's `scope` consists of the group of units that are transitively connected
   252  to that unit within a particular relation. So, for a globally-scoped relation,
   253  that means every unit of each service in the relation; for a locally-scoped
   254  relation, it means only those sets of units which are deployed alongside one
   255  another.  That is to say: a globally-scoped relation has a single unit scope,
   256  whilst a locally-scoped relation has one for each principal unit.
   257  
   258  When a unit becomes aware that it is a member of a relation, its only self-
   259  directed action is to `join` its scope within that relation. This involves two
   260  steps:
   261  
   262    * Write initial relation settings (just one value, "private-address"), to
   263      ensure that they will be available to observers before they're triggered
   264      by the next step;
   265    * Signal its existence, and role in the relation, to the rest of the system.
   266  
   267  The unit then starts observing and reacting to any other units in its scope
   268  which are playing a role in which it is interested. To be specific:
   269  
   270    * Each provider unit observes every requirer unit
   271    * Each requirer unit observes every provider unit
   272    * Each peer unit observes every other peer unit
   273  
   274  Now, suppose that some unit as the very first unit to join the relation; and
   275  let's say it's a requirer. No provider units are present, so no hooks will fire.
   276  But, when a provider unit joins the relation, the requirer and provider become
   277  aware of each other almost simultaneously. (Similarly, the first two units in a
   278  peer relation become aware of each other almost simultaneously.)
   279  
   280  So, concurrently, the units on each side of the relation run their relation-joined
   281  and relation-changed hooks with respect to their counterpart. The intent is that
   282  they communicate appropriate information to each other to set up some sort of
   283  connection, by using the relation-set and relation-get hook tools; but neither
   284  unit is safe to assume that any particular setting has yet been set by its
   285  counterpart.
   286  
   287  This sounds kinda tricky to deal with, but merely requires suitable respect for
   288  the relation-get tool: it is important to realise that relation-get is *never*
   289  guaranteed to contain any values at all, because we have decided that it's
   290  perfectly legitimate for a unit to delete its own private-address value.
   291  
   292  [TODO: There is a school of thought that maintains that we should add an
   293  independent "juju-private-address" setting that *is* guaranteed, but for now
   294  the reality is that relation-get can *always* fail to produce any given value.
   295  However, in the name of sanity, it's probably reasonable to treat a missing
   296  private-address as an error, and assume that `relation-get private-address` is
   297  always safe. For all other values, we must operate with the understanding that
   298  relation-get can always fail.]
   299  
   300  In one specific kind of hook, this is easy to deal with. A relation-changed hook
   301  can always exit without error when the current remote unit is missing data,
   302  because the hook is guaranteed to be run again when that data changes -- and,
   303  assuming the remote unit is running a charm that agrees on how to implement the
   304  interface, the data *will* change and the hook *will* be run again.
   305  
   306  In *all* other cases -- unit hooks, relation hooks for a different relation,
   307  relation hooks for a different remote unit in the same relation, and even
   308  relation hooks other than -changed for the *same* remote unit -- there is no
   309  such guarantee. These hooks all run on their own schedule, and there is no
   310  reason to expect them to be re-run on a predictable schedule, or in some cases
   311  ever again.
   312  
   313  This means that all such hooks need to be able to handle missing relation data,
   314  and to complete successfully; they mustn't fail, because the user is powerless
   315  to resolve the situation, and they can't even wait for state to change, because
   316  they all see their own sandboxed composite snapshot of fairly-recent state,
   317  which never changes.
   318  
   319  So, outside a vey narrow range of circumstances, relation-get should be treated
   320  with particular care. The corresponding advice for relation-set is very simple
   321  by comparison: relation-set should be called early and often. Because the unit
   322  agent serializes hook execution, there is never any danger of concurrent changes
   323  to the data, and so a null setting change can be safely ignored, and will not
   324  cause other units to react.
   325  
   326  Departing relations
   327  -------------------
   328  
   329  A unit will depart a relation when either the relation or the unit itself is
   330  marked for termination. In either case, it follows the same sequence:
   331  
   332    * For every known related unit -- those which have joined and not yet
   333      departed -- run the relation-departed hook.
   334    * Run the relation-broken hook.
   335    * `depart` from its scope in the relation.
   336  
   337  The unit's departure from its scope will in turn be detected by units of the
   338  related service, and cause them to run relation-departed hooks. A unit's
   339  relation settings persist beyond its own departure from the relation; the
   340  final unit to depart a relation marked for termination is responsible for
   341  destroying the relation and all associated data.
   342  
   343  Debugging charms
   344  ----------------
   345  
   346  Facilities are currently not good.
   347  
   348    * juju ssh
   349    * juju debug-hooks [TODO: not implemented]
   350    * juju debug-log [TODO: not implemented]
   351  
   352  It may be helpful to note that the charm directory is a git repository that
   353  holds the complete hook-by-hook history of the deployment. This property is
   354  not guaranteed, and charms should not depend upon it, but humans who need to
   355  dig around in charm directories should be aware of it.