github.com/axw/juju@v0.0.0-20161005053422-4bd6544d08d4/doc/charms-in-action.txt (about)

     1  Charms in action
     2  ================
     3  
     4  This document describes the behaviour of the go implementation of the unit
     5  agent, whose behaviour differs in some respects from that of the python
     6  implementation. This information is largely relevant to charm authors, and
     7  potentially to developers interested in the unit agent.
     8  
     9  Hooks
    10  -----
    11  
    12  A service unit's direct action is entirely defined by its charm's hooks. Hooks
    13  are executable files in a charm's hooks directory; hooks with particular names
    14  will be invoked by the juju unit agent at particular times, and thereby cause
    15  changes to the world.
    16  
    17  Whenever a hook-worthy event takes place, the unit agent tries to run the hook
    18  with the appropriate name. If the hook doesn't exist, the agent continues
    19  without complaint; if it does, it is invoked without arguments in a specific
    20  environment, and its output is written to the unit agent's log. If it returns
    21  a non-zero exit code, the agent enters an error state and awaits resolution;
    22  otherwise it continues to process model changes as before.
    23  
    24  In general, a unit will run hooks in a clear sequence, about which a number of
    25  useful guarantees are made. All such guarantees come with the caveat that there
    26  is [TODO: will be: `remove-unit --force`] a mechanism for forcible termination
    27  of a unit, and that a unit so terminated will just stop, dead, and completely
    28  fail to run anything else ever again. This shouldn't actually be a big deal in
    29  practice.
    30  
    31  Errors in hooks
    32  ---------------
    33  
    34  Hooks should ideally be idempotent, so that they can fail and be re-executed
    35  from scratch without trouble. As a hook author, you don't have complete control
    36  over the times your hook might be stopped: if the unit agent process is killed
    37  for any reason while running a hook, then when it recovers it will treat that
    38  hook as having failed -- just as if it had returned a non-zero exit code -- and
    39  request user intervention.
    40  
    41  It is unrealistic to expect great sophistication on the part of the average user,
    42  and as a charm author you should expect that users will attempt to re-execute
    43  failed hooks before attempting to investigate or understand the situation. You
    44  should therefore make every effort to ensure your hooks are idempotent when
    45  aborted and restarted.
    46  
    47  [TODO: I have a vague feeling that `juju resolved` actually defaults to "just
    48  pretend the hook ran successfully" mode. I'm not sure that's really the best
    49  default, but I'm also not sure we're in a position to change the UI that much.]
    50  
    51  The most sophisticated charms will consider the nature of their operations with
    52  care, and will be prepared to internally retry any operations they suspect of
    53  having failed transiently, to ensure that they only request user intervention in
    54  the most trying circumstances; and will also be careful to log any relevant
    55  information or advice before signalling the error.
    56  
    57  [TODO: I just thought; it would be really nice to have a juju-fail hook tool,
    58  which would allow charm authors to explicity set the unit's error status to
    59  something a bit more sophisticated than "X hook failed". Wishlist, really.]
    60  
    61  Charm deployment
    62  ----------------
    63  
    64    * A charm is deployed into a directory that is entirely owned and controlled
    65      by juju.
    66    * At certain times, control of the directory is ceded to the charm (by
    67      running a hook) or to the user (by entering an error state).
    68    * At these times, and only at these times, should the charm directory be
    69      used by anything other than juju itself.
    70  
    71  The most important consequence of this is that it is a mistake to conflate the
    72  state of the charm with the state of the software deployed by the charm: it's
    73  fine to store *charm* state in the charm directory, but the charm must deploy
    74  its actual software elsewhere on the system.
    75  
    76  To put it another way: deleting the charm directory should not impact the
    77  software deployed by the charm in any way; and there is currently no mechanism
    78  by which deployed software can safely feed information back into the charm
    79  and/or expect that it will be acted upon in a timely way.
    80  
    81  [TODO: this sucks a bit. We have plans for a tool called `juju-run`, which
    82  would allow an arbitrary script to be invoked as though it were a hook at any
    83  time (well, it'd block until no other hook were running, but still). Probably
    84  isn't even that hard but it's still rolling around my brain, might either click
    85  soon or be overridden by higher priorities and be left for ages. I'm less sure,
    86  but have a suspicion, that `juju ssh <unit>` should also default to a juju-run
    87  model: primarily because, without this, in the context of forced upgrades,
    88  the system cannot offer *any* guarantees about what it might suddenly do to the
    89  charm directory while the user's doing things with it. The alternative is to
    90  allow unguarded ssh, but tell people that they have to use something like
    91  `juju-run --interactive` before they modify the charm dir; this feels somewhat
    92  user-hostile, though.]
    93  
    94  Execution environment
    95  ---------------------
    96  
    97  Every hook is run in the deployed charm directory, in an environment with the
    98  following characteristics:
    99  
   100    * $PATH is prefixed by a directory containing command line tools through
   101      which the hooks can interact with juju.
   102    * $CHARM_DIR holds the path to the charm directory.
   103    * $JUJU_UNIT_NAME holds the name of the local unit.
   104    * $JUJU_CONTEXT_ID and $JUJU_AGENT_SOCKET are set (but should not be messed
   105      with: the command line tools won't work without them).
   106    * $JUJU_API_ADDRESSES holds a space separated list of juju API addresses.
   107    * $JUJU_MODEL_NAME holds the human friendly name of the current model.
   108  
   109  Hook tools
   110  ----------
   111  
   112  All hooks can directly use the following tools:
   113  
   114    * juju-log (write arguments direct to juju's log (potentially redundant, hook
   115      output is all logged anyway, but --debug may remain useful))
   116    * unit-get (returns the local unit's private-address or public-address)
   117    * open-port (marks the supplied port/protocol as ready to open when the
   118      service is exposed)
   119    * close-port (reverses the effect of open-port)
   120    * config-get (get current service configuration values)
   121    * relation-get (get the settings of some related unit)
   122    * relation-set (write the local unit's relation settings)
   123    * relation-ids (list all relations using a given charm relation)
   124    * relation-list (list all units of a related service)
   125    * storage-add (add storage instances)
   126    * storage-get (get storage instance values)
   127    * status-get (get unit workload status information)
   128    * status-set (set unit workload status information)
   129  
   130  Within the context of a single hook execution, the above tools present a
   131  sandboxed view of the system with the following properties:
   132  
   133    * Any data retrieved corresponds to the real value of the underlying state at
   134      some point in time.
   135    * Once state data has been observed within a given hook execution, further
   136      requests for the same data will produce the same results, unless that data
   137      has been explicitly changed with relation-set.
   138    * Data changed by relation-set is only written to global state when the hook
   139      completes without error; changes made by a failing hook will be discarded
   140      and never observed by any other part of the system.
   141    * Not actually sandboxed: open-port and close-port operate directly on state.
   142      [TODO: lp:1089304 - might be a little tricky.]
   143  
   144  Hook kinds
   145  ----------
   146  
   147  There are 5 `unit hooks` with predefined names that can be implemented by any
   148  charm:
   149  
   150    * install
   151    * config-changed
   152    * start
   153    * upgrade-charm
   154    * stop
   155  
   156  For every relation defined by a charm, an additional 4 `relation hooks` can be
   157  implemented, named after the charm relation:
   158  
   159    * <name>-relation-joined
   160    * <name>-relation-changed
   161    * <name>-relation-departed
   162    * <name>-relation-broken
   163  
   164  Unit hooks
   165  ----------
   166  
   167  The `install` hook always runs once, and only once, before any other hook.
   168  
   169  The `config-changed` hook always runs once immediately after the install hook,
   170  and likewise after the upgrade-charm hook. It also runs whenever the service
   171  configuration changes, and when recovering from transient unit agent errors.
   172  
   173  The `start` hook always runs once immediately after the first config-changed
   174   hook; there are currently no other circumstances in which it will be called,
   175  but this may change in the future.
   176  
   177  The `upgrade-charm` hook always runs once immediately after the charm directory
   178  contents have been changed by an unforced charm upgrade operation, and *may* do
   179  so after a forced upgrade; but will *not* be run after a forced upgrade from an
   180  existing error state. (Consequently, neither will the config-changed hook that
   181  would ordinarily follow the upgrade-charm.)
   182  
   183  The `stop` hook is the last hook to be run before the unit is destroyed. In the
   184  future, it may be called in other situations.
   185  
   186  In normal operation, a unit will run at least the install, start, config-changed
   187  and stop hooks over the course of its lifetime.
   188  
   189  It should be noted that, while all hook tools are available to all hooks, the
   190  relation-* tools are not useful to the install, start, and stop hooks; this is
   191  because the first two are run before the unit has any opportunity to participate
   192  in any relations, and the stop hooks will not be run while the unit is still
   193  participating in one.
   194  
   195  Relation hooks
   196  --------------
   197  
   198  For each charm relation, any or all of the 4 relation hooks can be implemented.
   199  Relation hooks operate in an environment slightly different to that of unit
   200  hooks, in the following ways:
   201  
   202    * JUJU_RELATION is set to the name of the charm relation. This is of limited
   203      value, because every relation hook already "knows" what charm relation it
   204      was written for; that is, in the "foo-relation-joined" hook, JUJU_RELATION
   205      is "foo".
   206    * JUJU_RELATION_ID is more useful, because it serves as unique identifier for
   207      a particular relation, and thereby allows the charm to handle distinct
   208      relations over a single endpoint. In hooks for the "foo" charm relation,
   209      JUJU_RELATION_ID always has the form "foo:<id>", where id uniquely but
   210      opaquely identifies the runtime relation currently in play.
   211    * The relation-* hook tools, which ordinarily require that a relation be
   212      specified, assume they're being called with respect to the current
   213      relation. The default can of course be overridden as usual.
   214  
   215  Furthermore, all relation hooks except relation-broken are notifications about
   216  some specific unit of a related service, and operate in an environment with the
   217  following additional properties:
   218  
   219    * JUJU_REMOTE_UNIT is set to the name of the current related unit.
   220    * The relation-get hook tool, which ordinarily requires that a related unit
   221      be specified, assumes that it is being called with respect to the current
   222      related unit. The default can of course be overridden as usual.
   223  
   224  For every relation in which a unit partcipates, hooks for the appropriate charm
   225  relation are run according to the following rules.
   226  
   227  The "relation-joined" hook always runs once when a related unit is first seen.
   228  
   229  The "relation-changed" hook for a given unit always runs once immediately
   230  following the relation-joined hook for that unit, and subsequently whenever
   231  the related unit changes its settings (by calling relation-set and exiting
   232  without error). Note that "immediately" only applies within the context of
   233  this particular runtime relation -- that is, when "foo-relation-joined" is
   234  run for unit "bar/99" in relation id "foo:123", the only guarantee is that
   235  the next hook to be run *in relation id "foo:123"* will be "foo-relation-changed"
   236  for "bar/99". Unit hooks may intervene, as may hooks for other relations,
   237  and even for other "foo" relations.
   238  
   239  The "relation-departed" hook for a given unit always runs once when a related
   240  unit is no longer related. After the "relation-departed" hook has run, no
   241  further notifications will be received from that unit; however, its settings
   242  will remain accessible via relation-get for the complete lifetime of the
   243  relation.
   244  
   245  The "relation-broken" hook is not specific to any unit, and always runs once
   246  when the local unit is ready to depart the relation itself. Before this hook
   247  is run, a relation-departed hook will be executed for every unit known to be
   248  related; it will never run while the relation appears to have members, but it
   249  may be the first and only hook to run for a given relation. The stop hook will
   250  not run while relations remain to be broken.
   251  
   252  Relations in depth
   253  ------------------
   254  
   255  A unit's `scope` consists of the group of units that are transitively connected
   256  to that unit within a particular relation. So, for a globally-scoped relation,
   257  that means every unit of each service in the relation; for a locally-scoped
   258  relation, it means only those sets of units which are deployed alongside one
   259  another.  That is to say: a globally-scoped relation has a single unit scope,
   260  whilst a locally-scoped relation has one for each principal unit.
   261  
   262  When a unit becomes aware that it is a member of a relation, its only self-
   263  directed action is to `join` its scope within that relation. This involves two
   264  steps:
   265  
   266    * Write initial relation settings (just one value, "private-address"), to
   267      ensure that they will be available to observers before they're triggered
   268      by the next step;
   269    * Signal its existence, and role in the relation, to the rest of the system.
   270  
   271  The unit then starts observing and reacting to any other units in its scope
   272  which are playing a role in which it is interested. To be specific:
   273  
   274    * Each provider unit observes every requirer unit
   275    * Each requirer unit observes every provider unit
   276    * Each peer unit observes every other peer unit
   277  
   278  Now, suppose that some unit as the very first unit to join the relation; and
   279  let's say it's a requirer. No provider units are present, so no hooks will fire.
   280  But, when a provider unit joins the relation, the requirer and provider become
   281  aware of each other almost simultaneously. (Similarly, the first two units in a
   282  peer relation become aware of each other almost simultaneously.)
   283  
   284  So, concurrently, the units on each side of the relation run their relation-joined
   285  and relation-changed hooks with respect to their counterpart. The intent is that
   286  they communicate appropriate information to each other to set up some sort of
   287  connection, by using the relation-set and relation-get hook tools; but neither
   288  unit is safe to assume that any particular setting has yet been set by its
   289  counterpart.
   290  
   291  This sounds kinda tricky to deal with, but merely requires suitable respect for
   292  the relation-get tool: it is important to realise that relation-get is *never*
   293  guaranteed to contain any values at all, because we have decided that it's
   294  perfectly legitimate for a unit to delete its own private-address value.
   295  
   296  [TODO: There is a school of thought that maintains that we should add an
   297  independent "juju-private-address" setting that *is* guaranteed, but for now
   298  the reality is that relation-get can *always* fail to produce any given value.
   299  However, in the name of sanity, it's probably reasonable to treat a missing
   300  private-address as an error, and assume that `relation-get private-address` is
   301  always safe. For all other values, we must operate with the understanding that
   302  relation-get can always fail.]
   303  
   304  In one specific kind of hook, this is easy to deal with. A relation-changed hook
   305  can always exit without error when the current remote unit is missing data,
   306  because the hook is guaranteed to be run again when that data changes -- and,
   307  assuming the remote unit is running a charm that agrees on how to implement the
   308  interface, the data *will* change and the hook *will* be run again.
   309  
   310  In *all* other cases -- unit hooks, relation hooks for a different relation,
   311  relation hooks for a different remote unit in the same relation, and even
   312  relation hooks other than -changed for the *same* remote unit -- there is no
   313  such guarantee. These hooks all run on their own schedule, and there is no
   314  reason to expect them to be re-run on a predictable schedule, or in some cases
   315  ever again.
   316  
   317  This means that all such hooks need to be able to handle missing relation data,
   318  and to complete successfully; they mustn't fail, because the user is powerless
   319  to resolve the situation, and they can't even wait for state to change, because
   320  they all see their own sandboxed composite snapshot of fairly-recent state,
   321  which never changes.
   322  
   323  So, outside a vey narrow range of circumstances, relation-get should be treated
   324  with particular care. The corresponding advice for relation-set is very simple
   325  by comparison: relation-set should be called early and often. Because the unit
   326  agent serializes hook execution, there is never any danger of concurrent changes
   327  to the data, and so a null setting change can be safely ignored, and will not
   328  cause other units to react.
   329  
   330  Departing relations
   331  -------------------
   332  
   333  A unit will depart a relation when either the relation or the unit itself is
   334  marked for termination. In either case, it follows the same sequence:
   335  
   336    * For every known related unit -- those which have joined and not yet
   337      departed -- run the relation-departed hook.
   338    * Run the relation-broken hook.
   339    * `depart` from its scope in the relation.
   340  
   341  The unit's departure from its scope will in turn be detected by units of the
   342  related service, and cause them to run relation-departed hooks. A unit's
   343  relation settings persist beyond its own departure from the relation; the
   344  final unit to depart a relation marked for termination is responsible for
   345  destroying the relation and all associated data.
   346  
   347  Debugging charms
   348  ----------------
   349  
   350  Facilities are currently not good.
   351  
   352    * juju ssh
   353    * juju debug-hooks [TODO: not implemented]
   354    * juju debug-log [TODO: not implemented]
   355