github.com/mhilton/juju-juju@v0.0.0-20150901100907-a94dd2c73455/doc/architectural-overview.txt

github.com/mhilton/juju-juju@v0.0.0-20150901100907-a94dd2c73455/doc/architectural-overview.txt (about)

     1  
     2  # Juju Architectural Overview
     3  
     4  ## Audience
     5  
     6  This document is targeted at new developers of Juju, and may be useful to experienced
     7  developers who need a refresher on some aspect of juju's operation. It is deliberately
     8  light on detail, because the precise mechanisms of various components' operation are
     9  expected to change much faster than the general interactions between components.
    10  
    11  
    12  ## The View From Space
    13  
    14  A Juju environment is a distributed system comprising:
    15  
    16    * A data store (mongodb) which describes the desired state of the world, in terms
    17      of running workloads or *services*, and the *relations* between them; and of the
    18      *units* that comprise those services, and the *machines* on which those units run.
    19  
    20    * A bunch of *agents*, each of which runs the same `jujud` binary, and which are
    21      variously responsible for causing reality to converge towards the idealised world-
    22      state encoded in the data store.
    23  
    24    * Some number of *clients* which talk over an API, implemented by the agents, to
    25      update the desired world-state (and thereby cause the agents to update the world
    26      to match). The `juju` binary is one of many possible clients; the `juju-gui` web
    27      application, and the `juju-deployer` python tool, are other examples.
    28  
    29  The whole system depends upon a substrate, or *provider*, which supplies the compute,
    30  storage, and network resources used by the workloads (and by juju itself; but never
    31  forget that *everything* described in this document is merely supporting infrastructure
    32  geared towards the successful deployment and configuration of the workloads that solve
    33  actual problems for actual users).
    34  
    35  
    36  ## The Data Store
    37  
    38  There's a lot of *detail* to cover, but there's not much to say from an architectural
    39  standpoint. We use a mongodb replicaset to support HA; we use the `mgo` package from
    40  `labix.org` to implement multi-document transactions; we make use of the transaction
    41  log to detect changes to particular documents, and convert them into business-object-
    42  level events that get sent over the API to interested parties.
    43  
    44  The mongodb databases run on machines we refer to as *state servers*, and are only
    45  accessed by agents running on those machines; it's important to keep it locked down
    46  (and, honestly, to lock it down further and better than we currently have).
    47  
    48  There's some documentation on how to work with [the state package](hacking-state);
    49  and plenty more on the [state entities](lifecycles) and the details of their
    50  [creation](entity-creation) and [destruction](death-and-destruction) from various
    51  perspectives; but there's not a lot more to say in this context.
    52  
    53  It *is* important to understand that the transaction-log watching is not an ideal
    54  solution, and we'll be retiring it at some point, in favour of an in-memory model
    55  of state and a pub-sub system for watchers; we *know* it's a scalability problem,
    56  but we're not devoting resources to it until it becomes more pressing.
    57  
    58  Code for dealing with mongodb is found primarily in the `state`, `state/watcher`,
    59  `replicaset`, and `worker/peergrouper` packages.
    60  
    61  
    62  ## The Agents
    63  
    64  Agents all use the same `jujud` binary, and all follow roughly the same  model.
    65  When starting up, they authenticate with an API server; possibly reset their
    66  password, if the one they used has been stored persistently somewhere and is
    67  thus vulnerable; determine their responsibilities; and run a set of tasks in
    68  parallel until one of those tasks returns an error indicating that the agent
    69  should either restart or terminate completely. Tasks that return any other error
    70  will be automatically restarted after a short delay; tasks that return nil are
    71  considered to be complete, and will not be restarted until the whole process is.
    72  
    73  When comparing the unit agent with the machine agent, the truth of the above
    74  may not be immediately apparent, because the responsibilities of the unit
    75  agent are so much less varied than those of the machine agent; but we have
    76  scheduled work to integrate the unit agent into the machine agent, rendering
    77  each unit agent a single worker task within its responsible machine agent. It's
    78  still better to consider a unit agent to be a simplistic and/or degenerate
    79  implementation of a machine agent than to attach too much importance to the
    80  differences.
    81  
    82  
    83  ### Jobs, Runners, and Workers
    84  
    85  Machine agents all have at least one of two jobs: JobHostUnits and JobManageEnviron.
    86  Each of these jobs represents a number of tasks the agent needs to execute to
    87  fulfil its responsibilities; in addition, there are a number of tasks that are
    88  executed by every machine agent. The terms *task* and *worker* are generally used
    89  interchangeably in this document and in the source code; it's possible but not
    90  generally helpful to draw the distinction that a worker executes a task. All
    91  tasks are implemented by code in some subpackage of the `worker` package, and the
    92  `worker.Runner` type implements the retry behaviour described above.
    93  
    94  It's useful to note that the Runner type is itself a worker, so we can and do
    95  nest Runners inside one another; the details of *exactly* how and where a given
    96  worker comes to be executed are generally elided in this document; but it's worth
    97  being aware of the fact that all the workers that use an API connection share a
    98  single one, mediated by a single Runner, such that when the API connection fails
    99  that single Runner can stop all its workers; shut itself down; be restarted by
   100  its parent worker; and set up a new API connection, which it then uses to start
   101  all its child workers.
   102  
   103  Please note that the lists of workers below should *not* be assumed to be
   104  exhaustive. Juju evolves, and the workers evolve with it.
   105  
   106  
   107  ### Common workers
   108  
   109  All agents run workers with the following responsibilities:
   110  
   111    * Check for scheduled upgrades for their binaries, and replace themselves
   112      (implemented in `worker/upgrader`)
   113    * Watch logging config, and reconfigure the local logger (`worker/logger`; yes,
   114      we know; it is not the stupidest name in the codebase)
   115    * Watch and store the latest known addresses for the state servers
   116      (`worker/apiaddressupdater`)
   117    * Watch and store rsyslog targets (`worker/rsyslog`) -- this *could* probably be
   118      reasonably combined with the apiaddressupdater, but there's no guarantee that
   119  
   120  ### Machine Agent Workers
   121  
   122  Machine agents additionally do the following:
   123  
   124    * Run upgrade code in the new binaries once they're replaced themselves
   125      (implemented directly in the machine agent's `upgradeWorker` method)
   126    * Handle SIGABRT and permanently stop the agent (`worker/terminationworker`)
   127    * Handle the machine entity's death and permanently stop the agent (`worker/machiner`)
   128    * Watch proxy config, and reconfigure the local machine (`worker/machineenvironmentworker`)
   129      the two sets of machines will be the same for ever.
   130    * Watch for contained LXC or KVM machines and provision/decommission them
   131      (`worker/provisioner`)
   132  
   133  *Almost* all machine agents have JobHostUnits -- the sole exception is the state
   134  server in a local environment, which runs directly on the user's machine (not
   135  in a container); we don't want to pollute their day-to-day working environment
   136  by deploying charms there (especially because the local provider is explicitly
   137  a development tool, and charms deployed there are disproportionately likely to
   138  be flawed or incomplete). Those that do run the `worker/deployer` code which
   139  watches for units assigned to the machine, and deploys/recalls upstart configs
   140  for their respective unit agents as the units are assigned/removed. We expect
   141  the deployer implementation to change to just directly run the unit agents'
   142  workers in its own Runner.
   143  
   144  There remain a couple of abominations in which the machine agent looks up
   145  information it really shouldn't have access to -- ie the running provider type
   146  -- and uses that information to decide whether to start other workers. These
   147  instances should be killed with fire if you get any opportunity to do so;
   148  basicaly all the information in agent config which *isn't* about contacting an
   149  API server represents a layering violation that (1) confuses us and slows us
   150  down and (2) just encourages worse layering violations as we progress.
   151  
   152  
   153  ### State Server Workers
   154  
   155  Machines with JobManageEnviron also run a number of other workers, which do
   156  the following.
   157  
   158    * Run the API server used by all other workers (in this, and other, agents:
   159      `state/apiserver`)
   160    * Provision/decommission provider instances in response to the creation/
   161      destruction of machine entities (`worker/provisioner`, just like the
   162      container provisioners run in all machine agents anyway)
   163    * Manipulate provider networks in response to units opening/closing ports,
   164      and users exposing/unexposing services (`worker/firewaller`)
   165    * Update network addresses and associated information for provider instances
   166      (`worker/instancepoller`)
   167    * Respond to queued DB cleanup events (`worker/cleaner`)
   168    * Maintain the MongoDB replica set (`worker/peergrouper`)
   169    * Resume incomplete MongoDB transactions (`worker/resumer`)
   170  
   171  Many of these workers (more than strictly need to be) are wrapped as "singular"
   172  workers, which only run on the same machine as the current MongoDB replicaset
   173  master. When the master changes, the state connection is dropped, causing all
   174  those workers to also be stopped; when they're restarted, they won't run because
   175  they're no longer running on the master.
   176  
   177  
   178  ### Unit Agents
   179  
   180  Unit agents run all the common workers, and the `worker/uniter` task as well;
   181  this task is probably the single most forbiddingly complex part of Juju. (Side
   182  note: It's a unit-er because it deals with units, and we're bad at names; but
   183  it's also a unite-r because it's where all the various components of juju come
   184  together to run actual workloads.) It's sufficiently large that it deserves its
   185  own top-level heading, below.
   186  
   187  
   188  ## The Uniter
   189  
   190  At the highest level, the Uniter is a state machine. After a "little bit" of setup,
   191  it runs a tight loop in which it calls `Mode` functions one after another, with the
   192  next mode run determined by the result of its predecessor. All mode functions are
   193  implemented in `worker/uniter/modes.go`, which is actually pretty small: just a hair
   194  over 400 lines.
   195  
   196  It's deliberately implemented as conceptually single-threaded (just like almost
   197  everything else in juju -- rampaging concurrency is the root of much evil, and so
   198  we save ourselves a huge number of headaches by hiding concurrency behind event
   199  channels and handling a single event at a time), but this property has degraded
   200  over time; in particular, the `RunListener` code can inject events at unhelpful
   201  times, and while the `hookLock` *probably* renders this safe it's still deeply
   202  suboptimal, because the fact of the concurrency requires that we be *extremely*
   203  careful with further modifications, lest they subtly break assumptions. We hope
   204  to address this by retiring the current implementation of `juju run`, but it's
   205  not entirely clear how to do this; in the meantime, Here Be Dragons.
   206  
   207  Leaving these woes aside, the mode functions make use of two fundamental components,
   208  which are glommed together until someone refactors it to make more sense. There's
   209  the `Filter`, which is responsible for communicating with the API server (and the
   210  rest of the outside world) such that relevant events can be delivered to the mode
   211  func via channels exposed on the filter; and then there's the `Uniter` itself, which
   212  exposes a number of methods that are expected to be called by the mode funcs.
   213  
   214  
   215  ### Uniter Modes
   216  
   217  XXXX
   218  
   219  
   220  ### Hook contexts
   221  
   222  XXXX
   223  
   224  
   225  ### The Relation Model
   226  
   227  XXXX
   228  
   229  
   230  ## The APIs
   231  
   232  State servers expose an API endpoint over a websocket connection. The methods
   233  available over the API are broken down by client; there's a `Client` facade that
   234  exposes the methods used by clients, an `Agent` facade that exposes the methods
   235  common to all agents, and a wide range of worker-specific *facades* that individually
   236  deal with particular chunks of functionality implemented by one agent or another
   237  (for example, `Provisioner`, `Upgrader`, and `Uniter`, each used by the eponymous
   238  worker types).
   239  
   240  Various facades share functionality; for example, the Life method is used by many
   241  worker facades. In these cases, the method is implemented on a spearate type, which
   242  is embedded in the facade implementation.
   243  
   244  All APIs *should* be implemented such that they can be called in bulk, but not
   245  all of them are. The agent facades are (almost?) all implemented correctly, but
   246  the Client facade is almost exclusively not. As functionality evolves, and new
   247  versions of the client APIs are implemented, we must take care to implement them
   248  consistently -- this means both implementing bulk calls *and* splitting the
   249  monolithic Client facade into smaller service-specific facades, such that we
   250  can evolve interaction with (say) users without bumping global API versions
   251  across the board).
   252  
   253  
   254  ## The Providers
   255  
   256  Each provider represents a different possible kind of substrate on which a juju
   257  environment can run, and (as far as possible) abstracts away the differences
   258  between them, by making them all conform to the Environ interface. The most
   259  important thing to understand about the various providers is that they're all
   260  implemented without reference to broader juju concepts; they are squeezed into
   261  a shape that's convenient WRT allowing juju to make use of them, but if we allow
   262  juju-level concepts to infect the providers we will suffer greatly, because we
   263  will open a path by which changes to *juju* end up causing changes to *all the
   264  providers at once*.
   265  
   266  However, we lack the ability to enforce this at present, because the package
   267  dependency flows in the wrong direction, thanks primarily (purely?) to the
   268  StateInfo method on Environ; and we jam all sorts of gymnastics into the state
   269  package to allow us to use Environs without doing so explicitly (see the
   270  state.Policy interface, and its many somewhat-inelegant uses). In other places,
   271  we have (quite reasonably) moved code out of the environs package (see both
   272  environs/config.Config, and instance.Instance).
   273  
   274  Environ implementations are expected to be goroutine-safe; we don't currently
   275  make much use of that property at the moment, but we will be coming to depend
   276  upon it as we move to eliminate the wasteful proliferation of Environ instances
   277  in the API server.
   278  
   279  It's important to note that an environ Config will generally contain sensitive
   280  information -- a user's authentication keys for a cloud provider -- and so we
   281  must always be careful to avoid spreading those around further than we need to.
   282  Basically, if an environ config gets off a state server, we've screwed up.
   283  
   284  
   285  ## Bootstrapping
   286  
   287  XXXX