github.com/altoros/juju-vmware@v0.0.0-20150312064031-f19ae857ccca/doc/backup_and_restore.txt (about)

     1  Backup and Restore
     2  ===================
     3  
     4  Backup of juju's state is a critical feature, not only for juju users
     5  but for use inside juju itself.  This is likewise the case for the
     6  ability to restore previous backups.  This doc is intended as an
     7  overview of both since changes in juju are prone to break both.
     8  
     9  Backup
    10  -------------------
    11  
    12  Backing up juju state involves dumping the state database (currently
    13  from mongo) and copying all files that are critical to juju's operation.
    14  All the files are bundled up into an archive file.  Effectively the
    15  archive represents a snapshot of juju state.
    16  
    17  That snapshot is stored by the state server in such a way that only a
    18  backup ID is needed for restore (no need to upload an archive).  Note
    19  that if the state server is not available, restoring with just the ID is
    20  not an option.  While that situation will need to be addressed in the
    21  short-term, it should not require much additional effort.
    22  
    23  We make reasonable efforts to ensure that the archive is consistent with
    24  the snapshot.  This includes stopping the database for the length of
    25  time it takes to dump it.  There is, however, room for improvement with
    26  regard to ensuring the consistency of the archive.
    27  
    28  First of all, running juju commands will fail while the DB is
    29  unavailable (already running services should not be affected).  While
    30  this period of time is rather short, we expect that it will grow over
    31  time.  Furthermore, the larger an environment's state, the larger the
    32  impact of this downtime.  In the long term this makes it less than ideal
    33  to run backup as often as it should be.
    34  
    35  Secondly, state currently does not block for the backup process as a
    36  whole.  This means that if we dump the DB first, it may be out of date
    37  by the time we finish gathering the state-related files.  In practice
    38  this isn't a big concern since we do not expect the files to change
    39  during the interval that backup is running.  However, we do backup some
    40  log files, so there is a small chance they will differ from when backups
    41  started.
    42  
    43  Restore
    44  -------------------
    45  
    46  Restore involves reviving the juju state in a new environment by
    47  reversing the steps taken by backup.  However, the process is a bit more
    48  complicated than just gathering files and dumping the DB.
    49  
    50  If no state-server is present restore will do the following:
    51  
    52  1. bootstrap a new node in safe mode (ProvisionerSafeMode reports
    53     whether the provisioner should not destroy machines it does not know
    54     about), 
    55  2. stop juju-db,
    56  3. stop jujud-machine,
    57  4. load the backed-up db in place of the recently created one,
    58  5. un-tar the fs files onto the root dir of the current machine,
    59  6. run a set of bash scripts that replace the dns/instance names of the
    60     old machine with those of the new machine in the relevant config
    61     files and also in the db (if this step is not performed peergrouper
    62     will kick our machine out of the vote list and fill it with the old
    63     dead ones),
    64  7. restart all services.
    65  
    66  As noted above, restoring via an uploaded archive (rather than by using
    67  an ID) will need to be addressed in the short term, since the existing
    68  restore functionality works this way.  However, it shouldn't involve
    69  more than bootstrapping a new environment, uploading the archive to it,
    70  and then requesting restore of that backup.  The design in this document
    71  already accommodates doing this.
    72  
    73  HA
    74  -------------------
    75  
    76  HA is a work in progress, for the moment we have a basic support which is an
    77  extension of the regular backup functionality.
    78  Read carefully before attempting backup/restore on an HA environment.
    79  
    80  In the case of HA, the backup process will backup files/db for machine 0,
    81  support for "any working state server" is plans for the near future.
    82  We assume, for now, that if you are running restore is because you have
    83  lost all state-server machines. Out of this restore you will get one
    84  functioning state-server that you can use to start you other state machines.
    85  BEWARE, only run restore in the case where you no longer have working
    86  State Servers since otherwise this will take them offline and possibly
    87  cripple your environment
    88  
    89  Previous Implementation
    90  -------------------
    91  
    92  Backup and restore were both implemented as plugins (in cmd/plugins/) to
    93  the juju CLI.  The plugins were essentially scripts we sent over SSH to
    94  the state machine and ran there.  However, they were definitely distinct
    95  pieces of software.
    96  
    97  
    98  Implementation
    99  ===================
   100  
   101  Key Points
   102  -------------------
   103  
   104  * Backups are created and then stored on the state machine.
   105  * Each backup archive has an associated metadata document (stored in
   106    mongo).
   107  * Each backup archive is stored relative to the state machine (currently
   108    env storage).
   109  * Restore will have access to the state machine where the archive and
   110    metadata are stored.
   111  * In the common case there is no need to upload or download a backup.
   112  * The backups machinery has its own facade in state/apiserver/backups.
   113  * The choice of mechanism for uploading and downloading backups has not
   114    been decided yet.
   115  * The backups machinery is divided into 4 layers:
   116    - state-dependent functionality,
   117    - state-independent functionality,
   118    - the state API facade for backups,
   119    - the juju CLI sub-command for backups.
   120  * The state-independent functionality can be broken down further:
   121    - a high-level backups interface/implementation,
   122    - low-level backup/restore functionality,
   123    - components of the backups machinery.
   124  * Backups depend on the github.com/juju/utils/filestorage package.
   125  * Backups have a special relationship with state (see note at
   126    beginning of state/backups.go).
   127  
   128  Backup Archives
   129  -------------------
   130  
   131  Each backup archive is a gzipped tar file (.tar.gz).  It has the
   132  following structure.
   133  
   134  juju-backup/
   135      metadata.json - the backup metadata for the archive.
   136      root.tar      - the bundle of state-related files (exluding mongo).
   137      dump/         - all the files dumped from the DB (using mongodump).
   138  
   139  At present we do not include any sort of manifest/index file in the
   140  archive.
   141  
   142  For more information, see:
   143    - state/backups/db/dump.go     - how the DB is dumped;
   144    - state/backups/files/files.go - which files are included in root.tar.
   145  
   146  File Layout
   147  --------------------
   148  
   149  The layering of the backups machinery and divisions of the state-
   150  independent functionality map almost directly to the following
   151  structure in the filesystem.  The state API facade for backups is spread
   152  between state/apiserver and state/api.
   153  
   154  state/
   155      backups.go - state-dependent functionality (basically the
   156                   interaction with mongo and with env storage)
   157      backups/   - state-independent functionality
   158          backups.go - high-level/public backups interface/implementation
   159          create.go  - low-level implementation of backing up juju state
   160          restore.go - low-level implementation of restoring juju state
   161          archive/   - an abstraction of a backups archive
   162          db/        - all stuff related to external interaction with the
   163                       DB (internal interactions live in state/backups.go)
   164          files/     - all stuff related to files we back up and restore
   165          metadata/  - the backups metadata implementation
   166      apiserver/
   167          backups/ - the state API facade for backups
   168              backups.go - facade implementation (not including methods
   169                           for end-points)
   170              create.go  - implementation of the Create() end-point
   171              info.go    - (wraps state/backups/backups.go:Backups.Get)
   172              list.go
   173              remove.go
   174              restore.go - implementation of the Restore() end-point
   175      api/
   176          backups.go - the juju state API client implementation for the
   177                       backups facade
   178          params/
   179              backups.go - the backups-related API arg/result types
   180      cmd/
   181          juju/
   182              backups.go - the juju CLI sub-command implementation
   183  
   184  Note that upload/download aren't accommodated in apiserver/backups/ yet.
   185  
   186  Layers of Abstraction
   187  --------------------
   188  
   189  As noted above, the backups machinery is divided in 4 layers and the
   190  state-independent portion into 3 parts.  Here is an example (using
   191  "create") of how those layers interact.
   192  
   193  * The juju CLI for backups wraps:
   194    - the backups facade's Create() method.
   195  * The state API facade wraps:
   196    - the high-level backups implementation (state/backups/backups.go),
   197    - the state-backups interactions (state/backups.go).
   198  * the backups implementation wraps:
   199    - a "filestorage" implementation (../utils/filestorage:FileStorage),
   200    - the low-level "create" implementation,
   201    - DB connection info (state/backups/db/info.go),
   202    - the backup's metadata.
   203  * the "create" implementation makes use of:
   204    - the code in state/backups/{archive,db,files}.
   205  
   206  Backups Interface
   207  --------------------
   208  
   209  Backups
   210    Add(meta Metadata, archive io.ReadCloser) error
   211    Create() (id string, err error)
   212    Get(id string) (Metadata, io.ReadCloser, error)
   213    List() ([]Metadata, error)
   214    Remove(id string) error
   215    Restore(id string) error
   216  
   217  Note: Restore() makes use of Get().
   218  
   219  State API Facade
   220  --------------------
   221  
   222  BackupsAPI
   223    Create(BackupsCreateArgs) BackupsCreateResult
   224    Info(BackupsInfoArgs) BackupsMetadataResult
   225    List() BackupsMetadataListResult
   226    Remove(BackupsRemoveArgs)
   227    Restore(BackupsRestoreArgs)
   228  
   229  Again note that upload and download are not yet included here.
   230  
   231  CLI sub-command
   232  --------------------
   233  
   234  The juju CLI sub-command for backups is called "backups".  Its own
   235  sub-commands have basically a 1-to-1 equivalence with the API client
   236  methods of the same respective names.  The essential sub-commands are
   237  exposed via the following options:
   238  
   239  - juju backups [--create] [--quiet] [<notes>]
   240  - juju backups --info <ID>
   241  - juju backups --list [--brief]
   242  - juju backups --remove <ID>
   243  - juju backups --restore <ID>
   244  
   245  Note: further options may be appropriate for later addition
   246        (e.g. [<filename>] on --restore).
   247  
   248  Other anticipated subcommands:
   249  
   250  - juju backups --download <ID> [<filename>]
   251  - juju backups --upload <filename>
   252  
   253  Note that download and upload are only hypothetical, pending support in
   254  the API facade.  When we add the ability to restore from an archive
   255  (rather than an ID), --download and --upload (or a --filename option on
   256  restore) will become essential.