github.com/altoros/juju-vmware@v0.0.0-20150312064031-f19ae857ccca/doc/backup_and_restore.txt (about) 1 Backup and Restore 2 =================== 3 4 Backup of juju's state is a critical feature, not only for juju users 5 but for use inside juju itself. This is likewise the case for the 6 ability to restore previous backups. This doc is intended as an 7 overview of both since changes in juju are prone to break both. 8 9 Backup 10 ------------------- 11 12 Backing up juju state involves dumping the state database (currently 13 from mongo) and copying all files that are critical to juju's operation. 14 All the files are bundled up into an archive file. Effectively the 15 archive represents a snapshot of juju state. 16 17 That snapshot is stored by the state server in such a way that only a 18 backup ID is needed for restore (no need to upload an archive). Note 19 that if the state server is not available, restoring with just the ID is 20 not an option. While that situation will need to be addressed in the 21 short-term, it should not require much additional effort. 22 23 We make reasonable efforts to ensure that the archive is consistent with 24 the snapshot. This includes stopping the database for the length of 25 time it takes to dump it. There is, however, room for improvement with 26 regard to ensuring the consistency of the archive. 27 28 First of all, running juju commands will fail while the DB is 29 unavailable (already running services should not be affected). While 30 this period of time is rather short, we expect that it will grow over 31 time. Furthermore, the larger an environment's state, the larger the 32 impact of this downtime. In the long term this makes it less than ideal 33 to run backup as often as it should be. 34 35 Secondly, state currently does not block for the backup process as a 36 whole. This means that if we dump the DB first, it may be out of date 37 by the time we finish gathering the state-related files. In practice 38 this isn't a big concern since we do not expect the files to change 39 during the interval that backup is running. However, we do backup some 40 log files, so there is a small chance they will differ from when backups 41 started. 42 43 Restore 44 ------------------- 45 46 Restore involves reviving the juju state in a new environment by 47 reversing the steps taken by backup. However, the process is a bit more 48 complicated than just gathering files and dumping the DB. 49 50 If no state-server is present restore will do the following: 51 52 1. bootstrap a new node in safe mode (ProvisionerSafeMode reports 53 whether the provisioner should not destroy machines it does not know 54 about), 55 2. stop juju-db, 56 3. stop jujud-machine, 57 4. load the backed-up db in place of the recently created one, 58 5. un-tar the fs files onto the root dir of the current machine, 59 6. run a set of bash scripts that replace the dns/instance names of the 60 old machine with those of the new machine in the relevant config 61 files and also in the db (if this step is not performed peergrouper 62 will kick our machine out of the vote list and fill it with the old 63 dead ones), 64 7. restart all services. 65 66 As noted above, restoring via an uploaded archive (rather than by using 67 an ID) will need to be addressed in the short term, since the existing 68 restore functionality works this way. However, it shouldn't involve 69 more than bootstrapping a new environment, uploading the archive to it, 70 and then requesting restore of that backup. The design in this document 71 already accommodates doing this. 72 73 HA 74 ------------------- 75 76 HA is a work in progress, for the moment we have a basic support which is an 77 extension of the regular backup functionality. 78 Read carefully before attempting backup/restore on an HA environment. 79 80 In the case of HA, the backup process will backup files/db for machine 0, 81 support for "any working state server" is plans for the near future. 82 We assume, for now, that if you are running restore is because you have 83 lost all state-server machines. Out of this restore you will get one 84 functioning state-server that you can use to start you other state machines. 85 BEWARE, only run restore in the case where you no longer have working 86 State Servers since otherwise this will take them offline and possibly 87 cripple your environment 88 89 Previous Implementation 90 ------------------- 91 92 Backup and restore were both implemented as plugins (in cmd/plugins/) to 93 the juju CLI. The plugins were essentially scripts we sent over SSH to 94 the state machine and ran there. However, they were definitely distinct 95 pieces of software. 96 97 98 Implementation 99 =================== 100 101 Key Points 102 ------------------- 103 104 * Backups are created and then stored on the state machine. 105 * Each backup archive has an associated metadata document (stored in 106 mongo). 107 * Each backup archive is stored relative to the state machine (currently 108 env storage). 109 * Restore will have access to the state machine where the archive and 110 metadata are stored. 111 * In the common case there is no need to upload or download a backup. 112 * The backups machinery has its own facade in state/apiserver/backups. 113 * The choice of mechanism for uploading and downloading backups has not 114 been decided yet. 115 * The backups machinery is divided into 4 layers: 116 - state-dependent functionality, 117 - state-independent functionality, 118 - the state API facade for backups, 119 - the juju CLI sub-command for backups. 120 * The state-independent functionality can be broken down further: 121 - a high-level backups interface/implementation, 122 - low-level backup/restore functionality, 123 - components of the backups machinery. 124 * Backups depend on the github.com/juju/utils/filestorage package. 125 * Backups have a special relationship with state (see note at 126 beginning of state/backups.go). 127 128 Backup Archives 129 ------------------- 130 131 Each backup archive is a gzipped tar file (.tar.gz). It has the 132 following structure. 133 134 juju-backup/ 135 metadata.json - the backup metadata for the archive. 136 root.tar - the bundle of state-related files (exluding mongo). 137 dump/ - all the files dumped from the DB (using mongodump). 138 139 At present we do not include any sort of manifest/index file in the 140 archive. 141 142 For more information, see: 143 - state/backups/db/dump.go - how the DB is dumped; 144 - state/backups/files/files.go - which files are included in root.tar. 145 146 File Layout 147 -------------------- 148 149 The layering of the backups machinery and divisions of the state- 150 independent functionality map almost directly to the following 151 structure in the filesystem. The state API facade for backups is spread 152 between state/apiserver and state/api. 153 154 state/ 155 backups.go - state-dependent functionality (basically the 156 interaction with mongo and with env storage) 157 backups/ - state-independent functionality 158 backups.go - high-level/public backups interface/implementation 159 create.go - low-level implementation of backing up juju state 160 restore.go - low-level implementation of restoring juju state 161 archive/ - an abstraction of a backups archive 162 db/ - all stuff related to external interaction with the 163 DB (internal interactions live in state/backups.go) 164 files/ - all stuff related to files we back up and restore 165 metadata/ - the backups metadata implementation 166 apiserver/ 167 backups/ - the state API facade for backups 168 backups.go - facade implementation (not including methods 169 for end-points) 170 create.go - implementation of the Create() end-point 171 info.go - (wraps state/backups/backups.go:Backups.Get) 172 list.go 173 remove.go 174 restore.go - implementation of the Restore() end-point 175 api/ 176 backups.go - the juju state API client implementation for the 177 backups facade 178 params/ 179 backups.go - the backups-related API arg/result types 180 cmd/ 181 juju/ 182 backups.go - the juju CLI sub-command implementation 183 184 Note that upload/download aren't accommodated in apiserver/backups/ yet. 185 186 Layers of Abstraction 187 -------------------- 188 189 As noted above, the backups machinery is divided in 4 layers and the 190 state-independent portion into 3 parts. Here is an example (using 191 "create") of how those layers interact. 192 193 * The juju CLI for backups wraps: 194 - the backups facade's Create() method. 195 * The state API facade wraps: 196 - the high-level backups implementation (state/backups/backups.go), 197 - the state-backups interactions (state/backups.go). 198 * the backups implementation wraps: 199 - a "filestorage" implementation (../utils/filestorage:FileStorage), 200 - the low-level "create" implementation, 201 - DB connection info (state/backups/db/info.go), 202 - the backup's metadata. 203 * the "create" implementation makes use of: 204 - the code in state/backups/{archive,db,files}. 205 206 Backups Interface 207 -------------------- 208 209 Backups 210 Add(meta Metadata, archive io.ReadCloser) error 211 Create() (id string, err error) 212 Get(id string) (Metadata, io.ReadCloser, error) 213 List() ([]Metadata, error) 214 Remove(id string) error 215 Restore(id string) error 216 217 Note: Restore() makes use of Get(). 218 219 State API Facade 220 -------------------- 221 222 BackupsAPI 223 Create(BackupsCreateArgs) BackupsCreateResult 224 Info(BackupsInfoArgs) BackupsMetadataResult 225 List() BackupsMetadataListResult 226 Remove(BackupsRemoveArgs) 227 Restore(BackupsRestoreArgs) 228 229 Again note that upload and download are not yet included here. 230 231 CLI sub-command 232 -------------------- 233 234 The juju CLI sub-command for backups is called "backups". Its own 235 sub-commands have basically a 1-to-1 equivalence with the API client 236 methods of the same respective names. The essential sub-commands are 237 exposed via the following options: 238 239 - juju backups [--create] [--quiet] [<notes>] 240 - juju backups --info <ID> 241 - juju backups --list [--brief] 242 - juju backups --remove <ID> 243 - juju backups --restore <ID> 244 245 Note: further options may be appropriate for later addition 246 (e.g. [<filename>] on --restore). 247 248 Other anticipated subcommands: 249 250 - juju backups --download <ID> [<filename>] 251 - juju backups --upload <filename> 252 253 Note that download and upload are only hypothetical, pending support in 254 the API facade. When we add the ability to restore from an archive 255 (rather than an ID), --download and --upload (or a --filename option on 256 restore) will become essential.