github.com/cloudberrydb/gpbackup@v1.0.3-0.20240118031043-5410fd45eed6/plugins/README.md (about) 1 # Plugins in gpbackup 2 3 gpbackup and gprestore support backing up to and restoring from remote storage locations (e.g.: s3) using a plugin architecture 4 5 ## Using plugins 6 The plugin executable must exist on all segments at the same path. 7 8 Backing up with a plugin: 9 ``` 10 gpbackup ... --plugin-config <Absolute path to config file> 11 ``` 12 13 Restoring with a plugin: 14 ``` 15 gprestore ... --plugin-config <Absolute path to config file> 16 ``` 17 The backup you are restoring must have been taken with the same plugin. 18 19 ## Plugin configuration file format 20 The plugin configuration must be specified in a yaml file. This yaml file is only required to exist on the coordinator host, and is automatically copied to segment hosts. 21 22 The _executablepath_ is a required parameter and must point to the absolute path of the executable on each host. Additional parameters may be specified under the _options_ key as required by the specific plugin. Refer to the documentation for the plugin you are using for additional required paramters. The _options_ section will include "pgport" for one of the segments on a given host, in case the plugin requires usage of a postgres function. Upon a restore, the _options_ section may also contain "backup_plugin_version" if the information is available from historical records. With this historical version, a newer plugin could possibly support backwards compatibility toward backups created with older versions of plugins. 23 24 ``` 25 executablepath: <Absolute path to plugin executable> 26 options: 27 my_first_option: <value1> 28 my_second_option: <value2> 29 pgport: 5432 30 backup_plugin_version: 1.3 31 <Additional options for the specific plugin> 32 ``` 33 34 ## Available plugins 35 [gpbackup_s3_plugin](https://github.com/greenplum-db/gpbackup-s3-plugin): Allows users to back up their Greenplum Database to Amazon S3. 36 37 ## Developing plugins 38 39 Plugins can be written in any language as long as they can be called as an executable and adhere to the gpbackup plugin API. 40 41 gpbackup and gprestore will call the plugin executable in the format 42 ``` 43 [plugin_executable_name] [command] arg1 arg2 44 ``` 45 46 If an error occurs during plugin execution, plugins should write an error message to stderr and return a non-zero error code. 47 48 49 50 ## Commands 51 52 The current version of our utility calls all commands listed below. Errors will occur if any of them are not defined. If your plugin does not require the functionality of one of these commands, leave the implementation empty. 53 54 See [Release Notes](#Release_Notes) for command modification history. 55 56 [setup_plugin_for_backup](#setup_plugin_for_backup) 57 58 [setup_plugin_for_restore](#setup_plugin_for_restore) 59 60 [cleanup_plugin_for_backup](#cleanup_plugin_for_backup) 61 62 [cleanup_plugin_for_restore](#cleanup_plugin_for_restore) 63 64 [backup_file](#backup_file) 65 66 [restore_file](#restore_file) 67 68 [backup_data](#backup_data) 69 70 [restore_data](#restore_data) 71 72 [plugin_api_version](#plugin_api_version) 73 74 [delete_backup](#delete_backup) 75 76 [--version](#--version) 77 78 ## Command Arguments 79 80 These arguments are passed to the plugin by gpbackup/gprestore. 81 82 [config_path](#config_path): Absolute path to the config yaml file 83 84 [local_backup_directory](#local_backup_directory): The path to the directory where gpbackup would place backup files on the coordinator host if not using a plugin. Our plugins reference this path to recreate a similar directory structure on the destination system. gprestore will read files from this location so the plugin will need to create the directory during setup if it does not already exist. 85 86 [scope](#scope): The scope at which this plugin's setup/cleanup hook is invoked. Values for this parameter are "coordinator", "segment_host" and "segment" (with "master" being a supported synonym for "coordinator" for backwards compatibility). Each such hook is invoked at each of these scopes. For eg. If we have a cluster with a coordinator on 1 coordinator host and 2 segment hosts each with 4 segments, each of these hooks will be executed in the following manner: There will be 1 invocation 87 of each method with the parameter "coordinator", offering a chance to perform some setup/cleanup to be done *once* per cluster. Creation/Deletion of a remote directory is a perfect candidate here. Furthermore, there will be 1 invocation for each of these commands for each of the segment hosts, offering a chance to establish/teardown connectivity to a remote storage provider such as S3 for instance. Finally, there will be 1 invocation for each of these commands for each of the segments. 88 89 Note: "segment_host" and "segment" are both provided as a single physical segment host may house multiple segment processes in Greenplum. There maybe some setup or cleanup required at the segment host level as compared to each segment process. 90 91 [contentID](#contentID): The contentID corresponding to the scope. This is passed in only for the "coordinator" and "segment" scopes. 92 93 [filepath](#filepath): The local path to a file written by gpbackup and/or read by gprestore. 94 95 [data_filekey](#data_filekey): The path where a data file would be written on local disk if not using a plugin. The plugin should use the filename specified in this argument when storing the streamed data on the remote system because the same path will be used as a key to the restore_data command to retrieve the data. 96 97 [timestamp](#timestamp): The timestamp key for a particular backup. 98 99 ## Command API 100 101 ### [setup_plugin_for_backup](#setup_plugin_for_backup) 102 103 Steps necessary to initialize plugin before backup begins. E.g. Creating remote directories, validating connectivity, disk checks, etc. 104 105 **Usage within gpbackup:** 106 107 Called at the start of the backup process on the coordinator and each segment host. 108 109 **Arguments:** 110 111 [config_path](#config_path) 112 113 [local_backup_directory](#local_backup_directory) 114 115 [scope](#scope) 116 117 [contentID](#contentID) 118 119 **Stdout:** None 120 121 **Example:** 122 ``` 123 test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1 124 test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host 125 test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0 126 test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1 127 ``` 128 129 ### [setup_plugin_for_restore](#setup_plugin_for_restore) 130 131 Steps necessary to initialize plugin before restore begins. E.g. validating connectivity 132 133 **Usage within gprestore:** 134 135 Called at the start of the restore process on the coordinator and each segment host. 136 137 **Arguments:** 138 139 [config_path](#config_path) 140 141 [local_backup_directory](#local_backup_directory) 142 143 [scope](#scope) 144 145 [contentID](#contentID) 146 147 **Stdout:** None 148 149 **Example:** 150 ``` 151 test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1 152 test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host 153 test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0 154 test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1 155 ``` 156 157 ### [cleanup_plugin_for_backup](#cleanup_plugin_for_backup) 158 159 Steps necessary to tear down plugin once backup is complete. E.g. Disconnecting from backup service, removing temporary files created during backup, etc. 160 161 **Usage within gpbackup:** 162 163 Called during the backup teardown phase on the coordinator and each segment host. This will execute even if backup fails early due to an error. 164 165 **Arguments:** 166 167 [config_path](#config_path) 168 169 [local_backup_directory](#local_backup_directory) 170 171 [scope](#scope) 172 173 [contentID](#contentID) 174 175 **Stdout:** None 176 177 **Example:** 178 ``` 179 test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1 180 test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host 181 test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0 182 test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1 183 ``` 184 185 ### [cleanup_plugin_for_restore](#cleanup_plugin_for_restore) 186 187 Steps necessary to tear down plugin once restore is complete. E.g. Disconnecting from backup service, removing files created during restore, etc. 188 189 **Usage within gprestore:** 190 191 Called during the restore teardown phase on the coordinator and each segment host. This will execute even if restore fails early due to an error. 192 193 **Arguments:** 194 195 [config_path](#config_path) 196 197 [local_backup_directory](#local_backup_directory) 198 199 [scope](#scope) 200 201 [contentID](#contentID) 202 203 **Stdout:** None 204 205 **Example:** 206 ``` 207 test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1 208 test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host 209 test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0 210 test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1 211 ``` 212 213 ### [backup_file](#backup_file) 214 215 Given the path to a file gpbackup has created on local disk, this command should copy the file to the remote system. The original file should be left behind. 216 217 **Usage within gpbackup:** 218 219 Called once for each file created by gpbackup after the files have been written to the backup directories on local disk. Some files exist on the coordinator and others exist on the segments. 220 221 **Arguments:** 222 223 [config_path](#config_path) 224 225 [filepath_to_back_up](#filepath) 226 227 **Stdout:** None 228 229 **Example:** 230 ``` 231 test_plugin backup_file /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_20180101010101_metadata.sql 232 ``` 233 234 ### [restore_file](#restore_file) 235 236 Given the path to a file gprestore will read on local disk, this command should recover this file from the remote system and place it at the specified path. 237 238 **Usage within gprestore:** 239 240 Called once for each file created by gpbackup to restore them to local disk so gprestore can read them. Some files will be restored to the coordinator and others to the segments. 241 242 **Arguments:** 243 244 [config_path](#config_path) 245 246 [filepath_to_restore](#filepath) 247 248 **Stdout:** None 249 250 **Example:** 251 ``` 252 test_plugin restore_file /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_20180101010101_metadata.sql 253 ``` 254 255 ### [backup_data](#backup_data) 256 257 This command should read a potentially large stream of data from stdin and process/write this data to a remote system. The destination file should keep the same name as the provided argument for easier restore. 258 259 **Usage within gpbackup:** 260 261 Called by the gpbackup_helper agent process to stream all table data for a segment from the postgres process' stdout to the plugin's stdin. This is a single continuous stream per segment, and can be either compressed or uncompressed depending on flags provided to gpbackup. 262 263 **Arguments:** 264 265 [config_path](#config_path) 266 267 [data_filekey](#data_filekey) 268 269 **Stdout:** None 270 271 **Stdin** Expecting stream of data 272 273 **Example:** 274 ``` 275 COPY "<large amount of data>" | test_plugin backup_data /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_0_20180101010101 276 ``` 277 278 ### [restore_data](#restore_data) 279 280 This command should read a potentially large data file specified by the filepath argument from the remote filesystem and process/write the contents to stdout. The data file in the restore system should have the same name as the filepath argument. 281 282 **Usage within gprestore:** 283 284 Called by the gpbackup_helper agent process to stream all table data for a segment from the remote system to be processed by the agent. If the backup_data command modified the data format (compression or otherwise), restore_data should perform the reverse operation before sending the data to gprestore. 285 286 **Arguments:** 287 288 [config_path](#config_path) 289 290 [data_filekey](#data_filekey) 291 292 **Stdout:** Stream of data from the remote source 293 294 **Example:** 295 ``` 296 test_plugin restore_data /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_0_20180101010101 > COPY ... 297 ``` 298 ### [plugin_api_version](#plugin_api_version) 299 300 This command should echo the gpbackup plugin api version to stdout. 301 302 **Usage within gpbackup and gprestore:** 303 304 Called to verify the plugin is using a version of the gpbackup plugin API that is compatible with the given version of gpbackup and gprestore. 305 306 **Arguments:** 307 308 None 309 310 **Stdout:** X.Y.Z 311 312 **Example:** 313 ``` 314 test_plugin plugin_api_version 315 ``` 316 317 ### [delete_backup](#delete_backup) 318 319 This command should delete the directory specified by the given backup timestamp on the remote system. 320 321 **Arguments:** 322 323 [config_path](#config_path) 324 325 [timestamp](#timestamp) 326 327 **Stdout:** None 328 329 **Example:** 330 ``` 331 test_plugin delete_backup /home/test_plugin_config.yaml 20180108130802 332 ``` 333 334 ### [--version](#--version) 335 336 This command should display the version of the plugin itself (not the api version). 337 338 **Arguments:** None 339 340 **Stdout:** 341 [plugin_name] version [git_version] 342 343 _e.g.:_ gpbackup_s3_plugin version 1.1.0+dev.2.g16b18a1 344 345 **Example:** 346 ``` 347 test_plugin --version 348 ``` 349 350 351 ## Plugin flow within gpbackup and gprestore 352 ### Backup Plugin Flow 353  354 355 ### Restore Plugin Flow 356  357 358 ## Custom yaml file 359 Parameters specific to a plugin can be specified through the plugin configuration yaml file. The _executablepath_ key is required and used by gpbackup and gprestore. Additional arguments should be specified under the _options_ keyword. A path to this file is passed as the first argument to every API command. Options and valid arguments should be documented by the plugin. 360 361 Example yaml file for s3: 362 ``` 363 executablepath: <full path to gpbackup_s3_plugin> 364 options: 365 region: us-west-2 366 aws_access_key_id: ... 367 aws_secret_access_key: ... 368 bucket: my_bucket_name 369 folder: greenplum_backups 370 ``` 371 372 ## Verification using the gpbackup plugin API test bench 373 374 We provide tests to ensure your plugin will work with gpbackup and gprestore. If the tests succesfully run your plugin, you can be confident that your plugin will work with the utilities. The tests are located [here](https://github.com/cloudberrydb/gpbackup/blob/coordinator/plugins/plugin_test.sh). 375 376 Run the test bench script using: 377 378 ``` 379 plugin_test.sh [path_to_executable] [plugin_config] [optional_config_for_secondary_destination] 380 ``` 381 382 This will individually test each command and run a backup and restore using your plugin. This suite will upload small amounts of data to your destination system (<1MB total) 383 384 If the `[optional_config_for_secondary_destination]` is provided, the test bench will also restore from this secondary destination. 385 386 387 ## [Release Notes](#Release_Notes) 388 389 ### Version 0.4.0 390 - [delete_backup](#delete_backup) command added 391 392 ### Version 0.2.0 - 0.3.0 393 - Added [scope](#scope) and [contentID](#contentID) arguments to setup and cleanup functions for more control over execution location. 394 395 ### Version 0.1.0 396 - Initial commands added.