github.com/cloudberrydb/gpbackup@v1.0.3-0.20240118031043-5410fd45eed6/plugins/README.md (about)

     1  # Plugins in gpbackup
     2  
     3  gpbackup and gprestore support backing up to and restoring from remote storage locations (e.g.: s3) using a plugin architecture
     4  
     5  ## Using plugins
     6  The plugin executable must exist on all segments at the same path.
     7  
     8  Backing up with a plugin:
     9  ```
    10  gpbackup ... --plugin-config <Absolute path to config file>
    11  ```
    12  
    13  Restoring with a plugin:
    14  ```
    15  gprestore ... --plugin-config <Absolute path to config file>
    16  ```
    17  The backup you are restoring must have been taken with the same plugin.
    18  
    19  ## Plugin configuration file format
    20  The plugin configuration must be specified in a yaml file. This yaml file is only required to exist on the coordinator host, and is automatically copied to segment hosts.
    21  
    22  The _executablepath_ is a required parameter and must point to the absolute path of the executable on each host. Additional parameters may be specified under the _options_ key as required by the specific plugin. Refer to the documentation for the plugin you are using for additional required paramters. The _options_ section will include "pgport" for one of the segments on a given host, in case the plugin requires usage of a postgres function. Upon a restore, the _options_ section may also contain "backup_plugin_version" if the information is available from historical records.  With this historical version, a newer plugin could possibly support backwards compatibility toward backups created with older versions of plugins.
    23  
    24  ```
    25  executablepath: <Absolute path to plugin executable>
    26  options:
    27    my_first_option: <value1>
    28    my_second_option: <value2>
    29    pgport: 5432
    30    backup_plugin_version: 1.3
    31    <Additional options for the specific plugin>
    32  ```
    33  
    34  ## Available plugins
    35  [gpbackup_s3_plugin](https://github.com/greenplum-db/gpbackup-s3-plugin): Allows users to back up their Greenplum Database to Amazon S3.
    36  
    37  ## Developing plugins
    38  
    39  Plugins can be written in any language as long as they can be called as an executable and adhere to the gpbackup plugin API.
    40  
    41  gpbackup and gprestore will call the plugin executable in the format
    42  ```
    43  [plugin_executable_name] [command] arg1 arg2
    44  ```
    45  
    46  If an error occurs during plugin execution, plugins should write an error message to stderr and return a non-zero error code.
    47  
    48  
    49  
    50  ## Commands
    51  
    52  The current version of our utility calls all commands listed below. Errors will occur if any of them are not defined. If your plugin does not require the functionality of one of these commands, leave the implementation empty.
    53  
    54  See [Release Notes](#Release_Notes) for command modification history.
    55  
    56  [setup_plugin_for_backup](#setup_plugin_for_backup)
    57  
    58  [setup_plugin_for_restore](#setup_plugin_for_restore)
    59  
    60  [cleanup_plugin_for_backup](#cleanup_plugin_for_backup)
    61  
    62  [cleanup_plugin_for_restore](#cleanup_plugin_for_restore)
    63  
    64  [backup_file](#backup_file)
    65  
    66  [restore_file](#restore_file)
    67  
    68  [backup_data](#backup_data)
    69  
    70  [restore_data](#restore_data)
    71  
    72  [plugin_api_version](#plugin_api_version)
    73  
    74  [delete_backup](#delete_backup)
    75  
    76  [--version](#--version)
    77  
    78  ## Command Arguments
    79  
    80  These arguments are passed to the plugin by gpbackup/gprestore.
    81  
    82  [config_path](#config_path): Absolute path to the config yaml file
    83  
    84  [local_backup_directory](#local_backup_directory): The path to the directory where gpbackup would place backup files on the coordinator host if not using a plugin. Our plugins reference this path to recreate a similar directory structure on the destination system. gprestore will read files from this location so the plugin will need to create the directory during setup if it does not already exist.
    85  
    86  [scope](#scope): The scope at which this plugin's setup/cleanup hook is invoked. Values for this parameter are "coordinator", "segment_host" and "segment" (with "master" being a supported synonym for "coordinator" for backwards compatibility). Each such hook is invoked at each of these scopes. For eg. If we have a cluster with a coordinator on 1 coordinator host and 2 segment hosts each with 4 segments, each of these hooks will be executed in the following manner: There will be 1 invocation
    87  of each method with the parameter "coordinator", offering a chance to perform some setup/cleanup to be done *once* per cluster. Creation/Deletion of a remote directory is a perfect candidate here. Furthermore, there will be 1 invocation for each of these commands for each of the segment hosts, offering a chance to establish/teardown connectivity to a remote storage provider such as S3 for instance. Finally, there will be 1 invocation for each of these commands for each of the segments.
    88  
    89  Note: "segment_host" and "segment" are both provided as a single physical segment host may house multiple segment processes in Greenplum. There maybe some setup or cleanup required at the segment host level as compared to each segment process.
    90  
    91  [contentID](#contentID): The contentID corresponding to the scope. This is passed in only for the "coordinator" and "segment" scopes.
    92  
    93  [filepath](#filepath): The local path to a file written by gpbackup and/or read by gprestore.
    94  
    95  [data_filekey](#data_filekey): The path where a data file would be written on local disk if not using a plugin. The plugin should use the filename specified in this argument when storing the streamed data on the remote system because the same path will be used as a key to the restore_data command to retrieve the data.
    96  
    97  [timestamp](#timestamp): The timestamp key for a particular backup.
    98  
    99  ## Command API
   100  
   101  ### [setup_plugin_for_backup](#setup_plugin_for_backup)
   102  
   103  Steps necessary to initialize plugin before backup begins. E.g. Creating remote directories, validating connectivity, disk checks, etc.
   104  
   105  **Usage within gpbackup:**
   106  
   107  Called at the start of the backup process on the coordinator and each segment host.
   108  
   109  **Arguments:**
   110  
   111  [config_path](#config_path)
   112  
   113  [local_backup_directory](#local_backup_directory)
   114  
   115  [scope](#scope)
   116  
   117  [contentID](#contentID)
   118  
   119  **Stdout:** None
   120  
   121  **Example:**
   122  ```
   123  test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1
   124  test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host
   125  test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0
   126  test_plugin setup_plugin_for_backup /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1
   127  ```
   128  
   129  ### [setup_plugin_for_restore](#setup_plugin_for_restore)
   130  
   131  Steps necessary to initialize plugin before restore begins. E.g. validating connectivity
   132  
   133  **Usage within gprestore:**
   134  
   135  Called at the start of the restore process on the coordinator and each segment host.
   136  
   137  **Arguments:**
   138  
   139  [config_path](#config_path)
   140  
   141  [local_backup_directory](#local_backup_directory)
   142  
   143  [scope](#scope)
   144  
   145  [contentID](#contentID)
   146  
   147  **Stdout:** None
   148  
   149  **Example:**
   150  ```
   151  test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1
   152  test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host
   153  test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0
   154  test_plugin setup_plugin_for_restore /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1
   155  ```
   156  
   157  ### [cleanup_plugin_for_backup](#cleanup_plugin_for_backup)
   158  
   159  Steps necessary to tear down plugin once backup is complete. E.g. Disconnecting from backup service, removing temporary files created during backup, etc.
   160  
   161  **Usage within gpbackup:**
   162  
   163  Called during the backup teardown phase on the coordinator and each segment host. This will execute even if backup fails early due to an error.
   164  
   165  **Arguments:**
   166  
   167  [config_path](#config_path)
   168  
   169  [local_backup_directory](#local_backup_directory)
   170  
   171  [scope](#scope)
   172  
   173  [contentID](#contentID)
   174  
   175  **Stdout:** None
   176  
   177  **Example:**
   178  ```
   179  test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1
   180  test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host
   181  test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0
   182  test_plugin cleanup_plugin_for_backup /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1
   183  ```
   184  
   185  ### [cleanup_plugin_for_restore](#cleanup_plugin_for_restore)
   186  
   187  Steps necessary to tear down plugin once restore is complete. E.g. Disconnecting from backup service, removing files created during restore, etc.
   188  
   189  **Usage within gprestore:**
   190  
   191  Called during the restore teardown phase on the coordinator and each segment host. This will execute even if restore fails early due to an error.
   192  
   193  **Arguments:**
   194  
   195  [config_path](#config_path)
   196  
   197  [local_backup_directory](#local_backup_directory)
   198  
   199  [scope](#scope)
   200  
   201  [contentID](#contentID)
   202  
   203  **Stdout:** None
   204  
   205  **Example:**
   206  ```
   207  test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir-1/backups/20180101/20180101010101 coordinator -1
   208  test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment_host
   209  test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir0/backups/20180101/20180101010101 segment 0
   210  test_plugin cleanup_plugin_for_restore /home/test_plugin_config.yaml /data_dir1/backups/20180101/20180101010101 segment 1
   211  ```
   212  
   213  ### [backup_file](#backup_file)
   214  
   215  Given the path to a file gpbackup has created on local disk, this command should copy the file to the remote system. The original file should be left behind.
   216  
   217  **Usage within gpbackup:**
   218  
   219  Called once for each file created by gpbackup after the files have been written to the backup directories on local disk. Some files exist on the coordinator and others exist on the segments.
   220  
   221  **Arguments:**
   222  
   223  [config_path](#config_path)
   224  
   225  [filepath_to_back_up](#filepath)
   226  
   227  **Stdout:** None
   228  
   229  **Example:**
   230  ```
   231  test_plugin backup_file /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_20180101010101_metadata.sql
   232  ```
   233  
   234  ### [restore_file](#restore_file)
   235  
   236  Given the path to a file gprestore will read on local disk, this command should recover this file from the remote system and place it at the specified path.
   237  
   238  **Usage within gprestore:**
   239  
   240  Called once for each file created by gpbackup to restore them to local disk so gprestore can read them. Some files will be restored to the coordinator and others to the segments.
   241  
   242  **Arguments:**
   243  
   244  [config_path](#config_path)
   245  
   246  [filepath_to_restore](#filepath)
   247  
   248  **Stdout:** None
   249  
   250  **Example:**
   251  ```
   252  test_plugin restore_file /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_20180101010101_metadata.sql
   253  ```
   254  
   255  ### [backup_data](#backup_data)
   256  
   257  This command should read a potentially large stream of data from stdin and process/write this data to a remote system. The destination file should keep the same name as the provided argument for easier restore.
   258  
   259  **Usage within gpbackup:**
   260  
   261  Called by the gpbackup_helper agent process to stream all table data for a segment from the postgres process' stdout to the plugin's stdin. This is a single continuous stream per segment, and can be either compressed or uncompressed depending on flags provided to gpbackup.
   262  
   263  **Arguments:**
   264  
   265  [config_path](#config_path)
   266  
   267  [data_filekey](#data_filekey)
   268  
   269  **Stdout:** None
   270  
   271  **Stdin** Expecting stream of data
   272  
   273  **Example:**
   274  ```
   275  COPY "<large amount of data>" | test_plugin backup_data /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_0_20180101010101
   276  ```
   277  
   278  ### [restore_data](#restore_data)
   279  
   280  This command should read a potentially large data file specified by the filepath argument from the remote filesystem and process/write the contents to stdout. The data file in the restore system should have the same name as the filepath argument.
   281  
   282  **Usage within gprestore:**
   283  
   284  Called by the gpbackup_helper agent process to stream all table data for a segment from the remote system to be processed by the agent. If the backup_data command modified the data format (compression or otherwise), restore_data should perform the reverse operation before sending the data to gprestore.
   285  
   286  **Arguments:**
   287  
   288  [config_path](#config_path)
   289  
   290  [data_filekey](#data_filekey)
   291  
   292  **Stdout:** Stream of data from the remote source
   293  
   294  **Example:**
   295  ```
   296  test_plugin restore_data /home/test_plugin_config.yaml /data_dir/backups/20180101/20180101010101/gpbackup_0_20180101010101 > COPY ...
   297  ```
   298  ### [plugin_api_version](#plugin_api_version)
   299  
   300  This command should echo the gpbackup plugin api version to stdout.
   301  
   302  **Usage within gpbackup and gprestore:**
   303  
   304  Called to verify the plugin is using a version of the gpbackup plugin API that is compatible with the given version of gpbackup and gprestore.
   305  
   306  **Arguments:**
   307  
   308  None
   309  
   310  **Stdout:** X.Y.Z
   311  
   312  **Example:**
   313  ```
   314  test_plugin plugin_api_version
   315  ```
   316  
   317  ### [delete_backup](#delete_backup)
   318  
   319  This command should delete the directory specified by the given backup timestamp on the remote system.
   320  
   321  **Arguments:**
   322  
   323  [config_path](#config_path)
   324  
   325  [timestamp](#timestamp)
   326  
   327  **Stdout:** None
   328  
   329  **Example:**
   330  ```
   331  test_plugin delete_backup /home/test_plugin_config.yaml 20180108130802
   332  ```
   333  
   334  ### [--version](#--version)
   335  
   336  This command should display the version of the plugin itself (not the api version).
   337  
   338  **Arguments:** None
   339  
   340  **Stdout:**
   341  [plugin_name] version [git_version]
   342  
   343  _e.g.:_ gpbackup_s3_plugin version 1.1.0+dev.2.g16b18a1
   344  
   345  **Example:**
   346  ```
   347  test_plugin --version
   348  ```
   349  
   350  
   351  ## Plugin flow within gpbackup and gprestore
   352  ### Backup Plugin Flow
   353  ![Backup Plugin Flow](https://github.com/greenplum-db/gpbackup/wiki/backup_plugin_flow.png)
   354  
   355  ### Restore Plugin Flow
   356  ![Restore Plugin Flow](https://github.com/cloudberrydb/gpbackup/wiki/restore_plugin_flow.png)
   357  
   358  ## Custom yaml file
   359  Parameters specific to a plugin can be specified through the plugin configuration yaml file. The _executablepath_ key is required and used by gpbackup and gprestore. Additional arguments should be specified under the _options_ keyword. A path to this file is passed as the first argument to every API command. Options and valid arguments should be documented by the plugin.
   360  
   361  Example yaml file for s3:
   362  ```
   363  executablepath: <full path to gpbackup_s3_plugin>
   364  options:
   365    region: us-west-2
   366    aws_access_key_id: ...
   367    aws_secret_access_key: ...
   368    bucket: my_bucket_name
   369    folder: greenplum_backups
   370  ```
   371  
   372  ## Verification using the gpbackup plugin API test bench
   373  
   374  We provide tests to ensure your plugin will work with gpbackup and gprestore. If the tests succesfully run your plugin, you can be confident that your plugin will work with the utilities. The tests are located [here](https://github.com/cloudberrydb/gpbackup/blob/coordinator/plugins/plugin_test.sh).
   375  
   376  Run the test bench script using:
   377  
   378  ```
   379  plugin_test.sh [path_to_executable] [plugin_config] [optional_config_for_secondary_destination]
   380  ```
   381  
   382  This will individually test each command and run a backup and restore using your plugin. This suite will upload small amounts of data to your destination system (<1MB total)
   383  
   384  If the `[optional_config_for_secondary_destination]` is provided, the test bench will also restore from this secondary destination.
   385  
   386  
   387  ## [Release Notes](#Release_Notes)
   388  
   389  ### Version 0.4.0
   390   - [delete_backup](#delete_backup) command added
   391  
   392  ### Version 0.2.0 - 0.3.0
   393   - Added [scope](#scope) and [contentID](#contentID) arguments to setup and cleanup functions for more control over execution location.
   394  
   395  ### Version 0.1.0
   396   - Initial commands added.