github.com/release-engineering/exodus-rsync@v1.11.2/README.md (about)

     1  # exodus-rsync
     2  
     3  exodus-aware drop-in replacement for rsync.
     4  
     5  [![Coverage Status](https://coveralls.io/repos/github/release-engineering/exodus-rsync/badge.svg?branch=main)](https://coveralls.io/github/release-engineering/exodus-rsync?branch=main)
     6  
     7  <!-- TOC -->
     8  
     9  - [Overview](#overview)
    10  - [Installation](#installation)
    11  - [Configuration](#configuration)
    12  - [Usage](#usage)
    13      - [Differences from rsync](#differences-from-rsync)
    14      - [Publish modes](#publish-modes)
    15          - [Standalone publish](#standalone-publish)
    16          - [Joined publish](#joined-publish)
    17  - [License](#license)
    18  
    19  <!-- /TOC -->
    20  
    21  ## Overview
    22  
    23  exodus-rsync is a command-line file transfer tool which is partially compatible with
    24  [rsync](https://rsync.samba.org/).
    25  
    26  Rather than transferring content via the rsync protocol, exodus-rsync uploads and
    27  publishes content via [exodus-gw](https://github.com/release-engineering/exodus-gw).
    28  
    29  See [exodus architecture](https://release-engineering.github.io/exodus-lambda/arch.html)
    30  for more information on how exodus-rsync works together with other projects in the
    31  Exodus CDN family of projects.
    32  
    33  
    34  ## Installation
    35  
    36  exodus-rsync is a standalone linux-amd64 binary which may be downloaded from the
    37  [repository releases](https://github.com/release-engineering/exodus-rsync/releases).
    38  
    39  It is designed to be installed as the `rsync` command in `$PATH`, ahead of the real
    40  `rsync` command. In typical scenarios this can be accomplished by installing to
    41  `/usr/local/bin`, as in example:
    42  
    43  ```
    44  curl -LO https://github.com/release-engineering/exodus-rsync/releases/latest/download/exodus-rsync
    45  chmod +x exodus-rsync
    46  mv exodus-rsync /usr/local/bin/rsync
    47  ```
    48  
    49  In order for exodus-rsync to do anything useful, it's necessary to first deploy a
    50  configuration file; see the next section.
    51  
    52  
    53  ## Configuration
    54  
    55  exodus-rsync uses a configuration file found at either:
    56  
    57  - exodus-rsync.conf
    58  - $HOME/.config/exodus-rsync.conf
    59  - /etc/exodus-rsync.conf
    60  - path given by `--exodus-conf` command-line argument
    61  
    62  The configuration file is written in YAML. The available config keys
    63  are documented in the example below:
    64  
    65  ```yaml
    66  ###############################################################################
    67  # exodus-gw environment settings
    68  ###############################################################################
    69  #
    70  # X509 PEM-format certificate and key for authentication to exodus-gw.
    71  # Environment variable substitution is supported.
    72  gwcert: $HOME/certs/$USER.crt
    73  gwkey: $HOME/certs/$USER.key
    74  
    75  # Base URL of the exodus-gw service to be used.
    76  # Environment variable substitution is supported.
    77  gwurl: https://exodus-gw.example.com
    78  
    79  # Defines the exodus-gw "environment" for use.
    80  #
    81  # This value must match one of the environments configured on that service, see:
    82  # https://release-engineering.github.io/exodus-gw/deployment.html#settings
    83  #
    84  # Additionally, the `gwcert` in use must grant the necessary roles for writing to
    85  # this environment, such as `live-blob-uploader`, `live-publisher`.
    86  #
    87  # A typical deployment of exodus-gw will use a `pre` environment for the
    88  # publishing of internal/pre-release content and a `live` environment for
    89  # the publishing of live content.
    90  #
    91  # Environment variable substitution is supported.
    92  gwenv: live
    93  
    94  # The commit mode for exodus-gw publish objects, one of the following:
    95  #
    96  # "" or "auto" (default):
    97  #    Commit occurs only if exodus-rsync created the publish (i.e. is operating
    98  #    in "standalone publish" mode). A phase2 (full) commit will be used.
    99  #
   100  #    The default value is appropriate for most scenarios.
   101  #
   102  # "none":
   103  #    Commit never occurs, regardless of whether exodus-rsync created
   104  #    the publish.
   105  #
   106  # "phase1", "phase2", <other>...:
   107  #    A commit of this type occurs, regardless of whether exodus-rsync created
   108  #    the publish. See exodus-gw documentation for details on the behavior
   109  #    of each mode:
   110  #    https://release-engineering.github.io/exodus-gw/api.html#operation/commit_publish__env__publish__publish_id__commit_post
   111  #
   112  # The `--exodus-commit=MODE` option overrides this value.
   113  gwcommit: auto
   114  
   115  ###############################################################################
   116  # Environment configuration
   117  ###############################################################################
   118  #
   119  # When exodus-rsync is run as 'rsync' it will inspect the target
   120  # user@host component of the command-line.
   121  #
   122  # If this component matches one of the configured prefixes, usage of
   123  # exodus-gw is enabled and the specified config is used. Otherwise,
   124  # exodus-rsync will delegate commands to the real rsync.
   125  environments:
   126  
   127    # Defining a prefix like this enables explicitly syncing to exodus CDN,
   128    # as in example:
   129    #
   130    #   rsync /my/src/tree exodus:/my/dest
   131    #
   132    # "prefix" is the only mandatory key here.
   133  - prefix: exodus
   134  
   135    # Defining a prefix like this enables overriding publishes to existing non-exodus
   136    # targets and diverting them instead to exodus CDN, as in example:
   137    #
   138    #   rsync /my/src/tree upload@example1.com:/root/my/dest
   139    #   => publishes to "/my/dest" on exodus
   140    #
   141  - prefix: upload@example1.com:/root
   142  
   143    # By default, the path used in "prefix" will be stripped from the exodus
   144    # destination, meaning that the root directory of the prefix is equivalent to
   145    # the root path on exodus. This can be customized via "strip", as in example:
   146    #
   147    #   rsync /src upload@example2.com:/1234/content/foo/bar
   148    #   => publishes to "/content" on exodus when using the below
   149    #      value for strip
   150    #
   151  - prefix: upload@example2.com:/1234/content
   152    strip: upload@example2.com:/1234
   153  
   154    # All top-level configuration keys can also be overridden per environment;
   155    # for example, to use a different exodus-gw service & environment:
   156    gwurl: https://other-exodus-gw.example.com/
   157    gwenv: pre
   158  
   159  ###############################################################################
   160  # Rsync configuration
   161  ###############################################################################
   162  #
   163  # Defines mode of operation for invoking rsync:
   164  #
   165  # - If "exodus", exodus-rsync only publishes to exodus CDN and does not invoke
   166  #   rsync
   167  #
   168  # - If "rsync", exodus-rsync does not publish to exodus CDN and only invokes rsync
   169  #
   170  # - If "mixed", exodus-rsync both publishes to exodus CDN and also invokes rsync,
   171  #   only exiting successfully if both succeed. Beware of the implications on
   172  #   atomicity (e.g. it is possible for one of these to succeed and the other fail).
   173  #
   174  # - Mode is always forced to "rsync" when no environment is matched.
   175  #
   176  rsyncmode: exodus
   177  
   178  ###############################################################################
   179  # Logging
   180  ###############################################################################
   181  #
   182  # Sets the minimum log level for logs sent to the local system log
   183  # (journald or syslog). One of:
   184  #
   185  # "none"   - no logging
   186  # "debug"  - for debugging exodus-rsync, very verbose
   187  # "trace"  - sets debug level for exodus-rsync and the AWS SDK
   188  # "info"   - outputs messages mostly when writes occur; default, and recommended.
   189  # "warn"   - outputs messages when possible issues are encountered
   190  # "error"  - outputs messages when errors occur
   191  #
   192  # Note that this log level is set independently from the level of verbosity
   193  # sent to stdout/stderr, which is only controlled by the "-v" argument.
   194  #
   195  loglevel: info
   196  
   197  #
   198  # Force usage of a specific logger backend.
   199  #
   200  # "journald"       - use journald. Recommended, fully supports structured logging.
   201  # "syslog"         - use syslog. Structured logs are embedded as JSON.
   202  # "auto" or absent - autodetect best logger
   203  #
   204  logger: auto
   205  
   206  #
   207  # Diagnostic mode.
   208  #
   209  # In diagnostic mode, exodus-rsync will perform various self-checks and dump
   210  # detailed info on the execution environment at the beginning of each
   211  # invocation.
   212  #
   213  # Diagnostic mode is intended for debugging only. It negatively impacts
   214  # performance and should generally be disabled in production.
   215  #
   216  # The `--exodus-diag` command-line option can also enable diagnostic mode.
   217  #
   218  diag: false
   219  
   220  ###############################################################################
   221  # Tuning
   222  ###############################################################################
   223  #
   224  # The following fields, all optional, may affect the performance of
   225  # exodus-rsync.
   226  #
   227  # They are listed here along with their default values.
   228  
   229  # The number of threads (goroutines) used to upload blobs to S3.
   230  uploadthreads: 4
   231  
   232  # When awaiting an exodus-gw publish task, how long (in milliseconds) should
   233  # we wait between each poll of the task status.
   234  gwpollinterval: 5000
   235  
   236  # When adding items onto an exodus-gw publish, what is the maximum number of
   237  # items we'll include in a single HTTP request.
   238  gwbatchsize: 10000
   239  
   240  # How many times to retry failing HTTP requests.
   241  gwmaxattempts: 10
   242  
   243  # Maximum duration (in milliseconds) between retries of HTTP requests.
   244  gwmaxbackoff: 20000
   245  ```
   246  
   247  In order to publish to exodus CDN it is necessary to configure all of the
   248  `gw*` configuration items, and add at least one entry under `environments`.
   249  
   250  If the configuration file is absent, exodus-rsync will pass through all commands
   251  to rsync without any usage of exodus-gw.
   252  
   253  
   254  ## Usage
   255  
   256  exodus-rsync provides an interface partially compatible with this form of the rsync
   257  command:
   258  
   259  ```
   260  exodus-rsync [OPTION]... SRC DEST
   261  ```
   262  
   263  For example, `exodus-rsync /my/srctree exodus:/my/dest` will publish the content of
   264  the `/my/srctree` directory onto Exodus CDN, using `/my/dest` as the root path for
   265  the content.
   266  
   267  In cases where the `DEST` argument does not refer to one of the environments in
   268  exodus-rsync.conf, exodus-rsync will delegate to the real rsync command, passing
   269  through the `SRC`, `DEST` and rsync-compatible `OPTIONs` without modification.
   270  
   271  
   272  ### Differences from rsync
   273  
   274  exodus-rsync does not aim to cover all rsync use-cases and has many limitations
   275  compared to rsync, as well as a few unique features not supported by rsync. Here
   276  is a summary of the differences:
   277  
   278  - exodus-rsync only supports the "single local SRC, remote DEST" form of the rsync command.
   279    rsync supports other variants, such as multiple SRC directories or copying from a remote SRC to a local DEST.
   280  
   281  - exodus-rsync supports a few additional arguments not supported by rsync. All of these are
   282    prefixed with `--exodus-` to avoid any clashes.
   283  
   284    | Argument | Notes |
   285    | -------- | ----- |
   286    | --exodus-conf=PATH | use this configuration file |
   287    | --exodus-publish=ID | join content to an existing publish (see "Publish modes") |
   288    | --exodus-commit=MODE | commit mode for publish (see `gwcommit` in config file) |
   289    | --exodus-diag | diagnostic mode, outputs various info for troubleshooting |
   290  
   291  - exodus-rsync supports only the following rsync arguments, most of which do not have any
   292    effect.
   293  
   294    | Argument | Notes |
   295    | -------- | ----- |
   296    | --verbose, -v | increase log verbosity |
   297    | --archive, -a | ignored |
   298    | --recursive, -r | ignored; exodus-rsync is always recursive |
   299    | --relative, -R | use relative path names |
   300    | --links, -l | copy symlinks as symlinks without following¹ |
   301    | --copy-links, -L | follow symlinks |
   302    | --keep-dirlinks, -K | ignored; there are no directories on exodus CDN |
   303    | --hard-links, -H | ignored |
   304    | --perms, -p | ignored |
   305    | --executability, -E | ignored |
   306    | --acls, -A | ignored |
   307    | --xattrs, -X | ignored |
   308    | --owner, -o | ignored |
   309    | --group, -g | ignored |
   310    | --devices | ignored |
   311    | --specials | ignored |
   312    | -D | ignored; same as --devices and --specials |
   313    | --times, -t | ignored |
   314    | --atimes, -U | ignored |
   315    | --crtimes, -N | ignored |
   316    | --omit-dir-times, -O | ignored; there are no directories on exodus CDN |
   317    | --dry-run, -n | dry-run mode, don't upload or publish anything |
   318    | --rsh, -e | ignored; ssh is not used |
   319    | --ignore-existing | ignored |
   320    | --delete | ignored; deleting content is not supported |
   321    | --prune-empty-dirs, -m | ignored; there are no directories on exodus CDN |
   322    | --timeout | ignored |
   323    | --filter  | add a file-filtering RULE (supports "+/-" rules and "/" modifier) |
   324    | --exclude | exclude files matching this pattern |
   325    | --include | don't exclude files matching PATTERN | 
   326    | --files-from | read list of source-file names from FILE |
   327    | --compress, -z | ignored |
   328    | --stats | ignored |
   329    | --itemize-changes, -i | ignored |
   330  
   331  1. `--links` has the following restrictions:
   332     * All links must resolve to an item included within the current publish at the
   333       time of commit.
   334       (Note that multiple exodus-rsync commands can participate in a single publish,
   335       see "Publish modes".)
   336     * Only a single level of link resolution is permitted. This restriction may be
   337       revisited in the future.
   338  
   339  ### Publish modes
   340  
   341  exodus-rsync supports two different modes of publishing to exodus CDN.
   342  
   343  #### Standalone publish
   344  
   345  This is the default mode.
   346  
   347  exodus-rsync will create a new "publish" object within exodus-gw, add content to it,
   348  and commit it.
   349  
   350  In this mode, each individual execution of exodus-rsync will have atomic semantics,
   351  but a sequence of publishes will not be atomic. For example, if we run commands
   352  in sequence:
   353  
   354  ```
   355  $ exodus-rsync src1 exodus:/dest1
   356  $ exodus-rsync src2 exodus:/dest2
   357  $ exodus-rsync src3 exodus:/dest3
   358  ```
   359  
   360  ... each of dest1, dest2 and dest3 will either be fully exposed on the CDN
   361  or not exposed at all, but if interrupted part way through, it is possible that
   362  (for example) dest1 and dest2 are published but dest3 is not.
   363  
   364  #### Joined publish
   365  
   366  This mode is activated by calling exodus-rsync with the `--exodus-publish=<publish_id>`
   367  argument.
   368  
   369  The given publish ID must have been created in exodus-gw prior to calling exodus-rsync.
   370  exodus-rsync will add content onto the publish, but will not commit it.
   371  
   372  In this mode, it is possible to achieve atomic behavior covering a group of exodus-rsync
   373  commands, as in example:
   374  
   375  ```
   376  # Create a publish.
   377  $ curl [...] -X POST https://exodus-gw.example.com/prod/publish
   378  {"id":"4e59c1a0", ...}
   379  
   380  # Let several syncs join this publish.
   381  $ exodus-rsync --exodus-publish 4e59c1a0 src1 exodus:/dest1
   382  $ exodus-rsync --exodus-publish 4e59c1a0 src2 exodus:/dest2
   383  $ exodus-rsync --exodus-publish 4e59c1a0 src3 exodus:/dest3
   384  
   385  # Commit the publish
   386  $ curl [...] -X POST https://exodus-gw.example.com/prod/publish/4e59c1a0/commit
   387  {"id":"fa9c4b26", ...}
   388  
   389  # (...and we should wait for task fa9c4b26 to complete as well)
   390  ```
   391  
   392  In the above example, it is ensured that either *all* of dest1, dest2 and dest3 are fully
   393  exposed from the CDN or that *none* of them are exposed at all, even if we are interrupted
   394  in the middle of publishing.  None of the published content becomes visible from the CDN until
   395  the "commit" operation occurs, which exposes all content at once.
   396  
   397  More complex scenarios are possible when specifying a custom commit mode via
   398  the `gwcommit` config file option or the `--exodus-commit` argument.
   399  See [the exodus-gw documentation](https://release-engineering.github.io/exodus-gw/api.html#section/Atomicity)
   400  for more information on the supported commit modes and the atomicity
   401  guarantees when publishing with exodus-rsync and exodus-gw.
   402  
   403  ## License
   404  
   405  This program is free software: you can redistribute it and/or modify it under the terms
   406  of the GNU General Public License as published by the Free Software Foundation,
   407  either version 3 of the License, or (at your option) any later version.