github.com/technosophos/deis@v1.7.1-0.20150915173815-f9005256004b/docs/managing_deis/backing_up_data.rst (about)

     1  :title: Backing Up and Restoring Data
     2  :description: Backing up stateful data on Deis.
     3  
     4  .. _backing_up_data:
     5  
     6  Backing Up and Restoring Data
     7  =============================
     8  
     9  While applications deployed on Deis follow the Twelve-Factor methodology and are thus stateless,
    10  Deis maintains platform state in the :ref:`Store` component.
    11  
    12  The store component runs `Ceph`_, and is used by the :ref:`Database`, :ref:`Registry`,
    13  :ref:`Controller`, and :ref:`Logger` components as a data store. Database and registry
    14  use store-gateway and controller and logger use store-volume. Being backed by the store component
    15  enables these components to move freely around the cluster while their state is backed by store.
    16  
    17  The store component is configured to still operate in a degraded state, and will automatically
    18  recover should a host fail and then rejoin the cluster. Total data loss of Ceph is only possible
    19  if all of the store containers are removed. However, backup of Ceph is fairly straightforward, and
    20  is recommended before :ref:`Upgrading Deis <upgrading-deis>`.
    21  
    22  Data stored in Ceph is accessible in two places: on the CoreOS filesystem at ``/var/lib/deis/store``
    23  and in the store-gateway component. Backing up this data is straightforward - we can simply tarball
    24  the filesystem data, and use any S3-compatible blob store tool to download all files in the
    25  store-gateway component.
    26  
    27  Setup
    28  -----
    29  
    30  The ``deis-store-gateway`` component exposes an S3-compatible API, so we can use a tool like `s3cmd`_
    31  to work with the object store. First, `download s3cmd`_ and install it (you'll need at least version
    32  1.5.0 for Ceph support).
    33  
    34  We'll need the generated access key and secret key for use with the gateway. We can get these using
    35  ``deisctl``, either on one of the cluster machines or on a remote machine with ``DEISCTL_TUNNEL`` set:
    36  
    37  .. code-block:: console
    38  
    39      $ deisctl config store get gateway/accessKey
    40      $ deisctl config store get gateway/secretKey
    41  
    42  Back on the local machine, run ``s3cmd --configure`` and enter your access key and secret key.
    43  
    44  When prompted with the ``Use HTTPS protocol`` option, answer ``No``. Other settings can be left at
    45  the defaults. If the configure script prompts to test the credentials, skip that step - it will
    46  try to authenticate against Amazon S3 and fail.
    47  
    48  You'll need to change two configuration settings - edit ``~/.s3cfg`` and change
    49  ``host_base`` and ``host_bucket`` to match ``deis-store.<your domain>``. For example, for my local
    50  Vagrant setup, I've changed the lines to:
    51  
    52  .. code-block:: console
    53  
    54      host_base = deis-store.local3.deisapp.com
    55      host_bucket = deis-store.local3.deisapp.com
    56  
    57  We can now use ``s3cmd`` to back up and restore data from the store-gateway.
    58  
    59  .. note::
    60  
    61      Some users have reported that the data transferred in this process can overwhelm the gateway
    62      component, and that scaling up to multiple gateways with ``deisctl scale`` before both the backup
    63      and restore alleviates this issue.
    64  
    65  Backing up
    66  ----------
    67  
    68  Database backups and registry data
    69  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    70  
    71  The store-gateway component stores database backups and is used to store data for the registry.
    72  On our local machine, we can use ``s3cmd sync`` to copy the objects locally:
    73  
    74  .. code-block:: console
    75  
    76      $ s3cmd sync s3://db_wal .
    77      $ s3cmd sync s3://registry .
    78  
    79  Log data
    80  ~~~~~~~~
    81  
    82  The store-volume service mounts a filesystem which is used by the controller and logger components
    83  to store and retrieve application and component logs.
    84  
    85  Since this is just a POSIX filesystem, you can simply tarball the contents of this directory
    86  and rsync it to a local machine:
    87  
    88  .. code-block:: console
    89  
    90      $ ssh core@<hostname> 'cd /var/lib/deis/store && sudo tar cpzf ~/store_file_backup.tar.gz .'
    91      tar: /var/lib/deis/store/logs/deis-registry.log: file changed as we read it
    92      $ rsync -avhe ssh core@<hostname>:~/store_file_backup.tar.gz .
    93  
    94  Note that you'll need to specify the SSH port when using Vagrant:
    95  
    96  .. code-block:: console
    97  
    98      $ rsync -avhe 'ssh -p 2222' core@127.0.0.1:~/store_file_backup.tar.gz .
    99  
   100  Note the warning - in a running cluster the log files are constantly being written to, so we are
   101  preserving a specific moment in time.
   102  
   103  Database data
   104  ~~~~~~~~~~~~~
   105  
   106  While backing up the Ceph data is sufficient (as database ships backups and WAL logs to store),
   107  we can also back up the PostgreSQL data using ``pg_dumpall`` so we have a text dump of the database.
   108  
   109  We can identify the machine running database with ``deisctl list``, and from that machine:
   110  
   111  .. code-block:: console
   112  
   113      core@deis-1 ~ $ docker exec deis-database sudo -u postgres pg_dumpall > dump_all.sql
   114      core@deis-1 ~ $ docker cp deis-database:/app/dump_all.sql .
   115  
   116  Restoring
   117  ---------
   118  
   119  .. note::
   120  
   121      Restoring data is only necessary when deploying a new cluster. Most users will use the normal
   122      in-place upgrade workflow which does not require a restore.
   123  
   124  We want to restore the data on a new cluster before the rest of the Deis components come up and
   125  initialize. So, we will install the whole platform, but only start the store components:
   126  
   127  .. code-block:: console
   128  
   129      $ deisctl install platform
   130      $ deisctl start store-monitor
   131      $ deisctl start store-daemon
   132      $ deisctl start store-metadata
   133      $ deisctl start store-gateway@1
   134      $ deisctl start store-volume
   135  
   136  We'll also need to start a router so we can access the gateway:
   137  
   138  .. code-block:: console
   139  
   140      $ deisctl start router@1
   141  
   142  The default maximum body size on the router is too small to support large uploads to the gateway,
   143  so we need to increase it:
   144  
   145  .. code-block:: console
   146  
   147      $ deisctl config router set bodySize=100m
   148  
   149  The new cluster will have generated a new access key and secret key, so we'll need to get those again:
   150  
   151  .. code-block:: console
   152  
   153      $ deisctl config store get gateway/accessKey
   154      $ deisctl config store get gateway/secretKey
   155  
   156  Edit ``~/.s3cfg`` and update the keys.
   157  
   158  Now we can restore the data!
   159  
   160  Database backups and registry data
   161  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   162  
   163  Because neither the database nor registry have started, the bucket we need to restore to will not
   164  yet exist. So, we'll need to create those buckets:
   165  
   166  .. code-block:: console
   167  
   168      $ s3cmd mb s3://db_wal
   169      $ s3cmd mb s3://registry
   170  
   171  Now we can restore the data:
   172  
   173  .. code-block:: console
   174  
   175      $ s3cmd sync basebackups_005 s3://db_wal
   176      $ s3cmd sync wal_005 s3://db_wal
   177      $ s3cmd sync registry s3://registry
   178  
   179  Log data
   180  ~~~~~~~~
   181  
   182  Once we copy the tarball back to one of the CoreOS machines, we can extract it:
   183  
   184  .. code-block:: console
   185  
   186      $ rsync -avhe ssh store_file_backup.tar.gz core@<hostname>:~/store_file_backup.tar.gz
   187      $ ssh core@<hostname> 'cd /var/lib/deis/store && sudo tar -xzpf ~/store_file_backup.tar.gz --same-owner'
   188  
   189  Note that you'll need to specify the SSH port when using Vagrant:
   190  
   191  .. code-block:: console
   192  
   193      $ rsync -avhe 'ssh -p 2222' store_file_backup.tar.gz core@127.0.0.1:~/store_file_backup.tar.gz
   194  
   195  Finishing up
   196  ~~~~~~~~~~~~
   197  
   198  Now that the data is restored, the rest of the cluster should come up normally with a ``deisctl start platform``.
   199  
   200  The last task is to instruct the controller to re-write user keys, application data, and domains to etcd.
   201  Log into the machine which runs deis-controller and run the following. Note that the IP address to
   202  use in the ``export`` command should correspond to the IP of the host machine which runs this container.
   203  
   204  .. code-block:: console
   205  
   206      $ nse deis-controller
   207      $ cd /app
   208      $ export ETCD=172.17.8.100:4001
   209      ./manage.py shell <<EOF
   210      from api.models import *
   211      [k.save() for k in Key.objects.all()]
   212      [a.save() for a in App.objects.all()]
   213      [d.save() for d in Domain.objects.all()]
   214      [c.save() for c in Certificate.objects.all()]
   215      [c.save() for c in Config.objects.all()]
   216      EOF
   217      $ exit
   218  
   219  That's it! The cluster should be fully restored.
   220  
   221  Tools
   222  -----
   223  
   224  Various community members have developed tools to assist in automating the backup and restore process outlined above.
   225  Information on the tools can be found on the `Community Contributions`_ page.
   226  
   227  .. _`Ceph`: http://ceph.com
   228  .. _`download s3cmd`: http://s3tools.org/download
   229  .. _`Community Contributions`: https://github.com/deis/deis/blob/master/contrib/README.md#backup-tools
   230  .. _`s3cmd`: http://s3tools.org/