github.com/ssube/gitlab-ci-multi-runner@v1.2.1-0.20160607142738-b8d1285632e6/docs/configuration/autoscale.md

github.com/ssube/gitlab-ci-multi-runner@v1.2.1-0.20160607142738-b8d1285632e6/docs/configuration/autoscale.md (about)

     1  # Runners autoscale configuration
     2  
     3  > The autoscale feature was introduced in GitLab Runner 1.1.0.
     4  
     5  ---
     6  
     7  <!-- START doctoc generated TOC please keep comment here to allow auto update -->
     8  <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
     9  **Table of Contents**  *generated with [DocToc](https://github.com/thlorenz/doctoc)*
    10  
    11  - [Overview](#overview)
    12  - [System requirements](#system-requirements)
    13  - [Runner configuration](#runner-configuration)
    14      - [Runner global options](#runner-global-options)
    15      - [`[[runners]]` options](#runners-options)
    16      - [`[runners.machine]` options](#runners-machine-options)
    17      - [`[runners.cache]` options](#runners-cache-options)
    18      - [Additional configuration information](#additional-configuration-information)
    19  - [Autoscaling algorithm and parameters](#autoscaling-algorithm-and-parameters)
    20  - [How `current`, `limit` and `IdleCount` generate the upper limit of running machines](#how-current-limit-and-idlecount-generate-the-upper-limit-of-running-machines)
    21  - [Distributed runners caching](#distributed-runners-caching)
    22  - [Distributed Docker registry mirroring](#distributed-docker-registry-mirroring)
    23  - [A complete example of `config.toml`](#a-complete-example-of-config-toml)
    24  - [What are the supported cloud providers](#what-are-the-supported-cloud-providers)
    25  
    26  <!-- END doctoc generated TOC please keep comment here to allow auto update -->
    27  
    28  ## Overview
    29  
    30  Autoscale provides the ability to utilize resources in a more elastic and
    31  dynamic way.
    32  
    33  When this feature is enabled and configured properly, builds are executed on
    34  machines created _on demand_. Those machines, after the build is finished, can
    35  wait to run the next builds or can be removed after the configured `IdleTime`.
    36  In case of many cloud providers this helps to utilize the cost of already used
    37  instances.
    38  
    39  Thanks to runners being able to autoscale, your infrastructure contains only as
    40  much build instances as necessary at anytime. If you configure the Runner to
    41  only use autoscale, the system on which the Runner is installed acts as a
    42  bastion for all the machines it creates.
    43  
    44  Below, you can see a real life example of the runners autoscale feature, tested
    45  on GitLab.com for the [GitLab Community Edition][ce] project:
    46  
    47  ![Real life example of autoscaling](img/autoscale-example.png)
    48  
    49  Each machine on the chart is an independent cloud instance, running build jobs
    50  inside of Docker containers.
    51  
    52  [ce]: https://gitlab.com/gitlab-org/gitlab-ce
    53  
    54  ## System requirements
    55  
    56  To use the autoscale feature, the system which will host the Runner must have:
    57  
    58  - GitLab Runner executable - installation guide can be found in
    59    [GitLab Runner Documentation][runner-installation]
    60  - Docker Machine executable - installation guide can be found in
    61    [Docker Machine documentation][docker-machine-installation]
    62  
    63  If you need to use any virtualization/cloud providers that aren't handled by
    64  Docker's Machine internal drivers, the appropriate driver plugin must be
    65  installed. The Docker Machine driver plugin installation and configuration is
    66  out of the scope of this documentation. For more details please read the
    67  [Docker Machine documentation][docker-machine-docs].
    68  
    69  ## Runner configuration
    70  
    71  In this section we will describe only the significant parameters from the
    72  autoscale feature point of view. For more configurations details please read
    73  the [GitLab Runner - Installation][runner-installation]
    74  and [GitLab Runner - Advanced Configuration][runner-configuration].
    75  
    76  ### Runner global options
    77  
    78  | Parameter    | Value   | Description |
    79  |--------------|---------|-------------|
    80  | `concurrent` | integer | Limits how many jobs globally can be run concurrently. This is the most upper limit of number of jobs using _all_ defined runners, local and autoscale. Together with `limit` (from [`[[runners]]` section](#runners-options)) and `IdleCount` (from [`[runners.machine]` section](advanced-configuration.md#the-runnersmachine-section)) it affects the upper limit of created machines. |
    81  
    82  ### `[[runners]]` options
    83  
    84  | Parameter  | Value            | Description |
    85  |------------|------------------|-------------|
    86  | `executor` | string           | To use the autoscale feature, `executor` must be set to `docker+machine` or `docker-ssh+machine`. |
    87  | `limit`    | integer          | Limits how many jobs can be handled concurrently by this specific token. 0 simply means don't limit. For autoscale it's the upper limit of machines created by this provider (in conjuction with `concurrent` and `IdleCount`). |
    88  
    89  ### `[runners.machine]` options
    90  
    91  Configuration parameters details can be found
    92  in [GitLab Runner - Advanced Configuration - The runners.machine section](advanced-configuration.md#the-runnersmachine-section).
    93  
    94  ### `[runners.cache]` options
    95  
    96  Configuration parameters details can be found
    97  in [GitLab Runner - Advanced Configuration - The runners.cache section](advanced-configuration.md#the-runnerscache-section)
    98  
    99  ### Additional configuration information
   100  
   101  There is also a special mode, when you set `IdleCount = 0`. In this mode,
   102  machines are **always** created **on-demand** before each build (if there is no
   103  available machine in _Idle_ state). After the build is finished, the autoscaling
   104  algorithm works
   105  [the same as it is described below](#autoscaling-algorithm-and-parameters).
   106  The machine is waiting for the next builds, and if no one is executed, after
   107  the `IdleTime` period, the machine is removed. If there are no builds, there
   108  are no machines in _Idle_ state.
   109  
   110  ## Autoscaling algorithm and parameters
   111  
   112  The autoscaling algorithm is based on three main parameters: `IdleCount`,
   113  `IdleTime` and `limit`.
   114  
   115  We say that each machine that does not run a build is in _Idle_ state. When
   116  GitLab Runner is in autoscale mode, it monitors all machines and ensures that
   117  there is always an `IdleCount` of machines in _Idle_ state.
   118  
   119  At the same time, GitLab Runner is checking the duration of the _Idle_ state of
   120  each machine. If the time exceeds the `IdleTime` value, the machine is
   121  automatically removed.
   122  
   123  ---
   124  
   125  **Example:**
   126  Let's suppose, that we have configured GitLab Runner with the following
   127  autoscale parameters:
   128  
   129  ```bash
   130  [[runners]]
   131    limit = 10
   132    (...)
   133    executor = "docker+machine"
   134    [runners.machine]
   135      IdleCount = 2
   136      IdleTime = 1800
   137      (...)
   138  ```
   139  
   140  At the beginning, when no builds are queued, GitLab Runner starts two machines
   141  (`IdleCount = 2`), and sets them in _Idle_ state. If there is 30 minutes
   142  (`IdleTime = 1800`) of inactivity (since last project finished building), both
   143  machines will be removed. As of this moment we have **zero** machines in _Idle_
   144  state, so GitLab Runner starts 2 new machines to satisfy `IdleCount` which is
   145  set to 2.
   146  
   147  Now, let's assume that 5 builds are queued in GitLab CI. The first 2 builds are
   148  sent to the _Idle_ machines. GitLab Runner notices that the number of _Idle_
   149  machines is less than `IdleCount` (`0 < 2`), so it starts 2 new machines. Then,
   150  the next 2 builds from the queue are sent to those newly created machines.
   151  Again, the number of _Idle_ machines is less than `IdleCount`, so GitLab Runner
   152  starts 2 new machines and the last queued build is sent to one of the _Idle_
   153  machines.
   154  
   155  We now have 1 _Idle_ machine, so GitLab Runner starts another 1 new machine to
   156  satisfy `IdleCount`. Because there are no new builds in queue, those two
   157  machines stay in _Idle_ state and GitLab Runner is satisfied.
   158  
   159  ---
   160  
   161  **This is what happend:**
   162  We had 2 machines, waiting in _Idle_ state for new builds. After the 5 builds
   163  where queued, new machines were created, so in total we had 7 machines. Five of
   164  them were running builds, and 2 were in _Idle_ state, waiting for the next
   165  builds.
   166  
   167  The algorithm will still work in the same way; GitLab Runner will create a new
   168  _Idle_ machine for each machine used for the build execution until `IdleCount`
   169  is satisfied. Those machines will be created up to the number defined by
   170  `limit` parameter. If GitLab Runner notices that there is a `limit` number of
   171  total created machines, it will stop autoscaling, and new builds will need to
   172  wait in the build queue until machines start returning to _Idle_ state.
   173  
   174  In the above example we will always have two idle machines. The `IdleTime`
   175  applies only when we are over the `IdleCount`, then we try to reduce the number
   176  of machines to `IdleCount`.
   177  
   178  ---
   179  
   180  **Scaling down:**
   181  After the build is finished, the machine is set to _Idle_ state and is waiting
   182  for the next builds to be executed. Let's suppose that we have no new builds in
   183  the queue. After the time designated by `IdleTime` passes, the _Idle_ machines
   184  will be removed. In our example, after 30 minutes, all machines will be removed
   185  (each machine after 30 minutes from when last build execution ended) and GitLab
   186  Runner will start to keep an `IdleCount` of _Idle_ machines running, just like
   187  at the beginning of the example.
   188  
   189  ---
   190  
   191  So, to sum up:
   192  
   193  1. We start the Runner
   194  2. Runner creates 2 idle machines
   195  3. Runner picks one build
   196  4. Runner creates one more machine to fulfill the strong requirement of always
   197     having the two idle machines
   198  5. Build finishes, we have 3 idle machines
   199  6. When one of the three idle machines goes over `IdleTime` from the time when
   200     last time it picked the build it will be removed
   201  7. The Runner will always have at least 2 idle machines waiting for fast
   202     picking of the builds
   203  
   204  Below you can see a comparison chart of builds statuses and machines statuses
   205  in time:
   206  
   207  ![Autoscale state chart](img/autoscale-state-chart.png)
   208  
   209  ## How `current`, `limit` and `IdleCount` generate the upper limit of running machines
   210  
   211  There doesn't exist a magic equation that will tell you what to set `limit` or
   212  `concurrent` to. Act according to your needs. Having `IdleCount` of _Idle_
   213  machines is a speedup feature. You don't need to wait 10s/20s/30s for the
   214  instance to be created. But as a user, you'd want all your machines (for which
   215  you need to pay) to be running builds, not stay in _Idle_ state. So you should
   216  have `concurrent` and `limit` set to values that will run the maximum count of
   217  machines you are willing to pay for. As for `IdleCount`, it should be set to a
   218  value that will generate a minimum amount of _not used_ machines when the build
   219  queue is empty.
   220  
   221  Let's assume the following example:
   222  
   223  ```bash
   224  concurrent=20
   225  
   226  [[runners]]
   227    limit = 40
   228    [runners.machine]
   229      IdleCount = 10
   230  ```
   231  
   232  In the above scenario the total amount of machines we could have is 30. The
   233  `limit` of total machines (building and idle) can be 40. We can have 10 idle
   234  machines but the `concurrent` builds are 20. So in total we can have 20
   235  concurrent machines running builds and 10 idle, summing up to 30.
   236  
   237  But what happens if the `limit` is less than the total amount of machines that
   238  could be created? The example below explains that case:
   239  
   240  ```bash
   241  concurrent=20
   242  
   243  [[runners]]
   244    limit = 25
   245    [runners.machine]
   246      IdleCount = 10
   247  ```
   248  
   249  In this example we will have at most 20 concurrent builds, and at most 25
   250  machines created. In the worst case scenario regarding idle machines, we will
   251  not be able to have 10 idle machines, but only 5, because the `limit` is 25.
   252  
   253  ## Distributed runners caching
   254  
   255  To speed up your builds, GitLab Runner provides a [cache mechanism][cache]
   256  where selected directories and/or files are saved and shared between subsequent
   257  builds.
   258  
   259  This is working fine when builds are run on the same host, but when you start
   260  using the Runners autoscale feature, most of your builds will be running on a
   261  new (or almost new) host, which will execute each build in a new Docker
   262  container. In that case, you will not be able to take advantage of the cache
   263  feature.
   264  
   265  To overcome this issue, together with the autoscale feature, the distributed
   266  Runners cache feature was introduced.
   267  
   268  It uses any S3-compatible server to share the cache between used Docker hosts.
   269  When restoring and archiving the cache, GitLab Runner will query the S3 server
   270  and will download or upload the archive.
   271  
   272  To enable distributed caching, you have to define it in `config.toml` using the
   273  [`[runners.cache]` directive][runners-cache]:
   274  
   275  ```bash
   276  [[runners]]
   277    limit = 10
   278    executor = "docker+machine"
   279    [runners.cache]
   280      Type = "s3"
   281      ServerAddress = "s3.example.com"
   282      AccessKey = "access-key"
   283      SecretKey = "secret-key"
   284      BucketName = "runner"
   285      Insecure = false
   286  ```
   287  
   288  Read how to [install your own caching server][caching].
   289  
   290  ## Distributed Docker registry mirroring
   291  
   292  To speed up builds executed inside of Docker containers, you can use the [Docker
   293  registry mirroring service][registry]. This will provide a proxy between your
   294  Docker machines and all used registries. Images will be downloaded once by the
   295  registry mirror. On each new host, or on an existing host where the image is
   296  not available, it will be downloaded from the configured registry mirror.
   297  
   298  Provided that the mirror will exist in your Docker machines LAN, the image
   299  downloading step should be much faster on each host.
   300  
   301  To configure the Docker registry mirroring, you have to add `MachineOptions` to
   302  the configuration in `config.toml`:
   303  
   304  ```bash
   305  [[runners]]
   306    limit = 10
   307    executor = "docker+machine"
   308    [runners.machine]
   309      (...)
   310      MachineOptions = [
   311        (...)
   312        "engine-registry-mirror=http://10.11.12.13:12345"
   313      ]
   314  ```
   315  
   316  Where `10.11.12.13:12345` is the IP address and port where your registry mirror
   317  is listening for connections from the Docker service. It must be accessible for
   318  each host created by Docker Machine.
   319  
   320  Read how to [install your own Docker registry server][registry-server].
   321  
   322  ## A complete example of `config.toml`
   323  
   324  The `config.toml` below uses the `digitalocean` Docker Machine driver:
   325  
   326  ```bash
   327  concurrent = 50   # All registered Runners can run up to 50 concurrent builds
   328  
   329  [[runners]]
   330    url = "https://gitlab.com/ci"
   331    token = "RUNNER_TOKEN"
   332    name = "autoscale-runner"
   333    executor = "docker+machine"       # This Runner is using the 'docker+machine' executor
   334    limit = 10                        # This Runner can execute up to 10 builds (created machines)
   335    [runners.docker]
   336      image = "ruby:2.1"              # The default image used for builds is 'ruby:2.1'
   337    [runners.machine]
   338      IdleCount = 5                   # There must be 5 machines in Idle state
   339      IdleTime = 600                  # Each machine can be in Idle state up to 600 seconds (after this it will be removed)
   340      MaxBuilds = 100                 # Each machine can handle up to 100 builds in a row (after this it will be removed)
   341      MachineName = "auto-scale-%s"   # Each machine will have a unique name ('%s' is required)
   342      MachineDriver = "digitalocean"  # Docker Machine is using the 'digitalocean' driver
   343      MachineOptions = [
   344          "digitalocean-image=coreos-beta",
   345          "digitalocean-ssh-user=core",
   346          "digitalocean-access-token=DO_ACCESS_TOKEN",
   347          "digitalocean-region=nyc2",
   348          "digitalocean-size=4gb",
   349          "digitalocean-private-networking",
   350          "engine-registry-mirror=http://10.11.12.13:12345"   # Docker Machine is using registry mirroring
   351      ]
   352    [runners.cache]
   353      Type = "s3"   # The Runner is using a distributed cache with Amazon S3 service
   354      ServerAddress = "s3-eu-west-1.amazonaws.com"
   355      AccessKey = "AMAZON_S3_ACCESS_KEY"
   356      SecretKey = "AMAZON_S3_SECRET_KEY"
   357      BucketName = "runners"
   358      Insecure = false
   359  ```
   360  
   361  Note that the `MachineOptions` parameter contains options for the `digitalocean`
   362  driver which is used by Docker Machine to spawn machines hosted on Digital Ocean,
   363  and one option for Docker Machine itself (`engine-registry-mirror`).
   364  
   365  ## What are the supported cloud providers
   366  
   367  The autoscale mechanism currently is based on Docker Machine. Advanced
   368  configuration options, including virtualization/cloud provider parameters, are
   369  available at the [Docker Machine documentation][docker-machine-driver].
   370  
   371  [cache]: http://doc.gitlab.com/ce/ci/yaml/README.html#cache
   372  [runner-installation]: https://gitlab.com/gitlab-org/gitlab-ci-multi-runner#installation
   373  [runner-configuration]: https://gitlab.com/gitlab-org/gitlab-ci-multi-runner#advanced-configuration
   374  [docker-machine-docs]: https://docs.docker.com/machine/
   375  [docker-machine-driver]: https://docs.docker.com/machine/drivers/
   376  [docker-machine-installation]: https://docs.docker.com/machine/install-machine/
   377  [runners-cache]: advanced-configuration.md#the-runnerscache-section
   378  [registry]: https://docs.docker.com/docker-trusted-registry/overview/
   379  [caching]: ../install/autoscaling.md#install-the-cache-server
   380  [registry-server]: ../install/autoscaling.md#install-docker-registry