github.com/abayer/test-infra@v0.0.5/velodrome/README.md (about)

     1  Overview
     2  ========
     3  
     4  Velodrome is the dashboard, monitoring and metrics for Kubernetes Developer
     5  Productivity. It is hosted at:
     6  
     7  http://velodrome.k8s.io.
     8  
     9  It is comprised of three components:
    10  
    11  1. [Grafana stack](grafana-stack/) is the front-end website where users can
    12    visualize the metrics along with the back-end databases used to print those
    13    metrics. It has:
    14    * an InfluxDB (a time-series database) to save precalculated metrics,
    15    * a Prometheus instance to save poll-based metrics (more monitoring
    16    based)
    17    * a Grafana instance to display graphs based on these metrics
    18    * and an nginx to proxy all of these services in a single URL.
    19  
    20  2. A SQL Database containing a copy of the issues, events, and PRs in Github
    21  repositories. It is used for calculating statistics about developer
    22  productivity. It has the following components:
    23    * [Fetcher](fetcher/): fetches Github data and stores in a SQL database
    24    * [SQL Proxy](mysql/): SQL Proxy deployment to Cloud SQL
    25    * [Transform](transform/): Transform SQL (Github db) into valuable metrics
    26  
    27  3. Other monitoring tools, only one for the moment:
    28    * [token-counter](token-counter/): Monitors RateLimit usage of your github
    29      tokens
    30  
    31  Github statistics
    32  =================
    33  
    34  Here is how the github statistics are communicating between each other:
    35  
    36  ```
    37  => pulls from
    38  -> pushes to
    39  * External components
    40  
    41  Github* <= Fetcher -> Cloud SQL* <= Transform -> InfluxDb
    42  ```
    43  
    44  Other metrics/monitoring components
    45  ===================================
    46  
    47  One can set-up monitoring components in two different ways:
    48  
    49  1. Push data directly into InfluxDb. Influx uses a SQL-like syntax and
    50  receives that data (there is no scraping). If you have events that you would
    51  like to push from time to time rather than reporting a current status, you should
    52  push to InfluxDB. Examples: build time, test time, etc ...
    53  
    54  2. Data can be polled on a regular interval by Prometheus. Prometheus will
    55  scrape the data and measure the current state of something. This is much more
    56  useful for monitoring as you can see what is the health of a service at a given
    57  time.
    58  
    59  As an example, the token counter measures the usage of our github-tokens, and
    60  has a new value every hour. We can push the new value to InfluxDB.
    61  
    62  Naming convention
    63  =================
    64  
    65  To disambiguate how each word is used, let's give a description of the naming
    66  convention used by velodrome:
    67  - Organization: This has the same meaning as the Github Organization. This is
    68    holding multiple repositories. e.g. In `github.com/istio/manager`, the
    69    organization is `istio`.
    70  - Repository can be either the last part of the github repository URL (i.e. in
    71    `github.com/istio/manager`, it would be `manager`), or the fully qualified
    72    repository name: `istio/manager`.
    73  - Project: A project describe a completely hermetic instance of the website for
    74    a given team. A project can span across multiple organizations and multiple
    75    repositories. e.g. The kubernetes project is made of repositories in the
    76    `kubernetes` organization, and `kubernetes-incubator`.
    77  
    78  Adding a new project
    79  ====================
    80  
    81  Adding a new project is as simple as adding it to [config.yaml](config.yaml).
    82  Typically, add the name of your project, the list of repositories. Don't worry
    83  about the `public-ip` field as the IP will be created later. You can also leave
    84  prometheus configuration if you don't need it initially.
    85  
    86  There are new project specific deployments necessary, and they are
    87  described [below](#new-project-deployments).
    88  
    89  Deployment
    90  ==========
    91  
    92  Update/Create deployments
    93  -------------------------
    94  
    95  [config.py](config.py) will generate all the deployments file for you. It reads
    96  the configuration in [config.yaml](config.yaml) to generate deployments for each
    97  project and/or repositories with proper labels. You can then use `kubectl`
    98  labels help you select what you want to do exactly, for example:
    99  
   100  ```
   101  ./config.py # Generates the configuration and prints it on stdout
   102  ./config.py | kubectl apply -f - # Creates/Updates everything
   103  ./config.py | kubectl delete -f - # Deletes everything
   104  ./config.py | kubectl apply -f - -l project=kubernetes # Only creates/updates kubernetes
   105  ./config.py | kubectl apply -f - -l app=fetcher # Only creates/updates fetcher
   106  ```
   107  
   108  First time deployments
   109  ----------------------
   110  
   111  - Make sure you create
   112    [the secrets for SQL Proxy](mysql/#set-up-google-cloud-sql-proxy)
   113  - Make sure your github tokens are also in a secret:
   114  
   115  ```
   116  kubectl create secret generic github-tokens --from-file=${TOKEN_FILE_1} --from-file=${TOKEN_FILE_2}
   117  ```
   118  
   119  New project deployments
   120  -----------------------
   121  
   122  - Create [secret for InfluxDB](grafana-stack/#first-time-only)
   123  - Deploy everything: `./config.py | kubectl apply -f - -l project=${NEW_PROJECT_NAME}`
   124  - Once the kubernetes service has its public IP, connect to the grafana instance, and add the
   125    default dashboard, star the dashboard, set-it as the default dashboard in the
   126    org preference.
   127  - Set the static IP in the GCP project, and update `config.yaml` with its
   128    value. Potentially create a domain-name pointing to it.