github.com/sentienttechnologies/studio-go-runner@v0.0.0-20201118202441-6d21f2ced8ee/examples/local/README.md

github.com/sentienttechnologies/studio-go-runner@v0.0.0-20201118202441-6d21f2ced8ee/examples/local/README.md (about)

     1  # Quick introduction to using a locally deployed microk8s Kubernetes configuration
     2  
     3  This document gives the briefest of overviews for standing up a single CPU runner cluster, with optional encryption support on a microk8s installation.
     4  
     5  The microk8s use-case this document describes is somewhat similiar in its target audience to the (docker oriented deployment)[../docker/README.md] use case with the exception of our use of Kubernetes rather than docker itself.
     6  
     7  The motivation behind this example is to provide from a deployment perspective something as close as possible on a single machine, or laptop to a real production deployment.  The docker use case is one that fits use cases needing a functionally equivalent deployment which can be done for Mac or Windows as a minimal alternative to a Linux deployment.
     8  
     9  <!--ts-->
    10  
    11  Table of Contents
    12  =================
    13  
    14  * [Quick introduction to using a locally deployed microk8s Kubernetes configuration](#quick-introduction-to-using-a-locally-deployed-microk8s-kubernetes-configuration)
    15  * [Table of Contents](#table-of-contents)
    16  * [Prerequisties](#prerequisties)
    17    * [Docker and Microk8s](#docker-and-microk8s)
    18    * [Deployment Tooling](#deployment-tooling)
    19    * [Kubernetes CLI](#kubernetes-cli)
    20    * [Minio CLI](#minio-cli)
    21    * [Steps](#steps)
    22      * [Deployment](#deployment)
    23      * [Running jobs](#running-jobs)
    24      * [Retrieving results](#retrieving-results)
    25      * [Cleanup](#cleanup)
    26  <!--te-->
    27  
    28  # Prerequisties
    29  
    30  Before using the following instructions experimenters will need to have [Docker Desktop 2.3+ service installed](https://www.docker.com/products/docker-desktop).
    31  
    32  This option requires at least 8Gb of memory in the minimal setups.
    33  
    34  ## Docker and Microk8s
    35  
    36  You will need to install docker, and microk8s using Ubuntu snap.  When using docker installs only the snap distribution for docker is compatible with the microk8s deployment.
    37  
    38  ```console
    39  sudo snap install docker --classic
    40  sudo snap install microk8s --classic
    41  ```
    42  When using microk8s during development builds the setup involved simply setting up the services that you to run under microk8s to support a docker registry and also to enable any GPU resources you have present to aid in testing.
    43  
    44  ```console
    45  export LOGXI='*=DBG'
    46  export LOGXI_FORMAT='happy,maxcol=1024'
    47  
    48  export SNAP=/snap
    49  export PATH=$SNAP/bin:$PATH
    50  
    51  export KUBE_CONFIG=~/.kube/microk8s.config
    52  export KUBECONFIG=~/.kube/microk8s.config
    53  
    54  microk8s.stop
    55  microk8s.start
    56  microk8s.config > $KUBECONFIG
    57  microk8s.enable registry:size=30Gi storage dns gpu
    58  ```
    59  
    60  Now we need to perform some customization, the first step then is to locate the IP address for the host that can be used and then define an environment variable to reference the registry.  
    61  
    62  ```console
    63  export RegistryIP=`microk8s.kubectl --namespace container-registry get pod --selector=app=registry -o jsonpath="{.items[*].status.hostIP}"`
    64  export RegistryPort=32000
    65  echo $RegistryIP
    66  172.31.39.52
    67  ```
    68  
    69  Now we have an IP Address for our unsecured microk8s registry we need to add it to the containerd configuration file being used by microk8s to mark this specific endpoint as being permitted for use with HTTP rather than HTTPS, as follows:
    70  
    71  ```console
    72  sudo vim /var/snap/microk8s/current/args/containerd-template.toml
    73  ```
    74  
    75  And add the last two lines in the following example to the file substituting in the IP Address we selected
    76  
    77  ```console
    78      [plugins.cri.registry]
    79        [plugins.cri.registry.mirrors]
    80          [plugins.cri.registry.mirrors."docker.io"]
    81            endpoint = ["https://registry-1.docker.io"]
    82          [plugins.cri.registry.mirrors."local.insecure-registry.io"]
    83            endpoint = ["http://localhost:32000"]
    84          [plugins.cri.registry.mirrors."172.31.39.52:32000"]
    85            endpoint = ["http://172.31.39.52:32000"]
    86  ```
    87  
    88  ```console
    89  sudo vim /var/snap/docker/current/config/daemon.json
    90  ```
    91  
    92  And add the insecure-registries line in the following example to the file substituting in the IP Address we obtained from the $RegistryIP
    93  
    94  ```console
    95  {
    96      "log-level":        "error",
    97      "storage-driver":   "overlay2",
    98      "insecure-registries" : ["172.31.39.52:32000"]
    99  }
   100  ```
   101  
   102  The services then need restarting, note that the image registry will be cleared of any existing images in this step:
   103  
   104  ```console
   105  microk8s.disable registry
   106  microk8s.stop
   107  sudo snap disable docker
   108  sudo snap enable docker
   109  microk8s.start
   110  microk8s.enable registry:size=30Gi
   111  ```
   112  
   113  You now have a registry which you can with your requirements in mind tag your own studio-go-runner images for and push to the registry in your local cluster using a command such as the following:
   114  
   115  ```
   116  docker tag leafai/studio-go-runner:0.9.27 $RegistryIP:32000/leafai/studio-go-runner:0.9.27
   117  docker push $RegistryIP:32000/leafai/studio-go-runner:0.9.27
   118  ```
   119  
   120  ## Deployment Tooling
   121  
   122  Install a template processor based on the Go lang templater used by Kubernetes.
   123  
   124  ```
   125  wget -O stencil https://github.com/karlmutch/duat/releases/download/0.13.0/stencil-linux-amd64
   126  chmod +x stencil
   127  ```
   128  
   129  ## Kubernetes CLI
   130  
   131  kubectl can be installed using instructions found at:
   132  
   133  - kubectl https://kubernetes.io/docs/tasks/tools/install-kubectl/
   134  
   135  ## Minio CLI
   136  
   137  Minio offers a client for the file server inside the docker cluster called, [mc](https://docs.min.io/docs/minio-client-quickstart-guide.html).
   138  
   139  The quickstart guide details installation for Linux using a wget download as follows:
   140  
   141  ```
   142  wget https://dl.min.io/client/mc/release/linux-amd64/mc
   143  chmod +x mc
   144  ```
   145  
   146  ## Steps
   147  
   148  These steps describe in summary form the actions needed to both initialize and access your locally deployed Kubernetes cluster.
   149  
   150  ### Deployment
   151  
   152  Deployment uses the standard kubectl apply to instantiate all of the resources needed to have a fully functioning cluster.  The stencil command is used to fill in details such as the name of the docker image that is to be used along with its registra host and optional parameters such as a namespace dedicated to the deployed cluster.  Using a namespace is useful as it allows the go runner cluster to be isolated from other workloads.
   153  
   154  The default cluster name if one is not supplied is local-go-runner.
   155  
   156  ```
   157  stencil -input deployment.yaml -values Image=$RegistryIP:32000/leafai/studio-go-runner:0.9.27 | kubectl apply -f -
   158  ```
   159  
   160  After deployment there will be three pods inside the namespace and you will also have two services defined, for example:
   161  
   162  ```
   163  $ kubectl --namespace local-go-runner get pods
   164  NAME                                             READY   STATUS    RESTARTS   AGE
   165  minio-deployment-7954bdbdc9-7w55b                1/1     Running   0          25m
   166  rabbitmq-controller-6mkq6                        1/1     Running   0          25m
   167  studioml-go-runner-deployment-5bddbccc94-54tq9   1/1     Running   0          25m
   168  ```
   169  
   170  In order to view the logs of the various components the following commands might serve useful:
   171  
   172  ```
   173  kubectl logs --namespace local-go-runner -f --selector=app=studioml-go-runner
   174  ...
   175  kubectl logs --namespace local-go-runner -f --selector=app=minio
   176  ...
   177  kubectl logs --namespace local-go-runner -f --selector=component=rabbitmq
   178  ...
   179  ```
   180  
   181  ### Running jobs
   182  
   183  In order to access the minio and rabbitMQ servers the host names being used will need to match between the experiment host where experiments are launched and host names inside the compute cluster.  To do this the /etc/hosts, typically using 'sudo vim /etc/hosts', file of your local experiment host will need the following line added.
   184  
   185  ```
   186  127.0.0.1 minio-service.local-go-runner.svc.cluster.local rabbitmq-service.local-go-runner.svc.cluster.local
   187  ```
   188  
   189  Before running a studioml job the configuration file should be populated as follows:
   190  
   191  ```
   192  #export rmq_queue_port=`kubectl get svc --namespace local-go-runner rabbitmq-service -o=jsonpath='{.spec.ports[?(@.port==5672)].nodePort}'`
   193  #export rmq_admin_port=`kubectl get svc --namespace local-go-runner rabbitmq-service -o=jsonpath='{.spec.ports[?(@.port==15672)].nodePort}'`
   194  mkdir -p ~/.studioml
   195  #stencil -input studioml.config -values RMQAdminPort=$rmq_admin_port,RMQPort=$rmq_queue_port,MinioPort=$minio_port > ~/.studioml/local_config.yaml
   196  
   197  stencil -input studioml.config > ~/.studioml/local_config.yaml
   198  kubectl port-forward --namespace local-go-runner replicationcontroller/rabbitmq-controller 5672:5672 &
   199  kubectl port-forward --namespace local-go-runner deployment/minio-deployment 9000:9000 &
   200  
   201  export minio_port=`kubectl get svc --namespace local-go-runner minio-service -o template --template="{{ range.spec.ports }}{{if .nodePort}}{{.nodePort}}{{end}}{{end}}"`
   202  mc config host add local-cluster http://minio-service.local-go-runner.svc.cluster.local:9000 UserUser PasswordPassword
   203  ```
   204  
   205  This example uses pyenv to create a python environment.  pip based virtualenv can be also use.
   206  
   207  Now a virtual environment can be created, studioml installed and a simple example run.
   208  
   209  ```
   210  eval "$(pyenv init -)"
   211  eval "$(pyenv virtualenv-init -)"
   212  
   213  pyenv install 3.6.10
   214  pyenv virtualenv 3.6.10 local-studioml
   215  pyenv activate local-studioml
   216  python3 -m pip install --upgrade pip setuptools
   217  python3 -m pip install wheel 
   218  pip install tensorflow==1.15.2 --only-binary tensorflow,tensorboard,tensorflow-estimator,h5py
   219  pip install rsa==4.0
   220  pip install studioml
   221  ```
   222  
   223  ```
   224  pip install keras
   225  studio run --lifetime=30m --max-duration=20m --gpus 0 --queue=rmq_kmutch --force-git --config=/home/kmutch/.studioml/local_config.yaml app.py
   226  ```
   227  
   228  ### Retrieving results
   229  
   230  When experiments are submitted using studioml an experiment ID is displayed on the second to last line, typically, that has the ID as the last item on the line, in this case 1591820141_664134e2-9d76-4c82-93cb-ea9ec09d790b.  This ID can be used to examine the S3 storage platform for output from the experiment as shown in the following example:
   231  
   232  ```
   233  2020-06-10 13:15:42 DEBUG  studio-runner - received ack for delivery tag: 1
   234  2020-06-10 13:15:42 INFO   studio-runner - published 1 messages, 0 have yet to be confirmed, 1 were acked and 0 were nacked
   235  2020-06-10 13:15:43 INFO   studio-runner - sent message acknowledged to amqp://UserUser:PasswordPassword@rabbitmq-service.local-go-runner.svc.cluster.local:5672/%2f?connection_attempts=30&retry_delay=.5&socket_timeout=5 after waiting 1 seconds
   236  2020-06-10 13:15:43 INFO   studio-runner - studio run: submitted experiment 1591820141_664134e2-9d76-4c82-93cb-ea9ec09d790b
   237  2020-06-10 13:15:43 INFO   studio-runner - Added 1 experiment(s) in 2 seconds to queue rmq_kmutch
   238  $ mc ls local-cluster/storage/experiments/1591820141_664134e2-9d76-4c82-93cb-ea9ec09d790b
   239  Handling connection for 9000
   240  [2020-06-10 13:18:07 PDT]  9.3MiB modeldir.tar
   241  [2020-06-10 13:18:08 PDT]   92KiB output.tar
   242  [2020-06-10 13:18:08 PDT]  131KiB tb.tar
   243  ```
   244  
   245  If you wish to stream the experiment log you can use the following, in this case to see if the runner has completed the job :
   246  
   247  ```
   248  $ mc cat local-cluster/storage/experiments/1591820141_664134e2-9d76-4c82-93cb-ea9ec09d790b/output.tar | tar -x --to-stdout -f - | tail
   249  + command pyenv virtualenv-delete -f studioml-811d22f98b3ef7f8.0
   250  + pyenv virtualenv-delete -f studioml-811d22f98b3ef7f8.0
   251  date
   252  + date
   253  Wed Jun 10 20:18:07 UTC 2020
   254  date -u
   255  + date -u
   256  Wed Jun 10 20:18:07 UTC 2020
   257  exit $result
   258  + exit 0
   259  $ 
   260  ```
   261  
   262  ### Cleanup
   263  
   264  ```
   265  kubectl delete namespace local-go-runner
   266  ```
   267  
   268  Copyright © 2019-2020 Cognizant Digital Business, Evolutionary AI. All rights reserved. Issued under the Apache 2.0 license.