github.com/solo-io/cue@v0.4.7/doc/tutorial/kubernetes/README.md (about)

     1  # Kubernetes tutorial
     2  
     3  In this tutorial we show how to convert Kubernetes configuration files
     4  for a collection of microservices.
     5  
     6  The configuration files are scrubbed and renamed versions of
     7  real-life configuration files.
     8  The files are organized in a directory hierarchy grouping related services
     9  in subdirectories.
    10  This is a common pattern.
    11  The `cue` tooling has been optimized for this use case.
    12  
    13  In this tutorial we will address the following topics:
    14  
    15  1. convert the given YAML files to CUE
    16  1. hoist common patterns to parent directories
    17  1. use the tooling to rewrite CUE files to drop unnecessary fields
    18  1. repeat from step 2 for different subdirectories
    19  1. define commands to operate on the configuration
    20  1. extract CUE templates directly from Kubernetes Go source
    21  1. manually tailor the configuration
    22  1. map a Kubernetes configuration to `docker-compose` (TODO)
    23  
    24  
    25  ## The given data set
    26  
    27  The data set is based on a real-life case, using different names for the
    28  services.
    29  All the inconsistencies of the real setup are replicated in the files
    30  to get a realistic impression of how a conversion to CUE would behave
    31  in practice.
    32  
    33  The given YAML files are ordered in following directory
    34  (you can use `find` if you don't have tree):
    35  
    36  ```
    37  $ tree ./original | head
    38  .
    39  └── services
    40      ├── frontend
    41      │   ├── bartender
    42      │   │   └── kube.yaml
    43      │   ├── breaddispatcher
    44      │   │   └── kube.yaml
    45      │   ├── host
    46      │   │   └── kube.yaml
    47      │   ├── maitred
    48  ...
    49  ```
    50  
    51  Each subdirectory contains related microservices that often share similar
    52  characteristics and configurations.
    53  The configurations include a large variety of Kubernetes objects, including
    54  services, deployments, config maps,
    55  a daemon set, a stateful set, and a cron job.
    56  
    57  The result of the first tutorial is in the `quick`, for "quick and dirty"
    58  directory.
    59  A manually optimized configuration can be found int `manual`
    60  directory.
    61  
    62  
    63  ## Importing existing configuration
    64  
    65  We first make a copy of the data directory.
    66  
    67  ```
    68  $ cp -a original tmp
    69  $ cd tmp
    70  ```
    71  
    72  We initialize a module so that we can treat all our configuration files
    73  in the subdirectories as part of one package.
    74  We do that later by giving all the same package name.
    75  
    76  ```
    77  $ cue mod init
    78  ```
    79  
    80  We initialize a Go module so that later we can resolve the
    81  `k8s.io/api/apps/v1` Go package dependency:
    82  
    83  ```
    84  $ go mod init example.com
    85  ```
    86  
    87  Creating a module also allows our packages import external packages.
    88  
    89  Let's try to use the `cue import` command to convert the given YAML files
    90  into CUE.
    91  
    92  ```
    93  $ cd services
    94  $ cue import ./...
    95  must specify package name with the -p flag
    96  ```
    97  
    98  Since we have multiple packages and files, we need to specify the package to
    99  which they should belong.
   100  
   101  ```
   102  $ cue import ./... -p kube
   103  path, list, or files flag needed to handle multiple objects in file "./frontend/bartender/kube.yaml"
   104  ```
   105  
   106  Many of the files contain more than one Kubernetes object.
   107  Moreover, we are creating a single configuration that contains all objects
   108  of all files.
   109  We need to organize all Kubernetes objects such that each is individually
   110  identifiable within a single configuration.
   111  We do so by defining a different struct for each type putting each object
   112  in this respective struct keyed by its name.
   113  This allows objects of different types to share the same name,
   114  just as is allowed by Kubernetes.
   115  To accomplish this, we tell `cue` to put each object in the configuration
   116  tree at the path with the "kind" as first element and "name" as second.
   117  
   118  ```
   119  $ cue import ./... -p kube -l 'strings.ToCamel(kind)' -l metadata.name -f
   120  ```
   121  
   122  The added `-l` flag defines the labels for each object, based on values from
   123  each object, using the usual CUE syntax for field labels.
   124  In this case, we use a camelcase variant of the `kind` field of each object and
   125  use the `name` field of the `metadata` section as the name for each object.
   126  We also added the `-f` flag to overwrite the few files that succeeded before.
   127  
   128  Let's see what happened:
   129  
   130  ```
   131  $ tree . | head
   132  .
   133  └── services
   134      ├── frontend
   135      │   ├── bartender
   136      │   │   ├── kube.cue
   137      │   │   └── kube.yaml
   138      │   ├── breaddispatcher
   139      │   │   ├── kube.cue
   140      │   │   └── kube.yaml
   141  ...
   142  ```
   143  
   144  Each of the YAML files is converted to corresponding CUE files.
   145  Comments of the YAML files are preserved.
   146  
   147  The result is not fully pleasing, though.
   148  Take a look at `mon/prometheus/configmap.cue`.
   149  
   150  ```
   151  $ cat mon/prometheus/configmap.cue
   152  package kube
   153  
   154  apiVersion: "v1"
   155  kind:       "ConfigMap"
   156  metadata: name: "prometheus"
   157  data: {
   158      "alert.rules": """
   159          groups:
   160          - name: rules.yaml
   161  ...
   162  ```
   163  
   164  The configuration file still contains YAML embedded in a string value of one
   165  of the fields.
   166  The original YAML file might have looked like it was all structured data, but
   167  the majority of it was a string containing, hopefully, valid YAML.
   168  
   169  The `-R` option attempts to detect structured YAML or JSON strings embedded
   170  in the configuration files and then converts these recursively.
   171  
   172  <-- TODO: update import label format -->
   173  
   174  ```
   175  $ cue import ./... -p kube -l 'strings.ToCamel(kind)' -l metadata.name -f -R
   176  ```
   177  
   178  Now the file looks like:
   179  
   180  ```
   181  $ cat mon/prometheus/configmap.cue
   182  package kube
   183  
   184  import "encoding/yaml"
   185  
   186  configMap: prometheus: {
   187      apiVersion: "v1"
   188      kind:       "ConfigMap"
   189      metadata: name: "prometheus"
   190      data: {
   191          "alert.rules": yaml.Marshal(_cue_alert_rules)
   192          _cue_alert_rules: {
   193              groups: [{
   194  ...
   195  ```
   196  
   197  That looks better!
   198  The resulting configuration file replaces the original embedded string
   199  with a call to `yaml.Marshal` converting a structured CUE source to
   200  a string with an equivalent YAML file.
   201  Fields starting with an underscore (`_`) are not included when emitting
   202  a configuration file (they are when enclosed in double quotes).
   203  
   204  ```
   205  $ cue eval ./mon/prometheus -e configMap.prometheus
   206  apiVersion: "v1"
   207  kind: "ConfigMap"
   208  metadata: {
   209      name: "prometheus"
   210  }
   211  data: {
   212      "alert.rules": """
   213      groups:
   214      - name: rules.yaml
   215  ...
   216  ```
   217  
   218  Yay!
   219  
   220  
   221  ## Quick 'n Dirty Conversion
   222  
   223  In this tutorial we show how to quickly eliminate boilerplate from a set
   224  of configurations.
   225  Manual tailoring will usually give better results, but takes considerably
   226  more thought, while taking the quick and dirty approach gets you mostly there.
   227  The result of such a quick conversion also forms a good basis for
   228  a more thoughtful manual optimization.
   229  
   230  ### Create top-level template
   231  
   232  Now we have imported the YAML files we can start the simplification process.
   233  
   234  Before we start the restructuring, lets save a full evaluation so that we
   235  can verify that simplifications yield the same results.
   236  
   237  ```
   238  $ cue eval -c ./... > snapshot
   239  ```
   240  
   241  The `-c` option tells `cue` that only concrete values, that is valid JSON,
   242  are allowed.
   243  We focus on the objects defined in the various `kube.cue` files.
   244  A quick inspection reveals that many of the Deployments and Services share
   245  common structure.
   246  
   247  We copy one of the files containing both as a basis for creating our template
   248  to the root of the directory tree.
   249  
   250  ```
   251  $ cp frontend/breaddispatcher/kube.cue .
   252  ```
   253  
   254  Modify this file as below.
   255  
   256  ```
   257  $ cat <<EOF > kube.cue
   258  package kube
   259  
   260  service: [ID=_]: {
   261      apiVersion: "v1"
   262      kind:       "Service"
   263      metadata: {
   264          name: ID
   265          labels: {
   266              app:       ID    // by convention
   267              domain:    "prod"  // always the same in the given files
   268              component: string  // varies per directory
   269          }
   270      }
   271      spec: {
   272          // Any port has the following properties.
   273          ports: [...{
   274              port:       int
   275              protocol:   *"TCP" | "UDP"      // from the Kubernetes definition
   276              name:       string | *"client"
   277          }]
   278          selector: metadata.labels // we want those to be the same
   279      }
   280  }
   281  
   282  deployment: [ID=_]: {
   283      apiVersion: "apps/v1"
   284      kind:       "Deployment"
   285      metadata: name: ID
   286      spec: {
   287          // 1 is the default, but we allow any number
   288          replicas: *1 | int
   289          template: {
   290              metadata: labels: {
   291                  app:       ID
   292                  domain:    "prod"
   293                  component: string
   294              }
   295              // we always have one namesake container
   296              spec: containers: [{ name: ID }]
   297          }
   298      }
   299  }
   300  EOF
   301  ```
   302  
   303  By replacing the service and deployment name with `[ID=_]` we have changed the
   304  definition into a template matching any field.
   305  CUE bind the field name to `ID` as a result.
   306  During importing we used `metadata.name` as a key for the object names,
   307  so we can now set this field to `ID`.
   308  
   309  Templates are applied to (are unified with) all entries in the struct in which
   310  they are defined,
   311  so we need to either strip fields specific to the `breaddispatcher` definition,
   312  generalize them, or remove them.
   313  
   314  One of the labels defined in the Kubernetes metadata seems to be always set
   315  to parent directory name.
   316  We enforce this by defining `component: string`, meaning that a field
   317  of name `component` must be set to some string value, and then define this
   318  later on.
   319  Any underspecified field results in an error when converting to, for instance,
   320  JSON.
   321  So a deployment or service will only be valid if this label is defined.
   322  
   323  <!-- TODO: once cycles in disjunctions are implemented
   324      port:       targetPort | int   // by default the same as targetPort
   325      targetPort: port | int         // by default the same as port
   326  
   327  Note that ports definition for service contains a cycle.
   328  Specifying one of the ports will break the cycle.
   329  The meaning of cycles are well-defined in CUE.
   330  In practice this means that a template writer does not have to make any
   331  assumptions about which of the fields that can be mutually derived from each
   332  other a user of the template will want to specify.
   333  -->
   334  
   335  Let's compare the result of merging our new template to our original snapshot.
   336  
   337  ```
   338  $ cue eval ./... -c > snapshot2
   339  --- ./mon/alertmanager
   340  service.alertmanager.metadata.labels.component: incomplete value (string):
   341      ./kube.cue:11:24
   342  service.alertmanager.spec.selector.component: incomplete value (string):
   343      ./kube.cue:11:24
   344  deployment.alertmanager.spec.template.metadata.labels.component: incomplete value (string):
   345      ./kube.cue:36:28
   346  service."node-exporter".metadata.labels.component: incomplete value (string):
   347      ./kube.cue:11:24
   348  ...
   349  ```
   350  
   351  <!-- TODO: better error messages -->
   352  
   353  Oops.
   354  The alert manager does not specify the `component` label.
   355  This demonstrates how constraints can be used to catch inconsistencies
   356  in your configurations.
   357  
   358  As there are very few objects that do not specify this label, we will modify
   359  the configurations to include them everywhere.
   360  We do this by setting a newly defined top-level field in each directory
   361  to the directory name and modify our master template file to use it.
   362  
   363  <!--
   364  ```
   365  $ cue add */kube.cue -p kube --list <<EOF
   366  #Component: "{{.DisplayPath}}"
   367  EOF
   368  ```
   369  -->
   370  
   371  ```
   372  # set the component label to our new top-level field
   373  $ sed -i.bak 's/component:.*string/component: #Component/' kube.cue && rm kube.cue.bak
   374  
   375  # add the new top-level field to our previous template definitions
   376  $ cat <<EOF >> kube.cue
   377  
   378  #Component: string
   379  EOF
   380  
   381  # add a file with the component label to each directory
   382  $ ls -d */ | sed 's/.$//' | xargs -I DIR sh -c 'cd DIR; echo "package kube
   383  
   384  #Component: \"DIR\"
   385  " > kube.cue; cd ..'
   386  
   387  # format the files
   388  $ cue fmt kube.cue */kube.cue
   389  ```
   390  
   391  Let's try again to see if it is fixed:
   392  
   393  ```
   394  $ cue eval -c ./... > snapshot2
   395  $ diff snapshot snapshot2
   396  ...
   397  ```
   398  
   399  Except for having more consistent labels and some reordering, nothing changed.
   400  We are happy and save the result as the new baseline.
   401  
   402  ```
   403  $ cp snapshot2 snapshot
   404  ```
   405  
   406  The corresponding boilerplate can now be removed with `cue trim`.
   407  
   408  ```
   409  $ find . | grep kube.cue | xargs wc | tail -1
   410      1792    3616   34815 total
   411  $ cue trim ./...
   412  $ find . | grep kube.cue | xargs wc | tail -1
   413      1223    2374   22903 total
   414  ```
   415  
   416  `cue trim` removes configuration from files that is already generated
   417  by templates or comprehensions.
   418  In doing so it removed over 500 lines of configuration, or over 30%!
   419  
   420  The following is proof that nothing changed semantically:
   421  
   422  ```
   423  $ cue eval -c ./... > snapshot2
   424  $ diff snapshot snapshot2 | wc
   425         0       0       0
   426  ```
   427  
   428  We can do better, though.
   429  A first thing to note is that DaemonSets and StatefulSets share a similar
   430  structure to Deployments.
   431  We generalize the top-level template as follows:
   432  
   433  ```
   434  $ cat <<EOF >> kube.cue
   435  
   436  daemonSet: [ID=_]: _spec & {
   437      apiVersion: "apps/v1"
   438      kind:       "DaemonSet"
   439      _name:      ID
   440  }
   441  
   442  statefulSet: [ID=_]: _spec & {
   443      apiVersion: "apps/v1"
   444      kind:       "StatefulSet"
   445      _name:      ID
   446  }
   447  
   448  deployment: [ID=_]: _spec & {
   449      apiVersion: "apps/v1"
   450      kind:       "Deployment"
   451      _name:      ID
   452      spec: replicas: *1 | int
   453  }
   454  
   455  configMap: [ID=_]: {
   456      metadata: name: ID
   457      metadata: labels: component: #Component
   458  }
   459  
   460  _spec: {
   461      _name: string
   462  
   463      metadata: name: _name
   464      metadata: labels: component: #Component
   465      spec: selector: {}
   466      spec: template: {
   467          metadata: labels: {
   468              app:       _name
   469              component: #Component
   470              domain:    "prod"
   471          }
   472          spec: containers: [{name: _name}]
   473      }
   474  }
   475  EOF
   476  $ cue fmt
   477  ```
   478  
   479  The common configuration has been factored out into `_spec`.
   480  We introduced `_name` to aid both specifying and referring
   481  to the name of an object.
   482  For completeness, we added `configMap` as a top-level entry.
   483  
   484  Note that we have not yet removed the old definition of deployment.
   485  This is fine.
   486  As it is equivalent to the new one, unifying them will have no effect.
   487  We leave its removal as an exercise to the reader.
   488  
   489  Next we observe that all deployments, stateful sets and daemon sets have
   490  an accompanying service which shares many of the same fields.
   491  We add:
   492  
   493  ```
   494  $ cat <<EOF >> kube.cue
   495  
   496  // Define the _export option and set the default to true
   497  // for all ports defined in all containers.
   498  _spec: spec: template: spec: containers: [...{
   499      ports: [...{
   500          _export: *true | false // include the port in the service
   501      }]
   502  }]
   503  
   504  for x in [deployment, daemonSet, statefulSet] for k, v in x {
   505      service: "\(k)": {
   506          spec: selector: v.spec.template.metadata.labels
   507  
   508          spec: ports: [
   509              for c in v.spec.template.spec.containers
   510              for p in c.ports
   511              if p._export {
   512                  let Port = p.containerPort // Port is an alias
   513                  port:       *Port | int
   514                  targetPort: *Port | int
   515              }
   516          ]
   517      }
   518  }
   519  EOF
   520  $ cue fmt
   521  ```
   522  
   523  This example introduces a few new concepts.
   524  Open-ended lists are indicated with an ellipsis (`...`).
   525  The value following an ellipsis is unified with any subsequent elements and
   526  defines the "type", or template, for additional list elements.
   527  
   528  The `Port` declaration is an alias.
   529  Aliases are only visible in their lexical scope and are not part of the model.
   530  They can be used to make shadowed fields visible within nested scopes or,
   531  in this case, to reduce boilerplate without introducing new fields.
   532  
   533  Finally, this example introduces list and field comprehensions.
   534  List comprehensions are analogous to list comprehensions found in other
   535  languages.
   536  Field comprehensions allow inserting fields in structs.
   537  In this case, the field comprehension adds a namesake service for any
   538  deployment, daemonSet, and statefulSet.
   539  Field comprehensions can also be used to add a field conditionally.
   540  
   541  
   542  Specifying the `targetPort` is not necessary, but since many files define it,
   543  defining it here will allow those definitions to be removed
   544  using `cue trim`.
   545  We add an option `_export` for ports defined in containers to specify whether
   546  to include them in the service and explicitly set this to false
   547  for the respective ports in `infra/events`, `infra/tasks`, and `infra/watcher`.
   548  
   549  For the purpose of this tutorial, here are some quick patches:
   550  ```
   551  $ cat <<EOF >> infra/events/kube.cue
   552  
   553  deployment: events: spec: template: spec: containers: [{ ports: [{_export: false}, _] }]
   554  EOF
   555  
   556  $ cat <<EOF >> infra/tasks/kube.cue
   557  
   558  deployment: tasks: spec: template: spec: containers: [{ ports: [{_export: false}, _] }]
   559  EOF
   560  
   561  $ cat <<EOF >> infra/watcher/kube.cue
   562  
   563  deployment: watcher: spec: template: spec: containers: [{ ports: [{_export: false}, _] }]
   564  EOF
   565  ```
   566  In practice it would be more proper form to add this field in the original
   567  port declaration.
   568  
   569  We verify that all changes are acceptable and store another snapshot.
   570  Then we run trim to further reduce our configuration:
   571  
   572  ```
   573  $ cue trim ./...
   574  $ find . | grep kube.cue | xargs wc | tail -1
   575      1129    2270   22073 total
   576  ```
   577  This is after removing the rewritten and now redundant deployment definition.
   578  
   579  We shaved off almost another 100 lines, even after adding the template.
   580  You can verify that the service definitions are now gone in most of the files.
   581  What remains is either some additional configuration, or inconsistencies that
   582  should probably be cleaned up.
   583  
   584  But we have another trick up our sleeve.
   585  With the `-s` or `--simplify` option we can tell `trim` or `fmt` to collapse
   586  structs with a single element onto a single line. For instance:
   587  
   588  ```
   589  $ head frontend/breaddispatcher/kube.cue
   590  package kube
   591  
   592  deployment: breaddispatcher: {
   593      spec: {
   594          template: {
   595              metadata: {
   596                  annotations: {
   597                      "prometheus.io.scrape": "true"
   598                      "prometheus.io.port":   "7080"
   599                  }
   600  $ cue trim ./... -s
   601  $ head -7 frontend/breaddispatcher/kube.cue
   602  package kube
   603  
   604  deployment: breaddispatcher: spec: template: {
   605      metadata: annotations: {
   606          "prometheus.io.scrape": "true"
   607          "prometheus.io.port":   "7080"
   608      }
   609  $ find . | grep kube.cue | xargs wc | tail -1
   610       975    2116   20264 total
   611  ```
   612  
   613  Another 150 lines lost!
   614  Collapsing lines like this can improve the readability of a configuration
   615  by removing considerable amounts of punctuation.
   616  
   617  
   618  ### Repeat for several subdirectories
   619  
   620  In the previous section we defined templates for services and deployments
   621  in the root of our directory structure to capture the common traits of all
   622  services and deployments.
   623  In addition, we defined a directory-specific label.
   624  In this section we will look into generalizing the objects per directory.
   625  
   626  
   627  #### Directory `frontend`
   628  
   629  We observe that all deployments in subdirectories of `frontend`
   630  have a single container with one port,
   631  which is usually `7080`, but sometimes `8080`.
   632  Also, most have two prometheus-related annotations, while some have one.
   633  We leave the inconsistencies in ports, but add both annotations
   634  unconditionally.
   635  
   636  ```
   637  $ cat <<EOF >> frontend/kube.cue
   638  
   639  deployment: [string]: spec: template: {
   640      metadata: annotations: {
   641          "prometheus.io.scrape": "true"
   642          "prometheus.io.port":   "\(spec.containers[0].ports[0].containerPort)"
   643      }
   644      spec: containers: [{
   645          ports: [{containerPort: *7080 | int}] // 7080 is the default
   646      }]
   647  }
   648  EOF
   649  $ cue fmt ./frontend
   650  
   651  # check differences
   652  $ cue eval -c ./... > snapshot2
   653  $ diff snapshot snapshot2
   654  368a369
   655  >                             prometheus.io.port:   "7080"
   656  577a579
   657  >                             prometheus.io.port:   "8080"
   658  $ cp snapshot2 snapshot
   659  ```
   660  
   661  Two lines with annotations added, improving consistency.
   662  
   663  ```
   664  $ cue trim -s ./frontend/...
   665  $ find . | grep kube.cue | xargs wc | tail -1
   666       931    2052   19624 total
   667  ```
   668  
   669  Another 40 lines removed.
   670  We may have gotten used to larger reductions, but at this point there is just
   671  not much left to remove: in some of the frontend files there are only 4 lines
   672  of configuration left.
   673  
   674  
   675  #### Directory `kitchen`
   676  
   677  In this directory we observe that all deployments have without exception
   678  one container with port `8080`, all have the same liveness probe,
   679  a single line of prometheus annotation, and most have
   680  two or three disks with similar patterns.
   681  
   682  Let's add everything but the disks for now:
   683  
   684  ```
   685  $ cat <<EOF >> kitchen/kube.cue
   686  
   687  deployment: [string]: spec: template: {
   688      metadata: annotations: "prometheus.io.scrape": "true"
   689      spec: containers: [{
   690          ports: [{
   691              containerPort: 8080
   692          }]
   693          livenessProbe: {
   694              httpGet: {
   695                  path: "/debug/health"
   696                  port: 8080
   697              }
   698              initialDelaySeconds: 40
   699              periodSeconds:       3
   700          }
   701      }]
   702  }
   703  EOF
   704  $ cue fmt ./kitchen
   705  ```
   706  
   707  A diff reveals that one prometheus annotation was added to a service.
   708  We assume this to be an accidental omission and accept the differences
   709  
   710  Disks need to be defined in both the template spec section as well as in
   711  the container where they are used.
   712  We prefer to keep these two definitions together.
   713  We take the volumes definition from `expiditer` (the first config in that
   714  directory with two disks), and generalize it:
   715  
   716  ```
   717  $ cat <<EOF >> kitchen/kube.cue
   718  
   719  deployment: [ID=_]: spec: template: spec: {
   720      _hasDisks: *true | bool
   721  
   722      // field comprehension using just "if"
   723      if _hasDisks {
   724          volumes: [{
   725              name: *"\(ID)-disk" | string
   726              gcePersistentDisk: pdName: *"\(ID)-disk" | string
   727              gcePersistentDisk: fsType: "ext4"
   728          }, {
   729              name: *"secret-\(ID)" | string
   730              secret: secretName: *"\(ID)-secrets" | string
   731          }, ...]
   732  
   733          containers: [{
   734              volumeMounts: [{
   735                  name:      *"\(ID)-disk" | string
   736                  mountPath: *"/logs" | string
   737              }, {
   738                  mountPath: *"/etc/certs" | string
   739                  name:      *"secret-\(ID)" | string
   740                  readOnly:  true
   741              }, ...]
   742          }]
   743      }
   744  }
   745  EOF
   746  
   747  $ cat <<EOF >> kitchen/souschef/kube.cue
   748  
   749  deployment: souschef: spec: template: spec: {
   750      _hasDisks: false
   751  }
   752  
   753  EOF
   754  $ cue fmt ./kitchen/...
   755  ```
   756  
   757  This template definition is not ideal: the definitions are positional, so if
   758  configurations were to define the disks in a different order, there would be
   759  no reuse or even conflicts.
   760  Also note that in order to deal with this restriction, almost all field values
   761  are just default values and can be overridden by instances.
   762  A better way would be define a map of volumes,
   763  similarly to how we organized the top-level Kubernetes objects,
   764  and then generate these two sections from this map.
   765  This requires some design, though, and does not belong in a
   766  "quick-and-dirty" tutorial.
   767  Later in this document we introduce a manually optimized configuration.
   768  
   769  We add the two disk by default and define a `_hasDisks` option to opt out.
   770  The `souschef` configuration is the only one that defines no disks.
   771  
   772  ```
   773  $ cue trim -s ./kitchen/...
   774  
   775  # check differences
   776  $ cue eval ./... > snapshot2
   777  $ diff snapshot snapshot2
   778  ...
   779  $ cp snapshot2 snapshot
   780  $ find . | grep kube.cue | xargs wc | tail -1
   781       807    1862   17190 total
   782  ```
   783  
   784  The diff shows that we added the `_hadDisks` option, but otherwise reveals no
   785  differences.
   786  We also reduced the configuration by a sizeable amount once more.
   787  
   788  However, on closer inspection of the remaining files we see a lot of remaining
   789  fields in the disk specifications as a result of inconsistent naming.
   790  Reducing configurations like we did in this exercise exposes inconsistencies.
   791  The inconsistencies can be removed by simply deleting the overrides in the
   792  specific configuration.
   793  Leaving them as is gives a clear signal that a configuration is inconsistent.
   794  
   795  
   796  ### Conclusion of Quick 'n Dirty tutorial
   797  
   798  There is still some gain to be made with the other directories.
   799  At nearly a 1000-line, or 55%, reduction, we leave the rest as an exercise to
   800  the reader.
   801  
   802  We have shown how CUE can be used to reduce boilerplate, enforce consistencies,
   803  and detect inconsistencies.
   804  Being able to deal with consistencies and inconsistencies is a consequence of
   805  the constraint-based model and harder to do with inheritance-based languages.
   806  
   807  We have indirectly also shown how CUE is well-suited for machine manipulation.
   808  This is a factor of syntax and the order independence that follows from its
   809  semantics.
   810  The `trim` command is one of many possible automated refactor tools made
   811  possible by this property.
   812  Also this would be harder to do with inheritance-based configuration languages.
   813  
   814  
   815  ## Define commands
   816  
   817  The `cue export` command can be used to convert the created configuration back
   818  to JSON.
   819  In our case, this requires a top-level "emit value"
   820  to convert our mapped Kubernetes objects back to a list.
   821  Typically, this output is piped to tools like `kubectl` or `etcdctl`.
   822  
   823  In practice this means typing the same commands ad nauseam.
   824  The next step is often to write wrapper tools.
   825  But as there is often no one-size-fits-all solution, this lead to the
   826  proliferation of marginally useful tools.
   827  The `cue` tool provides an alternative by allowing the declaration of
   828  frequently used commands in CUE itself.
   829  Advantages:
   830  
   831  - added domain knowledge that CUE may use for improved analysis,
   832  - only one language to learn,
   833  - easy discovery of commands,
   834  - no further configuration required,
   835  - enforce uniform CLI standards across commands,
   836  - standardized commands across an organization.
   837  
   838  Commands are defined in files ending with `_tool.cue` in the same package as
   839  where the configuration files are defined on which the commands should operate.
   840  Top-level values in the configuration are visible by the tool files
   841  as long as they are not shadowed by top-level fields in the tool files.
   842  Top-level fields in the tool files are not visible in the configuration files
   843  and are not part of any model.
   844  
   845  The tool definitions also have access to additional builtin packages.
   846  A CUE configuration is fully hermetic, disallowing any outside influence.
   847  This property enables automated analysis and manipulation
   848  such as the `trim` command.
   849  The tool definitions, however, have access to such things as command line flags
   850  and environment variables, random generators, file listings, and so on.
   851  
   852  We define the following tools for our example:
   853  
   854  - ls: list the Kubernetes objects defined in our configuration
   855  - dump: dump all selected objects as a YAML stream
   856  - create: send all selected objects to `kubectl` for creation
   857  
   858  ### Preparations
   859  
   860  To work with Kubernetes we need to convert our map of Kubernetes objects
   861  back to a simple list.
   862  We create the tool file to do just that.
   863  
   864  ```
   865  $ cat <<EOF > kube_tool.cue
   866  package kube
   867  
   868  objects: [ for v in objectSets for x in v { x } ]
   869  
   870  objectSets: [
   871  	service,
   872  	deployment,
   873  	statefulSet,
   874  	daemonSet,
   875  	configMap,
   876  ]
   877  EOF
   878  ```
   879  
   880  ### Listing objects
   881  
   882  Commands are defined in the `command` section at the top-level of a tool file.
   883  A `cue` command defines command line flags, environment variables, as well as
   884  a set of tasks.
   885  Examples tasks are load or write a file, dump something to the console,
   886  download a web page, or execute a command.
   887  
   888  We start by defining the `ls` command which dumps all our objects
   889  
   890  ```
   891  $ cat <<EOF > ls_tool.cue
   892  package kube
   893  
   894  import (
   895  	"text/tabwriter"
   896  	"tool/cli"
   897  	"tool/file"
   898  )
   899  
   900  command: ls: {
   901  	task: print: cli.Print & {
   902  		text: tabwriter.Write([
   903  			for x in objects {
   904  				"\(x.kind)  \t\(x.metadata.labels.component)  \t\(x.metadata.name)"
   905  			}
   906  		])
   907  	}
   908  
   909  	task: write: file.Create & {
   910  		filename: "foo.txt"
   911  		contents: task.print.text
   912  	}
   913  }
   914  EOF
   915  ```
   916  <!-- TODO: use "let" once implemented-->
   917  
   918  NOTE: THE API OF THE TASK DEFINITIONS WILL CHANGE.
   919  Although we may keep supporting this form if needed.
   920  
   921  The command is now available in the `cue` tool:
   922  
   923  ```
   924  $ cue cmd ls ./frontend/maitred
   925  Service         frontend        maitred
   926  Deployment      frontend        maitred
   927  ```
   928  
   929  As long as the name does not conflict with an existing command it can be
   930  used as a top-level command as well:
   931  ```
   932  $ cue ls ./frontend/maitred
   933  ...
   934  ```
   935  
   936  If more than one instance is selected the `cue` tool may either operate
   937  on them one by one or merge them.
   938  The default is to merge them.
   939  Different instances of a package are typically not compatible:
   940  different subdirectories may have different specializations.
   941  A merge pre-expands templates of each instance and then merges their root
   942  values.
   943  The result may contain conflicts, such as our top-level `#Component` field,
   944  but our per-type maps of Kubernetes objects should be free of conflict
   945  (if there is, we have a problem with Kubernetes down the line).
   946  A merge thus gives us a unified view of all objects.
   947  
   948  ```
   949  $ cue ls ./...
   950  Service       infra      tasks
   951  Service       frontend   bartender
   952  Service       frontend   breaddispatcher
   953  Service       frontend   host
   954  Service       frontend   maitred
   955  Service       frontend   valeter
   956  Service       frontend   waiter
   957  Service       frontend   waterdispatcher
   958  Service       infra      download
   959  Service       infra      etcd
   960  Service       infra      events
   961  
   962  ...
   963  
   964  Deployment    proxy           nginx
   965  StatefulSet   infra           etcd
   966  DaemonSet     mon             node-exporter
   967  ConfigMap     mon        alertmanager
   968  ConfigMap     mon        prometheus
   969  ConfigMap     proxy      authproxy
   970  ConfigMap     proxy      nginx
   971  ```
   972  
   973  ### Dumping a YAML Stream
   974  
   975  The following adds a command to dump the selected objects as a YAML stream.
   976  
   977  <!--
   978  TODO: add command line flags to filter object types.
   979  -->
   980  ```
   981  $ cat <<EOF > dump_tool.cue
   982  package kube
   983  
   984  import (
   985  	"encoding/yaml"
   986  	"tool/cli"
   987  )
   988  
   989  command: dump: {
   990  	task: print: cli.Print & {
   991  		text: yaml.MarshalStream(objects)
   992  	}
   993  }
   994  EOF
   995  ```
   996  
   997  <!--
   998  TODO: with new API as well as conversions implemented
   999  command dump task print: cli.Print(text: yaml.MarshalStream(objects))
  1000  
  1001  or without conversions:
  1002  command dump task print: cli.Print & {text: yaml.MarshalStream(objects)}
  1003  -->
  1004  
  1005  The `MarshalStream` command converts the list of objects to a '`---`'-separated
  1006  stream of YAML values.
  1007  
  1008  
  1009  ### Creating Objects
  1010  
  1011  The `create` command sends a list of objects to `kubectl create`.
  1012  
  1013  ```
  1014  $ cat <<EOF > create_tool.cue
  1015  package kube
  1016  
  1017  import (
  1018  	"encoding/yaml"
  1019  	"tool/exec"
  1020  	"tool/cli"
  1021  )
  1022  
  1023  command: create: {
  1024  	task: kube: exec.Run & {
  1025  		cmd:    "kubectl create --dry-run -f -"
  1026  		stdin:  yaml.MarshalStream(objects)
  1027  		stdout: string
  1028  	}
  1029  
  1030  	task: display: cli.Print & {
  1031  		text: task.kube.stdout
  1032  	}
  1033  }
  1034  EOF
  1035  ```
  1036  
  1037  This command has two tasks, named `kube` and `display`.
  1038  The `display` task depends on the output of the `kube` task.
  1039  The `cue` tool does a static analysis of the dependencies and runs all
  1040  tasks which dependencies are satisfied in parallel while blocking tasks
  1041  for which an input is missing.
  1042  
  1043  ```
  1044  $ cue create ./frontend/...
  1045  service "bartender" created (dry run)
  1046  service "breaddispatcher" created (dry run)
  1047  service "host" created (dry run)
  1048  service "maitred" created (dry run)
  1049  service "valeter" created (dry run)
  1050  service "waiter" created (dry run)
  1051  service "waterdispatcher" created (dry run)
  1052  deployment.apps "bartender" created (dry run)
  1053  deployment.apps "breaddispatcher" created (dry run)
  1054  deployment.apps "host" created (dry run)
  1055  deployment.apps "maitred" created (dry run)
  1056  deployment.apps "valeter" created (dry run)
  1057  deployment.apps "waiter" created (dry run)
  1058  deployment.apps "waterdispatcher" created (dry run)
  1059  ```
  1060  
  1061  A production real-life version of this could should omit the `--dry-run` flag
  1062  of course.
  1063  
  1064  ### Extract CUE templates directly from Kubernetes Go source
  1065  
  1066  In order for `cue get go` to generate the CUE templates from Go sources, you first need to have the sources locally:
  1067  
  1068  ```
  1069  $ go get k8s.io/api/apps/v1
  1070  ```
  1071  
  1072  ```
  1073  $ cue get go k8s.io/api/apps/v1
  1074  
  1075  ```
  1076  
  1077  Now that we have the Kubernetes definitions in our module, we can import and use them:
  1078  
  1079  ```
  1080  $ cat <<EOF > k8s_defs.cue
  1081  package kube
  1082  
  1083  import (
  1084    "k8s.io/api/core/v1"
  1085    apps_v1 "k8s.io/api/apps/v1"
  1086  )
  1087  
  1088  service: [string]:     v1.#Service
  1089  deployment: [string]:  apps_v1.#Deployment
  1090  daemonSet: [string]:   apps_v1.#DaemonSet
  1091  statefulSet: [string]: apps_v1.#StatefulSet
  1092  EOF
  1093  ```
  1094  
  1095  And, finally, we'll format again:
  1096  
  1097  ```
  1098  cue fmt
  1099  ```
  1100  
  1101  ## Manually tailored configuration
  1102  
  1103  In Section "Quick 'n Dirty" we showed how to quickly get going with CUE.
  1104  With a bit more deliberation, one can reduce configurations even further.
  1105  Also, we would like to define a configuration that is more generic and less tied
  1106  to Kubernetes.
  1107  
  1108  We will rely heavily on CUEs order independence, which makes it easy to
  1109  combine two configurations of the same object in a well-defined way.
  1110  This makes it easy, for instance, to put frequently used fields in one file
  1111  and more esoteric one in another and then combine them without fear that one
  1112  will override the other.
  1113  We will take this approach in this section.
  1114  
  1115  The end result of this tutorial is in the `manual` directory.
  1116  In the next sections we will show how to get there.
  1117  
  1118  
  1119  ### Outline
  1120  
  1121  The basic premise of our configuration is to maintain two configurations,
  1122  a simple and abstract one, and one compatible with Kubernetes.
  1123  The Kubernetes version is automatically generated from the simple configuration.
  1124  Each simplified object has a `kubernetes` section that get gets merged into
  1125  the Kubernetes object upon conversion.
  1126  
  1127  We define one top-level file with our generic definitions.
  1128  
  1129  ```
  1130  // file cloud.cue
  1131  package cloud
  1132  
  1133  service: [Name=_]: {
  1134      name: *Name | string // the name of the service
  1135  
  1136      ...
  1137  
  1138      // Kubernetes-specific options that get mixed in when converting
  1139      // to Kubernetes.
  1140      kubernetes: {
  1141      }
  1142  }
  1143  
  1144  deployment: [Name=_]: {
  1145      name: *Name | string
  1146     ...
  1147  }
  1148  ```
  1149  
  1150  A Kubernetes-specific file then contains the definitions to
  1151  convert the generic objects to Kubernetes.
  1152  
  1153  Overall, the code modeling our services and the code generating the kubernetes
  1154  code is separated, while still allowing to inject Kubernetes-specific
  1155  data into our general model.
  1156  At the same time, we can add additional information to our model without
  1157  it ending up in the Kubernetes definitions causing it to barf.
  1158  
  1159  
  1160  ### Deployment Definition
  1161  
  1162  For our design we assume that all Kubernetes Pod derivatives only define one
  1163  container.
  1164  This is clearly not the case in general, but often it does and it is good
  1165  practice.
  1166  Conveniently, it simplifies our model as well.
  1167  
  1168  We base the model loosely on the master templates we derived in
  1169  Section "Quick 'n Dirty".
  1170  The first step we took is to eliminate `statefulSet` and `daemonSet` and
  1171  rather just have a `deployment` allowing different kinds.
  1172  
  1173  ```
  1174  deployment: [Name=_]: _base & {
  1175      name:     *Name | string
  1176      ...
  1177  ```
  1178  
  1179  The kind only needs to be specified if the deployment is a stateful set or
  1180  daemonset.
  1181  This also eliminates the need for `_spec`.
  1182  
  1183  The next step is to pull common fields, such as `image` to the top level.
  1184  
  1185  Arguments can be specified as a map.
  1186  ```
  1187      arg: [string]: string
  1188      args: [ for k, v in arg { "-\(k)=\(v)" } ] | [...string]
  1189  ```
  1190  
  1191  If order matters, users could explicitly specify the list as well.
  1192  
  1193  For ports we define two simple maps from name to port number:
  1194  
  1195  ```
  1196      // expose port defines named ports that is exposed in the service
  1197      expose: port: [string]: int
  1198  
  1199      // port defines a named port that is not exposed in the service.
  1200      port: [string]: int
  1201  ```
  1202  Both maps get defined in the container definition, but only `port` gets
  1203  included in the service definition.
  1204  This may not be the best model, and does not support all features,
  1205  but it shows how one can chose a different representation.
  1206  
  1207  A similar story holds for environment variables.
  1208  In most cases mapping strings to string suffices.
  1209  The testdata uses other options though.
  1210  We define a simple `env` map and an `envSpec` for more elaborate cases:
  1211  
  1212  ```
  1213      env: [string]: string
  1214  
  1215      envSpec: [string]: {}
  1216      envSpec: {
  1217          for k, v in env {
  1218              "\(k)" value: v
  1219          }
  1220      }
  1221  ```
  1222  The simple map automatically gets mapped into the more elaborate map
  1223  which then presents the full picture.
  1224  
  1225  Finally, our assumption that there is one container per deployment allows us
  1226  to create a single definition for volumes, combining the information for
  1227  volume spec and volume mount.
  1228  
  1229  ```
  1230      volume: [Name=_]: {
  1231          name:       *Name | string
  1232          mountPath:  string
  1233          subPath:    null | string
  1234          readOnly:   bool
  1235          kubernetes: {}
  1236      }
  1237  ```
  1238  
  1239  All other fields that we way want to define can go into a generic kubernetes
  1240  struct that gets merged in with all other generated kubernetes data.
  1241  This even allows us to augment generated data, such as adding additional
  1242  fields to the container.
  1243  
  1244  
  1245  ### Service Definition
  1246  
  1247  The service definition is straightforward.
  1248  As we eliminated stateful and daemon sets, the field comprehension to
  1249  automatically derive a service is now a bit simpler:
  1250  
  1251  ```
  1252  // define services implied by deployments
  1253  service: {
  1254      for k, spec in deployment {
  1255          "\(k)": {
  1256              // Copy over all ports exposed from containers.
  1257              for Name, Port in spec.expose.port {
  1258                  port: "\(Name)": {
  1259                      port:       *Port | int
  1260                      targetPort: *Port | int
  1261                  }
  1262              }
  1263  
  1264              // Copy over the labels
  1265              label: spec.label
  1266          }
  1267      }
  1268  }
  1269  ```
  1270  
  1271  The complete top-level model definitions can be found at
  1272  [doc/tutorial/kubernetes/manual/services/cloud.cue](https://cue.googlesource.com/cue/+/master/doc/tutorial/kubernetes/manual/services/cloud.cue).
  1273  
  1274  The tailorings for this specific project (the labels) are defined
  1275  [here](https://cue.googlesource.com/cue/+/master/doc/tutorial/kubernetes/manual/services/kube.cue).
  1276  
  1277  
  1278  ### Converting to Kubernetes
  1279  
  1280  Converting services is fairly straightforward.
  1281  
  1282  ```
  1283  kubernetes: services: {
  1284      for k, x in service {
  1285          "\(k)": x.kubernetes & {
  1286              apiVersion: "v1"
  1287              kind:       "Service"
  1288  
  1289              metadata: name:   x.name
  1290              metadata: labels: x.label
  1291              spec: selector:   x.label
  1292  
  1293              spec: ports: [ for p in x.port { p } ]
  1294          }
  1295      }
  1296  }
  1297  ```
  1298  
  1299  We add the Kubernetes boilerplate, map the top-level fields and mix in
  1300  the raw `kubernetes` fields for each service.
  1301  
  1302  Mapping deployments is a bit more involved, though analogous.
  1303  The complete definitions for Kubernetes conversions can be found at
  1304  [doc/tutorial/kubernetes/manual/services/k8s.cue](https://cue.googlesource.com/cue/+/master/doc/tutorial/kubernetes/manual/services/k8s.cue).
  1305  
  1306  Converting the top-level definitions to concrete Kubernetes code is the hardest
  1307  part of this exercise.
  1308  That said, most CUE users will never have to resort to this level of CUE
  1309  to write configurations.
  1310  For instance, none of the files in the subdirectories contain comprehensions,
  1311  not even the template files in these directories (such as `kitchen/kube.cue`).
  1312  Furthermore, none of the configuration files in any of the
  1313  leaf directories contain string interpolations.
  1314  
  1315  
  1316  ### Metrics
  1317  
  1318  The fully written out manual configuration can be found in the `manual`
  1319  subdirectory.
  1320  Running our usual count yields
  1321  ```
  1322  $ find . | grep kube.cue | xargs wc | tail -1
  1323       542    1190   11520 total
  1324  ```
  1325  This does not count our conversion templates.
  1326  Assuming that the top-level templates are reusable, and if we don't count them
  1327  for both approaches, the manual approach shaves off about another 150 lines.
  1328  If we count the templates as well, the two approaches are roughly equal.
  1329  
  1330  
  1331  ### Conclusions Manual Configuration
  1332  
  1333  We have shown that we can further compact a configuration by manually
  1334  optimizing template files.
  1335  However, we have also shown that the manual optimization only gives
  1336  a marginal benefit with respect to the quick-and-dirty semi-automatic reduction.
  1337  The benefits for the manual definition largely lies in the organizational
  1338  flexibility one gets.
  1339  
  1340  Manually tailoring your configurations allows creating an abstraction layer
  1341  between logical definitions and Kubernetes-specific definitions.
  1342  At the same time, CUE's order independence
  1343  makes it easy to mix in low-level Kubernetes configuration wherever it is
  1344  convenient and applicable.
  1345  
  1346  Manual tailoring also allows us to add our own definitions without breaking
  1347  Kubernetes.
  1348  This is crucial in defining information relevant to definitions,
  1349  but unrelated to Kubernetes, where they belong.
  1350  
  1351  Separating abstract from concrete configuration also allows us to create
  1352  difference adaptors for the same configuration.
  1353  
  1354  
  1355  <!-- TODO:
  1356  ## Conversion to `docker-compose`
  1357  -->