github.com/GoogleContainerTools/skaffold/v2@v2.13.2/docs-v2/design_proposals/sync-auto-support.md (about)

     1  # Auto sync support design doc
     2  
     3  * Author(s): Appu Goundan (@loosebazooka)
     4  * Design Shepherds: Tejal Desai (@tejal29), Balint Pato (@balopat)
     5  * Date: 09/17/2019
     6  * Status: Implementation in progress
     7  
     8  ## Background
     9  
    10  Currently skaffold does not support `sync` for files that are generated
    11  during a build. For example when syncing java files to a container, one
    12  would normally expect `.class` files to be sync'd, but skaffold is
    13  really only aware of `.java` files in a build.
    14  
    15  1. Why is this required?
    16    - We would like to support the following features
    17        - Skaffold can sync files that are not watched (these files are not
    18          considered inputs for the container builder). Motivating example: user
    19          compiles java class files outside of the container _manually_, while
    20          Spring Boot DevTools is running inside the container. The class files
    21          would be picked up by Skaffold and copied over, and the app would pick
    22          up the changes.
    23        - Skaffold can sync files that are generated by a buildscript. Motivating
    24          example: user changes java files, Skaffold watchers notices this, runs the
    25          buildscript that generates class files. Since class files are marked 'generated'
    26          synacbles, they are synced to the container, where Spring Boot DevTools
    27          picks up the change.
    28        - And one or both of the following:
    29            - Skaffold can be configured to run a custom script to when certain files change.
    30              Motivating example: Jib by default will run a full container build
    31              on "build", but we only want to generate intermediate assets (class
    32              files, etc) when doing a sync build.
    33            - Skaffold can sync an externally generated tar file to a remote container (see
    34              [alternative design for builder using tar](#delegate-generation-of-sync-tar-to-builder))
    35  
    36  2. If this is a redesign, what are the drawbacks of the current implementation?
    37    - This is not a redesign, but a new sync mode that may or may not be available
    38      in the skaffold.yaml directly to users. It could simply be an internal API
    39      that is usable by builders like Jib.
    40    - This is similar to `_smart_` described in [sync-improvements](sync-improvements.md)
    41  
    42  3. Is there any another workaround, and if so, what are its drawbacks?
    43    - Currently one can create a docker file that only copies specific build
    44      results into the container and relies on a local build to generate those
    45      intermediate artifacts. This requires a user to trigger a first build
    46      manually before docker kicks in to containerize the application. While
    47      it may be possible to automate this (for example: gradle --continuous), it
    48      is not an acceptable solution to require manual external processes for
    49      a build to succeed. Covered in [Hack it](#hack-it)
    50  
    51  4. Mention related issues, if there are any.
    52    - This is not trying to solve the problem of dealing with a multistage
    53      dockerbuild. Intermediate build artifacts might still be possible to
    54      determine, however that would require an extra mechanism to do a partial
    55      docker build and sync files from a built container to a running container --
    56      something we do not intend to cover here.
    57  
    58  #### Problems with current API/config
    59  The current `sync` system has the following problems:
    60  1. No way to trigger local out-of-container processes - for example, in jib, to build a
    61     container, one would run `./gradlew jib`, but to update class files so the
    62     system may sync them, one would only be required to run `./gradlew classes`.
    63  2. No way to tell the system to sync non build inputs. Skaffold is
    64     normally only watching `.java` files for a build, in the sync case, we want
    65     it to watch `.java` files, trigger a partial build, and sync `.class` files
    66     so a remote server can pick up and reload the changes.
    67  
    68  ## Design
    69  
    70  ### Hack it
    71  To get close to the functionality we want, without modifying skaffold at all, a
    72  Dockerfile which depends on java build outputs could be used, like:
    73  ```
    74  FROM openjdk:8
    75  COPY build/dependencies/ /app/dependencies
    76  COPY build/classes/java/main/ /app/classes
    77  
    78  CMD ["java", "-cp", "/app/classes:/app/dependencies/*", "hello.Application"]
    79  ```
    80  
    81  with a skaffold sync block that looks like:
    82  ```
    83  sync:
    84    manual:
    85      - src: "build/classes/java/main/**/*.class"
    86        dest: "/app/classes"
    87        strip: "build/classes/java/main/"
    88  ```
    89  
    90  A user's devloop then looks like this:
    91  
    92  1. run `./gradlew classes copyDependencies`
    93  1. run `skaffold dev`
    94  1. *make changes to some java file*
    95  1. run `./gradlew classes`
    96  1. *skaffold syncs files*
    97  
    98  which is far from ideal.
    99  
   100  ### A new `auto` option
   101  
   102  Provide users with an `auto` option, users that use `auto` should expect
   103  the builder-sync to do the right thing and will not be required to do much
   104  configuration.
   105  
   106  `auto` will only work with builders that have implemented the `auto` spec. We
   107  expect at least `jib` to do implement the spec.
   108  
   109  #### User Configuration
   110  
   111  ```yaml
   112  build:
   113    artifacts:
   114    - image: ...
   115      context: jib-project
   116      jib: {}
   117      sync:
   118        auto: {}
   119  ```
   120  
   121  
   122  #### Get necessary information from the builder
   123  
   124  Skaffold can expose an API that can accept a complex configuration on how
   125  skaffold should be doing synchronization.
   126  
   127  The builder will talk to sync component by providing it with the following data
   128  1. A list of `generated` configs, each containing
   129      1. A `command` to generate files to sync
   130      1. A list of inputs to watch as triggers
   131      1. A list of syncs (src, dest) to execute after generation
   132  1. A list of `direct` sync directives (src, dest) to execute without any script
   133     execution
   134  
   135  So maybe some datastructures like (I don't really know a lot of go, so assume
   136  this will be written in some consistent way eventually):
   137  
   138  ```golang
   139  type AutoSync struct {
   140    generated []Generated
   141    direct []Direct
   142  }
   143  
   144  type Generated struct {
   145    command []String
   146    inputs []File
   147    syncables []Syncables
   148  }
   149  
   150  type Direct struct {
   151    syncables []Syncables
   152  }
   153  
   154  type Syncables struct {
   155    src String
   156    dest String
   157  }
   158  ```
   159  
   160  #### Jib - Skaffold Sync interface
   161  
   162  How a tool like Jib might surface the necessary information to Skaffold
   163  
   164  I would expect to add a task like `_jibSkaffoldSyncMap` that will produce
   165  json output for the skaffold-jib intergration to consume and forward to the sync
   166  system. And example output might look like:
   167  
   168  ```
   169  BEGIN JIB JSON: SYNCMAP/1
   170  {
   171    "generated": [
   172      {
   173        src: “target/classes”
   174        dest: "app/classes",
   175      },
   176      {
   177        src: “target/resources"
   178        dest: "app/resources",
   179      }
   180    ]
   181    "direct": [
   182      {
   183        src: "src/main/extra1",
   184        dest: "/"
   185      },
   186      {
   187        src: "src/main/extra2",
   188        dest: "/"
   189      },
   190      {
   191        src: ".m2/some/dep/my-dep-SNAPSHOT.jar",
   192        dest: "app/dependencies/my-dep-SNAPSHOT.jar"
   193      }
   194    }
   195  }
   196  ```
   197  
   198  Files in the `generated` section will trigger a partial rebuild of the container
   199  (everything before containerization) while files in the `direct` section can
   200  just be synchronized to the running container without a rebuild of anything.
   201  
   202  ##### Sync or Rebuild?
   203  
   204  Each builder implementing an `auto` builder should be able to decide when a sync
   205  should be skipped and a rebuild done instead. In the jib case for instance, a
   206  rebuild will be triggered if:
   207  - a build file has changed (`build.gradle`, `pom.xml`, etc)
   208  - a file is deleted
   209  
   210  #### Open Issues/Questions
   211  
   212  **What about files that have specific permissions?**
   213  
   214  Jib allows users to customize file permissions, for instance a file on the
   215  container can be configured to be 0x456 or something. One option instead of
   216  dealing with this, is to just make all sync'd files 777? Or we allow passthrough
   217  of permissions from the build system to the sync system.
   218  
   219  **Should we allow the user to configure the auto block?**
   220  
   221  Perhaps the user knows something that jib doesnt, and wants to ignore some files
   222  from synchronization. They might want to do:
   223  
   224  ```
   225  sync:
   226    auto:
   227      ignored:
   228      - "src/main/jib/myExtraFilesThatBreakADeployment"
   229  ```
   230  
   231  ## Implementation plan for jib
   232  
   233  - [`schemas/<version>.go`] Add `Auto` to the schema under `Sync`
   234  - Before first build, initialize the `sync` state for builders in `auto` mode, this
   235  sync state is saved in the builder's specific implementation of `auto`
   236  
   237  - On file change in dev mode:
   238  ```
   239  if (files were deleted)
   240    return REBUILD
   241  
   242  if (changes were made to build def)
   243    return REBUILD
   244  
   245  lastSyncState = syncStates["this project"]
   246  
   247  if (if all files changes are in lastSyncState.direct)
   248    return SYNC{list of direct files}
   249  
   250  newSyncState = buildAndCalculateNewSyncState("this project")
   251  syncStates["this project"] = newSyncState
   252  
   253  diff = diff(lastSyncState, newSyncState)
   254  return SYNC{files in diff}
   255  ```
   256  
   257  ## Integration test plan
   258  
   259  
   260  This can be a Kind based test, we don't need GCP secrets. 
   261  Hence: 
   262  - need to install gradle or maven on travis integration test jobs outside the cluster for testing the build script
   263  - the test would do the following steps: 
   264      - spin up a simple java app built by jib with skaffold dev + autosync enabled
   265      - get pod name
   266      - change a .java file (should trigger generation of a class file) 
   267      - check with kubectl inside the pod (if pod not found as rebuild was triggered, fail) that the class file is copied over (compare content with external version) 
   268  
   269  ## Alternatives Explored
   270  
   271  The following implementation had a few problems and that were potential deal
   272  breakers, this implementation is left in here for information purposes, no
   273  attempt was made to implement this. It has the following issues:
   274  1. Depends on `sync` using `tar` which should be an implementation detail
   275  2. Doesn't really provide a good base for exposing the `auto` sync configuration
   276     block. As skaffold takes care of fewer things, the user must be responsible
   277     for implementing it.
   278  
   279  #### Delegate generation of sync tar to builder
   280  
   281  One option is to complete hand over the detection of container updates to the
   282  builder which would provide skaffold with a tar to synchronize.
   283  
   284  A potential runthrough might look like:
   285  
   286  1. Skaffold detects changes
   287  1. Skaffold asks build system if sync should happen
   288  1. Build system can reply
   289    1. Yes: and here is the `tar` to sync
   290    1. No: this will tell skaffold it should do a rebuild
   291  
   292  This allows the builder implementation to make the sync decisions within it's
   293  own system.
   294  
   295  #### Jib - Skaffold Sync interface
   296  
   297  For a potential consumer of this mechanism like jib, we would expose a task like
   298  `_jibCreateSyncTar` which would so something like
   299  
   300  1. If we should rebuild -> tell skaffold to rebuild, so output something like
   301    ```
   302    REBUILD
   303    ```
   304  1. If we can sync
   305    1. Compare against last build
   306    2. Tar up files for synchronization
   307    3. Tell skaffold where to find that tar
   308    ```
   309    SYNC: /path/to/sync.tar
   310    ```