github.com/GoogleContainerTools/skaffold/v2@v2.13.2/docs-v2/design_proposals/sync-auto-support.md (about) 1 # Auto sync support design doc 2 3 * Author(s): Appu Goundan (@loosebazooka) 4 * Design Shepherds: Tejal Desai (@tejal29), Balint Pato (@balopat) 5 * Date: 09/17/2019 6 * Status: Implementation in progress 7 8 ## Background 9 10 Currently skaffold does not support `sync` for files that are generated 11 during a build. For example when syncing java files to a container, one 12 would normally expect `.class` files to be sync'd, but skaffold is 13 really only aware of `.java` files in a build. 14 15 1. Why is this required? 16 - We would like to support the following features 17 - Skaffold can sync files that are not watched (these files are not 18 considered inputs for the container builder). Motivating example: user 19 compiles java class files outside of the container _manually_, while 20 Spring Boot DevTools is running inside the container. The class files 21 would be picked up by Skaffold and copied over, and the app would pick 22 up the changes. 23 - Skaffold can sync files that are generated by a buildscript. Motivating 24 example: user changes java files, Skaffold watchers notices this, runs the 25 buildscript that generates class files. Since class files are marked 'generated' 26 synacbles, they are synced to the container, where Spring Boot DevTools 27 picks up the change. 28 - And one or both of the following: 29 - Skaffold can be configured to run a custom script to when certain files change. 30 Motivating example: Jib by default will run a full container build 31 on "build", but we only want to generate intermediate assets (class 32 files, etc) when doing a sync build. 33 - Skaffold can sync an externally generated tar file to a remote container (see 34 [alternative design for builder using tar](#delegate-generation-of-sync-tar-to-builder)) 35 36 2. If this is a redesign, what are the drawbacks of the current implementation? 37 - This is not a redesign, but a new sync mode that may or may not be available 38 in the skaffold.yaml directly to users. It could simply be an internal API 39 that is usable by builders like Jib. 40 - This is similar to `_smart_` described in [sync-improvements](sync-improvements.md) 41 42 3. Is there any another workaround, and if so, what are its drawbacks? 43 - Currently one can create a docker file that only copies specific build 44 results into the container and relies on a local build to generate those 45 intermediate artifacts. This requires a user to trigger a first build 46 manually before docker kicks in to containerize the application. While 47 it may be possible to automate this (for example: gradle --continuous), it 48 is not an acceptable solution to require manual external processes for 49 a build to succeed. Covered in [Hack it](#hack-it) 50 51 4. Mention related issues, if there are any. 52 - This is not trying to solve the problem of dealing with a multistage 53 dockerbuild. Intermediate build artifacts might still be possible to 54 determine, however that would require an extra mechanism to do a partial 55 docker build and sync files from a built container to a running container -- 56 something we do not intend to cover here. 57 58 #### Problems with current API/config 59 The current `sync` system has the following problems: 60 1. No way to trigger local out-of-container processes - for example, in jib, to build a 61 container, one would run `./gradlew jib`, but to update class files so the 62 system may sync them, one would only be required to run `./gradlew classes`. 63 2. No way to tell the system to sync non build inputs. Skaffold is 64 normally only watching `.java` files for a build, in the sync case, we want 65 it to watch `.java` files, trigger a partial build, and sync `.class` files 66 so a remote server can pick up and reload the changes. 67 68 ## Design 69 70 ### Hack it 71 To get close to the functionality we want, without modifying skaffold at all, a 72 Dockerfile which depends on java build outputs could be used, like: 73 ``` 74 FROM openjdk:8 75 COPY build/dependencies/ /app/dependencies 76 COPY build/classes/java/main/ /app/classes 77 78 CMD ["java", "-cp", "/app/classes:/app/dependencies/*", "hello.Application"] 79 ``` 80 81 with a skaffold sync block that looks like: 82 ``` 83 sync: 84 manual: 85 - src: "build/classes/java/main/**/*.class" 86 dest: "/app/classes" 87 strip: "build/classes/java/main/" 88 ``` 89 90 A user's devloop then looks like this: 91 92 1. run `./gradlew classes copyDependencies` 93 1. run `skaffold dev` 94 1. *make changes to some java file* 95 1. run `./gradlew classes` 96 1. *skaffold syncs files* 97 98 which is far from ideal. 99 100 ### A new `auto` option 101 102 Provide users with an `auto` option, users that use `auto` should expect 103 the builder-sync to do the right thing and will not be required to do much 104 configuration. 105 106 `auto` will only work with builders that have implemented the `auto` spec. We 107 expect at least `jib` to do implement the spec. 108 109 #### User Configuration 110 111 ```yaml 112 build: 113 artifacts: 114 - image: ... 115 context: jib-project 116 jib: {} 117 sync: 118 auto: {} 119 ``` 120 121 122 #### Get necessary information from the builder 123 124 Skaffold can expose an API that can accept a complex configuration on how 125 skaffold should be doing synchronization. 126 127 The builder will talk to sync component by providing it with the following data 128 1. A list of `generated` configs, each containing 129 1. A `command` to generate files to sync 130 1. A list of inputs to watch as triggers 131 1. A list of syncs (src, dest) to execute after generation 132 1. A list of `direct` sync directives (src, dest) to execute without any script 133 execution 134 135 So maybe some datastructures like (I don't really know a lot of go, so assume 136 this will be written in some consistent way eventually): 137 138 ```golang 139 type AutoSync struct { 140 generated []Generated 141 direct []Direct 142 } 143 144 type Generated struct { 145 command []String 146 inputs []File 147 syncables []Syncables 148 } 149 150 type Direct struct { 151 syncables []Syncables 152 } 153 154 type Syncables struct { 155 src String 156 dest String 157 } 158 ``` 159 160 #### Jib - Skaffold Sync interface 161 162 How a tool like Jib might surface the necessary information to Skaffold 163 164 I would expect to add a task like `_jibSkaffoldSyncMap` that will produce 165 json output for the skaffold-jib intergration to consume and forward to the sync 166 system. And example output might look like: 167 168 ``` 169 BEGIN JIB JSON: SYNCMAP/1 170 { 171 "generated": [ 172 { 173 src: “target/classes” 174 dest: "app/classes", 175 }, 176 { 177 src: “target/resources" 178 dest: "app/resources", 179 } 180 ] 181 "direct": [ 182 { 183 src: "src/main/extra1", 184 dest: "/" 185 }, 186 { 187 src: "src/main/extra2", 188 dest: "/" 189 }, 190 { 191 src: ".m2/some/dep/my-dep-SNAPSHOT.jar", 192 dest: "app/dependencies/my-dep-SNAPSHOT.jar" 193 } 194 } 195 } 196 ``` 197 198 Files in the `generated` section will trigger a partial rebuild of the container 199 (everything before containerization) while files in the `direct` section can 200 just be synchronized to the running container without a rebuild of anything. 201 202 ##### Sync or Rebuild? 203 204 Each builder implementing an `auto` builder should be able to decide when a sync 205 should be skipped and a rebuild done instead. In the jib case for instance, a 206 rebuild will be triggered if: 207 - a build file has changed (`build.gradle`, `pom.xml`, etc) 208 - a file is deleted 209 210 #### Open Issues/Questions 211 212 **What about files that have specific permissions?** 213 214 Jib allows users to customize file permissions, for instance a file on the 215 container can be configured to be 0x456 or something. One option instead of 216 dealing with this, is to just make all sync'd files 777? Or we allow passthrough 217 of permissions from the build system to the sync system. 218 219 **Should we allow the user to configure the auto block?** 220 221 Perhaps the user knows something that jib doesnt, and wants to ignore some files 222 from synchronization. They might want to do: 223 224 ``` 225 sync: 226 auto: 227 ignored: 228 - "src/main/jib/myExtraFilesThatBreakADeployment" 229 ``` 230 231 ## Implementation plan for jib 232 233 - [`schemas/<version>.go`] Add `Auto` to the schema under `Sync` 234 - Before first build, initialize the `sync` state for builders in `auto` mode, this 235 sync state is saved in the builder's specific implementation of `auto` 236 237 - On file change in dev mode: 238 ``` 239 if (files were deleted) 240 return REBUILD 241 242 if (changes were made to build def) 243 return REBUILD 244 245 lastSyncState = syncStates["this project"] 246 247 if (if all files changes are in lastSyncState.direct) 248 return SYNC{list of direct files} 249 250 newSyncState = buildAndCalculateNewSyncState("this project") 251 syncStates["this project"] = newSyncState 252 253 diff = diff(lastSyncState, newSyncState) 254 return SYNC{files in diff} 255 ``` 256 257 ## Integration test plan 258 259 260 This can be a Kind based test, we don't need GCP secrets. 261 Hence: 262 - need to install gradle or maven on travis integration test jobs outside the cluster for testing the build script 263 - the test would do the following steps: 264 - spin up a simple java app built by jib with skaffold dev + autosync enabled 265 - get pod name 266 - change a .java file (should trigger generation of a class file) 267 - check with kubectl inside the pod (if pod not found as rebuild was triggered, fail) that the class file is copied over (compare content with external version) 268 269 ## Alternatives Explored 270 271 The following implementation had a few problems and that were potential deal 272 breakers, this implementation is left in here for information purposes, no 273 attempt was made to implement this. It has the following issues: 274 1. Depends on `sync` using `tar` which should be an implementation detail 275 2. Doesn't really provide a good base for exposing the `auto` sync configuration 276 block. As skaffold takes care of fewer things, the user must be responsible 277 for implementing it. 278 279 #### Delegate generation of sync tar to builder 280 281 One option is to complete hand over the detection of container updates to the 282 builder which would provide skaffold with a tar to synchronize. 283 284 A potential runthrough might look like: 285 286 1. Skaffold detects changes 287 1. Skaffold asks build system if sync should happen 288 1. Build system can reply 289 1. Yes: and here is the `tar` to sync 290 1. No: this will tell skaffold it should do a rebuild 291 292 This allows the builder implementation to make the sync decisions within it's 293 own system. 294 295 #### Jib - Skaffold Sync interface 296 297 For a potential consumer of this mechanism like jib, we would expose a task like 298 `_jibCreateSyncTar` which would so something like 299 300 1. If we should rebuild -> tell skaffold to rebuild, so output something like 301 ``` 302 REBUILD 303 ``` 304 1. If we can sync 305 1. Compare against last build 306 2. Tar up files for synchronization 307 3. Tell skaffold where to find that tar 308 ``` 309 SYNC: /path/to/sync.tar 310 ```