sigs.k8s.io/gateway-api@v1.0.0/site-src/reference/implementers-guide.md (about) 1 # Gateway API Implementer's Guide 2 3 Everything you wanted to know about building a Gateway API implementation 4 but were too afraid to ask. 5 6 This document is a place to collect tips and tricks for _writing a Gateway API 7 implementation_ that have no straightforward place within the godoc fields of the 8 underlying types. 9 10 It's also intended to be a place to write down some guidelines to 11 help implementers of this API to skip making common mistakes. 12 13 It may not be very relevant if you are intending to _use_ this API as an end 14 user as opposed to _building_ something that uses it. 15 16 This is a living document, if you see something missing, PRs welcomed! 17 18 ## Important things to remember about Gateway API 19 20 Hopefully most of these are not surprising, but they sometimes have non-obvious 21 implications that we'll try and lay out here. 22 23 ### Gateway API is a `kubernetes.io` API 24 25 Gateway API uses the `gateway.networking.k8s.io` API group. This means that, 26 like APIs delivered in the core Kubernetes binaries, each time a release happens, 27 the APIs have been reviewed by upstream Kubernetes reviewers, just like the APIs 28 delivered in the core binaries. 29 30 ### Gateway API is delivered using CRDs 31 32 Gateway API is supplied as a set of CRDs, version controlled using our [versioning 33 policy][versioning]. 34 35 The most important part of that versioning policy is that what _appears to be_ 36 the same object (that is, it has the same `group`,`version`, and `kind`) may have 37 a slightly different schema. We make changes in ways that are _compatible_, so 38 things should generally "just work", but there are some actions implementations 39 need to take to make "just work"ing more reliable; these are detailed below. 40 41 The CRD-based delivery also means that if an implementation tries to use (that is 42 get, list, watch, etc) Gateway API objects when the CRDs have _not_ been installed, 43 then it's likely that your Kubernetes client code will return serious errors. 44 Tips to deal with this are also detailed below. 45 46 The CRD definitions for Gateway API objects all contain two specific 47 annotations: 48 49 - `gateway.networking.k8s.io/bundle-version: <semver-release-version>` 50 - `gateway.networking.k8s.io/channel: <channel-name>` 51 52 The concepts of "bundle version" and "channel" (short for "release channel") are 53 explained in our [versioning][versioning] documentation. 54 55 Implementations may use these to determine what schema versions are installed in 56 the cluster, if any. 57 58 [versioning]: /concepts/versioning 59 60 ### Changes to the Gateway API CRDs are backwards compatible 61 62 Part of the contract for Gateway API CRDs is that changes _within an API version_ 63 must be _compatible_. 64 65 "Within an API Version" means changes to a CRD that occur while the same API version 66 (`v1alpha2` or `v1` for example) is in use, and "compatible" means that any new 67 fields, values, or validation will be added to ensure that _previous_ 68 objects _will still be valid objects_ after the change. 69 70 This means that once Gateway API objects move to the `v1` API version, then _all_ 71 changes must be compatible. 72 73 This contract also means that an implementation will not fail with a higher version 74 of the API than the version it was written with, because the newer schema being 75 stored by Kubernetes will definitely be able to be serialized into the older version 76 used in code by the implementation. 77 78 Similarly, if an implementation was written with a _higher_ version, the newer 79 values that it understands will simply _never be used_, as they are not present 80 in the older version. 81 82 ## Implementation Rules and Guidelines 83 84 ### CRD Management 85 86 For a Gateway API implementation to work, the Gateway API CRDs must be installed 87 in the Kubernetes cluster the implementation is watching. 88 89 Implementations have two possible options: installing CRDs themselves (implementation 90 controlled) or requiring installation by some other mechanism before working 91 (externally controlled). Both have tradeoffs, but implementation controlled has 92 significantly more, and so we DO NOT recommend using implementation controlled 93 methods at this time. 94 95 Regardless, either way has certain things that SHOULD be true, however: 96 97 Whatever method is used, infra and cluster admins SHOULD attempt to ensure that 98 the Bundle version of the CRDs is not _downgraded_. Although we ensure that 99 API changes are backwards compatible, changing CRD definitions can change the 100 storage version of the resource, which could have unforeseen effects. Most of the 101 time, things will probably work, but if it doesn't work, it will most likely 102 break in weird ways. 103 104 Additionally, older versions of the API may be missing fields or features, which 105 could be very disruptive for users. 106 107 Try your best to ensure that the bundle version doesn't roll backwards. It's safer. 108 109 Implementations SHOULD also handle the Gateway API CRDs _not_ being present in 110 the cluster without crashing or panicking. Exiting with a clear fatal error is 111 acceptable in this case, as is disabling Gateway API support even if enabled in 112 configuration. 113 114 Practically, for implementations using tools like `controller-runtime` or 115 similar tooling, they may need to check for the _presence_ of the CRDs by 116 getting the list of installed CRDs before attempting to watch those resources. 117 (Note that this will require the implementation to have `read` access to those 118 resources though.) 119 120 #### Implementation-controlled CRD installation 121 122 Implementation-controlled CRD installation also includes automatic installation 123 mechanisms such as Helm, if the CRDs are included in a Helm chart with the 124 implementation's installation. 125 126 Because of significant caveats we DO NOT recommend doing implementation-controlled 127 CRD management at this time. 128 129 However, if you really must, CRD definitions MAY be installed by implementations, 130 but if they do, they MUST have a way to ensure: 131 132 - there are no other Gateway API CRDs installed in the cluster before starting, or 133 - that the CRD definitions are only installed if they are a higher bundle version 134 than any existing Gateway API CRDs. Note that even this may not be safe if there 135 are breaking changes in the experimental channel resources, so implementations 136 should be _very_ careful with doing this. 137 138 This avoids problems if another implementation is also installed in the cluster 139 and expects a higher version of the CRDs to be installed. 140 141 The worst outcome here would be two implementations trying to do automatic install 142 of _different_ CRD versions, resulting in the CRD versions flapping between 143 versions or channels. This would _not_ produce good outcomes. 144 145 The safer method for an automatic installation would require the implementation 146 to: 147 148 - Check if there are any Gateway API CRDs installed in the cluster. 149 - If not, install its most compatible version of the CRDs. 150 - If so, only install its version of the CRDs if the bundle version is higher 151 than the existing one, and the mechanism will also need to check if there are 152 incompatible changes included in any versions as well. 153 154 This is going to be _very_ difficult to pull off in practice. 155 156 It should also be noted that many infra and cluster admins manage CRDs using 157 externally controlled methods that will not be visible to a Gateway 158 implementation, so if you still proceed with automatic installation, it MUST be 159 able to be disabled by the installation owner (whether that is the infra or cluster 160 admin). 161 162 Because of all these caveats, we DO NOT recommend doing automatic CRD management 163 at this time. 164 165 #### Externally controlled CRD installation 166 167 Because of all of the complexities mentioned in the "Implementation controlled" 168 section of this document, we recommend that implementations supply documentation 169 on how to check if CRDs are installed and upgrade versions if required. 170 171 Additions to this document to add suggested commands here are welcomed. 172 173 ### Conformance and Version compatibility 174 175 A conformant Gateway API implementation is one that passes the conformance tests 176 that are included in each Gateway API bundle version release. 177 178 An implementation MUST pass the conformance suite with _no_ skipped tests to be 179 conformant. Tests may be skipped during development, but a version you want to 180 be conformant MUST have no skipped tests. 181 182 Extended features may, as per the contract for Extended status, be disabled. 183 184 Gateway API conformance is version-specific. An implementation that passes 185 conformance for version N may not pass conformance for version N+1 without changes. 186 187 Implementations SHOULD submit a report from the conformance testing suite back 188 to the Gateway API Github repo containing details of their testing. 189 190 The conformance suite output includes the Gateway API version supported. 191 192 #### Version compatibility 193 194 Once v1.0 is released, for implementations supporting Gateway and GatewayClass, 195 they MUST set a new Condition, `SupportedVersion`, with `status: true` meaning 196 that the installed CRD version is supported, and `status: false` meaning that it 197 is not. 198 199 ### Standard Status fields and Conditions 200 201 Gateway API has many resources, but when designing this, we've worked to keep 202 the status experience as consistent as possible across objects, using the 203 Condition type and the `status.conditions` field. 204 205 Most resources have a `status.conditions` field, but some also have a namespaced 206 field that _contains_ a `conditions` field. 207 208 For the latter, Gateway's `status.listeners` and the Route `status.parents` 209 fields are examples where each item in the slice identifies the Conditions 210 associated with some subset of configuration. 211 212 For the Gateway case, it's to allow Conditions per _Listener_, and in the Route 213 case, it's to allow Conditions per _implementation_ (since Route objects can 214 be used in multiple Gateways, and those Gateways can be reconciled by different 215 implementations). 216 217 In all of these cases, there are some relatively-common Condition types that have 218 similar meanings: 219 220 - `Accepted` - the resource or part thereof contains acceptable config that will 221 produce some configuration in the underlying data plane that the implementation 222 controls. This does not mean that the _whole_ configuration is valid, just that 223 _enough_ is valid to produce some effect. 224 - `Programmed` - this represents a later phase of operation, after `Accepted`, 225 when the resource or part thereof has been Accepted and programmed into the 226 underlying dataplane. Users should expect the configuration to be ready for 227 traffic to flow _at some point in the near future_. This Condition does _not_ 228 say that the dataplane is ready _when it's set_, just that everything is valid 229 and it _will become ready soon_. "Soon" may have different meanings depending 230 on the implementation. 231 - `ResolvedRefs` - this Condition indicates that all references in the resource 232 or part thereof were valid and pointed to an object that both exists and allows 233 that reference. If this Condition is set to `status: false`, then _at least one_ 234 reference in the resource or part thereof is invalid for some reason, and the 235 `message` field should indicate which one are invalid. 236 237 Implementers should check the godoc for each type to see the exact details of 238 these Conditions on each resource or part thereof. 239 240 Additionally, the upstream `Conditions` struct contains an optional 241 `observedGeneration` field - implementations MUST use this field and set it to 242 the `metadata.generation` field of the object at the time the status is generated. 243 This allows users of the API to determine if the status is relevant to the current 244 version of the object. 245 246 247 ### Resource details 248 249 For each currently available conformance profile, there are a set of resources 250 that implementations are expected to reconcile. 251 252 The following section goes through each Gateway API object and indicates expected 253 behaviors. 254 255 #### GatewayClass 256 257 GatewayClass has one main `spec` field - `controllerName`. Each implementation 258 is expected to claim a domain-prefixed string value (like 259 `example.com/example-ingress`) as its `controllerName`. 260 261 Implementations MUST watch _all_ GatewayClasses, and reconcile GatewayClasses 262 that have a matching `controllerName`. The implementation must choose at least 263 one compatible GatewayClass out of the set of GatewayClasses that have a matching 264 `controllerName`, and indicate that it accepts processing of that GatewayClass 265 by setting an `Accepted` Condition to `status: true` in each. Any GatewayClasses 266 that have a matching `controllerName` but are _not_ Accepted must have the 267 `Accepted` Condition sett to `status: false`. 268 269 Implementations MAY choose only one GatewayClass out of the pool of otherwise 270 acceptable GatewayClasses if they can only reconcile one, or, if they are capable 271 of reconciling multiple GatewayClasses, they may also choose as many as they like. 272 273 If something in the GatewayClass renders it incompatibie (at the time of writing, 274 the only possible reason for this is that there is a pointer to a `paramsRef` 275 object that is not supported by the implementation), then the implementation 276 SHOULD mark the incompatible GatewayClass as not `Accepted`. 277 278 #### Gateway 279 280 Gateway objects MUST refer in the `spec.gatewayClassName` field to a GatewayClass 281 that exists and is `Accepted` by an implementation for that implementation to 282 reconcile them. 283 284 Gateway objects that fall out of scope (for example, because the GatewayClass 285 they reference was deleted) for reconciliation MAY have their status removed by 286 the implementation as part of the delete process, but this is not required. 287 288 #### General Route information 289 290 All Route objects share some properties: 291 292 - They MUST be attached to an in-scope parent for the implementation to consider 293 them reconcilable. 294 - The implementation MUST update the status for each in-scope Route with the 295 relevant Conditions, using the namespaced `parents` field. See the specific Route 296 types for details, but this usually includes `Accepted`, `Programmed` and 297 `ResolvedRefs` Conditions. 298 - Routes that fall out of scope SHOULD NOT have status updated, since it's possible 299 that these updates may overwrite any new owners. The `observedGeneration` field 300 will indicate that any remaining status is out of date. 301 302 303 #### HTTPRoute 304 305 HTTPRoutes route HTTP traffic that is _unencrypted_ and available for inspection. 306 This includes HTTPS traffic that's terminated at the Gateway (since that is then 307 decrypted), and allows the HTTPRoute to use HTTP properties, like path, method, 308 or headers in its routing directives. 309 310 #### TLSRoute 311 312 TLSRoutes route encrypted TLS traffic using the SNI header, _without decrypting 313 the traffic stream_, to the relevant backends. 314 315 #### TCPRoute 316 317 TCPRoutes route a TCP stream that arrives at a Listener to one of the given 318 backends. 319 320 #### UDPRoute 321 322 UDPRoutes route UDP packets that arrive at a Listener to one of the given 323 backends. 324 325 #### ReferenceGrant 326 327 ReferenceGrant is a special resource that is used by resource owners in one 328 namespace to _selectively_ allow references from Gateway API objects in other 329 namespaces. 330 331 A ReferenceGrant is created in the same namespace as the thing it's granting 332 reference access to, and allows access from other namespaces, from other Kinds, 333 or both. 334 335 Implementations that support cross-namespace references MUST watch ReferenceGrant 336 and reconcile any ReferenceGrant that points to an object that's referred to by 337 an in-scope Gateway API object.