sigs.k8s.io/gateway-api@v1.0.0/site-src/reference/implementers-guide.md (about)

     1  # Gateway API Implementer's Guide
     2  
     3  Everything you wanted to know about building a Gateway API implementation
     4  but were too afraid to ask.
     5  
     6  This document is a place to collect tips and tricks for _writing a Gateway API
     7  implementation_ that have no straightforward place within the godoc fields of the
     8  underlying types.
     9  
    10  It's also intended to be a place to write down some guidelines to
    11  help implementers of this API to skip making common mistakes.
    12  
    13  It may not be very relevant if you are intending to _use_ this API as an end
    14  user as opposed to _building_ something that uses it.
    15  
    16  This is a living document, if you see something missing, PRs welcomed!
    17  
    18  ## Important things to remember about Gateway API
    19  
    20  Hopefully most of these are not surprising, but they sometimes have non-obvious
    21  implications that we'll try and lay out here.
    22  
    23  ### Gateway API is a `kubernetes.io` API
    24  
    25  Gateway API uses the `gateway.networking.k8s.io` API group. This means that,
    26  like APIs delivered in the core Kubernetes binaries, each time a release happens,
    27  the APIs have been reviewed by upstream Kubernetes reviewers, just like the APIs
    28  delivered in the core binaries.
    29  
    30  ### Gateway API is delivered using CRDs
    31  
    32  Gateway API is supplied as a set of CRDs, version controlled using our [versioning
    33  policy][versioning].
    34  
    35  The most important part of that versioning policy is that what _appears to be_
    36  the same object (that is, it has the same `group`,`version`, and `kind`) may have
    37  a slightly different schema. We make changes in ways that are _compatible_, so
    38  things should generally "just work", but there are some actions implementations
    39  need to take to make "just work"ing more reliable; these are detailed below.
    40  
    41  The CRD-based delivery also means that if an implementation tries to use (that is
    42  get, list, watch, etc) Gateway API objects when the CRDs have _not_ been installed,
    43  then it's likely that your Kubernetes client code will return serious errors.
    44  Tips to deal with this are also detailed below.
    45  
    46  The CRD definitions for Gateway API objects all contain two specific
    47  annotations:
    48  
    49  - `gateway.networking.k8s.io/bundle-version: <semver-release-version>`
    50  - `gateway.networking.k8s.io/channel: <channel-name>`
    51  
    52  The concepts of "bundle version" and "channel" (short for "release channel") are
    53  explained in our [versioning][versioning] documentation.
    54  
    55  Implementations may use these to determine what schema versions are installed in
    56  the cluster, if any.
    57  
    58  [versioning]: /concepts/versioning
    59  
    60  ### Changes to the Gateway API CRDs are backwards compatible
    61  
    62  Part of the contract for Gateway API CRDs is that changes _within an API version_
    63  must be _compatible_.
    64  
    65  "Within an API Version" means changes to a CRD that occur while the same API version
    66  (`v1alpha2` or `v1` for example) is in use, and "compatible" means that any new
    67  fields, values, or validation will be added to ensure that _previous_
    68  objects _will still be valid objects_ after the change.
    69  
    70  This means that once Gateway API objects move to the `v1` API version, then _all_
    71  changes must be compatible.
    72  
    73  This contract also means that an implementation will not fail with a higher version
    74  of the API than the version it was written with, because the newer schema being
    75  stored by Kubernetes will definitely be able to be serialized into the older version
    76  used in code by the implementation.
    77  
    78  Similarly, if an implementation was written with a _higher_ version, the newer
    79  values that it understands will simply _never be used_, as they are not present
    80  in the older version.
    81  
    82  ## Implementation Rules and Guidelines
    83  
    84  ### CRD Management
    85  
    86  For a Gateway API implementation to work, the Gateway API CRDs must be installed
    87  in the Kubernetes cluster the implementation is watching.
    88  
    89  Implementations have two possible options: installing CRDs themselves (implementation
    90  controlled) or requiring installation by some other mechanism before working 
    91  (externally controlled). Both have tradeoffs, but implementation controlled has
    92  significantly more, and so we DO NOT recommend using implementation controlled
    93  methods at this time.
    94  
    95  Regardless, either way has certain things that SHOULD be true, however:
    96  
    97  Whatever method is used, infra and cluster admins SHOULD attempt to ensure that
    98  the Bundle version of the CRDs is not _downgraded_. Although we ensure that
    99  API changes are backwards compatible, changing CRD definitions can change the
   100  storage version of the resource, which could have unforeseen effects. Most of the
   101  time, things will probably work, but if it doesn't work, it will most likely 
   102  break in weird ways.
   103  
   104  Additionally, older versions of the API may be missing fields or features, which
   105  could be very disruptive for users.
   106  
   107  Try your best to ensure that the bundle version doesn't roll backwards. It's safer.
   108  
   109  Implementations SHOULD also handle the Gateway API CRDs _not_ being present in
   110  the cluster without crashing or panicking. Exiting with a clear fatal error is
   111  acceptable in this case, as is disabling Gateway API support even if enabled in
   112  configuration.
   113  
   114  Practically, for implementations using tools like `controller-runtime` or
   115  similar tooling, they may need to check for the _presence_ of the CRDs by 
   116  getting the list of installed CRDs before attempting to watch those resources.
   117  (Note that this will require the implementation to have `read` access to those
   118  resources though.)
   119  
   120  #### Implementation-controlled CRD installation
   121  
   122  Implementation-controlled CRD installation also includes automatic installation
   123  mechanisms such as Helm, if the CRDs are included in a Helm chart with the
   124  implementation's installation.
   125  
   126  Because of significant caveats we DO NOT recommend doing implementation-controlled
   127  CRD management at this time.
   128  
   129  However, if you really must, CRD definitions MAY be installed by implementations,
   130  but if they do, they MUST have a way to ensure:
   131  
   132  - there are no other Gateway API CRDs installed in the cluster before starting, or
   133  - that the CRD definitions are only installed if they are a higher bundle version
   134    than any existing Gateway API CRDs. Note that even this may not be safe if there
   135    are breaking changes in the experimental channel resources, so implementations
   136    should be _very_ careful with doing this.
   137  
   138  This avoids problems if another implementation is also installed in the cluster
   139  and expects a higher version of the CRDs to be installed.
   140  
   141  The worst outcome here would be two implementations trying to do automatic install
   142  of _different_ CRD versions, resulting in the CRD versions flapping between
   143  versions or channels. This would _not_ produce good outcomes.
   144  
   145  The safer method for an automatic installation would require the implementation
   146  to:
   147  
   148  - Check if there are any Gateway API CRDs installed in the cluster.
   149  - If not, install its most compatible version of the CRDs.
   150  - If so, only install its version of the CRDs if the bundle version is higher
   151    than the existing one, and the mechanism will also need to check if there are
   152    incompatible changes included in any versions as well.
   153  
   154  This is going to be _very_ difficult to pull off in practice.
   155  
   156  It should also be noted that many infra and cluster admins manage CRDs using
   157  externally controlled methods that will not be visible to a Gateway
   158  implementation, so if you still proceed with automatic installation, it MUST be
   159  able to be disabled by the installation owner (whether that is the infra or cluster
   160  admin).
   161  
   162  Because of all these caveats, we DO NOT recommend doing automatic CRD management
   163  at this time.
   164  
   165  #### Externally controlled CRD installation
   166  
   167  Because of all of the complexities mentioned in the "Implementation controlled"
   168  section of this document, we recommend that implementations supply documentation
   169  on how to check if CRDs are installed and upgrade versions if required.
   170  
   171  Additions to this document to add suggested commands here are welcomed.
   172  
   173  ### Conformance and Version compatibility
   174  
   175  A conformant Gateway API implementation is one that passes the conformance tests
   176  that are included in each Gateway API bundle version release.
   177  
   178  An implementation MUST pass the conformance suite with _no_ skipped tests to be
   179  conformant. Tests may be skipped during development, but a version you want to
   180  be conformant MUST have no skipped tests.
   181  
   182  Extended features may, as per the contract for Extended status, be disabled.
   183  
   184  Gateway API conformance is version-specific. An implementation that passes
   185  conformance for version N may not pass conformance for version N+1 without changes.
   186  
   187  Implementations SHOULD submit a report from the conformance testing suite back
   188  to the Gateway API Github repo containing details of their testing.
   189  
   190  The conformance suite output includes the Gateway API version supported.
   191  
   192  #### Version compatibility
   193  
   194  Once v1.0 is released, for implementations supporting Gateway and GatewayClass,
   195  they MUST set a new Condition, `SupportedVersion`, with `status: true` meaning
   196  that the installed CRD version is supported, and `status: false` meaning that it
   197  is not.
   198  
   199  ### Standard Status fields and Conditions
   200  
   201  Gateway API has many resources, but when designing this, we've worked to keep
   202  the status experience as consistent as possible across objects, using the
   203  Condition type and the `status.conditions` field.
   204  
   205  Most resources have a `status.conditions` field, but some also have a namespaced
   206  field that _contains_ a `conditions` field.
   207  
   208  For the latter, Gateway's `status.listeners` and the Route `status.parents`
   209  fields are examples where each item in the slice identifies the Conditions
   210  associated with some subset of configuration.
   211  
   212  For the Gateway case, it's to allow Conditions per _Listener_, and in the Route
   213  case, it's to allow Conditions per _implementation_ (since Route objects can
   214  be used in multiple Gateways, and those Gateways can be reconciled by different
   215  implementations).
   216  
   217  In all of these cases, there are some relatively-common Condition types that have
   218  similar meanings:
   219  
   220  - `Accepted` - the resource or part thereof contains acceptable config that will
   221  produce some configuration in the underlying data plane that the implementation
   222  controls. This does not mean that the _whole_ configuration is valid, just that
   223  _enough_ is valid to produce some effect.
   224  - `Programmed` - this represents a later phase of operation, after `Accepted`,
   225  when the resource or part thereof has been Accepted and programmed into the
   226  underlying dataplane. Users should expect the configuration to be ready for
   227  traffic to flow _at some point in the near future_. This Condition does _not_
   228  say that the dataplane is ready _when it's set_, just that everything is valid
   229  and it _will become ready soon_. "Soon" may have different meanings depending
   230  on the implementation.
   231  - `ResolvedRefs` - this Condition indicates that all references in the resource
   232  or part thereof were valid and pointed to an object that both exists and allows
   233  that reference. If this Condition is set to `status: false`, then _at least one_
   234  reference in the resource or part thereof is invalid for some reason, and the
   235  `message` field should indicate which one are invalid.
   236  
   237  Implementers should check the godoc for each type to see the exact details of
   238  these Conditions on each resource or part thereof.
   239  
   240  Additionally, the upstream `Conditions` struct contains an optional
   241  `observedGeneration` field - implementations MUST use this field and set it to
   242  the `metadata.generation` field of the object at the time the status is generated.
   243  This allows users of the API to determine if the status is relevant to the current
   244  version of the object.
   245  
   246  
   247  ### Resource details
   248  
   249  For each currently available conformance profile, there are a set of resources
   250  that implementations are expected to reconcile.
   251  
   252  The following section goes through each Gateway API object and indicates expected
   253  behaviors.
   254  
   255  #### GatewayClass
   256  
   257  GatewayClass has one main `spec` field - `controllerName`. Each implementation
   258  is expected to claim a domain-prefixed string value (like
   259  `example.com/example-ingress`) as its `controllerName`.
   260  
   261  Implementations MUST watch _all_ GatewayClasses, and reconcile GatewayClasses
   262  that have a matching `controllerName`. The implementation must choose at least
   263  one compatible GatewayClass out of the set of GatewayClasses that have a matching
   264  `controllerName`, and indicate that it accepts processing of that GatewayClass
   265  by setting an `Accepted` Condition to `status: true` in each. Any GatewayClasses
   266  that have a matching `controllerName` but are _not_ Accepted must have the
   267  `Accepted` Condition sett to `status: false`.
   268  
   269  Implementations MAY choose only one GatewayClass out of the pool of otherwise
   270  acceptable GatewayClasses if they can only reconcile one, or, if they are capable
   271  of reconciling multiple GatewayClasses, they may also choose as many as they like.
   272  
   273  If something in the GatewayClass renders it incompatibie (at the time of writing,
   274  the only possible reason for this is that there is a pointer to a `paramsRef`
   275  object that is not supported by the implementation), then the implementation
   276  SHOULD mark the incompatible GatewayClass as not `Accepted`.
   277  
   278  #### Gateway
   279  
   280  Gateway objects MUST refer in the `spec.gatewayClassName` field to a GatewayClass
   281  that exists and is `Accepted` by an implementation for that implementation to
   282  reconcile them.
   283  
   284  Gateway objects that fall out of scope (for example, because the GatewayClass
   285  they reference was deleted) for reconciliation MAY have their status removed by
   286  the implementation as part of the delete process, but this is not required.
   287  
   288  #### General Route information
   289  
   290  All Route objects share some properties:
   291  
   292  - They MUST be attached to an in-scope parent for the implementation to consider
   293  them reconcilable.
   294  - The implementation MUST update the status for each in-scope Route with the
   295  relevant Conditions, using the namespaced `parents` field. See the specific Route
   296  types for details, but this usually includes `Accepted`, `Programmed` and
   297  `ResolvedRefs` Conditions.
   298  - Routes that fall out of scope SHOULD NOT have status updated, since it's possible
   299  that these updates may overwrite any new owners. The `observedGeneration` field
   300  will indicate that any remaining status is out of date.
   301  
   302  
   303  #### HTTPRoute
   304  
   305  HTTPRoutes route HTTP traffic that is _unencrypted_ and available for inspection.
   306  This includes HTTPS traffic that's terminated at the Gateway (since that is then
   307  decrypted), and allows the HTTPRoute to use HTTP properties, like path, method,
   308  or headers in its routing directives.
   309  
   310  #### TLSRoute
   311  
   312  TLSRoutes route encrypted TLS traffic using the SNI header, _without decrypting
   313  the traffic stream_, to the relevant backends.
   314  
   315  #### TCPRoute
   316  
   317  TCPRoutes route a TCP stream that arrives at a Listener to one of the given
   318  backends.
   319  
   320  #### UDPRoute
   321  
   322  UDPRoutes route UDP packets that arrive at a Listener to one of the given
   323  backends.
   324  
   325  #### ReferenceGrant
   326  
   327  ReferenceGrant is a special resource that is used by resource owners in one
   328  namespace to _selectively_ allow references from Gateway API objects in other
   329  namespaces.
   330  
   331  A ReferenceGrant is created in the same namespace as the thing it's granting
   332  reference access to, and allows access from other namespaces, from other Kinds,
   333  or both.
   334  
   335  Implementations that support cross-namespace references MUST watch ReferenceGrant
   336  and reconcile any ReferenceGrant that points to an object that's referred to by
   337  an in-scope Gateway API object.