sigs.k8s.io/kueue@v0.6.2/site/content/en/docs/tasks/run_kubeflow_jobs/run_tfjobs.md (about)

     1  ---
     2  title: "Run a TFJob"
     3  date: 2023-08-23
     4  weight: 6
     5  description: >
     6    Run a Kueue scheduled TFJob
     7  ---
     8  
     9  This page shows how to leverage Kueue's scheduling and resource management capabilities when running [Training Operator](https://www.kubeflow.org/docs/components/training/tftraining/) TFJobs.
    10  
    11  This guide is for [batch users](/docs/tasks#batch-user) that have a basic understanding of Kueue. For more information, see [Kueue's overview](/docs/overview).
    12  
    13  ## Before you begin
    14  
    15  Check [administer cluster quotas](/docs/tasks/administer_cluster_quotas) for details on the initial cluster setup.
    16  
    17  Check [the Training Operator installation guide](https://github.com/kubeflow/training-operator#installation).
    18  
    19  Note that the minimum requirement training-operator version is v1.7.0.
    20  
    21  You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include TFJobs as an allowed workload.
    22  
    23  ## TFJob definition
    24  
    25  ### a. Queue selection
    26  
    27  The target [local queue](/docs/concepts/local_queue) should be specified in the `metadata.labels` section of the TFJob configuration.
    28  
    29  ```yaml
    30  metadata:
    31    labels:
    32      kueue.x-k8s.io/queue-name: user-queue
    33  ```
    34  
    35  ### b. Optionally set Suspend field in TFJobs
    36  
    37  ```yaml
    38  spec:
    39    runPolicy:
    40      suspend: true
    41  ```
    42  
    43  By default, Kueue will set `suspend` to true via webhook and unsuspend it when the TFJob is admitted.
    44  
    45  ## Sample TFJob
    46  
    47  This example is based on https://github.com/kubeflow/training-operator/blob/48dbbf0a8e90e52c55ec05d0f689fcbf83c6b441/examples/tensorflow/dist-mnist/tf_job_mnist.yaml.
    48  
    49  {{< include "examples/jobs/sample-tfjob.yaml" "yaml" >}}