github.com/kubeflow/training-operator@v1.7.0/docs/development/developer_guide.md (about)

     1  # Developer Guide
     2  
     3  Kubeflow Training Operator is currently at v1.
     4  
     5  ## Requirements
     6  
     7  - [Go](https://golang.org/) (1.20 or later)
     8  
     9  ## Building the operator
    10  
    11  Create a symbolic link inside your GOPATH to the location you checked out the code
    12  
    13  ```sh
    14  mkdir -p ${go env GOPATH}/src/github.com/kubeflow
    15  ln -sf ${GIT_TRAINING} ${go env GOPATH}/src/github.com/kubeflow/training-operator
    16  ```
    17  
    18  - GIT_TRAINING should be the location where you checked out https://github.com/kubeflow/training-operator
    19  
    20  Install dependencies
    21  
    22  ```sh
    23  go mod vendor
    24  ```
    25  
    26  Build it
    27  
    28  ```sh
    29  go install github.com/kubeflow/training-operator/cmd/training-operator.v1
    30  ```
    31  
    32  ## Running the Operator Locally
    33  
    34  Running the operator locally (as opposed to deploying it on a K8s cluster) is convenient for debugging/development.
    35  
    36  ### Run a Kubernetes cluster
    37  
    38  First, you need to run a Kubernetes cluster locally. There are lots of choices:
    39  
    40  - [local-up-cluster.sh in Kubernetes](https://github.com/kubernetes/kubernetes/blob/master/hack/local-up-cluster.sh)
    41  - [minikube](https://github.com/kubernetes/minikube)
    42  
    43  `local-up-cluster.sh` runs a single-node Kubernetes cluster locally, but Minikube runs a single-node Kubernetes cluster inside a VM. It is all compilable with the controller, but the Kubernetes version should be `1.8` or above.
    44  
    45  Notice: If you use `local-up-cluster.sh`, please make sure that the kube-dns is up, see [kubernetes/kubernetes#47739](https://github.com/kubernetes/kubernetes/issues/47739) for more details.
    46  
    47  ### Configure KUBECONFIG and KUBEFLOW_NAMESPACE
    48  
    49  We can configure the operator to run locally using the configuration available in your kubeconfig to communicate with
    50  a K8s cluster. Set your environment:
    51  
    52  ```sh
    53  export KUBECONFIG=$(echo ~/.kube/config)
    54  export KUBEFLOW_NAMESPACE=$(your_namespace)
    55  ```
    56  
    57  - KUBEFLOW_NAMESPACE is used when deployed on Kubernetes, we use this variable to create other resources (e.g. the resource lock) internal in the same namespace. It is optional, use `default` namespace if not set.
    58  
    59  ### Create the TFJob CRD
    60  
    61  After the cluster is up, the TFJob CRD should be created on the cluster.
    62  
    63  ```bash
    64  make install
    65  ```
    66  
    67  ### Run Operator
    68  
    69  Now we are ready to run operator locally:
    70  
    71  ```sh
    72  make run
    73  ```
    74  
    75  To verify local operator is working, create an example job and you should see jobs created by it.
    76  
    77  ```sh
    78  cd ./examples/v1/dist-mnist
    79  docker build -f Dockerfile -t kubeflow/tf-dist-mnist-test:1.0 .
    80  kubectl create -f ./tf_job_mnist.yaml
    81  ```
    82  
    83  ## Go version
    84  
    85  On ubuntu the default go package appears to be gccgo-go which has problems see [issue](https://github.com/golang/go/issues/15429) golang-go package is also really old so install from golang tarballs instead.
    86  
    87  ## Generate Python SDK
    88  
    89  To generate Python SDK for the operator, run:
    90  
    91  ```
    92  ./hack/python-sdk/gen-sdk.sh
    93  ```
    94  
    95  This command will re-generate the api and model files together with the documentation and model tests.
    96  The following files/folders in `sdk/python` are auto-generated and should not be modified directly:
    97  
    98  ```
    99  sdk/python/docs
   100  sdk/python/kubeflow/training/models
   101  sdk/python/kubeflow/training/*.py
   102  sdk/python/test/*.py
   103  ```
   104  
   105  The Training Operator client and public APIs are located here:
   106  
   107  ```
   108  sdk/python/kubeflow/training/api
   109  ```
   110  
   111  ## Code Style
   112  
   113  ### Python
   114  
   115  - Use [yapf](https://github.com/google/yapf) to format Python code
   116  - `yapf` style is configured in `.style.yapf` file
   117  - To autoformat code
   118  
   119    ```sh
   120    yapf -i py/**/*.py
   121    ```
   122  
   123  - To sort imports
   124  
   125    ```sh
   126    isort path/to/module.py
   127    ```