github.com/kubeflow/training-operator@v1.7.0/examples/xgboost/smoke-dist/README.md (about)

     1  ### Distributed send/recv e2e test for xgboost rabit
     2  
     3  This folder containers Dockerfile and distributed send/recv test.
     4  
     5  **Build Image**
     6  
     7  The default image name and tag is `kubeflow/xgboost-dist-rabit-test:1.2`. 
     8  You can build the image based on your requirement.
     9  
    10  ```shell
    11  docker build -f Dockerfile -t kubeflow/xgboost-dist-rabit-test:1.2 ./
    12  ```
    13  
    14  **Start and test XGBoost Rabit tracker**
    15  
    16  ```
    17  kubectl create -f xgboostjob_v1alpha1_rabit_test.yaml
    18  ```
    19  
    20  **Look at the job status**
    21  ```
    22   kubectl get -o yaml XGBoostJob/xgboost-dist-test
    23   ```
    24  Here is sample output when the job is running. The output result like this
    25  ```
    26  apiVersion: xgboostjob.kubeflow.org/v1alpha1
    27  kind: XGBoostJob
    28  metadata:
    29    creationTimestamp: "2019-06-21T03:32:57Z"
    30    generation: 7
    31    name: xgboost-dist-test
    32    namespace: default
    33    resourceVersion: "258466"
    34    selfLink: /apis/xgboostjob.kubeflow.org/v1alpha1/namespaces/default/xgboostjobs/xgboost-dist-test
    35    uid: 431dc182-93d5-11e9-bbab-080027dfbfe2
    36  spec:
    37    RunPolicy:
    38      cleanPodPolicy: None
    39    xgbReplicaSpecs:
    40      Master:
    41        replicas: 1
    42        restartPolicy: Never
    43        template:
    44          metadata:
    45            creationTimestamp: null
    46          spec:
    47            containers:
    48            - image: docker.io/merlintang/xgboost-dist-rabit-test:1.2
    49              imagePullPolicy: Always
    50              name: xgboostjob
    51              ports:
    52              - containerPort: 9991
    53                name: xgboostjob-port
    54              resources: {}
    55      Worker:
    56        replicas: 2
    57        restartPolicy: Never
    58        template:
    59          metadata:
    60            creationTimestamp: null
    61          spec:
    62            containers:
    63            - image: docker.io/merlintang/xgboost-dist-rabit-test:1.2
    64              imagePullPolicy: Always
    65              name: xgboostjob
    66              ports:
    67              - containerPort: 9991
    68                name: xgboostjob-port
    69              resources: {}
    70  status:
    71    completionTime: "2019-06-21T03:33:03Z"
    72    conditions:
    73    - lastTransitionTime: "2019-06-21T03:32:57Z"
    74      lastUpdateTime: "2019-06-21T03:32:57Z"
    75      message: xgboostJob xgboost-dist-test is created.
    76      reason: XGBoostJobCreated
    77      status: "True"
    78      type: Created
    79    - lastTransitionTime: "2019-06-21T03:32:57Z"
    80      lastUpdateTime: "2019-06-21T03:32:57Z"
    81      message: XGBoostJob xgboost-dist-test is running.
    82      reason: XGBoostJobRunning
    83      status: "False"
    84      type: Running
    85    - lastTransitionTime: "2019-06-21T03:33:03Z"
    86      lastUpdateTime: "2019-06-21T03:33:03Z"
    87      message: XGBoostJob xgboost-dist-test is successfully completed.
    88      reason: XGBoostJobSucceeded
    89      status: "True"
    90      type: Succeeded
    91    replicaStatuses:
    92      Master:
    93        succeeded: 1
    94      Worker:
    95        succeeded: 2
    96  ```
    97   
    98  
    99