github.com/kubeflow/training-operator@v1.7.0/examples/pytorch (about) README.md elastic mnist pytorch_cuda_docker simple.yaml smoke-dist