github.com/pachyderm/pachyderm@v1.13.4/examples/kubeflow/README.md

github.com/pachyderm/pachyderm@v1.13.4/examples/kubeflow/README.md (about)

     1  >![pach_logo](../img/pach_logo.svg) INFO - Pachyderm 2.0 introduces profound architectural changes to the product. As a result, our examples pre and post 2.0 are kept in two separate branches:
     2  > - Branch Master: Examples using Pachyderm 2.0 and later versions - https://github.com/pachyderm/pachyderm/tree/master/examples
     3  > - Branch 1.13.x: Examples using Pachyderm 1.13 and older versions - https://github.com/pachyderm/pachyderm/tree/1.13.x/examples
     4  # Pachyderm Kubeflow Examples
     5  
     6  Pachyderm makes production data pipelines repeatable, scalable and provable.
     7  Data scientists and engineers use Pachyderm pipelines to connect data acquisition, cleaning, processing, modeling, and analysis code,
     8  while using Pachyderm's versioned data repositories to keep a complete history of all the data, models, parameters and code
     9  that went into producing each result anywhere in their pipelines. 
    10  This is called data provenance.
    11  
    12  If you're currently using Kubeflow to manage your machine learning workloads,
    13  Pachyderm can add value to your Kubeflow deployment in a couple of ways.
    14  You can use Pachyderm's pipelines and containers to call Kubeflow API's to connect and orchestrate Kubeflow jobs.
    15  You can use Pachyderm's versioned data repositories to provide data provenance to the data, models and parameters you use with your Kubeflow code.
    16  
    17  This directory contains an example of integrating Pachyderm with Kubeflow.
    18  
    19  ## Mnist with TFJob and Pachyderm
    20  
    21  [This example](https://github.com/pachyderm/pachyderm/tree/1.13.x/examples/kubeflow/mnist) 
    22  uses the canonical mnist dataset, Kubeflow, TFJobs, and Pachyderm to demonstrate an end-to-end machine learning workflow with data provenance.
    23  Specifically, it copies data to and read data from a Pachyderm versioned data repository
    24  in a Kubeflow pipeline
    25  using Pachyderm's S3 Gateway.
    26  
    27  
    28