github.com/pachyderm/pachyderm@v1.13.4/README.md (about)

     1  <p align="center">
     2  	<img src='doc/docs/master/assets/images/Pachyderm-Character_600.png' height='225' title='Pachyderm'>
     3  </p>
     4  
     5  [![GitHub release](https://img.shields.io/github/release/pachyderm/pachyderm.svg?style=flat-square)](https://github.com/pachyderm/pachyderm/releases)
     6  [![GitHub license](https://img.shields.io/badge/license-Pachyderm-blue)](https://github.com/pachyderm/pachyderm/blob/master/LICENSE)
     7  [![GoDoc](https://godoc.org/github.com/pachyderm/pachyderm?status.svg)](https://godoc.org/github.com/pachyderm/pachyderm/src/client)
     8  [![Go Report Card](https://goreportcard.com/badge/github.com/pachyderm/pachyderm)](https://goreportcard.com/report/github.com/pachyderm/pachyderm)
     9  [![Slack Status](https://badge.slack.pachyderm.io/badge.svg)](https://slack.pachyderm.io)
    10  [![CLA assistant](https://cla-assistant.io/readme/badge/pachyderm/pachyderm)](https://cla-assistant.io/pachyderm/pachyderm)
    11  
    12  # Pachyderm: Data Versioning, Data Pipelines, and Data Lineage
    13  
    14  Pachyderm is a tool for production data pipelines. If you need to chain
    15  together data scraping, ingestion, cleaning, munging, wrangling, processing,
    16  modeling, and analysis in a sane way, then Pachyderm is for you. If you have an
    17  existing set of scripts which do this in an ad-hoc fashion and you're looking
    18  for a way to "productionize" them, Pachyderm can make this easy for you.
    19  
    20  ## Features
    21  
    22  - Containerized: Pachyderm is built on Docker and Kubernetes. Whatever
    23    languages or libraries your pipeline needs, they can run on Pachyderm which
    24    can easily be deployed on any cloud provider or on prem.
    25  - Version Control: Pachyderm version controls your data as it's processed. You
    26    can always ask the system how data has changed, see a diff, and, if something
    27    doesn't look right, revert.
    28  - Provenance (aka data lineage): Pachyderm tracks where data comes from. Pachyderm keeps track of all the code and  data that created a result.
    29  - Parallelization: Pachyderm can efficiently schedule massively parallel
    30    workloads.
    31  - Incremental Processing: Pachyderm understands how your data has changed and
    32    is smart enough to only process the new data.
    33  
    34  ## Getting Started
    35  [Install Pachyderm locally](https://docs.pachyderm.com/latest/getting_started/local_installation/) or [deploy on AWS/GCE/Azure](https://docs.pachyderm.com/latest/deploy-manage/deploy/amazon_web_services/) in about 5 minutes. 
    36  
    37  You can also refer to our complete [documentation](https://docs.pachyderm.com) to see tutorials, check out example projects, and learn about advanced features of Pachyderm.
    38  
    39  If you'd like to see some examples and learn about core use cases for Pachyderm:
    40  - [Examples](https://docs.pachyderm.com/latest/examples/examples/)
    41  - [Use Cases](https://www.pachyderm.com/use-cases/)
    42  - [Case Studies](https://www.pachyderm.com/case-studies/)
    43  
    44  ## Documentation
    45  
    46  [Official Documentation](https://docs.pachyderm.com/)
    47  
    48  ## Community
    49  Keep up to date and get Pachyderm support via:
    50  - [![Twitter](https://img.shields.io/twitter/follow/pachyderminc?style=social)](https://twitter.com/pachyderminc) Follow us on Twitter.
    51  - [![Slack Status](https://badge.slack.pachyderm.io/badge.svg)](https://slack.pachyderm.io) Join our community [Slack Channel](https://slack.pachyderm.io) to get help from the Pachyderm team and other users.
    52  
    53  ## Contributing
    54  
    55  To get started, sign the [Contributor License Agreement](https://cla-assistant.io/pachyderm/pachyderm).
    56  
    57  You should also check out our [contributing guide](https://docs.pachyderm.com/latest/contributing/setup/).
    58  
    59  Send us PRs, we would love to see what you do! You can also check our GH issues for things labeled "help-wanted" as a good place to start. We're sometimes bad about keeping that label up-to-date, so if you don't see any, just let us know.
    60  
    61  ## Join Us
    62  
    63  WE'RE HIRING! Love Docker, Go and distributed systems? Learn more about [our open positions](https://boards.greenhouse.io/pachyderm)
    64  
    65  ## Usage Metrics
    66  
    67  Pachyderm automatically reports anonymized usage metrics. These metrics help us
    68  understand how people are using Pachyderm and make it better.  They can be
    69  disabled by setting the env variable `METRICS` to `false` in the pachd
    70  container.
    71  
    72  ## License Information
    73  Pachyderm has moved some components of Pachyderm Platform to a [source-available limited license](LICENSE). 
    74  
    75  We remain committed to the culture of open source, developing our product transparently and collaboratively with our community, and giving our community and customers source code access and the ability to study and change the software to suit their needs.
    76  
    77  Under the Pachyderm Community License, you can access the source code and modify or redistribute it; there is only one thing you cannot do, and that is use it to make a competing offering. 
    78  
    79  Check out our [License FAQ Page](https://pachyderm.com/about/pachyderm-community-license-faq/) for more information.