github.com/pachyderm/pachyderm@v1.13.4/README.md (about) 1 <p align="center"> 2 <img src='doc/docs/master/assets/images/Pachyderm-Character_600.png' height='225' title='Pachyderm'> 3 </p> 4 5 [](https://github.com/pachyderm/pachyderm/releases) 6 [](https://github.com/pachyderm/pachyderm/blob/master/LICENSE) 7 [](https://godoc.org/github.com/pachyderm/pachyderm/src/client) 8 [](https://goreportcard.com/report/github.com/pachyderm/pachyderm) 9 [](https://slack.pachyderm.io) 10 [](https://cla-assistant.io/pachyderm/pachyderm) 11 12 # Pachyderm: Data Versioning, Data Pipelines, and Data Lineage 13 14 Pachyderm is a tool for production data pipelines. If you need to chain 15 together data scraping, ingestion, cleaning, munging, wrangling, processing, 16 modeling, and analysis in a sane way, then Pachyderm is for you. If you have an 17 existing set of scripts which do this in an ad-hoc fashion and you're looking 18 for a way to "productionize" them, Pachyderm can make this easy for you. 19 20 ## Features 21 22 - Containerized: Pachyderm is built on Docker and Kubernetes. Whatever 23 languages or libraries your pipeline needs, they can run on Pachyderm which 24 can easily be deployed on any cloud provider or on prem. 25 - Version Control: Pachyderm version controls your data as it's processed. You 26 can always ask the system how data has changed, see a diff, and, if something 27 doesn't look right, revert. 28 - Provenance (aka data lineage): Pachyderm tracks where data comes from. Pachyderm keeps track of all the code and data that created a result. 29 - Parallelization: Pachyderm can efficiently schedule massively parallel 30 workloads. 31 - Incremental Processing: Pachyderm understands how your data has changed and 32 is smart enough to only process the new data. 33 34 ## Getting Started 35 [Install Pachyderm locally](https://docs.pachyderm.com/latest/getting_started/local_installation/) or [deploy on AWS/GCE/Azure](https://docs.pachyderm.com/latest/deploy-manage/deploy/amazon_web_services/) in about 5 minutes. 36 37 You can also refer to our complete [documentation](https://docs.pachyderm.com) to see tutorials, check out example projects, and learn about advanced features of Pachyderm. 38 39 If you'd like to see some examples and learn about core use cases for Pachyderm: 40 - [Examples](https://docs.pachyderm.com/latest/examples/examples/) 41 - [Use Cases](https://www.pachyderm.com/use-cases/) 42 - [Case Studies](https://www.pachyderm.com/case-studies/) 43 44 ## Documentation 45 46 [Official Documentation](https://docs.pachyderm.com/) 47 48 ## Community 49 Keep up to date and get Pachyderm support via: 50 - [](https://twitter.com/pachyderminc) Follow us on Twitter. 51 - [](https://slack.pachyderm.io) Join our community [Slack Channel](https://slack.pachyderm.io) to get help from the Pachyderm team and other users. 52 53 ## Contributing 54 55 To get started, sign the [Contributor License Agreement](https://cla-assistant.io/pachyderm/pachyderm). 56 57 You should also check out our [contributing guide](https://docs.pachyderm.com/latest/contributing/setup/). 58 59 Send us PRs, we would love to see what you do! You can also check our GH issues for things labeled "help-wanted" as a good place to start. We're sometimes bad about keeping that label up-to-date, so if you don't see any, just let us know. 60 61 ## Join Us 62 63 WE'RE HIRING! Love Docker, Go and distributed systems? Learn more about [our open positions](https://boards.greenhouse.io/pachyderm) 64 65 ## Usage Metrics 66 67 Pachyderm automatically reports anonymized usage metrics. These metrics help us 68 understand how people are using Pachyderm and make it better. They can be 69 disabled by setting the env variable `METRICS` to `false` in the pachd 70 container. 71 72 ## License Information 73 Pachyderm has moved some components of Pachyderm Platform to a [source-available limited license](LICENSE). 74 75 We remain committed to the culture of open source, developing our product transparently and collaboratively with our community, and giving our community and customers source code access and the ability to study and change the software to suit their needs. 76 77 Under the Pachyderm Community License, you can access the source code and modify or redistribute it; there is only one thing you cannot do, and that is use it to make a competing offering. 78 79 Check out our [License FAQ Page](https://pachyderm.com/about/pachyderm-community-license-faq/) for more information.