github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/deploy-manage/manage/configure-external-access.md (about) 1 # Overview 2 3 When you deploy a Kubernetes application, like Pachyderm, typically, it cannot 4 be accessed from outside the Kubernetes cluster right away. This ensures that 5 your cluster is secure and resilient to malicious cyber attacks. In the 6 simplest example, such as running a Pachyderm cluster locally, implicit and 7 explicit port-forwarding enables you to communicate with `pachd`, the Pachyderm 8 daemon pod, and `pachd-dash`, the Pachyderm UI. Port-forwarding can be used in 9 cloud environments as well, but a production environment might require you to 10 define more sophisticated inbound connection rules. 11 12 Kubernetes provides multiple ways to deliver external traffic to a service, 13 including the following: 14 15 * Through a Service with `type: NodePort`. A NodePort service provides basic 16 access to services. By default, the `pachd` service is deployed as `NodePort` 17 to simplify interaction with your localhost. `NodePort` is a limited solution 18 that might not be considered reliable or secure enough in production 19 deployments. 20 21 * Through a Service with `type: LoadBalancer`. A Kubernetes service with 22 `type: LoadBalancer` can perform basic load balancing. Typically, if you 23 change the `pachd` service type to LoadBalancer in a cloud provider, the 24 cloud provider automatically deploys a load balancer to serve your 25 application. The downside of this approach is that you will have to change 26 all your services to the load balancer type and have a separate load 27 balancers for each service. This can become difficult to manage long term. 28 29 * Through the `Ingress` Kubernetes resource. An ingress resource is 30 completely independent of the services that you deploy. Because an Ingress 31 resource provides advanced routing capabilities, it is the recommended option 32 to use with Pachyderm. The only complication of this approach is that you need 33 to deploy an ingress controller, such as NGINX or traefik, in your Kubernetes 34 cluster. Such an ingress controller is not deployed by default with the 35 `pachctl deploy command`. 36 37 ## Pachyderm Ingress Requirements 38 39 Kubernetes supports multiple ingress controller options, and you are free to 40 pick the one that works best for your environment. However, not all of them 41 might be fully compatible with Pachyderm. Moreover, exposing your cluster 42 through an ingress incorrectly might make your Pachyderm cluster and your 43 data insecure and vulnerable to external attacks. Regardless of your 44 choice of ingress resource, your environment must meet the 45 following security requirements to protect your data: 46 47 * **Use secure connection** 48 49 Exposing your application to an outside world might pose a security 50 risk to your data and organization. Make sure that you have Transport 51 Layer Security (TLS) enabled for Ingress connections. 52 53 * **Use Pachyderm authentication** 54 55 Pachyderm authentication must be enabled and access provided to a 56 verified list of users. Pachyderm authentication is an additional 57 security layer to protect your data from malicious attacks. 58 If you cannot use Pachyderm authentication providers, we highly recommend to 59 use Pachyderm port-forwarding for security reasons. Exposing Pachyderm 60 services through an ingress without Pachyderm authentication might result in 61 your Pachyderm and Kubernetes clusters being compromised, along with your data. 62 63 * **The ingress controller must support gRPC protocol and websockets** 64 65 Some of the ingress controllers that support gRPC include NGNIX and Traefik. 66 67 ## Ingress Configuration Workflow 68 69 This section outlines the general workflow for ingress configuration. 70 Depending on your use case, you might need to start from the bottom of 71 this list and determine your firewall and whitelist requirements first. 72 But commonly, you need to start with deciding which ingress controller 73 you want to use. In any case, read and understand the requirements 74 outlined below before you proceed with any configuration. 75 76 A general workflow for enabling external traffic inside of a Pachyderm 77 cluster includes the following steps: 78 79 * **Configure Kubernetes networking.** 80 81 You can use one of the following options: 82 83 * (Recommended) Deploy an ingress controller and configure an ingress 84 resource. 85 86 Pachyderm supports the [Traefik](https://docs.traefik.io/) 87 ingress controller. For more information, see 88 [Expose a Pachyderm UI Through an Ingress](../expose-pach-ui-ingress/). 89 90 * Configure the pachd service as a `LoadBalancer` by changing 91 `type: Nodeport` to `type: LoabBalancer` in the `pachd` service 92 manifest. As mentioned above, this is the simplest way to expose 93 Pachyderm services to the outside world that does not provide 94 any sophisticated control over load balancing. This option works 95 on most cloud platforms, such as AWS and GKE, as well as in 96 minikube, and majorly used for internal use. 97 98 * **Configure access to your ingress public IP addresses through firewalls 99 and whitelisting.** 100 101 If you are deploying Pachyderm on a cloud provider, you need to make sure 102 that the ingress IP is available to external users. For example, in AWS, you can 103 configure access through security groups in the Virtual Private Cloud (VPC) 104 on which the Kubernetes with Pachyderm runs. Other cloud providers have 105 similar functionality. 106 107 * **Secure the connection end-to-end.** 108 109 If you run Pachyderm in a cloud platform, the cloud provider is responsible 110 for securing the underlying infrastructure, such as the Kubernetes control 111 plane. Most cloud providers have a security compliance program that address 112 these issues. If you are running Kubernetes locally, the security of 113 Kubernetes APIs, kubelet, and other components becomes your responsibility. 114 See security recommendations in the [Kubernetes documentation](https://kubernetes.io/docs/tasks/administer-cluster/securing-a-cluster/). 115 116 As for Pachyderm, you need to make sure that you deploy Pachyderm with 117 TLS enabled. You can deploy `pachd` and `dash` with different certificates 118 if required. Self-signed certificates might require additional configuration. 119 For instructions on deployment with TLS, see [Deploy Pachyderm with TLS](https://docs.pachyderm.com/latest/deploy-manage/deploy/deploy_w_tls/). 120 121 In addition, you must have administrative access to the Domain Name 122 Server (DNS) that you will use to access Pachyderm. If you are deploying 123 Pachyderm to an internal site with a self-signed certificate, contact our 124 support organization for assistance. 125 126 !!! note "See Also" 127 128 - [Expose a Pachyderm UI Through an Ingress](../expose-pach-ui-ingress/)