github.com/smintz/nomad@v0.8.3/website/source/guides/spark/monitoring.html.md

github.com/smintz/nomad@v0.8.3/website/source/guides/spark/monitoring.html.md (about)

     1  ---
     2  layout: "guides"
     3  page_title: "Apache Spark Integration - Monitoring Output"
     4  sidebar_current: "guides-spark-monitoring"
     5  description: |-
     6    Learn how to monitor Spark application output.
     7  ---
     8  
     9  # Monitoring Spark Application Output
    10  
    11  By default, `spark-submit` in `cluster` mode will submit your application
    12   to the Nomad cluster and return immediately. You can use the 
    13   [spark.nomad.cluster.monitorUntil](/guides/spark/configuration.html#spark-nomad-cluster-monitoruntil) configuration property to have 
    14   `spark-submit` monitor the job continuously. Note that, with this flag set, 
    15   killing `spark-submit` will *not* stop the spark application, since it will be
    16    running independently in the Nomad cluster. 
    17  
    18  ## Spark UI
    19  
    20  In cluster mode, if `spark.ui.enabled` is set to `true` (as by default), the 
    21  Spark web UI will be dynamically allocated a port. The Web UI will be exposed by
    22   Nomad as a service, and the UI’s `URL` will appear in the Spark driver’s log. By 
    23  default, the Spark web UI will terminate when the application finishes. This can 
    24  be problematic when debugging an application. You can delay termination by 
    25  setting `spark.ui.stopDelay` (e.g. `5m` for 5 minutes). Note that this will 
    26  cause the driver process to continue to run. You can force termination
    27   immediately on the “Jobs” page of the web UI.
    28  
    29  ## Spark History Server
    30  
    31  It is possible to reconstruct the web UI of a completed application using 
    32  Spark’s [history server](https://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact). 
    33  The history server requires the event log to have been written to an accessible 
    34  location like [HDFS](/guides/spark/hdfs.html) or Amazon S3.
    35  
    36  Sample history server job file:
    37  
    38  ```hcl
    39  job "spark-history-server" {
    40    datacenters = ["dc1"]
    41    type = "service"
    42  
    43    group "server" {
    44      count = 1
    45  
    46      task "history-server" {
    47        driver = "docker"
    48        
    49        config {
    50          image = "barnardb/spark"
    51          command = "/spark/spark-2.1.0-bin-nomad/bin/spark-class"
    52          args = [ "org.apache.spark.deploy.history.HistoryServer" ]
    53          port_map {
    54            ui = 18080
    55          }
    56          network_mode = "host"
    57        }
    58  
    59        env {
    60          "SPARK_HISTORY_OPTS" = "-Dspark.history.fs.logDirectory=hdfs://hdfs.service.consul/spark-events/"
    61          "SPARK_PUBLIC_DNS"   = "spark-history.service.consul"
    62        }
    63  
    64        resources {
    65          cpu    = 1000
    66          memory = 1024
    67          network {
    68            mbits = 250
    69            port "ui" {
    70              static = 18080
    71            }
    72          }
    73        }
    74  
    75        service {
    76          name = "spark-history"
    77          tags = ["spark", "ui"]
    78          port = "ui"
    79        }
    80      }
    81  
    82    }
    83  }
    84  ```
    85  
    86  The job file above can also be found [here](https://github.com/hashicorp/nomad/blob/master/terraform/examples/spark/spark-history-server-hdfs.nomad).
    87  
    88  To run the history server, first [deploy HDFS](/guides/spark/hdfs.html) and then 
    89  create a directory in HDFS to store events:
    90  
    91  ```shell
    92  $ hdfs dfs -fs hdfs://hdfs.service.consul:8020 -mkdir /spark-events
    93  ```
    94  
    95  You can then deploy the history server with:
    96  
    97  ```shell
    98  $ nomad job run spark-history-server-hdfs.nomad
    99  ```
   100  
   101  You can get the private IP for the history server with a Consul DNS lookup:
   102  
   103  ```shell
   104  $ dig spark-history.service.consul
   105  ```
   106  
   107  Find the public IP that corresponds to the private IP returned by the `dig` 
   108  command above. You can access the history server at http://PUBLIC_IP:18080.
   109  
   110  Use the `spark.eventLog.enabled` and `spark.eventLog.dir` configuration 
   111  properties in `spark-submit` to log events for a given application:
   112  
   113  ```shell
   114  $ spark-submit \
   115      --class org.apache.spark.examples.JavaSparkPi \
   116      --master nomad \
   117      --deploy-mode cluster \
   118      --conf spark.executor.instances=4 \
   119      --conf spark.nomad.cluster.monitorUntil=complete \
   120      --conf spark.eventLog.enabled=true \
   121      --conf spark.eventLog.dir=hdfs://hdfs.service.consul/spark-events \
   122      --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/nomad-spark/spark-2.1.0-bin-nomad.tgz \
   123      https://s3.amazonaws.com/nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar 100
   124  ```
   125  
   126  ## Logs
   127  
   128  Nomad clients collect the `stderr` and `stdout` of running tasks. The CLI or the
   129   HTTP API can be used to inspect logs, as documented in 
   130  [Accessing Logs](https://www.nomadproject.io/guides/operating-a-job/accessing-logs.html).
   131  In cluster mode, the `stderr` and `stdout` of the `driver` application can be 
   132  accessed in the same way. The [Log Shipper Pattern](https://www.nomadproject.io/guides/operating-a-job/accessing-logs.html#log-shipper-pattern) uses sidecar tasks to forward logs to a central location. This
   133  can be done using a job template as follows:
   134  
   135  ```hcl
   136  job "template" {
   137    group "driver" {
   138  
   139      task "driver" {
   140        meta {
   141          "spark.nomad.role" = "driver"
   142        }
   143      }
   144  
   145      task "log-forwarding-sidecar" {
   146        # sidecar task definition here
   147      }
   148    }
   149  
   150    group "executor" {
   151  
   152      task "executor" {
   153        meta {
   154          "spark.nomad.role" = "executor"
   155        }
   156      }
   157  
   158      task "log-forwarding-sidecar" {
   159        # sidecar task definition here
   160      }
   161    }
   162  }
   163  ```
   164  
   165  ## Next Steps
   166  
   167  Review the Nomad/Spark [configuration properties](/guides/spark/configuration.html).