github.com/smintz/nomad@v0.8.3/website/source/guides/spark/monitoring.html.md (about) 1 --- 2 layout: "guides" 3 page_title: "Apache Spark Integration - Monitoring Output" 4 sidebar_current: "guides-spark-monitoring" 5 description: |- 6 Learn how to monitor Spark application output. 7 --- 8 9 # Monitoring Spark Application Output 10 11 By default, `spark-submit` in `cluster` mode will submit your application 12 to the Nomad cluster and return immediately. You can use the 13 [spark.nomad.cluster.monitorUntil](/guides/spark/configuration.html#spark-nomad-cluster-monitoruntil) configuration property to have 14 `spark-submit` monitor the job continuously. Note that, with this flag set, 15 killing `spark-submit` will *not* stop the spark application, since it will be 16 running independently in the Nomad cluster. 17 18 ## Spark UI 19 20 In cluster mode, if `spark.ui.enabled` is set to `true` (as by default), the 21 Spark web UI will be dynamically allocated a port. The Web UI will be exposed by 22 Nomad as a service, and the UI’s `URL` will appear in the Spark driver’s log. By 23 default, the Spark web UI will terminate when the application finishes. This can 24 be problematic when debugging an application. You can delay termination by 25 setting `spark.ui.stopDelay` (e.g. `5m` for 5 minutes). Note that this will 26 cause the driver process to continue to run. You can force termination 27 immediately on the “Jobs” page of the web UI. 28 29 ## Spark History Server 30 31 It is possible to reconstruct the web UI of a completed application using 32 Spark’s [history server](https://spark.apache.org/docs/latest/monitoring.html#viewing-after-the-fact). 33 The history server requires the event log to have been written to an accessible 34 location like [HDFS](/guides/spark/hdfs.html) or Amazon S3. 35 36 Sample history server job file: 37 38 ```hcl 39 job "spark-history-server" { 40 datacenters = ["dc1"] 41 type = "service" 42 43 group "server" { 44 count = 1 45 46 task "history-server" { 47 driver = "docker" 48 49 config { 50 image = "barnardb/spark" 51 command = "/spark/spark-2.1.0-bin-nomad/bin/spark-class" 52 args = [ "org.apache.spark.deploy.history.HistoryServer" ] 53 port_map { 54 ui = 18080 55 } 56 network_mode = "host" 57 } 58 59 env { 60 "SPARK_HISTORY_OPTS" = "-Dspark.history.fs.logDirectory=hdfs://hdfs.service.consul/spark-events/" 61 "SPARK_PUBLIC_DNS" = "spark-history.service.consul" 62 } 63 64 resources { 65 cpu = 1000 66 memory = 1024 67 network { 68 mbits = 250 69 port "ui" { 70 static = 18080 71 } 72 } 73 } 74 75 service { 76 name = "spark-history" 77 tags = ["spark", "ui"] 78 port = "ui" 79 } 80 } 81 82 } 83 } 84 ``` 85 86 The job file above can also be found [here](https://github.com/hashicorp/nomad/blob/master/terraform/examples/spark/spark-history-server-hdfs.nomad). 87 88 To run the history server, first [deploy HDFS](/guides/spark/hdfs.html) and then 89 create a directory in HDFS to store events: 90 91 ```shell 92 $ hdfs dfs -fs hdfs://hdfs.service.consul:8020 -mkdir /spark-events 93 ``` 94 95 You can then deploy the history server with: 96 97 ```shell 98 $ nomad job run spark-history-server-hdfs.nomad 99 ``` 100 101 You can get the private IP for the history server with a Consul DNS lookup: 102 103 ```shell 104 $ dig spark-history.service.consul 105 ``` 106 107 Find the public IP that corresponds to the private IP returned by the `dig` 108 command above. You can access the history server at http://PUBLIC_IP:18080. 109 110 Use the `spark.eventLog.enabled` and `spark.eventLog.dir` configuration 111 properties in `spark-submit` to log events for a given application: 112 113 ```shell 114 $ spark-submit \ 115 --class org.apache.spark.examples.JavaSparkPi \ 116 --master nomad \ 117 --deploy-mode cluster \ 118 --conf spark.executor.instances=4 \ 119 --conf spark.nomad.cluster.monitorUntil=complete \ 120 --conf spark.eventLog.enabled=true \ 121 --conf spark.eventLog.dir=hdfs://hdfs.service.consul/spark-events \ 122 --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/nomad-spark/spark-2.1.0-bin-nomad.tgz \ 123 https://s3.amazonaws.com/nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar 100 124 ``` 125 126 ## Logs 127 128 Nomad clients collect the `stderr` and `stdout` of running tasks. The CLI or the 129 HTTP API can be used to inspect logs, as documented in 130 [Accessing Logs](https://www.nomadproject.io/guides/operating-a-job/accessing-logs.html). 131 In cluster mode, the `stderr` and `stdout` of the `driver` application can be 132 accessed in the same way. The [Log Shipper Pattern](https://www.nomadproject.io/guides/operating-a-job/accessing-logs.html#log-shipper-pattern) uses sidecar tasks to forward logs to a central location. This 133 can be done using a job template as follows: 134 135 ```hcl 136 job "template" { 137 group "driver" { 138 139 task "driver" { 140 meta { 141 "spark.nomad.role" = "driver" 142 } 143 } 144 145 task "log-forwarding-sidecar" { 146 # sidecar task definition here 147 } 148 } 149 150 group "executor" { 151 152 task "executor" { 153 meta { 154 "spark.nomad.role" = "executor" 155 } 156 } 157 158 task "log-forwarding-sidecar" { 159 # sidecar task definition here 160 } 161 } 162 } 163 ``` 164 165 ## Next Steps 166 167 Review the Nomad/Spark [configuration properties](/guides/spark/configuration.html).