github.com/smintz/nomad@v0.8.3/website/source/guides/spark/submit.html.md

github.com/smintz/nomad@v0.8.3/website/source/guides/spark/submit.html.md (about)

     1  ---
     2  layout: "guides"
     3  page_title: "Apache Spark Integration - Submitting Applications"
     4  sidebar_current: "guides-spark-submit"
     5  description: |-
     6    Learn how to submit Spark jobs that run on a Nomad cluster.
     7  ---
     8  
     9  # Submitting Applications
    10  
    11  The [`spark-submit`](https://spark.apache.org/docs/latest/submitting-applications.html) 
    12  script located in Spark’s `bin` directory is used to launch applications on a 
    13  cluster. Spark applications can be submitted to Nomad in either `client` mode 
    14  or `cluster` mode.
    15  
    16  ## Client Mode
    17  
    18  In `client` mode, the application driver runs on a machine that is not 
    19  necessarily in the Nomad cluster. The driver’s `SparkContext` creates a Nomad 
    20  job to run Spark executors. The executors connect to the driver and run Spark 
    21  tasks on behalf of the application. When the driver’s SparkContext is stopped, 
    22  the executors are shut down. Note that the machine running the driver or 
    23  `spark-submit` needs to be reachable from the Nomad clients so that the 
    24  executors can connect to it.
    25  
    26  In `client` mode, application resources need to start out present on the 
    27  submitting machine, so JAR files (both the primary JAR and those added with the 
    28  `--jars` option) can not be specified using `http:` or `https:` URLs. You can 
    29  either use files on the submitting machine (either as raw paths or `file:` URLs)
    30  , or use `local:` URLs to indicate that the files are independently available on
    31   both the submitting machine and all of the Nomad clients where the executors 
    32   might run.
    33  
    34  In this mode, the `spark-submit` invocation doesn’t return until the application 
    35  has finished running, and killing the `spark-submit` process kills the 
    36  application. 
    37  
    38  In this example, the `spark-submit` command is used to run the `SparkPi` sample 
    39  application:
    40  
    41  ```shell
    42  $ spark-submit --class org.apache.spark.examples.SparkPi \
    43      --master nomad \
    44      --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/nomad-spark/spark-2.1.0-bin-nomad.tgz \
    45      lib/spark-examples*.jar \
    46      10
    47  ```
    48  
    49  ## Cluster Mode
    50  
    51  In cluster mode, the `spark-submit` process creates a Nomad job to run the Spark 
    52  application driver itself. The driver’s `SparkContext` then adds Spark executors
    53   to the Nomad job. The executors connect to the driver and run Spark tasks on 
    54   behalf of the application. When the driver’s `SparkContext` is stopped, the 
    55   executors are shut down.
    56  
    57  In cluster mode, application resources need to be hosted somewhere accessible 
    58  to the Nomad cluster, so JARs (both the primary JAR and those added with the 
    59  `--jars` option) can’t be specified using raw paths or `file:` URLs. You can either 
    60  use `http:` or `https:` URLs, or use `local:` URLs to indicate that the files are 
    61  independently available on all of the Nomad clients where the driver and executors 
    62  might run.
    63  
    64  Note that in cluster mode, the Nomad master URL needs to be routable from both 
    65  the submitting machine and the Nomad client node that runs the driver. If the 
    66  Nomad cluster is integrated with Consul, you may want to use a DNS name for the 
    67  Nomad service served by Consul.
    68  
    69  For example, to submit an application in cluster mode:
    70  
    71  ```shell
    72  $ spark-submit --class org.apache.spark.examples.SparkPi \
    73      --master nomad \
    74      --deploy-mode cluster \
    75      --conf spark.nomad.sparkDistribution=http://example.com/spark.tgz \
    76      http://example.com/spark-examples.jar \
    77      10
    78  ```
    79  
    80  ## Next Steps
    81  
    82  Learn how to [customize applications](/guides/spark/customizing.html).