github.com/ferranbt/nomad@v0.9.3-0.20190607002617-85c449b7667c/website/source/guides/analytical-workloads/spark/customizing.html.md

github.com/ferranbt/nomad@v0.9.3-0.20190607002617-85c449b7667c/website/source/guides/analytical-workloads/spark/customizing.html.md (about)

     1  ---
     2  layout: "guides"
     3  page_title: "Apache Spark Integration - Customizing Applications"
     4  sidebar_current: "guides-analytical-workloads-spark-customizing"
     5  description: |-
     6    Learn how to customize the Nomad job that is created to run a Spark
     7    application.
     8  ---
     9  
    10  # Customizing Applications
    11  
    12  There are two ways to customize the Nomad job that Spark creates to run an
    13  application:
    14  
    15   - Use the default job template and set configuration properties
    16   - Use a custom job template
    17  
    18  ## Using the Default Job Template
    19  
    20  The Spark integration will use a generic job template by default. The template
    21  includes groups and tasks for the driver, executors and (optionally) the
    22  [shuffle service](/guides/analytical-workloads/spark/dynamic.html). The job itself and the tasks that
    23   are created have the `spark.nomad.role` meta value defined accordingly:
    24  
    25  ```hcl
    26  job "structure" {
    27    meta {
    28      "spark.nomad.role" = "application"
    29    }
    30  
    31    # A driver group is only added in cluster mode
    32    group "driver" {
    33      task "driver" {
    34        meta {
    35          "spark.nomad.role" = "driver"
    36        }
    37      }
    38    }
    39  
    40    group "executors" {
    41      count = 2
    42      task "executor" {
    43        meta {
    44          "spark.nomad.role" = "executor"
    45        }
    46      }
    47  
    48      # Shuffle service tasks are only added when enabled (as it must be when
    49      # using dynamic allocation)
    50      task "shuffle-service" {
    51        meta {
    52          "spark.nomad.role" = "shuffle"
    53        }
    54      }
    55    }
    56  }
    57  ```
    58  
    59  The default template can be customized indirectly by explicitly [setting
    60  configuration properties](/guides/analytical-workloads/spark/configuration.html).
    61  
    62  ## Using a Custom Job Template
    63  
    64  An alternative to using the default template is to set the
    65  `spark.nomad.job.template` configuration property to the path of a file
    66  containing a custom job template. There are two important considerations:
    67  
    68    * The template must use the JSON format. You can convert an HCL jobspec to
    69    JSON by running `nomad job run -output <job.nomad>`.
    70  
    71    * `spark.nomad.job.template` should be set to a path on the submitting
    72    machine, not to a URL (even in cluster mode). The template does not need to
    73    be accessible to the driver or executors.
    74  
    75  Using a job template you can override Spark’s default resource utilization, add
    76  additional metadata or constraints, set environment variables, add sidecar
    77  tasks and utilize the Consul and Vault integration. The template does
    78  not need to be a complete Nomad job specification, since Spark will add
    79  everything necessary to run your the application. For example, your template
    80  might set `job` metadata, but not contain any task groups, making it an
    81  incomplete Nomad job specification but still a valid template to use with Spark.
    82  
    83  To customize the driver task group, include a task group in your template that
    84  has a task that contains a `spark.nomad.role` meta value set to `driver`.
    85  
    86  To customize the executor task group, include a task group in your template that
    87  has a task that contains a `spark.nomad.role` meta value set to `executor` or
    88  `shuffle`.
    89  
    90  The following template adds a `meta` value at the job level and an environment
    91  variable to the executor task group:
    92  
    93  ```hcl
    94  job "template" {
    95  
    96    meta {
    97      "foo" = "bar"
    98    }
    99  
   100    group "executor-group-name" {
   101  
   102      task "executor-task-name" {
   103        meta {
   104          "spark.nomad.role" = "executor"
   105        }
   106  
   107        env {
   108          BAZ = "something"
   109        }
   110      }
   111    }
   112  }
   113  ```
   114  
   115  ## Order of Precedence
   116  
   117  The order of precedence for customized settings is as follows:
   118  
   119  1. Explicitly set configuration properties.
   120  2. Settings in the job template (if provided).
   121  3. Default values of the configuration properties.
   122  
   123  ## Next Steps
   124  
   125  Learn how to [allocate resources](/guides/analytical-workloads/spark/resource.html) for your Spark
   126  applications.