github.com/smintz/nomad@v0.8.3/website/source/guides/spark/resource.html.md

github.com/smintz/nomad@v0.8.3/website/source/guides/spark/resource.html.md (about)

     1  ---
     2  layout: "guides"
     3  page_title: "Apache Spark Integration - Resource Allocation"
     4  sidebar_current: "guides-spark-resource"
     5  description: |-
     6    Learn how to configure resource allocation for your Spark applications.
     7  ---
     8  
     9  # Resource Allocation
    10  
    11  Resource allocation can be configured using a job template or through 
    12  configuration properties. Here is a sample template in HCL syntax (this would 
    13  need to be converted to JSON):
    14  
    15  ```hcl
    16  job "template" {
    17    group "group-name" {
    18  
    19      task "executor" {
    20        meta {
    21          "spark.nomad.role" = "executor"
    22        }
    23  
    24        resources {
    25          cpu = 2000
    26          memory = 2048
    27          network {
    28            mbits = 100
    29          }
    30        }
    31      }
    32    }
    33  }
    34  ```
    35  Resource-related configuration properties are covered below.
    36  
    37  ## Memory
    38  
    39  The standard Spark memory properties will be propagated to Nomad to control 
    40  task resource allocation: `spark.driver.memory` (set by `--driver-memory`) and 
    41  `spark.executor.memory` (set by `--executor-memory`). You can additionally specify
    42   [spark.nomad.shuffle.memory](/guides/spark/configuration.html#spark-nomad-shuffle-memory)
    43    to control how much memory Nomad allocates to  shuffle service tasks.
    44  
    45  ## CPU
    46  
    47  Spark sizes its thread pools and allocates tasks based on the number of CPU 
    48  cores available. Nomad manages CPU allocation in terms of processing speed 
    49  rather than number of cores. When running Spark on Nomad, you can control how 
    50  much CPU share Nomad will allocate to tasks using the 
    51  [spark.nomad.driver.cpu](/guides/spark/configuration.html#spark-nomad-driver-cpu) 
    52  (set by `--driver-cpu`), 
    53  [spark.nomad.executor.cpu](/guides/spark/configuration.html#spark-nomad-executor-cpu) 
    54  (set by `--executor-cpu`) and 
    55  [spark.nomad.shuffle.cpu](/guides/spark/configuration.html#spark-nomad-shuffle-cpu) 
    56  properties. When running on Nomad, executors will be configured to use one core 
    57  by default, meaning they will only pull a single 1-core task at a time. You can 
    58  set the `spark.executor.cores` property (set by `--executor-cores`) to allow 
    59  more tasks to be executed concurrently on a single executor.
    60  
    61  ## Network
    62  
    63  Nomad does not restrict the network bandwidth of running tasks, bit it does 
    64  allocate a non-zero number of Mbit/s to each task and uses this when bin packing 
    65  task groups onto Nomad clients. Spark defaults to requesting the minimum of 1 
    66  Mbit/s per task, but you can change this with the 
    67  [spark.nomad.driver.networkMBits](/guides/spark/configuration.html#spark-nomad-driver-networkmbits), 
    68  [spark.nomad.executor.networkMBits](/guides/spark/configuration.html#spark-nomad-executor-networkmbits), and
    69  [spark.nomad.shuffle.networkMBits](/guides/spark/configuration.html#spark-nomad-shuffle-networkmbits) 
    70  properties.
    71  
    72  ## Log rotation
    73  
    74  Nomad performs log rotation on the `stdout` and `stderr` of its tasks. You can 
    75  configure the number number and size of log files it will keep for driver and 
    76  executor task groups using 
    77  [spark.nomad.driver.logMaxFiles](/guides/spark/configuration.html#spark-nomad-driver-logmaxfiles) 
    78  and [spark.nomad.executor.logMaxFiles](/guides/spark/configuration.html#spark-nomad-executor-logmaxfiles).
    79  
    80  ## Next Steps
    81  
    82  Learn how to [dynamically allocate Spark executors](/guides/spark/dynamic.html).