github.com/vtorhonen/terraform@v0.9.0-beta2.0.20170307220345-5d894e4ffda7/website/source/docs/providers/aws/r/emr_cluster.html.md (about)

     1  ---
     2  layout: "aws"
     3  page_title: "AWS: aws_emr_cluster"
     4  sidebar_current: "docs-aws-resource-emr-cluster"
     5  description: |-
     6    Provides an Elastic MapReduce Cluster
     7  ---
     8  
     9  # aws\_emr\_cluster
    10  
    11  Provides an Elastic MapReduce Cluster, a web service that makes it easy to
    12  process large amounts of data efficiently. See [Amazon Elastic MapReduce Documentation](https://aws.amazon.com/documentation/elastic-mapreduce/)
    13  for more information.
    14  
    15  ## Example Usage
    16  
    17  ```
    18  resource "aws_emr_cluster" "emr-test-cluster" {
    19    name          = "emr-test-arn"
    20    release_label = "emr-4.6.0"
    21    applications  = ["Spark"]
    22  
    23    termination_protection = false
    24    keep_job_flow_alive_when_no_steps = true
    25  
    26    ec2_attributes {
    27      subnet_id                         = "${aws_subnet.main.id}"
    28      emr_managed_master_security_group = "${aws_security_group.sg.id}"
    29      emr_managed_slave_security_group  = "${aws_security_group.sg.id}"
    30      instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
    31    }
    32  
    33    master_instance_type = "m3.xlarge"
    34    core_instance_type   = "m3.xlarge"
    35    core_instance_count  = 1
    36  
    37    tags {
    38      role     = "rolename"
    39      env      = "env"
    40    }
    41  
    42    bootstrap_action {
    43      path = "s3://elasticmapreduce/bootstrap-actions/run-if"
    44      name = "runif"
    45      args = ["instance.isMaster=true", "echo running on master node"]
    46    }
    47  
    48    configurations = "test-fixtures/emr_configurations.json"
    49  
    50    service_role = "${aws_iam_role.iam_emr_service_role.arn}"
    51  }
    52  ```
    53  
    54  The `aws_emr_cluster` resource typically requires two IAM roles, one for the EMR Cluster
    55  to use as a service, and another to place on your Cluster Instances to interact
    56  with AWS from those instances. The suggested role policy template for the EMR service is `AmazonElasticMapReduceRole`,
    57  and `AmazonElasticMapReduceforEC2Role` for the EC2 profile. See the [Getting
    58  Started](https://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html)
    59  guide for more information on these IAM roles. There is also a fully-bootable
    60  example Terraform configuration at the bottom of this page.
    61  
    62  ## Argument Reference
    63  
    64  The following arguments are supported:
    65  
    66  * `name` - (Required) The name of the job flow
    67  * `release_label` - (Required) The release label for the Amazon EMR release
    68  * `master_instance_type` - (Required) The EC2 instance type of the master node
    69  * `service_role` - (Required) IAM role that will be assumed by the Amazon EMR service to access AWS resources
    70  * `core_instance_type` - (Optional) The EC2 instance type of the slave nodes
    71  * `core_instance_count` - (Optional) Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster's master node and use the remainder of the nodes (`core_instance_count`-1) as core nodes. Default `1`
    72  * `log_uri` - (Optional) S3 bucket to write the log files of the job flow. If a value
    73  	is not provided, logs are not created
    74  * `applications` - (Optional) A list of applications for the cluster. Valid values are: `Flink`, `Hadoop`, `Hive`, `Mahout`, `Pig`, and `Spark`. Case insensitive
    75  * `termination_protection` - (Optional) Switch on/off termination protection (default is off) 
    76  * `keep_job_flow_alive_when_no_steps` - (Optional) Switch on/off run cluster with no steps or when all steps are complete (default is on)
    77  * `ec2_attributes` - (Optional) Attributes for the EC2 instances running the job
    78  flow. Defined below
    79  * `bootstrap_action` - (Optional) List of bootstrap actions that will be run before Hadoop is started on
    80  	the cluster nodes. Defined below
    81  * `configurations` - (Optional) List of configurations supplied for the EMR cluster you are creating
    82  * `visible_to_all_users` - (Optional) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default `true`
    83  * `tags` - (Optional) list of tags to apply to the EMR Cluster
    84  
    85  
    86  
    87  ## ec2\_attributes
    88  
    89  Attributes for the Amazon EC2 instances running the job flow
    90  
    91  * `key_name` - (Optional) Amazon EC2 key pair that can be used to ssh to the master
    92  	node as the user called `hadoop`
    93  * `subnet_id` - (Optional) VPC subnet id where you want the job flow to launch.
    94  Cannot specify the `cc1.4xlarge` instance type for nodes of a job flow launched in a Amazon VPC
    95  * `additional_master_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the master node
    96  * `additional_slave_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the slave nodes
    97  * `emr_managed_master_security_group` - (Optional) Identifier of the Amazon EC2 security group for the master node
    98  * `emr_managed_slave_security_group` - (Optional) Identifier of the Amazon EC2 security group for the slave nodes
    99  * `service_access_security_group` - (Optional) Identifier of the Amazon EC2 service-access security group - required when the cluster runs on a private subnet
   100  * `instance_profile` - (Required) Instance Profile for EC2 instances of the cluster assume this role
   101  
   102  
   103  ## bootstrap\_action
   104  
   105  * `name` - (Required) Name of the bootstrap action
   106  * `path` - (Required) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system
   107  * `args` - (Optional) List of command line arguments to pass to the bootstrap action script
   108  
   109  ## Attributes Reference
   110  
   111  The following attributes are exported:
   112  
   113  * `id` - The ID of the EMR Cluster
   114  * `name` - The name of the cluster.
   115  * `release_label` - The release label for the Amazon EMR release.
   116  * `master_instance_type` - The EC2 instance type of the master node.
   117  * `master_public_dns` - The public DNS name of the master EC2 instance.
   118  * `core_instance_type` - The EC2 instance type of the slave nodes.
   119  * `core_instance_count` The number of slave nodes, i.e. EC2 instance nodes.
   120  * `log_uri` - The path to the Amazon S3 location where logs for this cluster are stored.
   121  * `applications` - The applications installed on this cluster.
   122  * `ec2_attributes` - Provides information about the EC2 instances in a cluster grouped by category: key name, subnet ID, IAM instance profile, and so on.
   123  * `bootstrap_action` - A list of bootstrap actions that will be run before Hadoop is started on the cluster nodes.
   124  * `configurations` - The list of Configurations supplied to the EMR cluster.
   125  * `service_role` - The IAM role that will be assumed by the Amazon EMR service to access AWS resources on your behalf.
   126  * `visible_to_all_users` - Indicates whether the job flow is visible to all IAM users of the AWS account associated with the job flow.
   127  * `tags` - The list of tags associated with a cluster.
   128  
   129  
   130  ## Example bootable config
   131  
   132  **NOTE:** This configuration demonstrates a minimal configuration needed to
   133  boot an example EMR Cluster. It is not meant to display best practices. Please
   134  use at your own risk.
   135  
   136  
   137  ```
   138  provider "aws" {
   139    region = "us-west-2"
   140  }
   141  
   142  resource "aws_emr_cluster" "tf-test-cluster" {
   143    name          = "emr-test-arn"
   144    release_label = "emr-4.6.0"
   145    applications  = ["Spark"]
   146  
   147    ec2_attributes {
   148      subnet_id                         = "${aws_subnet.main.id}"
   149      emr_managed_master_security_group = "${aws_security_group.allow_all.id}"
   150      emr_managed_slave_security_group  = "${aws_security_group.allow_all.id}"
   151      instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
   152    }
   153  
   154    master_instance_type = "m3.xlarge"
   155    core_instance_type   = "m3.xlarge"
   156    core_instance_count  = 1
   157  
   158    tags {
   159      role     = "rolename"
   160      dns_zone = "env_zone"
   161      env      = "env"
   162      name     = "name-env"
   163    }
   164  
   165    bootstrap_action {
   166      path = "s3://elasticmapreduce/bootstrap-actions/run-if"
   167      name = "runif"
   168      args = ["instance.isMaster=true", "echo running on master node"]
   169    }
   170  
   171    configurations = "test-fixtures/emr_configurations.json"
   172  
   173    service_role = "${aws_iam_role.iam_emr_service_role.arn}"
   174  }
   175  
   176  resource "aws_security_group" "allow_all" {
   177    name        = "allow_all"
   178    description = "Allow all inbound traffic"
   179    vpc_id      = "${aws_vpc.main.id}"
   180  
   181    ingress {
   182      from_port   = 0
   183      to_port     = 0
   184      protocol    = "-1"
   185      cidr_blocks = ["0.0.0.0/0"]
   186    }
   187  
   188    egress {
   189      from_port   = 0
   190      to_port     = 0
   191      protocol    = "-1"
   192      cidr_blocks = ["0.0.0.0/0"]
   193    }
   194  
   195    depends_on = ["aws_subnet.main"]
   196  
   197    lifecycle {
   198      ignore_changes = ["ingress", "egress"]
   199    }
   200  
   201    tags {
   202      name = "emr_test"
   203    }
   204  }
   205  
   206  resource "aws_vpc" "main" {
   207    cidr_block           = "168.31.0.0/16"
   208    enable_dns_hostnames = true
   209  
   210    tags {
   211      name = "emr_test"
   212    }
   213  }
   214  
   215  resource "aws_subnet" "main" {
   216    vpc_id     = "${aws_vpc.main.id}"
   217    cidr_block = "168.31.0.0/20"
   218  
   219    tags {
   220      name = "emr_test"
   221    }
   222  }
   223  
   224  resource "aws_internet_gateway" "gw" {
   225    vpc_id = "${aws_vpc.main.id}"
   226  }
   227  
   228  resource "aws_route_table" "r" {
   229    vpc_id = "${aws_vpc.main.id}"
   230  
   231    route {
   232      cidr_block = "0.0.0.0/0"
   233      gateway_id = "${aws_internet_gateway.gw.id}"
   234    }
   235  }
   236  
   237  resource "aws_main_route_table_association" "a" {
   238    vpc_id         = "${aws_vpc.main.id}"
   239    route_table_id = "${aws_route_table.r.id}"
   240  }
   241  
   242  ###
   243  
   244  # IAM Role setups
   245  
   246  ###
   247  
   248  # IAM role for EMR Service
   249  resource "aws_iam_role" "iam_emr_service_role" {
   250    name = "iam_emr_service_role"
   251  
   252    assume_role_policy = <<EOF
   253  {
   254    "Version": "2008-10-17",
   255    "Statement": [
   256      {
   257        "Sid": "",
   258        "Effect": "Allow",
   259        "Principal": {
   260          "Service": "elasticmapreduce.amazonaws.com"
   261        },
   262        "Action": "sts:AssumeRole"
   263      }
   264    ]
   265  }
   266  EOF
   267  }
   268  
   269  resource "aws_iam_role_policy" "iam_emr_service_policy" {
   270    name = "iam_emr_service_policy"
   271    role = "${aws_iam_role.iam_emr_service_role.id}"
   272  
   273    policy = <<EOF
   274  {
   275      "Version": "2012-10-17",
   276      "Statement": [{
   277          "Effect": "Allow",
   278          "Resource": "*",
   279          "Action": [
   280              "ec2:AuthorizeSecurityGroupEgress",
   281              "ec2:AuthorizeSecurityGroupIngress",
   282              "ec2:CancelSpotInstanceRequests",
   283              "ec2:CreateNetworkInterface",
   284              "ec2:CreateSecurityGroup",
   285              "ec2:CreateTags",
   286              "ec2:DeleteNetworkInterface",
   287              "ec2:DeleteSecurityGroup",
   288              "ec2:DeleteTags",
   289              "ec2:DescribeAvailabilityZones",
   290              "ec2:DescribeAccountAttributes",
   291              "ec2:DescribeDhcpOptions",
   292              "ec2:DescribeInstanceStatus",
   293              "ec2:DescribeInstances",
   294              "ec2:DescribeKeyPairs",
   295              "ec2:DescribeNetworkAcls",
   296              "ec2:DescribeNetworkInterfaces",
   297              "ec2:DescribePrefixLists",
   298              "ec2:DescribeRouteTables",
   299              "ec2:DescribeSecurityGroups",
   300              "ec2:DescribeSpotInstanceRequests",
   301              "ec2:DescribeSpotPriceHistory",
   302              "ec2:DescribeSubnets",
   303              "ec2:DescribeVpcAttribute",
   304              "ec2:DescribeVpcEndpoints",
   305              "ec2:DescribeVpcEndpointServices",
   306              "ec2:DescribeVpcs",
   307              "ec2:DetachNetworkInterface",
   308              "ec2:ModifyImageAttribute",
   309              "ec2:ModifyInstanceAttribute",
   310              "ec2:RequestSpotInstances",
   311              "ec2:RevokeSecurityGroupEgress",
   312              "ec2:RunInstances",
   313              "ec2:TerminateInstances",
   314              "ec2:DeleteVolume",
   315              "ec2:DescribeVolumeStatus",
   316              "ec2:DescribeVolumes",
   317              "ec2:DetachVolume",
   318              "iam:GetRole",
   319              "iam:GetRolePolicy",
   320              "iam:ListInstanceProfiles",
   321              "iam:ListRolePolicies",
   322              "iam:PassRole",
   323              "s3:CreateBucket",
   324              "s3:Get*",
   325              "s3:List*",
   326              "sdb:BatchPutAttributes",
   327              "sdb:Select",
   328              "sqs:CreateQueue",
   329              "sqs:Delete*",
   330              "sqs:GetQueue*",
   331              "sqs:PurgeQueue",
   332              "sqs:ReceiveMessage"
   333          ]
   334      }]
   335  }
   336  EOF
   337  }
   338  
   339  # IAM Role for EC2 Instance Profile
   340  resource "aws_iam_role" "iam_emr_profile_role" {
   341    name = "iam_emr_profile_role"
   342  
   343    assume_role_policy = <<EOF
   344  {
   345    "Version": "2008-10-17",
   346    "Statement": [
   347      {
   348        "Sid": "",
   349        "Effect": "Allow",
   350        "Principal": {
   351          "Service": "ec2.amazonaws.com"
   352        },
   353        "Action": "sts:AssumeRole"
   354      }
   355    ]
   356  }
   357  EOF
   358  }
   359  
   360  resource "aws_iam_instance_profile" "emr_profile" {
   361    name  = "emr_profile"
   362    roles = ["${aws_iam_role.iam_emr_profile_role.name}"]
   363  }
   364  
   365  resource "aws_iam_role_policy" "iam_emr_profile_policy" {
   366    name = "iam_emr_profile_policy"
   367    role = "${aws_iam_role.iam_emr_profile_role.id}"
   368  
   369    policy = <<EOF
   370  {
   371      "Version": "2012-10-17",
   372      "Statement": [{
   373          "Effect": "Allow",
   374          "Resource": "*",
   375          "Action": [
   376              "cloudwatch:*",
   377              "dynamodb:*",
   378              "ec2:Describe*",
   379              "elasticmapreduce:Describe*",
   380              "elasticmapreduce:ListBootstrapActions",
   381              "elasticmapreduce:ListClusters",
   382              "elasticmapreduce:ListInstanceGroups",
   383              "elasticmapreduce:ListInstances",
   384              "elasticmapreduce:ListSteps",
   385              "kinesis:CreateStream",
   386              "kinesis:DeleteStream",
   387              "kinesis:DescribeStream",
   388              "kinesis:GetRecords",
   389              "kinesis:GetShardIterator",
   390              "kinesis:MergeShards",
   391              "kinesis:PutRecord",
   392              "kinesis:SplitShard",
   393              "rds:Describe*",
   394              "s3:*",
   395              "sdb:*",
   396              "sns:*",
   397              "sqs:*"
   398          ]
   399      }]
   400  }
   401  EOF
   402  }
   403  ```