github.com/nathanielks/terraform@v0.6.1-0.20170509030759-13e1a62319dc/website/source/docs/providers/aws/r/emr_cluster.html.md (about)

     1  ---
     2  layout: "aws"
     3  page_title: "AWS: aws_emr_cluster"
     4  sidebar_current: "docs-aws-resource-emr-cluster"
     5  description: |-
     6    Provides an Elastic MapReduce Cluster
     7  ---
     8  
     9  # aws\_emr\_cluster
    10  
    11  Provides an Elastic MapReduce Cluster, a web service that makes it easy to
    12  process large amounts of data efficiently. See [Amazon Elastic MapReduce Documentation](https://aws.amazon.com/documentation/elastic-mapreduce/)
    13  for more information.
    14  
    15  ## Example Usage
    16  
    17  ```hcl
    18  resource "aws_emr_cluster" "emr-test-cluster" {
    19    name          = "emr-test-arn"
    20    release_label = "emr-4.6.0"
    21    applications  = ["Spark"]
    22  
    23    termination_protection = false
    24    keep_job_flow_alive_when_no_steps = true
    25  
    26    ec2_attributes {
    27      subnet_id                         = "${aws_subnet.main.id}"
    28      emr_managed_master_security_group = "${aws_security_group.sg.id}"
    29      emr_managed_slave_security_group  = "${aws_security_group.sg.id}"
    30      instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
    31    }
    32  
    33    master_instance_type = "m3.xlarge"
    34    core_instance_type   = "m3.xlarge"
    35    core_instance_count  = 1
    36  
    37    tags {
    38      role     = "rolename"
    39      env      = "env"
    40    }
    41  
    42    bootstrap_action {
    43      path = "s3://elasticmapreduce/bootstrap-actions/run-if"
    44      name = "runif"
    45      args = ["instance.isMaster=true", "echo running on master node"]
    46    }
    47  
    48    configurations = "test-fixtures/emr_configurations.json"
    49  
    50    service_role = "${aws_iam_role.iam_emr_service_role.arn}"
    51  }
    52  ```
    53  
    54  The `aws_emr_cluster` resource typically requires two IAM roles, one for the EMR Cluster
    55  to use as a service, and another to place on your Cluster Instances to interact
    56  with AWS from those instances. The suggested role policy template for the EMR service is `AmazonElasticMapReduceRole`,
    57  and `AmazonElasticMapReduceforEC2Role` for the EC2 profile. See the [Getting
    58  Started](https://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html)
    59  guide for more information on these IAM roles. There is also a fully-bootable
    60  example Terraform configuration at the bottom of this page.
    61  
    62  ## Argument Reference
    63  
    64  The following arguments are supported:
    65  
    66  * `name` - (Required) The name of the job flow
    67  * `release_label` - (Required) The release label for the Amazon EMR release
    68  * `master_instance_type` - (Required) The EC2 instance type of the master node
    69  * `service_role` - (Required) IAM role that will be assumed by the Amazon EMR service to access AWS resources
    70  * `security_configuration` - (Optional) The security configuration name to attach to the EMR cluster. Only valid for EMR clusters with `release_label` 4.8.0 or greater
    71  * `core_instance_type` - (Optional) The EC2 instance type of the slave nodes
    72  * `core_instance_count` - (Optional) Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster's master node and use the remainder of the nodes (`core_instance_count`-1) as core nodes. Default `1`
    73  * `log_uri` - (Optional) S3 bucket to write the log files of the job flow. If a value
    74  	is not provided, logs are not created
    75  * `applications` - (Optional) A list of applications for the cluster. Valid values are: `Flink`, `Hadoop`, `Hive`, `Mahout`, `Pig`, and `Spark`. Case insensitive
    76  * `termination_protection` - (Optional) Switch on/off termination protection (default is off) 
    77  * `keep_job_flow_alive_when_no_steps` - (Optional) Switch on/off run cluster with no steps or when all steps are complete (default is on)
    78  * `ec2_attributes` - (Optional) Attributes for the EC2 instances running the job
    79  flow. Defined below
    80  * `bootstrap_action` - (Optional) List of bootstrap actions that will be run before Hadoop is started on
    81  	the cluster nodes. Defined below
    82  * `configurations` - (Optional) List of configurations supplied for the EMR cluster you are creating
    83  * `visible_to_all_users` - (Optional) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default `true`
    84  * `autoscaling_role` - (Optional) An IAM role for automatic scaling policies. The IAM role provides permissions that the automatic scaling feature requires to launch and terminate EC2 instances in an instance group.
    85  * `tags` - (Optional) list of tags to apply to the EMR Cluster
    86  
    87  
    88  ## ec2\_attributes
    89  
    90  Attributes for the Amazon EC2 instances running the job flow
    91  
    92  * `key_name` - (Optional) Amazon EC2 key pair that can be used to ssh to the master
    93  	node as the user called `hadoop`
    94  * `subnet_id` - (Optional) VPC subnet id where you want the job flow to launch.
    95  Cannot specify the `cc1.4xlarge` instance type for nodes of a job flow launched in a Amazon VPC
    96  * `additional_master_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the master node
    97  * `additional_slave_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the slave nodes
    98  * `emr_managed_master_security_group` - (Optional) Identifier of the Amazon EC2 security group for the master node
    99  * `emr_managed_slave_security_group` - (Optional) Identifier of the Amazon EC2 security group for the slave nodes
   100  * `service_access_security_group` - (Optional) Identifier of the Amazon EC2 service-access security group - required when the cluster runs on a private subnet
   101  * `instance_profile` - (Required) Instance Profile for EC2 instances of the cluster assume this role
   102  
   103  
   104  ## bootstrap\_action
   105  
   106  * `name` - (Required) Name of the bootstrap action
   107  * `path` - (Required) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system
   108  * `args` - (Optional) List of command line arguments to pass to the bootstrap action script
   109  
   110  ## Attributes Reference
   111  
   112  The following attributes are exported:
   113  
   114  * `id` - The ID of the EMR Cluster
   115  * `name` - The name of the cluster.
   116  * `release_label` - The release label for the Amazon EMR release.
   117  * `master_instance_type` - The EC2 instance type of the master node.
   118  * `master_public_dns` - The public DNS name of the master EC2 instance.
   119  * `core_instance_type` - The EC2 instance type of the slave nodes.
   120  * `core_instance_count` The number of slave nodes, i.e. EC2 instance nodes.
   121  * `log_uri` - The path to the Amazon S3 location where logs for this cluster are stored.
   122  * `applications` - The applications installed on this cluster.
   123  * `ec2_attributes` - Provides information about the EC2 instances in a cluster grouped by category: key name, subnet ID, IAM instance profile, and so on.
   124  * `bootstrap_action` - A list of bootstrap actions that will be run before Hadoop is started on the cluster nodes.
   125  * `configurations` - The list of Configurations supplied to the EMR cluster.
   126  * `service_role` - The IAM role that will be assumed by the Amazon EMR service to access AWS resources on your behalf.
   127  * `visible_to_all_users` - Indicates whether the job flow is visible to all IAM users of the AWS account associated with the job flow.
   128  * `tags` - The list of tags associated with a cluster.
   129  
   130  
   131  ## Example bootable config
   132  
   133  **NOTE:** This configuration demonstrates a minimal configuration needed to
   134  boot an example EMR Cluster. It is not meant to display best practices. Please
   135  use at your own risk.
   136  
   137  
   138  ```
   139  provider "aws" {
   140    region = "us-west-2"
   141  }
   142  
   143  resource "aws_emr_cluster" "tf-test-cluster" {
   144    name          = "emr-test-arn"
   145    release_label = "emr-4.6.0"
   146    applications  = ["Spark"]
   147  
   148    ec2_attributes {
   149      subnet_id                         = "${aws_subnet.main.id}"
   150      emr_managed_master_security_group = "${aws_security_group.allow_all.id}"
   151      emr_managed_slave_security_group  = "${aws_security_group.allow_all.id}"
   152      instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
   153    }
   154  
   155    master_instance_type = "m3.xlarge"
   156    core_instance_type   = "m3.xlarge"
   157    core_instance_count  = 1
   158  
   159    tags {
   160      role     = "rolename"
   161      dns_zone = "env_zone"
   162      env      = "env"
   163      name     = "name-env"
   164    }
   165  
   166    bootstrap_action {
   167      path = "s3://elasticmapreduce/bootstrap-actions/run-if"
   168      name = "runif"
   169      args = ["instance.isMaster=true", "echo running on master node"]
   170    }
   171  
   172    configurations = "test-fixtures/emr_configurations.json"
   173  
   174    service_role = "${aws_iam_role.iam_emr_service_role.arn}"
   175  }
   176  
   177  resource "aws_security_group" "allow_all" {
   178    name        = "allow_all"
   179    description = "Allow all inbound traffic"
   180    vpc_id      = "${aws_vpc.main.id}"
   181  
   182    ingress {
   183      from_port   = 0
   184      to_port     = 0
   185      protocol    = "-1"
   186      cidr_blocks = ["0.0.0.0/0"]
   187    }
   188  
   189    egress {
   190      from_port   = 0
   191      to_port     = 0
   192      protocol    = "-1"
   193      cidr_blocks = ["0.0.0.0/0"]
   194    }
   195  
   196    depends_on = ["aws_subnet.main"]
   197  
   198    lifecycle {
   199      ignore_changes = ["ingress", "egress"]
   200    }
   201  
   202    tags {
   203      name = "emr_test"
   204    }
   205  }
   206  
   207  resource "aws_vpc" "main" {
   208    cidr_block           = "168.31.0.0/16"
   209    enable_dns_hostnames = true
   210  
   211    tags {
   212      name = "emr_test"
   213    }
   214  }
   215  
   216  resource "aws_subnet" "main" {
   217    vpc_id     = "${aws_vpc.main.id}"
   218    cidr_block = "168.31.0.0/20"
   219  
   220    tags {
   221      name = "emr_test"
   222    }
   223  }
   224  
   225  resource "aws_internet_gateway" "gw" {
   226    vpc_id = "${aws_vpc.main.id}"
   227  }
   228  
   229  resource "aws_route_table" "r" {
   230    vpc_id = "${aws_vpc.main.id}"
   231  
   232    route {
   233      cidr_block = "0.0.0.0/0"
   234      gateway_id = "${aws_internet_gateway.gw.id}"
   235    }
   236  }
   237  
   238  resource "aws_main_route_table_association" "a" {
   239    vpc_id         = "${aws_vpc.main.id}"
   240    route_table_id = "${aws_route_table.r.id}"
   241  }
   242  
   243  ###
   244  
   245  # IAM Role setups
   246  
   247  ###
   248  
   249  # IAM role for EMR Service
   250  resource "aws_iam_role" "iam_emr_service_role" {
   251    name = "iam_emr_service_role"
   252  
   253    assume_role_policy = <<EOF
   254  {
   255    "Version": "2008-10-17",
   256    "Statement": [
   257      {
   258        "Sid": "",
   259        "Effect": "Allow",
   260        "Principal": {
   261          "Service": "elasticmapreduce.amazonaws.com"
   262        },
   263        "Action": "sts:AssumeRole"
   264      }
   265    ]
   266  }
   267  EOF
   268  }
   269  
   270  resource "aws_iam_role_policy" "iam_emr_service_policy" {
   271    name = "iam_emr_service_policy"
   272    role = "${aws_iam_role.iam_emr_service_role.id}"
   273  
   274    policy = <<EOF
   275  {
   276      "Version": "2012-10-17",
   277      "Statement": [{
   278          "Effect": "Allow",
   279          "Resource": "*",
   280          "Action": [
   281              "ec2:AuthorizeSecurityGroupEgress",
   282              "ec2:AuthorizeSecurityGroupIngress",
   283              "ec2:CancelSpotInstanceRequests",
   284              "ec2:CreateNetworkInterface",
   285              "ec2:CreateSecurityGroup",
   286              "ec2:CreateTags",
   287              "ec2:DeleteNetworkInterface",
   288              "ec2:DeleteSecurityGroup",
   289              "ec2:DeleteTags",
   290              "ec2:DescribeAvailabilityZones",
   291              "ec2:DescribeAccountAttributes",
   292              "ec2:DescribeDhcpOptions",
   293              "ec2:DescribeInstanceStatus",
   294              "ec2:DescribeInstances",
   295              "ec2:DescribeKeyPairs",
   296              "ec2:DescribeNetworkAcls",
   297              "ec2:DescribeNetworkInterfaces",
   298              "ec2:DescribePrefixLists",
   299              "ec2:DescribeRouteTables",
   300              "ec2:DescribeSecurityGroups",
   301              "ec2:DescribeSpotInstanceRequests",
   302              "ec2:DescribeSpotPriceHistory",
   303              "ec2:DescribeSubnets",
   304              "ec2:DescribeVpcAttribute",
   305              "ec2:DescribeVpcEndpoints",
   306              "ec2:DescribeVpcEndpointServices",
   307              "ec2:DescribeVpcs",
   308              "ec2:DetachNetworkInterface",
   309              "ec2:ModifyImageAttribute",
   310              "ec2:ModifyInstanceAttribute",
   311              "ec2:RequestSpotInstances",
   312              "ec2:RevokeSecurityGroupEgress",
   313              "ec2:RunInstances",
   314              "ec2:TerminateInstances",
   315              "ec2:DeleteVolume",
   316              "ec2:DescribeVolumeStatus",
   317              "ec2:DescribeVolumes",
   318              "ec2:DetachVolume",
   319              "iam:GetRole",
   320              "iam:GetRolePolicy",
   321              "iam:ListInstanceProfiles",
   322              "iam:ListRolePolicies",
   323              "iam:PassRole",
   324              "s3:CreateBucket",
   325              "s3:Get*",
   326              "s3:List*",
   327              "sdb:BatchPutAttributes",
   328              "sdb:Select",
   329              "sqs:CreateQueue",
   330              "sqs:Delete*",
   331              "sqs:GetQueue*",
   332              "sqs:PurgeQueue",
   333              "sqs:ReceiveMessage"
   334          ]
   335      }]
   336  }
   337  EOF
   338  }
   339  
   340  # IAM Role for EC2 Instance Profile
   341  resource "aws_iam_role" "iam_emr_profile_role" {
   342    name = "iam_emr_profile_role"
   343  
   344    assume_role_policy = <<EOF
   345  {
   346    "Version": "2008-10-17",
   347    "Statement": [
   348      {
   349        "Sid": "",
   350        "Effect": "Allow",
   351        "Principal": {
   352          "Service": "ec2.amazonaws.com"
   353        },
   354        "Action": "sts:AssumeRole"
   355      }
   356    ]
   357  }
   358  EOF
   359  }
   360  
   361  resource "aws_iam_instance_profile" "emr_profile" {
   362    name  = "emr_profile"
   363    roles = ["${aws_iam_role.iam_emr_profile_role.name}"]
   364  }
   365  
   366  resource "aws_iam_role_policy" "iam_emr_profile_policy" {
   367    name = "iam_emr_profile_policy"
   368    role = "${aws_iam_role.iam_emr_profile_role.id}"
   369  
   370    policy = <<EOF
   371  {
   372      "Version": "2012-10-17",
   373      "Statement": [{
   374          "Effect": "Allow",
   375          "Resource": "*",
   376          "Action": [
   377              "cloudwatch:*",
   378              "dynamodb:*",
   379              "ec2:Describe*",
   380              "elasticmapreduce:Describe*",
   381              "elasticmapreduce:ListBootstrapActions",
   382              "elasticmapreduce:ListClusters",
   383              "elasticmapreduce:ListInstanceGroups",
   384              "elasticmapreduce:ListInstances",
   385              "elasticmapreduce:ListSteps",
   386              "kinesis:CreateStream",
   387              "kinesis:DeleteStream",
   388              "kinesis:DescribeStream",
   389              "kinesis:GetRecords",
   390              "kinesis:GetShardIterator",
   391              "kinesis:MergeShards",
   392              "kinesis:PutRecord",
   393              "kinesis:SplitShard",
   394              "rds:Describe*",
   395              "s3:*",
   396              "sdb:*",
   397              "sns:*",
   398              "sqs:*"
   399          ]
   400      }]
   401  }
   402  EOF
   403  }
   404  ```