github.com/ves/terraform@v0.8.0-beta2/website/source/docs/providers/aws/r/emr_cluster.html.md (about) 1 --- 2 layout: "aws" 3 page_title: "AWS: aws_emr_cluster" 4 sidebar_current: "docs-aws-resource-emr-cluster" 5 description: |- 6 Provides an Elastic MapReduce Cluster 7 --- 8 9 # aws\_emr\_cluster 10 11 Provides an Elastic MapReduce Cluster, a web service that makes it easy to 12 process large amounts of data efficiently. See [Amazon Elastic MapReduce Documentation](https://aws.amazon.com/documentation/elastic-mapreduce/) 13 for more information. 14 15 ## Example Usage 16 17 ``` 18 resource "aws_emr_cluster" "emr-test-cluster" { 19 name = "emr-test-arn" 20 release_label = "emr-4.6.0" 21 applications = ["Spark"] 22 23 ec2_attributes { 24 subnet_id = "${aws_subnet.main.id}" 25 emr_managed_master_security_group = "${aws_security_group.sg.id}" 26 emr_managed_slave_security_group = "${aws_security_group.sg.id}" 27 instance_profile = "${aws_iam_instance_profile.emr_profile.arn}" 28 } 29 30 master_instance_type = "m3.xlarge" 31 core_instance_type = "m3.xlarge" 32 core_instance_count = 1 33 34 tags { 35 role = "rolename" 36 env = "env" 37 } 38 39 bootstrap_action { 40 path = "s3://elasticmapreduce/bootstrap-actions/run-if" 41 name = "runif" 42 args = ["instance.isMaster=true", "echo running on master node"] 43 } 44 45 configurations = "test-fixtures/emr_configurations.json" 46 47 service_role = "${aws_iam_role.iam_emr_service_role.arn}" 48 } 49 ``` 50 51 The `aws_emr_cluster` resource typically requires two IAM roles, one for the EMR Cluster 52 to use as a service, and another to place on your Cluster Instances to interact 53 with AWS from those instances. The suggested role policy template for the EMR service is `AmazonElasticMapReduceRole`, 54 and `AmazonElasticMapReduceforEC2Role` for the EC2 profile. See the [Getting 55 Started](https://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html) 56 guide for more information on these IAM roles. There is also a fully-bootable 57 example Terraform configuration at the bottom of this page. 58 59 ## Argument Reference 60 61 The following arguments are supported: 62 63 * `name` - (Required) The name of the job flow 64 * `release_label` - (Required) The release label for the Amazon EMR release 65 * `master_instance_type` - (Required) The EC2 instance type of the master node 66 * `core_instance_type` - (Optional) The EC2 instance type of the slave nodes 67 * `core_instance_count` - (Optional) Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster's master node and use the remainder of the nodes (`core_instance_count`-1) as core nodes. Default `1` 68 * `log_uri` - (Optional) S3 bucket to write the log files of the job flow. If a value 69 is not provided, logs are not created 70 * `applications` - (Optional) A list of applications for the cluster. Valid values are: `Hadoop`, `Hive`, 71 `Mahout`, `Pig`, and `Spark.` Case insensitive 72 * `ec2_attributes` - (Optional) Attributes for the EC2 instances running the job 73 flow. Defined below 74 * `bootstrap_action` - (Optional) List of bootstrap actions that will be run before Hadoop is started on 75 the cluster nodes. Defined below 76 * `configurations` - (Optional) List of configurations supplied for the EMR cluster you are creating 77 * `service_role` - (Optional) IAM role that will be assumed by the Amazon EMR service to access AWS resources 78 * `visible_to_all_users` - (Optional) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default `true` 79 * `tags` - (Optional) list of tags to apply to the EMR Cluster 80 81 82 83 ## ec2\_attributes 84 85 Attributes for the Amazon EC2 instances running the job flow 86 87 * `key_name` - (Optional) Amazon EC2 key pair that can be used to ssh to the master 88 node as the user called `hadoop` 89 * `subnet_id` - (Optional) VPC subnet id where you want the job flow to launch. 90 Cannot specify the `cc1.4xlarge` instance type for nodes of a job flow launched in a Amazon VPC 91 * `additional_master_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the master node 92 * `additional_slave_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the slave nodes 93 * `emr_managed_master_security_group` - (Optional) Identifier of the Amazon EC2 security group for the master node 94 * `emr_managed_slave_security_group` - (Optional) Identifier of the Amazon EC2 security group for the slave nodes 95 * `service_access_security_group` - (Optional) Identifier of the Amazon EC2 service-access security group - required when the cluster runs on a private subnet 96 * `instance_profile` - (Optional) Instance Profile for EC2 instances of the cluster assume this role 97 98 99 ## bootstrap\_action 100 101 * `name` - (Required) Name of the bootstrap action 102 * `path` - (Required) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system 103 * `args` - (Optional) List of command line arguments to pass to the bootstrap action script 104 105 ## Attributes Reference 106 107 The following attributes are exported: 108 109 * `id` - The ID of the EMR Cluster 110 * `name` 111 * `release_label` 112 * `master_instance_type` 113 * `core_instance_type` 114 * `core_instance_count` 115 * `log_uri` 116 * `applications` 117 * `ec2_attributes` 118 * `bootstrap_action` 119 * `configurations` 120 * `service_role` 121 * `visible_to_all_users` 122 * `tags` 123 124 125 ## Example bootable config 126 127 **NOTE:** This configuration demonstrates a minimal configuration needed to 128 boot an example EMR Cluster. It is not meant to display best practices. Please 129 use at your own risk. 130 131 132 ``` 133 provider "aws" { 134 region = "us-west-2" 135 } 136 137 resource "aws_emr_cluster" "tf-test-cluster" { 138 name = "emr-test-arn" 139 release_label = "emr-4.6.0" 140 applications = ["Spark"] 141 142 ec2_attributes { 143 subnet_id = "${aws_subnet.main.id}" 144 emr_managed_master_security_group = "${aws_security_group.allow_all.id}" 145 emr_managed_slave_security_group = "${aws_security_group.allow_all.id}" 146 instance_profile = "${aws_iam_instance_profile.emr_profile.arn}" 147 } 148 149 master_instance_type = "m3.xlarge" 150 core_instance_type = "m3.xlarge" 151 core_instance_count = 1 152 153 tags { 154 role = "rolename" 155 dns_zone = "env_zone" 156 env = "env" 157 name = "name-env" 158 } 159 160 bootstrap_action { 161 path = "s3://elasticmapreduce/bootstrap-actions/run-if" 162 name = "runif" 163 args = ["instance.isMaster=true", "echo running on master node"] 164 } 165 166 configurations = "test-fixtures/emr_configurations.json" 167 168 service_role = "${aws_iam_role.iam_emr_service_role.arn}" 169 } 170 171 resource "aws_security_group" "allow_all" { 172 name = "allow_all" 173 description = "Allow all inbound traffic" 174 vpc_id = "${aws_vpc.main.id}" 175 176 ingress { 177 from_port = 0 178 to_port = 0 179 protocol = "-1" 180 cidr_blocks = ["0.0.0.0/0"] 181 } 182 183 egress { 184 from_port = 0 185 to_port = 0 186 protocol = "-1" 187 cidr_blocks = ["0.0.0.0/0"] 188 } 189 190 depends_on = ["aws_subnet.main"] 191 192 lifecycle { 193 ignore_changes = ["ingress", "egress"] 194 } 195 196 tags { 197 name = "emr_test" 198 } 199 } 200 201 resource "aws_vpc" "main" { 202 cidr_block = "168.31.0.0/16" 203 enable_dns_hostnames = true 204 205 tags { 206 name = "emr_test" 207 } 208 } 209 210 resource "aws_subnet" "main" { 211 vpc_id = "${aws_vpc.main.id}" 212 cidr_block = "168.31.0.0/20" 213 214 tags { 215 name = "emr_test" 216 } 217 } 218 219 resource "aws_internet_gateway" "gw" { 220 vpc_id = "${aws_vpc.main.id}" 221 } 222 223 resource "aws_route_table" "r" { 224 vpc_id = "${aws_vpc.main.id}" 225 226 route { 227 cidr_block = "0.0.0.0/0" 228 gateway_id = "${aws_internet_gateway.gw.id}" 229 } 230 } 231 232 resource "aws_main_route_table_association" "a" { 233 vpc_id = "${aws_vpc.main.id}" 234 route_table_id = "${aws_route_table.r.id}" 235 } 236 237 ### 238 239 # IAM Role setups 240 241 ### 242 243 # IAM role for EMR Service 244 resource "aws_iam_role" "iam_emr_service_role" { 245 name = "iam_emr_service_role" 246 247 assume_role_policy = <<EOF 248 { 249 "Version": "2008-10-17", 250 "Statement": [ 251 { 252 "Sid": "", 253 "Effect": "Allow", 254 "Principal": { 255 "Service": "elasticmapreduce.amazonaws.com" 256 }, 257 "Action": "sts:AssumeRole" 258 } 259 ] 260 } 261 EOF 262 } 263 264 resource "aws_iam_role_policy" "iam_emr_service_policy" { 265 name = "iam_emr_service_policy" 266 role = "${aws_iam_role.iam_emr_service_role.id}" 267 268 policy = <<EOF 269 { 270 "Version": "2012-10-17", 271 "Statement": [{ 272 "Effect": "Allow", 273 "Resource": "*", 274 "Action": [ 275 "ec2:AuthorizeSecurityGroupEgress", 276 "ec2:AuthorizeSecurityGroupIngress", 277 "ec2:CancelSpotInstanceRequests", 278 "ec2:CreateNetworkInterface", 279 "ec2:CreateSecurityGroup", 280 "ec2:CreateTags", 281 "ec2:DeleteNetworkInterface", 282 "ec2:DeleteSecurityGroup", 283 "ec2:DeleteTags", 284 "ec2:DescribeAvailabilityZones", 285 "ec2:DescribeAccountAttributes", 286 "ec2:DescribeDhcpOptions", 287 "ec2:DescribeInstanceStatus", 288 "ec2:DescribeInstances", 289 "ec2:DescribeKeyPairs", 290 "ec2:DescribeNetworkAcls", 291 "ec2:DescribeNetworkInterfaces", 292 "ec2:DescribePrefixLists", 293 "ec2:DescribeRouteTables", 294 "ec2:DescribeSecurityGroups", 295 "ec2:DescribeSpotInstanceRequests", 296 "ec2:DescribeSpotPriceHistory", 297 "ec2:DescribeSubnets", 298 "ec2:DescribeVpcAttribute", 299 "ec2:DescribeVpcEndpoints", 300 "ec2:DescribeVpcEndpointServices", 301 "ec2:DescribeVpcs", 302 "ec2:DetachNetworkInterface", 303 "ec2:ModifyImageAttribute", 304 "ec2:ModifyInstanceAttribute", 305 "ec2:RequestSpotInstances", 306 "ec2:RevokeSecurityGroupEgress", 307 "ec2:RunInstances", 308 "ec2:TerminateInstances", 309 "ec2:DeleteVolume", 310 "ec2:DescribeVolumeStatus", 311 "ec2:DescribeVolumes", 312 "ec2:DetachVolume", 313 "iam:GetRole", 314 "iam:GetRolePolicy", 315 "iam:ListInstanceProfiles", 316 "iam:ListRolePolicies", 317 "iam:PassRole", 318 "s3:CreateBucket", 319 "s3:Get*", 320 "s3:List*", 321 "sdb:BatchPutAttributes", 322 "sdb:Select", 323 "sqs:CreateQueue", 324 "sqs:Delete*", 325 "sqs:GetQueue*", 326 "sqs:PurgeQueue", 327 "sqs:ReceiveMessage" 328 ] 329 }] 330 } 331 EOF 332 } 333 334 # IAM Role for EC2 Instance Profile 335 resource "aws_iam_role" "iam_emr_profile_role" { 336 name = "iam_emr_profile_role" 337 338 assume_role_policy = <<EOF 339 { 340 "Version": "2008-10-17", 341 "Statement": [ 342 { 343 "Sid": "", 344 "Effect": "Allow", 345 "Principal": { 346 "Service": "ec2.amazonaws.com" 347 }, 348 "Action": "sts:AssumeRole" 349 } 350 ] 351 } 352 EOF 353 } 354 355 resource "aws_iam_instance_profile" "emr_profile" { 356 name = "emr_profile" 357 roles = ["${aws_iam_role.iam_emr_profile_role.name}"] 358 } 359 360 resource "aws_iam_role_policy" "iam_emr_profile_policy" { 361 name = "iam_emr_profile_policy" 362 role = "${aws_iam_role.iam_emr_profile_role.id}" 363 364 policy = <<EOF 365 { 366 "Version": "2012-10-17", 367 "Statement": [{ 368 "Effect": "Allow", 369 "Resource": "*", 370 "Action": [ 371 "cloudwatch:*", 372 "dynamodb:*", 373 "ec2:Describe*", 374 "elasticmapreduce:Describe*", 375 "elasticmapreduce:ListBootstrapActions", 376 "elasticmapreduce:ListClusters", 377 "elasticmapreduce:ListInstanceGroups", 378 "elasticmapreduce:ListInstances", 379 "elasticmapreduce:ListSteps", 380 "kinesis:CreateStream", 381 "kinesis:DeleteStream", 382 "kinesis:DescribeStream", 383 "kinesis:GetRecords", 384 "kinesis:GetShardIterator", 385 "kinesis:MergeShards", 386 "kinesis:PutRecord", 387 "kinesis:SplitShard", 388 "rds:Describe*", 389 "s3:*", 390 "sdb:*", 391 "sns:*", 392 "sqs:*" 393 ] 394 }] 395 } 396 EOF 397 } 398 ```