github.com/minamijoyo/terraform@v0.7.8-0.20161029001309-18b3736ba44b/website/source/docs/providers/aws/r/emr_cluster.html.md (about) 1 --- 2 layout: "aws" 3 page_title: "AWS: aws_emr_cluster" 4 sidebar_current: "docs-aws-resource-emr-cluster" 5 description: |- 6 Provides an Elastic MapReduce Cluster 7 --- 8 9 # aws\_emr\_cluster 10 11 Provides an Elastic MapReduce Cluster, a web service that makes it easy to 12 process large amounts of data efficiently. See [Amazon Elastic MapReduce Documentation](https://aws.amazon.com/documentation/elastic-mapreduce/) 13 for more information. 14 15 ## Example Usage 16 17 ``` 18 resource "aws_emr_cluster" "emr-test-cluster" { 19 name = "emr-test-arn" 20 release_label = "emr-4.6.0" 21 applications = ["Spark"] 22 23 ec2_attributes { 24 subnet_id = "${aws_subnet.main.id}" 25 emr_managed_master_security_group = "${aws_security_group.sg.id}" 26 emr_managed_slave_security_group = "${aws_security_group.sg.id}" 27 instance_profile = "${aws_iam_instance_profile.emr_profile.arn}" 28 } 29 30 master_instance_type = "m3.xlarge" 31 core_instance_type = "m3.xlarge" 32 core_instance_count = 1 33 34 tags { 35 role = "rolename" 36 env = "env" 37 } 38 39 bootstrap_action { 40 path = "s3://elasticmapreduce/bootstrap-actions/run-if" 41 name = "runif" 42 args = ["instance.isMaster=true", "echo running on master node"] 43 } 44 45 configurations = "test-fixtures/emr_configurations.json" 46 47 service_role = "${aws_iam_role.iam_emr_service_role.arn}" 48 } 49 ``` 50 51 The `aws_emr_cluster` resource typically requires two IAM roles, one for the EMR Cluster 52 to use as a service, and another to place on your Cluster Instances to interact 53 with AWS from those instances. The suggested role policy template for the EMR service is `AmazonElasticMapReduceRole`, 54 and `AmazonElasticMapReduceforEC2Role` for the EC2 profile. See the [Getting 55 Started](https://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html) 56 guide for more information on these IAM roles. There is also a fully-bootable 57 example Terraform configuration at the bottom of this page. 58 59 ## Argument Reference 60 61 The following arguments are supported: 62 63 * `name` - (Required) The name of the job flow 64 * `release_label` - (Required) The release label for the Amazon EMR release 65 * `master_instance_type` - (Required) The EC2 instance type of the master node 66 * `core_instance_type` - (Optional) The EC2 instance type of the slave nodes 67 * `core_instance_count` - (Optional) number of Amazon EC2 instances used to execute the job flow. Default `0` 68 * `log_uri` - (Optional) S3 bucket to write the log files of the job flow. If a value 69 is not provided, logs are not created 70 * `applications` - (Optional) A list of applications for the cluster. Valid values are: `Hadoop`, `Hive`, 71 `Mahout`, `Pig`, and `Spark.` Case insensitive 72 * `ec2_attributes` - (Optional) attributes for the EC2 instances running the job 73 flow. Defined below 74 * `bootstrap_action` - (Optional) list of bootstrap actions that will be run before Hadoop is started on 75 the cluster nodes. Defined below 76 * `configurations` - (Optional) list of configurations supplied for the EMR cluster you are creating 77 * `service_role` - (Optional) IAM role that will be assumed by the Amazon EMR service to access AWS resources 78 * `visible_to_all_users` - (Optional) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default `true` 79 * `tags` - (Optional) list of tags to apply to the EMR Cluster 80 81 82 83 ## ec2\_attributes 84 85 Attributes for the Amazon EC2 instances running the job flow 86 87 * `key_name` - (Optional) Amazon EC2 key pair that can be used to ssh to the master 88 node as the user called `hadoop` 89 * `subnet_id` - (Optional) VPC subnet id where you want the job flow to launch. 90 Cannot specify the `cc1.4xlarge` instance type for nodes of a job flow launched in a Amazon VPC 91 * `additional_master_security_groups` - (Optional) list of additional Amazon EC2 security group IDs for the master node 92 * `additional_slave_security_groups` - (Optional) list of additional Amazon EC2 security group IDs for the slave nodes 93 * `emr_managed_master_security_group` - (Optional) identifier of the Amazon EC2 security group for the master node 94 * `emr_managed_slave_security_group` - (Optional) identifier of the Amazon EC2 security group for the slave nodes 95 * `instance_profile` - (Optional) Instance Profile for EC2 instances of the cluster assume this role 96 97 98 ## bootstrap\_action 99 100 * `name` - (Required) name of the bootstrap action 101 * `path` - (Required) location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system 102 * `args` - (Optional) list of command line arguments to pass to the bootstrap action script 103 104 ## Attributes Reference 105 106 The following attributes are exported: 107 108 * `id` - The ID of the EMR Cluster 109 * `name` 110 * `release_label` 111 * `master_instance_type` 112 * `core_instance_type` 113 * `core_instance_count` 114 * `log_uri` 115 * `applications` 116 * `ec2_attributes` 117 * `bootstrap_action` 118 * `configurations` 119 * `service_role` 120 * `visible_to_all_users` 121 * `tags` 122 123 124 ## Example bootable config 125 126 **NOTE:** This configuration demonstrates a minimal configuration needed to 127 boot an example EMR Cluster. It is not meant to display best practices. Please 128 use at your own risk. 129 130 131 ``` 132 provider "aws" { 133 region = "us-west-2" 134 } 135 136 resource "aws_emr_cluster" "tf-test-cluster" { 137 name = "emr-test-arn" 138 release_label = "emr-4.6.0" 139 applications = ["Spark"] 140 141 ec2_attributes { 142 subnet_id = "${aws_subnet.main.id}" 143 emr_managed_master_security_group = "${aws_security_group.allow_all.id}" 144 emr_managed_slave_security_group = "${aws_security_group.allow_all.id}" 145 instance_profile = "${aws_iam_instance_profile.emr_profile.arn}" 146 } 147 148 master_instance_type = "m3.xlarge" 149 core_instance_type = "m3.xlarge" 150 core_instance_count = 1 151 152 tags { 153 role = "rolename" 154 dns_zone = "env_zone" 155 env = "env" 156 name = "name-env" 157 } 158 159 bootstrap_action { 160 path = "s3://elasticmapreduce/bootstrap-actions/run-if" 161 name = "runif" 162 args = ["instance.isMaster=true", "echo running on master node"] 163 } 164 165 configurations = "test-fixtures/emr_configurations.json" 166 167 service_role = "${aws_iam_role.iam_emr_service_role.arn}" 168 } 169 170 resource "aws_security_group" "allow_all" { 171 name = "allow_all" 172 description = "Allow all inbound traffic" 173 vpc_id = "${aws_vpc.main.id}" 174 175 ingress { 176 from_port = 0 177 to_port = 0 178 protocol = "-1" 179 cidr_blocks = ["0.0.0.0/0"] 180 } 181 182 egress { 183 from_port = 0 184 to_port = 0 185 protocol = "-1" 186 cidr_blocks = ["0.0.0.0/0"] 187 } 188 189 depends_on = ["aws_subnet.main"] 190 191 lifecycle { 192 ignore_changes = ["ingress", "egress"] 193 } 194 195 tags { 196 name = "emr_test" 197 } 198 } 199 200 resource "aws_vpc" "main" { 201 cidr_block = "168.31.0.0/16" 202 enable_dns_hostnames = true 203 204 tags { 205 name = "emr_test" 206 } 207 } 208 209 resource "aws_subnet" "main" { 210 vpc_id = "${aws_vpc.main.id}" 211 cidr_block = "168.31.0.0/20" 212 213 tags { 214 name = "emr_test" 215 } 216 } 217 218 resource "aws_internet_gateway" "gw" { 219 vpc_id = "${aws_vpc.main.id}" 220 } 221 222 resource "aws_route_table" "r" { 223 vpc_id = "${aws_vpc.main.id}" 224 225 route { 226 cidr_block = "0.0.0.0/0" 227 gateway_id = "${aws_internet_gateway.gw.id}" 228 } 229 } 230 231 resource "aws_main_route_table_association" "a" { 232 vpc_id = "${aws_vpc.main.id}" 233 route_table_id = "${aws_route_table.r.id}" 234 } 235 236 ### 237 238 # IAM Role setups 239 240 ### 241 242 # IAM role for EMR Service 243 resource "aws_iam_role" "iam_emr_service_role" { 244 name = "iam_emr_service_role" 245 246 assume_role_policy = <<EOF 247 { 248 "Version": "2008-10-17", 249 "Statement": [ 250 { 251 "Sid": "", 252 "Effect": "Allow", 253 "Principal": { 254 "Service": "elasticmapreduce.amazonaws.com" 255 }, 256 "Action": "sts:AssumeRole" 257 } 258 ] 259 } 260 EOF 261 } 262 263 resource "aws_iam_role_policy" "iam_emr_service_policy" { 264 name = "iam_emr_service_policy" 265 role = "${aws_iam_role.iam_emr_service_role.id}" 266 267 policy = <<EOF 268 { 269 "Version": "2012-10-17", 270 "Statement": [{ 271 "Effect": "Allow", 272 "Resource": "*", 273 "Action": [ 274 "ec2:AuthorizeSecurityGroupEgress", 275 "ec2:AuthorizeSecurityGroupIngress", 276 "ec2:CancelSpotInstanceRequests", 277 "ec2:CreateNetworkInterface", 278 "ec2:CreateSecurityGroup", 279 "ec2:CreateTags", 280 "ec2:DeleteNetworkInterface", 281 "ec2:DeleteSecurityGroup", 282 "ec2:DeleteTags", 283 "ec2:DescribeAvailabilityZones", 284 "ec2:DescribeAccountAttributes", 285 "ec2:DescribeDhcpOptions", 286 "ec2:DescribeInstanceStatus", 287 "ec2:DescribeInstances", 288 "ec2:DescribeKeyPairs", 289 "ec2:DescribeNetworkAcls", 290 "ec2:DescribeNetworkInterfaces", 291 "ec2:DescribePrefixLists", 292 "ec2:DescribeRouteTables", 293 "ec2:DescribeSecurityGroups", 294 "ec2:DescribeSpotInstanceRequests", 295 "ec2:DescribeSpotPriceHistory", 296 "ec2:DescribeSubnets", 297 "ec2:DescribeVpcAttribute", 298 "ec2:DescribeVpcEndpoints", 299 "ec2:DescribeVpcEndpointServices", 300 "ec2:DescribeVpcs", 301 "ec2:DetachNetworkInterface", 302 "ec2:ModifyImageAttribute", 303 "ec2:ModifyInstanceAttribute", 304 "ec2:RequestSpotInstances", 305 "ec2:RevokeSecurityGroupEgress", 306 "ec2:RunInstances", 307 "ec2:TerminateInstances", 308 "ec2:DeleteVolume", 309 "ec2:DescribeVolumeStatus", 310 "ec2:DescribeVolumes", 311 "ec2:DetachVolume", 312 "iam:GetRole", 313 "iam:GetRolePolicy", 314 "iam:ListInstanceProfiles", 315 "iam:ListRolePolicies", 316 "iam:PassRole", 317 "s3:CreateBucket", 318 "s3:Get*", 319 "s3:List*", 320 "sdb:BatchPutAttributes", 321 "sdb:Select", 322 "sqs:CreateQueue", 323 "sqs:Delete*", 324 "sqs:GetQueue*", 325 "sqs:PurgeQueue", 326 "sqs:ReceiveMessage" 327 ] 328 }] 329 } 330 EOF 331 } 332 333 # IAM Role for EC2 Instance Profile 334 resource "aws_iam_role" "iam_emr_profile_role" { 335 name = "iam_emr_profile_role" 336 337 assume_role_policy = <<EOF 338 { 339 "Version": "2008-10-17", 340 "Statement": [ 341 { 342 "Sid": "", 343 "Effect": "Allow", 344 "Principal": { 345 "Service": "ec2.amazonaws.com" 346 }, 347 "Action": "sts:AssumeRole" 348 } 349 ] 350 } 351 EOF 352 } 353 354 resource "aws_iam_instance_profile" "emr_profile" { 355 name = "emr_profile" 356 roles = ["${aws_iam_role.iam_emr_profile_role.name}"] 357 } 358 359 resource "aws_iam_role_policy" "iam_emr_profile_policy" { 360 name = "iam_emr_profile_policy" 361 role = "${aws_iam_role.iam_emr_profile_role.id}" 362 363 policy = <<EOF 364 { 365 "Version": "2012-10-17", 366 "Statement": [{ 367 "Effect": "Allow", 368 "Resource": "*", 369 "Action": [ 370 "cloudwatch:*", 371 "dynamodb:*", 372 "ec2:Describe*", 373 "elasticmapreduce:Describe*", 374 "elasticmapreduce:ListBootstrapActions", 375 "elasticmapreduce:ListClusters", 376 "elasticmapreduce:ListInstanceGroups", 377 "elasticmapreduce:ListInstances", 378 "elasticmapreduce:ListSteps", 379 "kinesis:CreateStream", 380 "kinesis:DeleteStream", 381 "kinesis:DescribeStream", 382 "kinesis:GetRecords", 383 "kinesis:GetShardIterator", 384 "kinesis:MergeShards", 385 "kinesis:PutRecord", 386 "kinesis:SplitShard", 387 "rds:Describe*", 388 "s3:*", 389 "sdb:*", 390 "sns:*", 391 "sqs:*" 392 ] 393 }] 394 } 395 EOF 396 } 397 ```