github.com/ves/terraform@v0.8.0-beta2/website/source/docs/providers/aws/r/emr_cluster.html.md (about)

     1  ---
     2  layout: "aws"
     3  page_title: "AWS: aws_emr_cluster"
     4  sidebar_current: "docs-aws-resource-emr-cluster"
     5  description: |-
     6    Provides an Elastic MapReduce Cluster
     7  ---
     8  
     9  # aws\_emr\_cluster
    10  
    11  Provides an Elastic MapReduce Cluster, a web service that makes it easy to
    12  process large amounts of data efficiently. See [Amazon Elastic MapReduce Documentation](https://aws.amazon.com/documentation/elastic-mapreduce/)
    13  for more information.
    14  
    15  ## Example Usage
    16  
    17  ```
    18  resource "aws_emr_cluster" "emr-test-cluster" {
    19    name          = "emr-test-arn"
    20    release_label = "emr-4.6.0"
    21    applications  = ["Spark"]
    22  
    23    ec2_attributes {
    24      subnet_id                         = "${aws_subnet.main.id}"
    25      emr_managed_master_security_group = "${aws_security_group.sg.id}"
    26      emr_managed_slave_security_group  = "${aws_security_group.sg.id}"
    27      instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
    28    }
    29  
    30    master_instance_type = "m3.xlarge"
    31    core_instance_type   = "m3.xlarge"
    32    core_instance_count  = 1
    33  
    34    tags {
    35      role     = "rolename"
    36      env      = "env"
    37    }
    38  
    39    bootstrap_action {
    40      path = "s3://elasticmapreduce/bootstrap-actions/run-if"
    41      name = "runif"
    42      args = ["instance.isMaster=true", "echo running on master node"]
    43    }
    44  
    45    configurations = "test-fixtures/emr_configurations.json"
    46  
    47    service_role = "${aws_iam_role.iam_emr_service_role.arn}"
    48  }
    49  ```
    50  
    51  The `aws_emr_cluster` resource typically requires two IAM roles, one for the EMR Cluster
    52  to use as a service, and another to place on your Cluster Instances to interact
    53  with AWS from those instances. The suggested role policy template for the EMR service is `AmazonElasticMapReduceRole`,
    54  and `AmazonElasticMapReduceforEC2Role` for the EC2 profile. See the [Getting
    55  Started](https://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-gs-launch-sample-cluster.html)
    56  guide for more information on these IAM roles. There is also a fully-bootable
    57  example Terraform configuration at the bottom of this page.
    58  
    59  ## Argument Reference
    60  
    61  The following arguments are supported:
    62  
    63  * `name` - (Required) The name of the job flow
    64  * `release_label` - (Required) The release label for the Amazon EMR release
    65  * `master_instance_type` - (Required) The EC2 instance type of the master node
    66  * `core_instance_type` - (Optional) The EC2 instance type of the slave nodes
    67  * `core_instance_count` - (Optional) Number of Amazon EC2 instances used to execute the job flow. EMR will use one node as the cluster's master node and use the remainder of the nodes (`core_instance_count`-1) as core nodes. Default `1`
    68  * `log_uri` - (Optional) S3 bucket to write the log files of the job flow. If a value
    69  	is not provided, logs are not created
    70  * `applications` - (Optional) A list of applications for the cluster. Valid values are: `Hadoop`, `Hive`,
    71  	`Mahout`, `Pig`, and `Spark.` Case insensitive
    72  * `ec2_attributes` - (Optional) Attributes for the EC2 instances running the job
    73  flow. Defined below
    74  * `bootstrap_action` - (Optional) List of bootstrap actions that will be run before Hadoop is started on
    75  	the cluster nodes. Defined below
    76  * `configurations` - (Optional) List of configurations supplied for the EMR cluster you are creating
    77  * `service_role` - (Optional) IAM role that will be assumed by the Amazon EMR service to access AWS resources
    78  * `visible_to_all_users` - (Optional) Whether the job flow is visible to all IAM users of the AWS account associated with the job flow. Default `true`
    79  * `tags` - (Optional) list of tags to apply to the EMR Cluster
    80  
    81  
    82  
    83  ## ec2\_attributes
    84  
    85  Attributes for the Amazon EC2 instances running the job flow
    86  
    87  * `key_name` - (Optional) Amazon EC2 key pair that can be used to ssh to the master
    88  	node as the user called `hadoop`
    89  * `subnet_id` - (Optional) VPC subnet id where you want the job flow to launch.
    90  Cannot specify the `cc1.4xlarge` instance type for nodes of a job flow launched in a Amazon VPC
    91  * `additional_master_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the master node
    92  * `additional_slave_security_groups` - (Optional) List of additional Amazon EC2 security group IDs for the slave nodes
    93  * `emr_managed_master_security_group` - (Optional) Identifier of the Amazon EC2 security group for the master node
    94  * `emr_managed_slave_security_group` - (Optional) Identifier of the Amazon EC2 security group for the slave nodes
    95  * `service_access_security_group` - (Optional) Identifier of the Amazon EC2 service-access security group - required when the cluster runs on a private subnet
    96  * `instance_profile` - (Optional) Instance Profile for EC2 instances of the cluster assume this role
    97  
    98  
    99  ## bootstrap\_action
   100  
   101  * `name` - (Required) Name of the bootstrap action
   102  * `path` - (Required) Location of the script to run during a bootstrap action. Can be either a location in Amazon S3 or on a local file system
   103  * `args` - (Optional) List of command line arguments to pass to the bootstrap action script
   104  
   105  ## Attributes Reference
   106  
   107  The following attributes are exported:
   108  
   109  * `id` - The ID of the EMR Cluster
   110  * `name`
   111  * `release_label`
   112  * `master_instance_type`
   113  * `core_instance_type`
   114  * `core_instance_count`
   115  * `log_uri`
   116  * `applications`
   117  * `ec2_attributes`
   118  * `bootstrap_action`
   119  * `configurations`
   120  * `service_role`
   121  * `visible_to_all_users`
   122  * `tags`
   123  
   124  
   125  ## Example bootable config
   126  
   127  **NOTE:** This configuration demonstrates a minimal configuration needed to
   128  boot an example EMR Cluster. It is not meant to display best practices. Please
   129  use at your own risk.
   130  
   131  
   132  ```
   133  provider "aws" {
   134    region = "us-west-2"
   135  }
   136  
   137  resource "aws_emr_cluster" "tf-test-cluster" {
   138    name          = "emr-test-arn"
   139    release_label = "emr-4.6.0"
   140    applications  = ["Spark"]
   141  
   142    ec2_attributes {
   143      subnet_id                         = "${aws_subnet.main.id}"
   144      emr_managed_master_security_group = "${aws_security_group.allow_all.id}"
   145      emr_managed_slave_security_group  = "${aws_security_group.allow_all.id}"
   146      instance_profile                  = "${aws_iam_instance_profile.emr_profile.arn}"
   147    }
   148  
   149    master_instance_type = "m3.xlarge"
   150    core_instance_type   = "m3.xlarge"
   151    core_instance_count  = 1
   152  
   153    tags {
   154      role     = "rolename"
   155      dns_zone = "env_zone"
   156      env      = "env"
   157      name     = "name-env"
   158    }
   159  
   160    bootstrap_action {
   161      path = "s3://elasticmapreduce/bootstrap-actions/run-if"
   162      name = "runif"
   163      args = ["instance.isMaster=true", "echo running on master node"]
   164    }
   165  
   166    configurations = "test-fixtures/emr_configurations.json"
   167  
   168    service_role = "${aws_iam_role.iam_emr_service_role.arn}"
   169  }
   170  
   171  resource "aws_security_group" "allow_all" {
   172    name        = "allow_all"
   173    description = "Allow all inbound traffic"
   174    vpc_id      = "${aws_vpc.main.id}"
   175  
   176    ingress {
   177      from_port   = 0
   178      to_port     = 0
   179      protocol    = "-1"
   180      cidr_blocks = ["0.0.0.0/0"]
   181    }
   182  
   183    egress {
   184      from_port   = 0
   185      to_port     = 0
   186      protocol    = "-1"
   187      cidr_blocks = ["0.0.0.0/0"]
   188    }
   189  
   190    depends_on = ["aws_subnet.main"]
   191  
   192    lifecycle {
   193      ignore_changes = ["ingress", "egress"]
   194    }
   195  
   196    tags {
   197      name = "emr_test"
   198    }
   199  }
   200  
   201  resource "aws_vpc" "main" {
   202    cidr_block           = "168.31.0.0/16"
   203    enable_dns_hostnames = true
   204  
   205    tags {
   206      name = "emr_test"
   207    }
   208  }
   209  
   210  resource "aws_subnet" "main" {
   211    vpc_id     = "${aws_vpc.main.id}"
   212    cidr_block = "168.31.0.0/20"
   213  
   214    tags {
   215      name = "emr_test"
   216    }
   217  }
   218  
   219  resource "aws_internet_gateway" "gw" {
   220    vpc_id = "${aws_vpc.main.id}"
   221  }
   222  
   223  resource "aws_route_table" "r" {
   224    vpc_id = "${aws_vpc.main.id}"
   225  
   226    route {
   227      cidr_block = "0.0.0.0/0"
   228      gateway_id = "${aws_internet_gateway.gw.id}"
   229    }
   230  }
   231  
   232  resource "aws_main_route_table_association" "a" {
   233    vpc_id         = "${aws_vpc.main.id}"
   234    route_table_id = "${aws_route_table.r.id}"
   235  }
   236  
   237  ###
   238  
   239  # IAM Role setups
   240  
   241  ###
   242  
   243  # IAM role for EMR Service
   244  resource "aws_iam_role" "iam_emr_service_role" {
   245    name = "iam_emr_service_role"
   246  
   247    assume_role_policy = <<EOF
   248  {
   249    "Version": "2008-10-17",
   250    "Statement": [
   251      {
   252        "Sid": "",
   253        "Effect": "Allow",
   254        "Principal": {
   255          "Service": "elasticmapreduce.amazonaws.com"
   256        },
   257        "Action": "sts:AssumeRole"
   258      }
   259    ]
   260  }
   261  EOF
   262  }
   263  
   264  resource "aws_iam_role_policy" "iam_emr_service_policy" {
   265    name = "iam_emr_service_policy"
   266    role = "${aws_iam_role.iam_emr_service_role.id}"
   267  
   268    policy = <<EOF
   269  {
   270      "Version": "2012-10-17",
   271      "Statement": [{
   272          "Effect": "Allow",
   273          "Resource": "*",
   274          "Action": [
   275              "ec2:AuthorizeSecurityGroupEgress",
   276              "ec2:AuthorizeSecurityGroupIngress",
   277              "ec2:CancelSpotInstanceRequests",
   278              "ec2:CreateNetworkInterface",
   279              "ec2:CreateSecurityGroup",
   280              "ec2:CreateTags",
   281              "ec2:DeleteNetworkInterface",
   282              "ec2:DeleteSecurityGroup",
   283              "ec2:DeleteTags",
   284              "ec2:DescribeAvailabilityZones",
   285              "ec2:DescribeAccountAttributes",
   286              "ec2:DescribeDhcpOptions",
   287              "ec2:DescribeInstanceStatus",
   288              "ec2:DescribeInstances",
   289              "ec2:DescribeKeyPairs",
   290              "ec2:DescribeNetworkAcls",
   291              "ec2:DescribeNetworkInterfaces",
   292              "ec2:DescribePrefixLists",
   293              "ec2:DescribeRouteTables",
   294              "ec2:DescribeSecurityGroups",
   295              "ec2:DescribeSpotInstanceRequests",
   296              "ec2:DescribeSpotPriceHistory",
   297              "ec2:DescribeSubnets",
   298              "ec2:DescribeVpcAttribute",
   299              "ec2:DescribeVpcEndpoints",
   300              "ec2:DescribeVpcEndpointServices",
   301              "ec2:DescribeVpcs",
   302              "ec2:DetachNetworkInterface",
   303              "ec2:ModifyImageAttribute",
   304              "ec2:ModifyInstanceAttribute",
   305              "ec2:RequestSpotInstances",
   306              "ec2:RevokeSecurityGroupEgress",
   307              "ec2:RunInstances",
   308              "ec2:TerminateInstances",
   309              "ec2:DeleteVolume",
   310              "ec2:DescribeVolumeStatus",
   311              "ec2:DescribeVolumes",
   312              "ec2:DetachVolume",
   313              "iam:GetRole",
   314              "iam:GetRolePolicy",
   315              "iam:ListInstanceProfiles",
   316              "iam:ListRolePolicies",
   317              "iam:PassRole",
   318              "s3:CreateBucket",
   319              "s3:Get*",
   320              "s3:List*",
   321              "sdb:BatchPutAttributes",
   322              "sdb:Select",
   323              "sqs:CreateQueue",
   324              "sqs:Delete*",
   325              "sqs:GetQueue*",
   326              "sqs:PurgeQueue",
   327              "sqs:ReceiveMessage"
   328          ]
   329      }]
   330  }
   331  EOF
   332  }
   333  
   334  # IAM Role for EC2 Instance Profile
   335  resource "aws_iam_role" "iam_emr_profile_role" {
   336    name = "iam_emr_profile_role"
   337  
   338    assume_role_policy = <<EOF
   339  {
   340    "Version": "2008-10-17",
   341    "Statement": [
   342      {
   343        "Sid": "",
   344        "Effect": "Allow",
   345        "Principal": {
   346          "Service": "ec2.amazonaws.com"
   347        },
   348        "Action": "sts:AssumeRole"
   349      }
   350    ]
   351  }
   352  EOF
   353  }
   354  
   355  resource "aws_iam_instance_profile" "emr_profile" {
   356    name  = "emr_profile"
   357    roles = ["${aws_iam_role.iam_emr_profile_role.name}"]
   358  }
   359  
   360  resource "aws_iam_role_policy" "iam_emr_profile_policy" {
   361    name = "iam_emr_profile_policy"
   362    role = "${aws_iam_role.iam_emr_profile_role.id}"
   363  
   364    policy = <<EOF
   365  {
   366      "Version": "2012-10-17",
   367      "Statement": [{
   368          "Effect": "Allow",
   369          "Resource": "*",
   370          "Action": [
   371              "cloudwatch:*",
   372              "dynamodb:*",
   373              "ec2:Describe*",
   374              "elasticmapreduce:Describe*",
   375              "elasticmapreduce:ListBootstrapActions",
   376              "elasticmapreduce:ListClusters",
   377              "elasticmapreduce:ListInstanceGroups",
   378              "elasticmapreduce:ListInstances",
   379              "elasticmapreduce:ListSteps",
   380              "kinesis:CreateStream",
   381              "kinesis:DeleteStream",
   382              "kinesis:DescribeStream",
   383              "kinesis:GetRecords",
   384              "kinesis:GetShardIterator",
   385              "kinesis:MergeShards",
   386              "kinesis:PutRecord",
   387              "kinesis:SplitShard",
   388              "rds:Describe*",
   389              "s3:*",
   390              "sdb:*",
   391              "sns:*",
   392              "sqs:*"
   393          ]
   394      }]
   395  }
   396  EOF
   397  }
   398  ```