Home

Step 2: Connect a new EMR cluster to Unravel Server
Step 2 (a): Create a new EMR cluster and connect Unravel
  1. Go to https://console.aws.amazon.com/elasticmapreduce/ and click Create Cluster.

    aws-marketplace-step2a-create-cluster.png
  2. Click Go to advanced options.

    aws-marketplace-step2a-create-cluster-advanced-options.png
  3. Select the release and the services you want to install.

    aws-marketplace-step2a-create-cluster-select-sw.png
  4. Click Next.

  5. Select the same VPC and subnet as the one chosen for Unravel Server (Unravel Server can reside in a different VPC, but then you would need to set up VPC peering).

    aws-marketplace-step2a-create-cluster-select-hw.png
  6. Click Next.

  7. Add Bootstrap Action

    • In this section, go the Bootstrap Actions section, select Custom Actions, and click Configure and Add.

      aws-marketplace-step2a-create-cluster-add-bootstrap1.png

      That brings up a dialog box like this:

      aws-marketplace-step2a-create-cluster-add-bootstrap2.png
  8. In the Script location text box, enter s3://unraveldatarepo/unravel_emr_bootstrap.py

  9. In the Optional Arguments text box, do the following:

    • Make a note of the private IP from your Unravel instance:

      aws-marketplace-step2a-create-cluster-add-bootstrap3.png
    • Add --unravel-server unravel-ec2-private-ip-address --bootstrap

      aws-marketplace-step2a-create-cluster-add-bootstrap4.png
  10. Click Save in the dialog box, and then click Next.

  11. In the next screen, as shown below, select a key pair to be able to connect to the EC2 nodes and also make a note of the names of the security groups encircled below, as you will modify them.

    aws-marketplace-step2a-create-cluster-add-bootstrap5.png
  12. Click Create Cluster.

Step 2 (b): Modify EMR cluster security groups so that Unravel Server has adequate access
  1. To find the security group ID of the security group named Unravel for Amazon EMR-1, navigate to Security Groups from the left panel in your AWS console and search for Unravel for Amazon EMR-1.

    aws-marketplace-step2b-modify-security-group1.png
  2. Make a note of the group ID.

    For example, in the screenshot above, the group ID is sg-0564b1b8902ecf611.

  3. From the EMR cluster’s screen (where it shows its status in the creation process), navigate to each of the two security groups highlighted in the screenshot below.

    aws-marketplace-step2b-modify-security-group2.png

    Note

    You only need to change these security groups once.

  4. Click the group, select one, click the Inbound tab, and click Edit.

    aws-marketplace-step2b-modify-security-group3.png
  5. Click Add Rule.

    aws-marketplace-step2b-modify-security-group4.png
  6. Add three rules as follows and then Save (as shown in the screenshot below):

    • Type = Custom TCP, Protocol = TCP, Port Range = 8020, Source = security-group-ID-of-Unravel-Server’s-security-group

    • Type = Custom TCP, Protocol = TCP, Port Range = 50010, Source = security-group-ID-of-Unravel-Server’s-security-group

    • Type = Custom TCP, Protocol = TCP, Port Range = 50020, Source = security-group-ID-of-Unravel-Server’s-security-group

    aws-marketplace-step2b-modify-security-group5.png
  7. Make the above changes to both the security groups corresponding to the EMR cluster.

In order to connect to an existing EMR cluster (instead of a new one) and/or for more advanced options, see Connecting the Unravel EC2 instance to a new or existing EMR cluster.