Skip to main content

Home

Amazon Web Services (AWS) Databricks

Before installing Unravel in AWS Databricks, check and ensure that the installation requirements are completed and follow the below instructions to install and configure Unravel:

1. Create an EC2 instance and connect Databricks to Unravel VM
  1. On your AWS Console, go to the EC2 dashboard and click Launch Instance.

  2. Select the following options based on Unravel's instance requirements:

    • Base OS

    • Instance type and size

    • Ports

    • Networking

      The EC2 instance must be in the same region as the target clusters, which Unravel EC2 node will be monitoring.

    • Security groups or policies

      • Create a security group that allows port 3000 and port 4043 from the cluster nodes' IP address, and put the security group member used on the cluster in this rule.

      Sample inbound rule

      Type

      Protocol

      Port range

      Source

      All traffic

      All

      All

      Security group ID of this group or subnet IP block.

      For example, 10.10.0.0/16

      SSH

      TCP

      22

      0.0.0.0/0 or trusted public IP for SSH access

      Custom TCP Rule

      TCP

      443

      Security group ID used on the cluster or subnet IP block (if the IP block belongs to a different VPC). Required for VPC peering connection

      Custom TCP Rule

      TCP

      3000

      Security group ID used on the cluster or subnet IP block (if the IP block belongs to a different VPC). Required for VPC peering connection.

      Custom TCP Rule

      TCP

      4043

      Security group ID used on the cluster or subnet IP block (if the IP block belongs to a different VPC). Required for VPC peering connection.

      Custom TCP Rule

      TCP

      4443

      Security group ID used on the cluster or subnet IP block (if the IP block belongs to a different VPC). Required for VPC peering connection.

Review the Virtual Private Cloud (VPC) Peering options to connect Databricks with the Unravel VM.

Workspace

VPC Peering Options

Workspace and Unravel VM are in the same VPC

-

Workspace VPC is in a different Region

Use VPC Peering:

Workspace VPC is in a different AWS account

Use VPC Peering:

2. Download Unravel
3. Deploy Unravel

Unravel binaries are available as a TAR file or RPM package. You can deploy the Unravel binaries in any directory on the server. However, the user who installs Unravel must have the write permissions to the directory where the Unravel binaries are deployed.

After you extract the contents of the TAR file or RPM package, unravel directory is created within the installation directory (<unravel_installation_directory>) and Unravel will be available in <Unravel_installation_directory>/unravel. The directory layout will be unravel/versions/<Directories and files>.

The following steps to deploy Unravel from a TAR file should be performed by a user, who will run Unravel.

  1. Create an Installation directory.

    mkdir /path/to/installation/directory
    

    For example: mkdir /opt/

  2. Extract Unravel tar file to the installation directory, which you have created in the first step. After you extract the contents of the TAR file, unravel directory is created within the installation directory.

    tar zxf unravel-<version>tar.gz -C </path/to/installation/directory>
    

    For example: tar zxf unravel-4.7.0.0.tar.gz -C /opt

    The unravel directory will be available within /opt.

  3. Grant ownership of the directory to a user who will run Unravel.

    chown -R username:groupname </path/to/installation/directory>
    

    For example: chown -R hadoop:hadoop /opt/unravel/

Important

The following steps to deploy Unravel from an RPM package should be performed by a root user. After the RPM package is deployed, the remaining installation procedures should be performed by unravel user.

  1. Create an installation directory.

    mkdir /usr/local/unravel
    
  2. Run the following command:

    rpm -i unravel-<version>.rpm
    

    For example: rpm -i unravel-4.7.0.0.rpm

    The unravel directory will be available in /usr/local.

    If you want to provide a different location, use the --prefix command.

    For example:

    mkdir /opt/unravel

    rpm -i unravel-4.7.0.0.rpm --prefix /opt

    The unravel directory will be available in /opt.

  3. Grant ownership of the directory to a user who will run Unravel. This user executes all the processes involved in Unravel installation.

    chown -R username:groupname /usr/local/unravel
    

    For example: chown -R hadoop:hadoop /usr/local/unravel

  4. Continue with the installation procedures as unravel user.

4. Install Unravel

You can install Unravel either with Interactive Precheck or manually without Interactive Precheck.

Note

Unravel recommends installation with Interactive Precheck.

To install Unravel with Interactive precheck, you must run the Interactive Precheck utility to generate a bootstrap configuration file for installation.

5. Configure Unravel Log Receiver
6. Connect Databricks cluster to Unravel

Run the following steps to connect the Databricks cluster to Unravel.