Home

Part 1: Installing Unravel on a separate Azure VM

This topic explains how to install Unravel on an Azure VM with required network setup to connect the Unravel VM to your Azure Databricks workspaces.

1. Provision an Azure VM for Unravel Server
2. Configure the VM
3. Install MySQL

Complete the [Before Installing Unravel RPM] steps in Install and configure MySQL for Unravel.

4. Install Unravel Server on the VM
5. Configure MySQL

Complete the [After Installing Unravel RPM] steps in Install and configure MySQL for Unravel.

6. Configure Unravel Server with basic options
  1. Open an SSH session to the Unravel VM.

    ssh -i ssh-private-key ssh-user@unravel-host
  2. Set correct permissions on the Unravel configuration directory.

    cd /usr/local/unravel/etc
    sudo chown unravel:unravel *.properties
    sudo chmod 644 *.properties
  3. In /usr/local/unravel/etc/unravel.properties, add/modify the following properties:

    1. Set general properties:

      com.unraveldata.onprem=false
      com.unraveldata.cluster.type=DB
      com.unraveldata.python.enabled=false
      com.unraveldata.tagging.enabled=true
    2. Optionally modify the following properties:

      com.unraveldata.spark.appLoading.delayForRetry=60000
      com.unraveldata.databricks.http.conn.timeout=5
      com.unraveldata.databricks.http.read.timeout=10
  4. Allow network access to the Unravel VM from Databricks workspaces:

    Assign public IP and add firewall rules to open the required ports as follows.

    Databricks cluster nodes are created in their own virtual network "workers-vnet". And for these cluster nodes access to Unravel VM’s secured service endpoint, you need to open Unravel VM’s secured service endpoints to the public. This means Unravel secure services ports (3000, 4043, and 4443) need to be opened to the public on Unravel VM’s virtual network and then mapped to Unravel VM, by following the steps below.

    1. Log into the Azure portal.

    2. Click Virtual Machines service and select the Unravel VM.

      azure-vm-networking1.png
    3. Click Networking.

      azure-vm-networking2.png

      The networking settings of the Unravel VM are displayed. The settings include the virtual network, subnet, and security group which the VM is currently using.

    4. Click Inbound port rule.

      azure-vm-networking3.png
    5. Add an inbound rule to the network security group to allow port 3000, 4043 and 4443 on the internal IP address of the Unravel VM.

      In the screenshot below, the internal IP address of the Unravel VM is 10.10.1.63.

      azure-vm-networking4.png

      You should now see the following three inbound rules added to the network security group:

      Rule name

      Port

      Protocol

      Source

      Destination

      Allow_port_3000_to_Unravel_VM

      3000

      Any

      Any

      Unravel_VM_internal_IP

      Allow_port_4043_to_Unravel_VM

      4043

      Any

      Any

      Unravel_VM_internal_IP

      Allow_port_4443_to_Unravel_VM

      4443

      Any

      Any

      Unravel_VM_internal_IP

  5. Generate Databricks personal access token for all workspaces.

    In the Generate New Token dialog, enter these settings:

    • Workspace name

    • Workspace ID

    • Databricks instance

    • Databricks personal access token

      Tip

      Unravel uses this token to authenticate and access Databricks REST API and DBFS CLI.

      For information on how to generate the token, see https://docs.databricks.com/api/latest/authentication.html#generate-a-token

      Unlike passwords, tokens expire and can be revoked. If a token expires, replace it with a new one and re-run databricks_setup.sh. To identify expired tokens, check the Unravel logs (look for "authentication errors") and Unravel UI (look for missing Databricks metadata).

  6. Enable token-based authentication.

  7. Install DBFS CLI on the Azure VM host.

    Note

    This CLI needs a Python version newer than 2.7.9.

  8. Run databricks_setup.sh to install Unravel agents on Azure Databricks workspaces.

7. Start Unravel services
sudo /etc/init.d/unravel_all.sh restart
8. Log into Unravel UI

Congratulations! Unravel Server is up and running. Proceed to Part 2: Connecting Unravel to a Databricks cluster.