

Each version of Unravel has specific platform requirements. Confirm that your HDInsight platform meets the requirements for the version of Unravel that you're installing, including:

  • HDInsight 3.6

  • Spark 1.6.3, 2.1.0, 2.2.0, 2.3.0

    Limitations: Spark relies on the yarn-site configuration property yarn.log-aggregation.file-formats, whose only supported value is TFile, like this:

  • Hive 2.1

  • Kafka 0.10.0, Kafka 1.0, Kafka 1.1 (preview)


Minimum VM type suggested: Medium memory optimized such as Standard_E8s_v3


Image (underlying operating system for the VM): RHEL 7 or CentOS 7.2 - 7.6. The actual HDInsight Kafka/Spark cluster can run another OS.

  • You must already have an Azure account.

  • You must already have a resource group assigned to a region in order to group your policies, VMs, and storage blobs/lakes/drives.

    A resource group is a container that holds related resources for an Azure solution. In Azure, you logically group related resources such as storage accounts, virtual networks, and virtual machines (VMs) to deploy, manage, and maintain them as a single entity.

  • You must have root privilege in order to perform some commands on the VM.

  • You must already have created Azure storage.

  • You must have an SSH key pair.

  • You must already have a virtual network and network security group set up for your resource group. Your virtual network and subnet(s) must be big enough to be shared by the Unravel VM and the target HDInsight cluster(s).

  • The Unravel VM must be located in the same VNET and VSNET as the HDInsight cluster.

  • You must allow inbound SSH connections to the Unravel VM.

  • You must allow outbound Internet access and all traffic within the subnet (VSNET).

  • Port 443 is open on the cluster for Azure HDInsight to monitor applications.

  • Port 3000 (or 4020) is open for Unravel Web UI access.

  • UDP and TCP ports 4041-4043 are open from the cluster to Unravel Server.