Prerequisites
To deploy Unravel, first ensure that your cluster and host meet these requirements.
Platform
Each version of Unravel has specific platform requirements. Confirm that your HDP meets the requirements for the version of Unravel that you're installing. Your HDP environment must be deployed using Ambari.
Sizing
Use Ambari Web UI to allocate a new node with these specifications in your cluster:
Software
On the new node, confirm the following:
All default clients are running:
If the new node is running Red Hat Enterprise Linux (RHEL) 6.x, set its boostrap.system_call_filter to
false
inelasticsearch.yml
:boostrap.system_call_filter: false
libaio.x86_64
is installed.For Unravel version 4.5.0.0,
SELINUX
is set topermissive
ordisabled
in/etc/sysconfig/selinuxg
. For Unravel versions 4.5.0.1+,SELINUX
can be set toenabled
.PATH
includes the path to the HDFS+Hive+YARN+Spark client/gateway, Hadoop commands, and Hive commands.Zookeeper is not installed.
Permissions
You must have root access or "sudo root" permission in order to install the Unravel Server RPM.
If you're using Kerberos, we'll explain how to create a principal and keytab for Unravel daemons to use to access these HDFS resources:
MapReduce logs (
/mr-history/done
)YARN's log aggregation directory (
/app-logs/*/logs*
)Spark and Spark2 event logs (
hdfs:///user/spark/applicationHistory/,hdfs:///spark2-history/,hdfs:///spark-history/
)File and partition sizes in the Hive warehouse directory (typically
hdfs://apps/hive/warehouse
)
Unravel needs access to the YARN Resource Manager's REST API (so that the principal can determine which resource manager is active).
Unravel needs access to the JDBC access to the Hive Metastore. Read-only access is sufficient.
If you plan to use Unravel's move or kill auto actions, you've added the Unravel username to YARN's yarn.admin.acl property.
Unravel needs read-only access to the Application Timeline Server (ATS).