Home

Installing Unravel on HDP

This topic explains how to deploy Unravel on Hortons Data Platform (HDP).

Important

If you have not already done so, confirm your new node meets Unravel's hosting requirements.

1. Create a new node on your cluster
  1. Using Ambari, add a new node (host) to your cluster. This new node will be Unravel Server's host machine.

  2. Verify that Ambari installed the following clients on the new host:

    • Atlas Metadata

    • HCat

    • HDFS

    • Hive

    • Kerberos

    • MapReduce2

    • Oozie

    • Pig

    • Slider

    • Spark

    • Spark2

    • Tez

    • YARN

    • ZooKeeper

    hdp-new-node.png
4. Check Kerberos settings.
  1. If the cluster is kerberized, check Kerberos settings with the kinit and klist commands:

    klist -kt /etc/security/keytabs/unravel.keytab
    Keytab name: FILE:/etc/security/keytabs/unravel.keytab
    KVNO Timestamp           Principal
    ---- ------------------- ------------------------------------------------------
       3 03/08/2019 15:20:13 unravel/abcd25.unraveldata.com@lab.localdomain
    
    kinit -kt /etc/security/keytabs/unravel.keytab unravel/abcd25.unraveldata.com@lab.localdomain
    
    klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: unravel/abcd25.unraveldata.com@lab.localdomain 
    Valid starting       Expires              Service principal
    03/14/2019 16:02:49  03/15/2019 16:02:49  krbtgt/lab.localdomain@lab.localdomain 
    # groups unravelunravel : unravel hadoop 
    # kvno unravel/abcd25.unraveldata.com@lab.localdomainunravel/abcd25.unraveldata.com@lab.localdomain: kvno = 3
  2. Check the unravel user's network access.

    If the cluster is kerberized, make sure Unravel can access http://timeline-host:8188/ws/v1/timeline.  The curl command below requires a successful kinit command for the unravel user.  The -u option is a fake user and is ignored when relying on GSS-API.

    curl --negotiate -v -u :-X GET http://timeline-host:8188/ws/v1/timeline
    ...
    > Authorization: Negotiate token
    > User-Agent: curl/7.29.0
    > Host: congo52.unraveldata.com:8188
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    ...
    
6. (Optional) Enable additional daemons for high-volume workloads.
7. Configure Unravel Server with basic options
  1. Edit the Unravel properties through automatic or manual configuration. Following is a reference list of the basic properties that you can configure for HDP installation.

  2. If Kerberos is enabled, do the following:

    1. Create or identify a principal and keytab for Unravel to use to access the HDFS resources listed in the table below and the REST API.

    2. Use the manager tool to access the configuration UI and run the following commands to set these properties:

      manager config kerberos set --keytab /path/to/keytab --principal <service@REALM>
      
      manager config kerberos enable

      Also, refer to Kerberos configurations.

  3. If Ranger is enabled, do the following:

    1. Check/set Ranger permissions for the runtime user.

    2. Create or identify a principal and keytab for Unravel daemons to access the HDFS resources listed in the table below.

  4. If you are using a virus scanner,

    It is recommended to disable your virus scanner from scanning the elasticsearch directories which are located under Unravel installation directory/data.

  5. In the Ambari dashboard, set YARN ACLs.

    1. Select YARN | Configs | Advanced.

    2. Set yarn.acl.enable to true.

    3. Add the Unravel user specified in com.unraveldata.kerberos.principal to yarn.admin.acl.

      RangerEnabledYarn.png
    4. Save your changes.

8. Set hive metastore password

Use the manager tool to access the configuration UI and set the hive metastore password manually:

manager config properties set javax.jdo.option.ConnectionPassword --<encrypted>
9. Verify properties

Log in to the Ambari web UI and verify if the values of the following properties match in the configuration.

Unravel Property

Cluster Property

com.unraveldata.job.collector.done.log.base

mapreduce.jobhistory.done-dir

com.unraveldata.job.collector.done.log.aggregation.base

yarn.nodemanager.remote-app-log-dir

javax.jdo.option.ConnectionURL

jdbc:mysql:// database-host :3306/ database-url

javax.jdo.option.ConnectionDriverName

JDBC Driver Class

javax.jdo.option.ConnectionUserName

Database Username

10. Start Unravel

Run the following command to start Unravel:

<Unravel installation directory>/manager start

This completes the basic/core configuration.

11. Log into Unravel UI

Using a supported web browser (see Unravel's HDP compatibility matrix), navigate to http://unravel-host:3000 and log in with username admin with password unraveldata.

signin.png

Unravel UI displays collected data.