Home

Part 1: Installing Unravel Server

Important

If you have not already done so, confirm your new node meets Unravel's hosting requirements.

1. Create a new node on your cluster
  1. Using Ambari, add a new node (host) to your cluster. This new node will be Unravel Server's host machine.

  2. Verify that Ambari installed the following clients on the new host:

    • Atlas Metadata

    • HCat

    • HDFS

    • Hive

    • Kerberos

    • MapReduce2

    • Oozie

    • Pig

    • Slider

    • Spark

    • Spark2

    • Tez

    • YARN

    • ZooKeeper

    hdp-new-node.png
3. Install Unravel Server on the new node
  1. Download the Unravel Server RPM.

  2. Install the Unravel Server RPM.

    sudo rpm -Uvh unravel-version.rpm

    The installation creates the following items:

    • /usr/local/unravel/, which contains executables, scripts, properties file (unravel.properties), and logs.

    • /etc/init.d/unravel_*, which contains scripts for controlling services, such as unravel_all.sh for manually stopping, starting, and getting the status of all daemons in proper order.

    • User unravel if it doesn't exist already.

    • An HDFS directory for Hive Hook instrumentation, if your cluster is non-secure. If your cluster is secured you'll create this directory later.

  3. If the cluster is kerberized, check Kerberos settings with the kinit and klist commands:

    klist -kt /etc/security/keytabs/unravel.keytab
    Keytab name: FILE:/etc/security/keytabs/unravel.keytab
    KVNO Timestamp           Principal
    ---- ------------------- ------------------------------------------------------
       3 03/08/2019 15:20:13 unravel/congo52.unraveldata.com@lab.localdomain
    
    kinit -kt /etc/security/keytabs/unravel.keytab unravel/congo52.unraveldata.com@lab.localdomain
    
    klist
    Ticket cache: FILE:/tmp/krb5cc_0
    Default principal: unravel/congo52.unraveldata.com@lab.localdomain 
    Valid starting       Expires              Service principal
    03/14/2019 16:02:49  03/15/2019 16:02:49  krbtgt/lab.localdomain@lab.localdomain 
    # groups unravelunravel : unravel hadoop 
    # kvno unravel/congo52.unraveldata.com@lab.localdomainunravel/congo52.unraveldata.com@lab.localdomain: kvno = 3
  4. Check the unravel user's network access.

    If the cluster is kerberized, make sure Unravel can access http://timeline-host:8188/ws/v1/timeline.  The curl command below requires a successful kinit command for the unravel user.  The -u option is a fake user and is ignored when relying on GSS-API.

    curl --negotiate -v -u :-X GET http://timeline-host:8188/ws/v1/timeline
    ...
    > Authorization: Negotiate token
    > User-Agent: curl/7.29.0
    > Host: congo52.unraveldata.com:8188
    > Accept: */*
    >
    < HTTP/1.1 200 OK
    ...
    

Run the following steps for installing Unravel Tarball:

The following section provides information about deploying Unravel binaries from a tar file. The Unravel binaries can be deployed in any directory on the server. The user who installs Unravel must have the write permissions to the directory where Unravel binaries are deployed.

To deploy Unravel binaries from a tar file, do the following:

  1. Deploy Unravel from a tar file.

    1. Create an Installation directory and grant ownership of the directory to the user who installs Unravel. This user executes all the processes involved in Unravel installation.

      mkdir /path/to/installation/directory
      chown -R username:groupname /path/to/installation/directory
    2. Download Unravel.

    3. Extract the Unravel tar file to the installation directory, which was created as part of the prerequisite.

      tar zxf unravel-<version>tar.gz -C /path/to/installation/directory
  2. Run setup.

    After deploying the Unravel binaries, run the setup command.

    1. Run the following setup command:

      <installation_directory>/versions/4.6.x.x/setup

      For example:

      /opt/unravel/versions/4.6.1.6.659/setup --extra /tmp/mysql --external-database mysql localhost 3306 unravel_mysql_prod unravel unraveldata
    2. Configure unravel using the following commands:

      <unravel_installation_directory>/manager config auto
      

      These will configure Unravel, start the daemons, and check the status of the daemon processes.

      Note

      Run --help with the setup command as well as with any combination of the setup command for the complete usage details.

      <unravel_installation_directory>/versions/4.6.x.x/setup --help
      <unravel_installation_directory>/manager/setup manager config auto --help
  3. Verify Unravel installation.

    1. Verify the Unravel sensor, properties, and files. Run the following command from the manager tool:

      manager verify <sensor|properties|files>
      • sensor: Checks the cluster configuration and ensures that the sensor parameters are correct.

      • properties: Validates unravel.properties, checking for invalid, unknown, duplicate properties.

      • files: Verifies and reports on any unexpected changes to unravel components, scripts, and configuration.

    2. Start all services and verify the status for all the services.

      manager start
      manager report
6. Configure Unravel Server with basic options

On Unravel Server, edit /usr/local/unravel/etc/unravel.properties as follows.

  1. Set general properties.

  2. Specify the location of HDFS logs.

    Important

    If you're using Spark1 and Spark2 you must set com.unraveldata.spark.eventlog.location to two directories using a comma-separated list.

    com.unraveldata.job.collector.done.log.base=/mr-history/done
    com.unraveldata.job.collector.log.aggregation.base=/app-logs/*/logs*
    com.unraveldata.spark.eventlog.location=hdfs:///spark1-history/,hdfs:///spark2-history/.
  3. If Kerberos is enabled, do the following:

    1. Create or identify a principal and keytab for Unravel to use to access the HDFS resources listed in the table below and the REST API.

    2. Set these properties:

  4. If Ranger is enabled, do the following:

    1. Check/set Ranger permissions for the runtime user.

    2. Create or identify a principal and keytab for Unravel daemons to access the HDFS resources listed in the table below.

  5. If you are using a virus scanner

    We recommend you disable your virus scanner from scanning the elasticsearch directories which are located under /srv/unravel.

  6. Disable Unravel's Impala sensor.

    HDP doesn't officially support Impala, so set com.unraveldata.sensor.tasks.disabled to iw.

    com.unraveldata.sensor.tasks.disabled=iw
  7. In the Ambari dashboard, set YARN ACLs.

    1. Select YARN | Configs | Advanced.

    2. Set yarn.acl.enable to true.

    3. Add the Unravel user specified in com.unraveldata.kerberos.principal to yarn.admin.acl.

      RangerEnabledYarn.png
    4. Save your changes.

7. Modify Unravel Server for HDP
  1. On Unravel Server, stop all services and run the switch_to_hdp.sh script.

    sudo /etc/init.d/unravel_all.sh stop
    sudo /usr/local/unravel/install_bin/switch_to_hdp.sh

    Note

    The changes made by switch_to_hdp.sh are persistent through subsequent RPM upgrades; you don't need to re-run this when you upgrade the RPM.

  2. Open /usr/local/unravel/etc/unravel.properties to see if you need to modify any properties which were added or modified by the script switch_to_hdp.sh:

    • Log properties.

      Important

      If you're using Spark1 and Spark2 you must set com.unraveldata.spark.eventlog.location to two directories using a comma-separated list.

      com.unraveldata.job.collector.done.log.base=/mr-history/done
      com.unraveldata.job.collector.log.aggregation.base=/app-logs/*/logs*
      com.unraveldata.spark.eventlog.location=hdfs:///spark1-history/,hdfs:///spark2-history/
    • Hive metastore properties.

      javax.jdo.option.ConnectionURL=jdbc:mysql://unravel-host:3306/database-name
      javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver
      javax.jdo.option.ConnectionPassword=hive-metastore-password
      javax.jdo.option.ConnectionURL=JDBC connection string
  3. Log into Ambari Web UI and set the following properties to match the values in /usr/local/unravel/etc/unravel.properties:

    Cluster Property

    Unravel Property

    mapreduce.jobhistory.done-dir

    com.unraveldata.job.collector.done.log.base

    yarn.nodemanager.remote-app-log-dir

    com.unraveldata.job.collector.done.log.aggregation.base

    jdbc:mysql:// database-host :3306/ database-url

    javax.jdo.option.ConnectionURL

    JDBC Driver Class

    javax.jdo.option.ConnectionDriverName

    Database Username

    javax.jdo.option.ConnectionUserName

8. Change the run-as user and group for Unravel daemons
9. Start Unravel services

Run the following command to start all Unravel services:

sudo /etc/init.d/unravel_all.sh start

This completes the basic/core configuration.

10. Log into Unravel UI

Using a supported web browser (see Unravel's HDP compatibility matrix), navigate to http://unravel-host:3000 and log in with username admin with password unraveldata.

signin.png

Unravel UI displays collected data.