Skip to main content

Home

MapR

Before installing, ensure to check and complete the installation requirements. Follow the below instructions to download, install, and set up Unravel for the MapR platform.

Notice

The following instructions are for a single cluster environment. For installing Unravel on a multi-cluster environment, refer to Multi-cluster install.

1. Download Unravel
2. Deploy Unravel binaries

Unravel binaries are available as a tar file or RPM package. You can deploy the Unravel binaries in any directory on the server. However, the user who installs Unravel must have write permissions to the directory where the Unravel binaries are deployed.

After the Unravel binaries are deployed, the directory layout for both the Tar and RPM will be unravel/versions/<Directories and files>. The binaries are deployed to <Unravel_installation_directory> and Unravel will be available in <Unravel_installation_directory>/unravel.

Option 1: Deploy Unravel from a tar file

The following steps to deploy Unravel from a tar file should be performed by a user, who will run Unravel.

  1. Create an Installation directory.

    mkdir /path/to/installation/directory
    ## For example: mkdir /opt/unravel
    

    Note

    Some locations may require root access to create a directory. In such a case, after the directory is created, change the ownership to unravel user and continue with the installation procedure as the unravel user.

    chown -R username:groupname /path/to/installation/directory
    ## For example: chown -R unravel:unravelgroup /opt/unravel
    
  2. Extract and copy the Unravel tar file to the installation directory, which was created in the first step. After you extract the contents of the tar file, unravel directory is created within the installation directory.

    tar zxf unravel-<version>tar.gz -C /path/to/installation/directory
    ## For example: tar zxf unravel-4.7.2.3.tar.gz -C /opt
    ## The unravel directory will be available within /opt
Option 2: Deploy Unravel from an RPM package

Important

The following steps, to deploy Unravel from an RPM package, should be performed by a root user. After the RPM package is deployed, the remaining installation procedures should be performed by the unravel user.

  1. Create an installation directory.

    mkdir /usr/local/unravel
    
  2. Run the following command:

    rpm -i unravel-<version>.rpm
    ## For example: rpm -i unravel-4.7.2.3.rpm 
    ## The unravel directory will be available in /usr/local

    In case you want to provide a different location, you can do so by using the --prefix command. For example:

    mkdir /opt/unravel
    chown -R username:groupname /opt/unravel
    rpm -i unravel-4.7.2.3.rpm --prefix /opt
    
    ## The unravel directory will be available in /opt
  3. Grant ownership of the directory to a user who will run Unravel. This user executes all the processes involved in Unravel installation.

    chown -R username:groupname /usr/local/unravel
    ## For example: chown -R unravel:unravelgroup /usr/local/unravel
  4. Continue with the installation procedures as unravel user.

3. Run the setup

You can run the setup command to install Unravel. The setup command allows you to do the following:

  • Runs Precheck automatically to detect possible issues that prevent a successful installation. Suggestions are provided to resolve issues. Refer to Precheck filters for the expected value for each filter.

  • Let you run extra parameters to integrate the database of your choice.

    The setup command allows you to use a managed database shipped with Unravel or an external database. The setup uses the Unravel managed PostgreSQL database when run without any additional parameters. Otherwise, you can specify one of the following types of databases in the setup command:

    • MySQL (Unravel managed as well as external MySQL database)

    • MariaDB (Unravel managed as well as external MariaDB database)

    • PostgreSQL (External PostgreSQL)

    Refer to Integrate database for details.

  • Let you specify a separate path for the data directory other than the default path.

    You can locate the Unravel data and configurations in the data directory. By default, the installer maintains the data directory under <Unravel installation directory>/data. You can also change the data directory's default location by running additional parameters with the setup command. To install Unravel with the setup command.

  • Provides more setup options.

Notice

The Unravel user who owns the installation directory should run the setup command to install Unravel.

To install Unravel with the setup command, do the following:

  1. Switch to Unravel user.

      su - <unravel user>
  2. Run setup command:

    Note

    Refer to setup Options for all the additional parameters that you can run with the setup command

    Refer to Integrate database topic and complete the pre-requisites before running the setup command with any other database other than Unravel managed PostgreSQL, which is shipped with the product. Extra parameters must be passed with the setup command when using another database.

    Tip

    Optionally, if you want to provide a different data directory, you can pass an extra parameter (--data-directory) with the setup command as follows:

    <unravel_installation_directory>/unravel/versions/<Unravel version>/setup --data-directory /the/data/directory

    Similarly, you can configure separate directories for other unravel directories —contact support for assistance.

    • PostgreSQL

      • Unravel managed PostgreSQL

        <unravel_installation_directory>/unravel/versions/<Unravel version>/setup
      • External PostgreSQL

        <unravel_installation_directory>/unravel/versions/<Unravel version>/setup --external-database postgresql <HOST> <PORT> <SCHEMA> <USERNAME> <PASSWORD>
        

        The HOST, PORT, SCHEMA, USERNAME, and PASSWORD are optional fields and are prompted if missing. For example: /opt/unravel/versions/abcd.992/setup --external-database postgresql xyz.unraveldata.com 5432 unravel_db_prod unravel unraveldata

    • MySQL

      • Unravel managed MySQL

        <unravel_installation_directory>/unravel/versions/<Unravel version>/setup --extra /tmp/mysql
      • External MySQL

        <unravel_installation_directory>/unravel/versions/<Unravel version>/setup --extra /tmp/<MySQL-directory> --external-database mysql <HOST> <PORT> <SCHEMA> <USERNAME> <PASSWORD>
        

        The HOST, PORT, SCHEMA, USERNAME, and PASSWORD are optional fields and are prompted if missing.

    • MariaDB

      • Unravel managed MariaDB

        <unravel_installation_directory>/unravel/versions/<Unravel version>/setup --extra /tmp/mariadb
      • External MariaDB

        <unravel_installation_directory>unravel/versions/<Unravel version>/setup --extra /tmp/<MariaDB-directory> --external-database mariadb <HOST> <PORT> <SCHEMA> <USERNAME> <PASSWORD>
        

        The HOST, PORT, SCHEMA, USERNAME, and PASSWORD are optional fields and are prompted if missing.

    Precheck is automatically run when you run the setup command. Refer to Precheck filters for the expected value for each filter.

  3. Apply the changes.

    <Unravel installation directory>/unravel/manager config apply
    
  4. Start all the services.

    <unravel_installation_directory>/unravel/manager start 
    
  5. Check the status of services.

    <unravel_installation_directory>/unravel/manager report 
    

    The following service statuses are reported:

    • OK: Service is up and running.

    • Not Monitored: Service is not running. (Has stopped or has failed to start)

    • Initializing: Services are starting up.

    • Does not exist: The process unexpectedly disappeared. A restart will be attempted ten times.

    You can also get the status and information for a specific service. Run the manager report command as follows:

    <unravel_installation_directory>/unravel/manager report <service>
    

    For example: /opt/unravel/manager report auto_action

The Precheck output displays the issues that prevent a successful installation and provides suggestions to resolve them. You must resolve each of the issues before proceeding. See Precheck filters.

After resolving the precheck issues, you must re-login or reload the shell to execute the setup command again.

Note

You can skip the precheck using the setup --skip-precheck command in certain situations.

For example:

/opt/unravel/versions/<Unravel version>/setup --skip-precheck

You can also skip the checks that you know can fail. For example, if you want to skip the Check limits option and the Disk freespace option, pick the command within the parenthesis corresponding to these failed options and run the setup command as follows:

setup --filter-precheck ~check_limits,~check_freespace 

Tip

Run --help with the setup command and any combination of the setup command for complete usage details.

<unravel_installation_directory>/unravel/versions/<Unravel version>/setup --help
Precheck filters
Precheck Sample
/opt/unravel/versions/abcd.1004/setup 
2021-04-05 15:51:30 Sending logs to: /tmp/unravel-setup-20210405-155130.log
2021-04-05 15:51:30 Running preinstallation check...
2021-04-05 15:51:31 Gathering information ................. Ok
2021-04-05 15:51:51 Running checks .................. Ok
--------------------------------------------------------------------------------
system
 Check limits        : PASSED
 Clock sync          : PASSED
 CPU requirement     : PASSED, Available cores: 8 cores
 Disk access         : PASSED, /opt/unravel/versions/develop.1004/healthcheck/healthcheck/plugins/system is writable
 Disk freespace      : PASSED, 229 GB of free disk space is available for precheck dir.
 Kerberos tools      : PASSED
 Memory requirement  : PASSED, Available memory: 79 GB
 Network ports       : PASSED
 OS libraries        : PASSED
 OS release          : PASSED, OS release version: centos 7.6
 OS settings         : PASSED
 SELinux             : PASSED
--------------------------------------------------------------------------------
Healthcheck report bundle: /tmp/healthcheck-20210405155130-xyz.unraveldata.com.tar.gz
2021-04-05 15:51:53 Prepare to install with: /opt/unravel/versions/abcd.1004/installer/installer/../installer/conf/presets/default.yaml
2021-04-05 15:51:57 Sending logs to: /opt/unravel/logs/setup.log
2021-04-05 15:51:57 Instantiating templates ................................................................................................................................................................................................................................ Ok
2021-04-05 15:52:05 Creating parcels .................................... Ok
2021-04-05 15:52:20 Installing sensors file ............................ Ok
2021-04-05 15:52:20 Installing pgsql connector ... Ok
2021-04-05 15:52:22 Starting service monitor ... Ok
2021-04-05 15:52:27 Request start for elasticsearch_1 .... Ok
2021-04-05 15:52:27 Waiting for elasticsearch_1 for 120 sec ......... Ok
2021-04-05 15:52:35 Request start for zookeeper .... Ok
2021-04-05 15:52:35 Request start for kafka .... Ok
2021-04-05 15:52:35 Waiting for kafka for 120 sec ...... Ok
2021-04-05 15:52:37 Waiting for kafka to be alive for 120 sec ..... Ok
2021-04-05 15:52:42 Initializing pgsql ... Ok
2021-04-05 15:52:46 Request start for pgsql .... Ok
2021-04-05 15:52:46 Waiting for pgsql for 120 sec ..... Ok
2021-04-05 15:52:47 Creating database schema ................. Ok
2021-04-05 15:52:50 Generating hashes .... Ok
2021-04-05 15:52:52 Loading elasticsearch templates ............ Ok
2021-04-05 15:52:55 Creating kafka topics .................... Ok
2021-04-05 15:53:36 Creating schema objects ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... Ok
2021-04-05 15:54:03 Request stop ....................................................... Ok
2021-04-05 15:54:16 Done
[unravel@xyz ~]$
4. Configuring Unravel for MapR

You must manually configure Unravel properties after installing Unravel. Do the following to manually configure the required set of properties.

  1. Create a file containing the required properties for MapR installation and their corresponding values. Here is a sample file with the mandatory properties for MapR installation. Ensure to replace the highlighted items with the corresponding values.

    # site-specific unravel properties
    com.unraveldata.is_mapr=true
    fs.defaultFS=maprfs://
    hadoop.conf.dir=/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop
    com.unraveldata.job.collector.done.log.base=/var/mapr/cluster/yarn/rm/staging/history/done
    com.unraveldata.spark.eventlog.location=maprfs:///apps/spark
    com.unraveldata.mr.parse.input.splits=false
    com.unraveldata.cluster.name=my.cluster.com
    com.unraveldata.job.collector.log.aggregation.base=/tmp/logs/*/logs/
    
    # Hive metastore properties
    javax.jdo.option.ConnectionURL=jdbc:mysql://<hive_metastore_host>:<hive_metastore_port>/hive?createDatabaseIfNotExist=true&amp;useLegacyDatetimeCode=false&amp;serverTimezone=<server_time_zone>
    javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver
    javax.jdo.option.ConnectionUserName=<hive_metastore_user>
    javax.jdo.option.ConnectionPassword=<hive_metastore_password>
    
    # Resource Manager (RM) properties. Enables https access to Resource Manager.
    https.protocols=TLSv1.2
    yarn.resourcemanager.webapp.address=https://<yarn_resource_manager_host>:<yarn_resource_manager_port>
    yarn.resourcemanager.webapp.username=<yarn_resource_manager_user>
    yarn.resourcemanager.webapp.password=<yarn_resource_manager_password>
         
    # oozie properties
    oozie.server.url=http://example.localdomain:11000/oozie
    

    The following tables provide descriptions and details about all the properties that can be set for MapR.

  2. From the installation directory, where Unravel binaries are installed, run the following command and provide the path to the properties file.

    Note

    Ensure that the properties file should be accessible by the user who installs Unravel that is the user should have read permissions for the properties file.

    <Unravel installation directory>/unravel/manager config properties import <path to the properties file>
    
    #For example:
    /opt/unravel/manager config properties import /opt/properties.txt

    Note

    You can set individual properties using the manager utility. Run the following command:

    <Unravel installation directory>/unravel/manager config properties set <property> <value>
    
    For example:
    /opt/unravel/manager config properties set com.unraveldata.cluster.type MAPR
  3. Configuring MapR Ticket.

    1. Stop Unravel.

      <Unravel installation directory>/unravel/manager stop
      
    2. Use an editor to open <Installation_directory>/unravel/data/conf/unravel.yaml file.

    3. Find the unravel: block and add the following:

        environment:
          MAPRTICKET_LOCATION: /path/to/MapR_Ticket 

      For example:

      unravel:
      ... other existing configuration attributes ...
        environment:
          MAPRTICKET_LOCATION: /path/to/MapR_Ticket 
    4. Apply the changes.

      <Unravel installation directory>/unravel/manager config apply
      
  4. Configure tagging.

    1. Stop Unravel

      <Unravel installation directory>/unravel/manager stop
      
    2. Run the following to enable tagging.

      <Unravel installation directory>/unravel/manager config tagging enable
    3. Set the tagging method via the python script. Refer to Writing a Python scriptfor more details.

      manager config tagging set <tagging_method> </path/to/tagging/script>

      <tagging method> is the method to call in the tagging script.

    4. Apply the changes.

      Note

      Ensure to stop Unravel before applying the changes.

      <Unravel installation directory>/unravel/manager config apply
      
  5. Start Unravel

    <Unravel installation directory>/unravel/manager start
5. Enable additional instrumentation (MapR)

You can use the following steps to enable additional instrumentation for MapR:

Note

After the files are updated on the Unravel host, you can use the scp command to copy them to other hosts in case of a multi-node MapR cluster. Back up your original files in case you need to roll back changes.

  1. Deploy sensors.

    1. Download the Hive Hook jar.

      Create /usr/local/unravel_client directory, if it does not exist already, and download the Hive Hook jar to this directory.

      mkdir -p /usr/local/unravel_client
      cd /usr/local/unravel_client
      wget <protocol>://<unravel_base_url>:<port>/hh/unravel-hive-<hive_version>-hook.jar 
    2. Download and extract Spark Sensor zip.

      Create /usr/local/unravel-agent directory, if it does not exist, and enter the directory. Download and extract Spark Sensor zip to this directory.

      mkdir -p /usr/local/unravel-agent
      cd /usr/local/unravel-agent
      wget <protocol>://<unravel_base_url>:<port>/hh/unravel-agent-pack-bin.zip
      unzip unravel-agent-pack-bin.zip
    3. (Optional) In the case of a multi-node MapR cluster, you must distribute the sensor to all the hosts using SCP.

      • SCP Hive Hook sensor

        scp -i <ssh_key> -r /usr/local/unravel_client/ ssh_user@host:/usr/local
      • SCP Spark/MR sensor

        scp -i <ssh_key> -r /usr/local/unravel-agent/ ssh_user@host:/usr/local
  2. Update hive-env.sh.

    In /opt/mapr/hive/hive-<hive-version>/conf/hive-env.sh, append these lines:

    export AUX_CLASSPATH=${AUX_CLASSPATH}:/usr/local/unravel_client/unravel-hive-<hive-version>-hook.jar 
    export HIVE_AUX_JARS_PATH=${HIVE_AUX_JARS_PATH}:/usr/local/unravel_client
  3. Configure hive-site.xml

    In /opt/mapr/hive/hive-<hive-version>/conf/hive-site.xml, add the following properties:

     </property>
    <!--<property>    
    <name>com.unraveldata.hive.hook.tcp</name>    
    <value>true</value>    
    <source>yarn-site.xml</source>  
    </property><property>    
    <name>hive.exec.failure.hooks</name>    
    <value>com.unraveldata.dataflow.hive.hook.UnravelHiveHook</value>    
    <source>yarn-site.xml</source>  
    </property><property>    
    <name>com.unraveldata.hive.hdfs.dir</name>    
    <value>/user/unravel/HOOK_RESULT_DIR</value>    
    <source>yarn-site.xml</source>  
    </property><property>    
    <name>hive.exec.driver.run.hooks</name>    
    <value>com.unraveldata.dataflow.hive.hook.UnravelHiveHook</value>    
    <source>yarn-site.xml</source>  
    </property><property>    
    <name>hive.exec.post.hooks</name>    
    <value>com.unraveldata.dataflow.hive.hook.UnravelHiveHook</value>    
    <source>yarn-site.xml</source>  
    </property><property>    
    <name>com.unraveldata.host</name>    
    <value><unravel host name></value>    
    <source>yarn-site.xml</source>  
    </property>  <property>    
    <name>hive.exec.pre.hooks</name>    
    <value>com.unraveldata.dataflow.hive.hook.UnravelHiveHook</value>    
    <source>yarn-site.xml</source>  
    </property>-->
  4. Update spark-defaults.conf.

    In /opt/mapr/spark/spark-<spark-version>/conf/spark-defaults.conf, append these lines:

    spark.unravel.server.hostport <unravel-host>:4043 spark.eventLog.dir maprfs:///apps/spark 
    spark.history.fs.logDirectory maprfs:///apps/spark 
    spark.driver.extraJavaOptions -javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=spark-<spark-version>,config=driver 
    spark.executor.extraJavaOptions -javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=spark-<spark-version>,config=executor 

    Note

    To enable live monitoring of Spark Streaming applications, add the following property in spark.driver.extraJavaOptions

    script=StreamingProbe.btclass
    
    #For example:
    spark.driver.extraJavaOptions=-javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=spark-<spark-version>,script=StreamingProbe.btclass,config=driver
  5. Update hadoop-env.sh.

    In /opt/mapr/hadoop/hadoop-<hadoop-version>/etc/hadoop/hadoop-env.sh, append these lines:

    export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}:/usr/local/unravel_client/unravel-hive-<hive-version>-hook.jar
  6. Update mapred-site.xml.

    In /opt/mapr/hadoop/hadoop-hadoop-version/etc/hadoop/mapred-site.xml, append these lines:

    <property>
    	<name>mapreduce.task.profile</name>
    	<value>true</value>
    </property>
    <property>
    	<name>mapreduce.task.profile.maps</name>
    	<value>0-5</value>
    </property>
    <property>
    	<name>mapreduce.task.profile.reduces</name>
    	<value>0-5</value>
    </property>
    <property>
    	<name>mapreduce.task.profile.params</name>
    	<value>-javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=mr -Dunravel.server.hostport=<unravel-host>:4043</value>
    </property>
    <property>
    	<name>yarn.app.mapreduce.am.command-opts</name>
    	<value>-javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=mr -Dunravel.server.hostport=<unravel-host>:4043</value>
    </property>
    

    Note

    Make sure the original value of yarn.app.mapreduce.am.command-opts is preserved, by appending the Java agent setup rather than replacing the original value.

  7. Update tez-site.xml.

    In /opt/mapr/tez/tez-version/conf/tez-site.xml, append these lines:

    <property>
      <name>tez.task.launch.cmd-opts</name>
      <value>-javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=mr,config=tez -Dunravel.server.hostport=<unravel-host>:4043</value>
      <description />
    </property>
    
    <property>
      <name>tez.am.launch.cmd-opts</name>
      <value>-javaagent:/usr/local/unravel-agent/btrace-agent.jar=libs=mr,config=tez -Dunravel.server.hostport=<unravel-host>:4043</value>
      <description />
    </property>
  8. Confirm and adjust the settings in yarn-site.xml

    Check specific properties in /opt/mapr/hadoop/hadoop-<hadoop-version>/etc/hadoop/yarn-site.xml to be sure that these settings are present:

    • yarn.resourcemanager.webapp.address

      <property>
      	<name>yarn.resourcemanager.webapp.address</name>
      	<value><your-resource-manager-webapp-ip-address>:8088</value>
      	<source>yarn-site.xml</source>
      </property>
      
    • yarn.log-aggregation-enable

      <property>
      	<name>yarn.log-aggregation-enable</name>
      	<value>true</value>
      	<description>For log aggregations</description>
      </property>
  9. Run the MapR setup script.

    1. Run the unravel_mapr_setup.py script.

      Replace the values for unravel-host, spark-version, hive-version, and ambari-host with appropriate values.

      <Unravel installation directory>/unravel/manager run script unravel_mapr_setup.py --unravel-server <unravel-host>:3000 --spark-version <spark-version> --hive-version <hive-version> -v
    2. Copy the following to all MapR nodes.

      /usr/local/unravel_client
      /usr/local/unravel-agent/
    3. Ensure the following modified configuration files in the Unravel node are applied to other nodes:

      • /opt/mapr/hive/hive-<hive-version>/conf/hive-env.sh

      • /opt/mapr/hive/hive-<hive-version>/conf/hive-site.xml

      • /opt/mapr/spark/spark-<spark-version>/conf/spark-defaults.conf

      • /opt/mapr/hadoop/hadoop-<hadoop-version>/etc/hadoop/hadoop-env.sh

      • /opt/mapr/hadoop/hadoop-<hadoop-version>/etc/hadoop/mapred-site.xml

      • /opt/mapr/tez/tez-version/conf/tez-site.xml

      • /opt/mapr/hadoop/hadoop-<hadoop-version>/etc/hadoop/yarn-site.xml

  10. Go to MapR Control System UI > Services and restart all the services.

6. Log onto Unravel UI
  1. Find Unravel URL from the machine where Unravel is installed.

    echo "http://$(hostname -f):3000/"

    If you are using an SSH tunnel or HTTP proxy, you might need to make adjustments.

  2. Using a supported web browser (see Unravel's MapR compatibility matrix, navigate to Unravel URL. For example: http://unravel-host:3000 and log in with username admin with password unraveldata.

    loginscreen.png

    Unravel UI displays the collected data.