Part 1: Installing Unravel Server on MapR

Home

Part 1: Installing Unravel Server on MapR

This topic explains how to deploy Unravel Server on the MapR converged data platform.

Important

If you have not already done so, confirm your cluster meets Unravel's MapR compatibility matrix hosting requirements.

1. Configure the host

Allocate a cluster gateway/edge/client host with HDFS access.
Enable the hadoop fs command, Hive, and Spark.
Although Unravel Server doesn't launch Hive or Spark jobs, it's convenient to have Hive and Spark installed on this gateway/edge/client host.
Tip
For more information about the MapR client configuration, see MapR's configuration documentation.
1. Run the following commands this gateway/edge/client host as root, substituting your site-specific values for name, cldb-list, and history-server.
```
sudo yum install mapr-client.x86_64
sudo /opt/mapr/server/configure.sh -N name -c -C cldb-list -HS history-server
sudo yum install mapr-hive.noarch
sudo yum install mapr-spark.noarch
```
Check/add/modify these MapR settings:
1. Run the following commands on Unravel Server as root:
```
sudo useradd -g mapr unravel
hadoop fs -mkdir /user/unravel
hadoop fs -chown unravel:mapr /user/unravel
```
2. Check/adjust available RAM on the Unravel gateway/client host:
```
free -g
```
  For instructions on adjusting RAM allocated to MapR-FS (mfs), see https://community.mapr.com/docs/DOC-1209.
  For example, edit /opt/mapr/conf/warden.conf as follows:
```
service.command.mfs.heapsize.maxpercent=10
```
3. Restart the MapR File System (mfs).
```
/etc/init.d/mapr-mfs restart
```

3. Install Unravel Server on the host

Download the Unravel Server RPM.
Ensure that the host machine's local disks have the minimum space required.
Unravel Server uses two separate disks: one for binaries (/usr/local/unravel) and one for data (/srv/unravel). The separate disk /srv/unravel is beneficial for performance. If either disk doesn't have the minimum space required, create symbolic links for them to another disk drive.
Tip
Use the df command to check the space on a volume. For example,
```
df -h /srv
```
Install the Unravel Server RPM.
```
sudo rpm -Uvh unravel-version.rpm
```

The installation creates the following items:

/usr/local/unravel/, which contains executables, scripts, properties file (unravel.properties), and logs.
/etc/init.d/unravel_*, which contains scripts for controlling services, such as unravel_all.sh for manually stopping, starting, and getting the status of all daemons in proper order.
User unravel if it doesn't exist already.

5. Export MapR ticket

If MapR tickets are enabled, make sure you have tickets for users unravel and mapr on the target host.

You may need to export ticket environment variables (such as MAPR_TICKETFILE_LOCATION) in /srv/unravel/unravel_ctl first.

For example:

[root@wnode55 ~]# cat /srv/unravel/unravel_ctl
export RUN_AS=mapr
export USE_GROUP=mapr
export MAPR_TICKETFILE_LOCATION=/tmp/maprticket_5000
[root@wnode55 ~]#

6. Modify Unravel Server for MapR

Open switch_to_mapr.sh script:

vi  /usr/local/unravel/install_bin/switch_to_mapr.sh

Validate and edit the following paths in the switch_to_mapr.sh script if required:

MAPR_LOGIN_CONF=/opt/mapr/conf/mapr.login.conf
HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop
SPARK_CONF_DIR=/opt/mapr/spark/spark-2.2.1/conf
HIVE_CONF_DIR=/opt/mapr/hive/hive-2.1/conf
MAPR_TICKETFILE_LOCATION=/tmp/maprticket_$(id -u $RUN_AS)

Run the following commands on Unravel Server.
```
sudo /etc/init.d/unravel_all.sh stop
sudo /usr/local/unravel/install_bin/switch_to_mapr.sh
```
Note
This change is persistent; you need not do this when you upgrade the RPM.

7. Configure Unravel Server with basic options

(Optional) Enable additional daemons for high-volume workloads.

In /usr/local/unravel/etc/unravel.properties, set general properties for Unravel Server.

Property/Description	Set by user	Unit	Default
com.unraveldata.customer.organization Customer name. Used to identify your installation for reporting and notification purposes in Unravel UI.	Optional	string	Not Set
com.unraveldata.advertised.url Defines the Unravel Server URL for HTTP traffic. Example: http://unravelserver.company.com:3000		string	http://{host}:3000
com.com.unraveldata.hdfs.timezone Timezone of HDFS, for example, US/Eastern, Etc/GMT-4, America/New_York. If the timezone is not set then an error message is logged and UTC timezone is used. Possible timezones can be obtained by calling `TimeZone.getAvailableIDs()`.		string	-
com.unraveldata.tmpdir The base location for Unravel process control files where Unravel's temp files reside.		string (path)	/srv/unravel/tmp
com.unraveldata.history.maxSize.weeks Number of weeks retained for search results in Elastic Search.		integer	5
com.unraveldata.retention.max.days Number of days to keep the heaviest data (such as error logs and drill-down details) in the SQL Database.		integer	30

Point Unravel Server to logs on HDFS.

Unravel collects HDFS logs for analysis. To point Unravel Server to these logs, set the following properties in /usr/local/unravel/etc/unravel.properties:

Property/Description	Unit	Default
com.unraveldata.job.collector.done.log.base HDFS path to `done` directory of MR logs as per cluster configuration. Don't include the hdfs:// prefix For example: com.unraveldata.job.collector.done.log.base=/mr-history/done.	string	/user/history/done
com.unraveldata.job.collector.log.aggregation.base HDFS path to the aggregated container logs (logs to process). Do not include the hdfs://prefix. The log format defaults to TFile. You can specify multiple logs and log formats (TFile or IndexedFormat.) Example: com.unraveldata.job.collector.log.aggregation.base=TFile:/tmp/logs//logs/,IndexedFormat:/tmp/logs//logs-ifile/.	CSL	/tmp/logs/*/logs/
com.unraveldata.spark.eventlog.location Comma-separated list of HDFS paths to the Spark event logs as per cluster configuration. Each path must include the hdfs:/// prefix. For example: com.unraveldata.spark.eventlog.location=hdfs:///spark1-history/,hdfs:///spark2-history/	CSL	hdfs:///user/spark/applicationHistory/

Property/Description

Set by user

Unit

Default

com.unraveldata.job.collector.done.log.base

HDFS path to done directory of MR logs as per cluster configuration. Don't include the hdfs:// prefix For example: com.unraveldata.job.collector.done.log.base=/mr-history/done.

string

/user/history/done

com.unraveldata.job.collector.log.aggregation.base

HDFS path to the aggregated container logs (logs to process). Do not include the hdfs://prefix. The log format defaults to TFile. You can specify multiple logs and log formats (TFile or IndexedFormat.)

Example: com.unraveldata.job.collector.log.aggregation.base=TFile:/tmp/logs/*/logs/,IndexedFormat:/tmp/logs/*/logs-ifile/.

CSL

/tmp/logs/*/logs/

com.unraveldata.spark.eventlog.location

Comma-separated list of HDFS paths to the Spark event logs as per cluster configuration. Each path must include the hdfs:/// prefix. For example: com.unraveldata.spark.eventlog.location=hdfs:///spark1-history/,hdfs:///spark2-history/

CSL

hdfs:///user/spark/applicationHistory/

If the Application Timeline Server (ATS) requires user authentication, set the following properties:

Property/Description	Set by user	Unit	Default
yarn.ats.webapp.username Username required for authentication to the Application Timeline Server (if authentication is required).	Optional	string	-
yarn.ats.webapp.password Password required for authentication to the Application Timeline Server (if authentication is required).	Optional	string	-

Property/Description

Set by user

Unit

Default

yarn.ats.webapp.username

Username required for authentication to the Application Timeline Server (if authentication is required).

Optional

string

yarn.ats.webapp.password

Password required for authentication to the Application Timeline Server (if authentication is required).

Optional

string

Connect to the Oozie server by setting oozie.server.url.
Enable https access to Resource Manager by setting https.protocol.
Define the monitoring frequency.
Property/Description
Set by user
Unit
Default
com.unraveldata.s3.batch.monitoring.interval.sec
Defines the monitoring frequency. Set this property to 60 for lower latency.
s
180
If Kerberos is enabled, add authentication for HDFS:
1. Create or identify a principal and keytab for Unravel daemons to access HDFS and REST when Kerberos is enabled.
2. Find and verify the principal keytab by running this command:
```
klist -kt KEYTAB_FILE
```
  Note
  Set the Linux file permissions of the keytab file to 500 (chmod 500) and set its owner set to unravel or to your chosen user, as explained in Run Unravel Daemons with Custom User.
3. In /usr/local/unravel/etc/unravel.properties, add/set these properties for Kerberos:
```
com.unraveldata.kerberos.principal=unravel/my-host.my-domain@my-realm
com.unraveldata.kerberos.keytab.path=/usr/local/unravel/etc/unravel.keytab
```

Property/Description	Set by user	Unit	Default
com.unraveldata.s3.batch.monitoring.interval.sec Defines the monitoring frequency. Set this property to 60 for lower latency.		s	180

If Sentry is enabled, add these permissions:

Define your own alt principal with narrow privileges and the access permissions shown in the table below.
The alt principal can be unravel (default) or one of your choosing. The corresponding kerberos principal does not need to be the same as the local user.

Verify that the user running the Unravel daemon /etc/unravel_ctl has the access permissions shown in the table below.

Resource	Principal	Access	Purpose
`hdfs://user/spark/applicationHistory`	`mapr` or alternate	read+execute	Spark event log
`hdfs://usr/history/done`	`mapr` or alternate	read+execute	MapReduce logs
`hdfs://tmp/logs`	`mapr` or alternate	read+execute	YARN aggregation folder
`hdfs://user/hive/warehouse`	`mapr` or alternate	read+execute	Obtain table partition sizes
`Hive Metastore access`	`hive`	read+execute	Hive table information

8. Change the run-as user and group for Unravel daemons

9. Connect to the Hive metastore

10. Start Unravel services

Run the following command to start all Unravel services:

sudo /etc/init.d/unravel_all.sh start
sleep 60

This completes the basic/core configuration.

11. Log into Unravel UI

Find the hostname of Unravel Server.
```
echo "http://$(hostname -f):3000/"
```
If you're using an SSH tunnel or HTTP proxy, you might need to make adjustments.
Using a supported web browser (see Unravel's MapR compatibility matrix, navigate to http://unravel-host:3000 and log in with username admin with password unraveldata.
Unravel UI displays the collected data.

12. Enable additional instrumentation

In this section:

Would you like to provide feedback? Just click here to suggest edits.

Home

Part 1: Installing Unravel Server on MapR

Important

1. Configure the host

Tip

2. Install MySQL

3. Install Unravel Server on the host

Tip

4. Configure MySQL

5. Export MapR ticket

6. Modify Unravel Server for MapR

Note

7. Configure Unravel Server with basic options

Note

8. Change the run-as user and group for Unravel daemons

9. Connect to the Hive metastore

10. Start Unravel services

11. Log into Unravel UI

12. Enable additional instrumentation

Search results