Populating the Unravel Data Insights page

Home

Populating the Unravel Data Insights page

Reports > Data Insights is populated by Unravel analyzing your Hive Metastore tables along with your Hive query information. You must configure Unravel's connection to the Hive Metastore. Unravel Server collects information through the Hive API, which is similar to the beeline CLI and uses the JDBC database connection protocol.

1. Obtaining Hive Metastore details

This topic explains how to find the connection URL, driver name, username, and password for Hive or HiveServer2. You need these four settings to connect Unravel Server to the Hive Metastore.

For CDH

From CDH version 5.5+, send the Cloudera Manager REST API request http://cloudera-manager-hostname-or-ip:7180/api/v12/cm/deployment.
Search the JSON response body for metastore.
Copy the connection URL, driver name, username, and password.

For Azure HDInsight

You can get 3 of the 4 properties from Ambari, or get all 4 from the Azure portal:

From Ambari...

Log in to the Azure portal.
Click HDInsight clusters, and select your cluster.
In the cluster's Overview window, click the link in the URL field.
This gets you to the Ambari UI.
In Ambari, click Hive | Configs | Advanced.
Get the four required settings:
- For connection URL, use the value in the Database URL field.
  In the screenshot, this is jdbc:sqlserver://ldnu62r6x6.database.windows.net;databaseName=v364574cc3cfd51438fb29a723243a9a71bhivemetastore;trustServerCertificate=false;encrypt=true;hostNameInCertificate=*.database.windows.net;
- For the driver name, use the value in the JDBC Driver Class field.
  In the screenshot, this is com.microsoft.sqlserver.jdbc.SQLServerDriver
- For username, use the value in Database Username field.
  In the screenshot, this is v364574cc3cfd51438fb29a723243a9a71bhivemetastoreLogin@ldnu62r6x6.database.windows.net
- The password is encrypted; Ambari doesn't show it. Obtain the password from Azure instead.

From the Azure portal...

From the Azure portal, find out the IP address to use for an SSH session to your cluster.
Open an SSH session to that IP address.
In /etc/hive/conf/hive-site.xml, search for these values, and copy them to use as-is:
javax.jdo.option.ConnectionURL
javax.jdo.option.ConnectionDriverName
javax.jdo.option.ConnectionUserName

To get the Hive password, run this command at the Linux prompt:

sudo java -cp '/var/lib/ambari-agent/cred/lib/*' org.apache.ambari.server.credentialapi.CredentialUtil  get   javax.jdo.option.ConnectionPassword  -provider 'jceks://file/var/lib/ambari-agent/cred/conf/hive_metastore/hive-site.jceks' 2>/dev/null

For Amazon EMR

Open an SSH session to the cluster, open the Hive configuration file, /etc/hive/conf/hive-site.xml, and search for the four settings. For example, on EMR version 4.7.2, you'd see something like this:

<property>
 <name>javax.jdo.option.ConnectionURL</name>
 <value>jdbc:mysql://ip-xx-xx-xx-xx:3306/hive?createDatabaseIfNotExist=true</value>
 <description>username to use against metastore database</description>
</property>

<property>
 <name>javax.jdo.option.ConnectionUserName</name>
 <value>hive</value>
 <description>username to use against metastore database</description>
</property>

<property>
 <name>javax.jdo.option.ConnectionPassword</name>
 <value>xxxxxxxxxxx</value>
 <description>password to use against metastore database</description>
</property>

For HDP of MapR

To get the four required settings, contact your cluster's administrator.

2. Configure Unravel to access Hive Metastore

In order for the tab to populate correctly, you must configure the Hive Metastore Access parameters. You also have the option to configure the JDBC parameters managing JDBC pooling.

Accessing Metastore with read-write user permission

Download the right JDBC connector JAR for your database:

If you're using MySQL as the Hive Metastore database, download the MySQL JDBC connector JAR from MySQL Download ConnectorJ.
If you're using Oracle as the Hive Metastore database, download the Oracle JDBC connector JARs ojdbc7.jar and ojdbc6.jar from Oracle Database 12c Release 1 JDBC Driver Downloads.
Note
Make sure the unravel account has read permission.

Accessing Metastore with read-only user permissions

Get details about the location of the Hive Metastore.
- In CDH, use the following Cloudera Manager API to get the Hive metastore database name and port. By default, in CDH and HDP, the hive metastore database name is hive.
  - For TLS-secured Cloudera Manager
    curl -k -u admin:admin "https://cloudera-manager-host:7183/api/v11/clusters/cluster-name/services/hive/config"
  - For non-TLS secured Cloudera Manager
    curl -k -u admin:admin "http://cloudera-manager-host:7183/api/v11/clusters/cluster-name/services/hive/config"
    or
    curl -k -u admin:admin "http://cloudera-manager-host:7180/api/v12/cm/deployment"
  Look for hive_metastore_database_host, hive_metastore_database_port, hive_metastore_database_user and hive_metastore_database_password in the JSON response body.
- For HDP or MapR, contact your cluster administrator.
Connect to the Hive metastore using the normal conversational interface for your underlying database (MySQL, PSQL Oracle, and so on) as an administrator or root user that can create new users and grant privileges.
For PostgreSQL, run the psql command as admin user cloudera-scm on Cloudera embedded PostgreSQL server within the Cloudera Manager node.
```
psql -U cloudera-scm -p 7432 -h localhost -d postgres
```
Create a new database user, such as unravelka, grant the user READ access to the database, grant that user SELECT privileges on all tables in the Hive database, and give that user access from the Unravel server host.
- For MySQL, run the following command to create a new user unravelka and grant it SELECT privilege on the Hive metastore database:
```
GRANT SELECT ON hive.* to 'unravelk'@'%' identified by 'unravelk_password';
flush privileges;
```
- For PostgreSQL, run the following commands:
```
CREATE USER unravelk WITH ENCRYPTED PASSWORD 'unravelk_password';

GRANT CONNECT ON DATABASE hive TO unravelk;
GRANT USAGE ON SCHEMA public TO unravelk;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO unravelk;
GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO unravelk;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO unravelk;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT USAGE ON SEQUENCES TO unravelk;
```
  Note
  The default PostgreSQL admin password created by Cloudera Manager is stored in /var/lib/cloudera-scm-server-db/data/generated_password.txt.
- For Oracle
  Due to restrictions for accessing an Oracle database you need to create synonyms and select access for the Unravel user. See Configuring Oracle.
As the new user, use the conversational interface (MySQL, psql, Oracle, and so on) from Unravel Server to verify that the new user has access.
Update /usr/local/unravel/etc/unravel.properties to use simple JDBC calls to get read-only details from Hive Metastore.
```
com.unraveldata.metastore.use.jdbc=true
```

Accessing Metastore with Microsoft JDBC

Download Microsoft JDBC Driver 7.0 for SQL Server JAR from Microsoft here to Unravel node.
Extract the downloaded file.
```
tar -xvzf sqljdbc_version_enu.tar.gz
```
Create the following directories if they do not exist.
```
mkdir -p /usr/local/unravel/share/java
```

Copy mssql-jdbc-7.0.0.jre8.jar to the following locations.

sudo cp sqljdbc_7.0/enu/mssql-jdbc-7.0.0.jre8.jar /usr/local/unravel/share/java

3. Add the following properties

In /usr/local/unravel/etc/unravel.properties, update Hive metastore access information with the read-only username and the values you retrieved in step 1.

Property/Description	Set by user	Unit	Default
javax.jdo.option.ConnectionDriverName JDBC Driver class name for the hive metastore containing the metadata Examples: MySQL:`com.mysql.jdbc.Driver` Oracle:`oracle.jdbc.driver.OracleDriver` Microsoft:`com.microsoft.sqlserver.jdbc.SQLServerDriver`	Required	string	-
javax.jdo.option.ConnectionPassword Password used to access the metastore.	Required	string	-
javax.jdo.option.ConnectionUserName Username used to access the metastore.	Required	string	-
javax.jdo.option.ConnectionURL JDBC connection string for the metastore containing the metadata of the form: jdbc:`DB_Driver`://`HOST`:`PORT`/hive Example: Oracle:`jdbc:oracle:thin:@//prodHost:1521/ORCL` Microsoft:`jdbc:sqlserver://jdbc_url`	Required	string (URL)	-

For example,

javax.jdo.option.ConnectionURL=jdbc:mysql://congo.unraveldata.com/hive javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver 
javax.jdo.option.ConnectionUserName=unravelka
javax.jdo.option.ConnectionPassword=hadoop

If you are using Microsoft JDBC Driver 7.0 for SQL Server set the properties as follows.

javax.jdo.option.ConnectionURL=jdbc:sqlserver://jdbc_url
javax.jdo.option.ConnectionDriverName=com.microsoft.sqlserver.jdbc.SQLServerDriver
javax.jdo.option.ConnectionUserName=username
javax.jdo.option.ConnectionPassword=password

4. Restart Unravel

sudo /etc/init.d/unravel_all.sh restart

5. Confirm Hive queries are displayed in the Unravel UI

Go to Unravel UI's Applications > Applications page to confirm that Hive queries are displayed. Approximately twenty-four hours after configuration the Reports > Data Insights > Details page displays a list of your Hive Metastore tables along with their KPIs and other details.

In this section:

Would you like to provide feedback? Just click here to suggest edits.

Home

Populating the Unravel Data Insights page

1. Obtaining Hive Metastore details

2. Configure Unravel to access Hive Metastore

Note

Note

3. Add the following properties

4. Restart Unravel

5. Confirm Hive queries are displayed in the Unravel UI

Search results