v4.7.4.0 Release notes

Software version

Release Date: 27/April/2022

See 4.7.4.0 for download information.v4.7.4.0

Software upgrade support

Fresh installations are supported along with the following upgrade path:

v4.7.0.x, v4.7.1.x, v4.7.2.x, v4.7.3.x → v4.7.4.0
v4.6.2.x → v4.7.4.0
v4.6.1.9 → v4.7.4.0
v4.6.1.8 or earlier → v4.6.1.9 → v4.7.4.0

Refer to Upgrading Unravel server for instructions to upgrade to Unravel 4.6.1.9 version.

Refer to Upgrading Unravel for instructions to upgrade to Unravel 4.7.4.0 version.

Refer to Installing Unravel for fresh installations.

Sensor upgrade

Sensor upgrade is mandatory.
Refer to Upgrading Sensors.

Certified platforms

The following platforms are tested and certified in this release:

Cloudera Distribution of Apache Hadoop (CDH)
Cloudera Data Platform (CDP)
Hortonworks Data Platform (HDP)
Amazon Elastic MapReduce (EMR)
Databricks (Azure and AWS)
Google Cloud Platform (Dataproc, BigQuery)

Review your platform's compatibility matrix before you install Unravel.

Updates to Unravel's configuration properties

Refer to 4.7.x - Updates to Unravel properties.

New features

Ansible
Ansible upgrade from 4.6.1.9 to 4.7.x for on-prem and cloud platforms.
LR Authentication
Support is provided for HTTP basic authentication with TLS for Log Receiver(LR). This is currently supported only on Databricks.

Improvements and enhancements

Billing
- Billing alert is raised only during the month of expiry. (CDI-466)
- Support for capturing discounts on VM and DBU prices as a percentage of public prices. (DT-1118)
- Customers can download the latest price information from Unravel. (DT-1117)
BigQuery
- Support is provided for cached queries.
- Support of insights for BigQuery.
Databricks
- Support for Global Init script. (DT-807)
- Redact property values containing passwords in the Cluster Configuration tab on Unravel UI. (DT-900)
- Process Databricks Run Annotation metrics in real time. (DT-1097)
- Skip storing Spark Job IDs in Databricks Run Annotation. (DT-1099)
Data page
- Support mix of metastore types (For example, BigQuery and DataProc in GCP).
EMR
- Added new IAM permissions needed for AWS account usage. (EMR-269)
- Node downsizing event recommendation accounts for task nodes and fleet instances. (EMR-253)
Hive
- Store viz_json as a compressed field in the hive_queries table in the database instead of text to save space. (ASP-1399)
Impala
- The Impala pipeline is improved to scale for huge impala workloads.
Install
- Added show commands (config tls show and config tls trust show) for TLS/Trust manager command. (INSTALL-2444)
- Usability improvement for interactive precheck and manual configuration.
- Added capability to automatically accept TLS certificate and chain via the manager command. (INSTALL-2019)
Migration
- Update resource files for AWS to include the following: (REPORT-2008)
  - New regions for ec2, emr, s3, ebs
  - New instances for ec2 and emr
  - New services for emr
  - Instances
  - Instances and storage prices for AWS
- Update resource files for Azure to include the following: (REPORT-2007)
  - New regions for Azure
  - New VM instances for Azure
  - New services for Azure HDInsight
  - Updated Instances and storage prices for Azure
- Enable diagnostics for migration reports to improve debuggability. (REPORT-2025)
Monitoring
- Improve database monitoring with the MyBatis plugin that monitors query duration and collects statistics. (CDI-432)
- Log pipeline metrics for spark and event worker. (ASP-1404)
Security
- Kafka and Zookeeper upgraded to address the Log4J vulnerability.
  - Kafka upgraded to 3.1.0
  - Zookeeper upgrade to 3.6.3
- JAVA upgraded to 8u322.
- Python upgraded to 3.8.12.
- Capability to generate a token from UI. (UIX-4476)
Spark
- New property (com.unraveldata.spark.query.size.max) was added to truncate the query if it is more than the configured length.
- Add protobuf support for Spark Btrace sensor data. (ASP-1362)
UI
- Cost tabs - Update date range to support broader options. (UIX-4492)
- Remove the Close button for the Compute Details page. (UIX-4475)
- Multiple UI fixes for Cluster Insights page.
- Generate token for API from UI. (UIX-4476)

Unsupported

Billing

Unravel does not support Billing on-prem platforms.

Data

On the Data page, File Reports, Small File reports, and file size information are not supported for MapR, and cloud (EMR, Databricks, GCP) clusters.

Jobs

Impala jobs are not supported on the HDP platform.

Healthcheck

Monitoring the expiration of the SSL Certificates and Kerberos principals in Unravel multi-cluster deployments.

Log Receiver (LR) authentication

LR authentication is not supported for on-prem platforms and for cloud platforms it is supported only for Databricks.

Platforms

Databricks

The Program tab does not get populated for a notebook job that is attached to an interactive cluster. (ASP-1432)

MapR

The following features are not supported for MapR:

Impala applications
Kerberos
The following features are supported on the Data page:
- Forecasting
- Small Files
- File Reports
The following reports are not supported on MapR:
- File Reports
- Small Files Report
- Capacity Forecasting
- Migration Planning
The Tuning report is supported only for MR jobs.
Migration Planning
AutoAction is not supported for Impala applications
Migration
Billing
Insights Overview

Migration Planning

Migration Planning is not supported for the following regions for Azure Data Lake:
- Germany Central (Sovereign)
- Germany Northeast (Sovereign)
Forecasting and Migration: In a multi-cluster environment, you can configure only a single cluster at a time. Hence reports are generated only for that single cluster.
Migration Planning is not supported for MapR.

Multi-cluster deployment

Unravel does not support multi-cluster management of combined on-prem and cloud clusters.

Pipeline

Unravel does not support apps belonging to the same pipeline in a multi-cluster environment but is sourced from different clusters. A pipeline can only contain apps that belong to the same cluster.

Reports

All the reports, except for the TopX report, are not supported on Databricks and EMR.
Memory and CPU usage metrics are not supported for TopX reports on Databricks.

Sessions

In Jobs > Sessions, the feature of applying recommendations and then running the newly configured app is not supported.

UI

Pig and Cascading applications are not supported.

Bug fixes

AppStore
- AppStore fails to start when there are special characters in the database username/password. (APP-567)
AutoAction
- AutoAction (AA) policy scope is set as app instead of apps when AA is created without a template. (ASP-1419)
Healthcheck
- Exceptions are shown while running AppStore healthcheck. (INSTALL-2291)
Insights
- Neglect join operator if valid column id not found in previous scan operators. (INSIGHTS-280)
- Change of impact for events in BigQuery applications. (INSIGHTS-301)
- The applications that show Total app time as zero are not analyzed for insights. (INSIGHTS-303)
Platform
- Group filtering is ignored when member-of search method is used for LDAP groups. (CDI-438)
Report
- Error while running Queue Analysis report. (REPORT-2022)
- Top X Report fails with HTTPError: HTTP Error 404: Not Found error. (REPORT-2014, REPORT-1979)
UI
- Unessential horizontal scrolling when you click the full log in Databricks playground for Spark app. (UIX-4510)
- On the Manage page, the DB Stats are not displayed for untracked clusters. (UIX-4171)

Known issues

Applications

Workflow/Jobs page displays empty for Analysis, Resources, Daggraph, and Errors tab. (DT-1093)
Event logs and YARN logs are not loaded for some applications in Google Dataproc clusters. (PG-170)

BigQuery

Mark duplicate insights as stale for BigQuery (INSIGHTS-305)

Data page

Incorrect data is displayed in the Number of Queries KPI/Trend graph on the Overview page. (DATAPAGE-502)
Create time of partitions does not get captured in hive metastore if the partition is created dynamically. This limits Unravel to show Last Day KPIs for the partition section.
Wrong data displayed for Number of Partitions Created KPI/trend graph under Partitions KPIs - Last Day section in theData page. (DATAPAGE-473)

Databricks

When a job fails to submit a spark application, the failed DataBricks job is missing from Unravel UI. (ASP-1427)
In Databricks, when a job in a workflow fails and a new job is launched instead, as a new attempt, the new job will not be part of the same workflow. (PG-269)
Cyclical dependency error is seen in the event_worker_1.err.log file. (DT-1127)
Workaround: Set the com.unravel.workflows.compareupdate.disable property to true.
DataBricks jobs are being missed intermittently in Unravel. (PG-232)
In the Databricks view, the application is shown in a running state, even though the corresponding Spark application is marked as finished. (ASP-1436)

Dataproc

Google Cloud Datapro: Executor Logs are not loaded for spark applications. (PG-229)

EMR

Exception: Problem when retrieving bootstrap actions for cluster is seen in the aws_worker daemon logs.

Workaround: While creating an AWS account for EMR Chargeback/Insights overview feature, you must include an additional entry in the Policy JSON file for "elasticmapreduce:ListBootstrapActions", as follows:

{
    “Version”: “2012-10-17",
    “Statement”: [
        {
            “Effect”: “Allow”,
            “Action”: [
                “pricing:GetProducts”,
                “elasticmapreduce:ListClusters”,
                “elasticmapreduce:DescribeCluster”,
                “elasticmapreduce:ListInstanceFleets”,
                “elasticmapreduce:ListInstanceGroups”,
                “elasticmapreduce:ListBootstrapActions“,
                “elasticmapreduce:ListInstances”,
                “ec2:DescribeSpotPriceHistory”
            ],
            “Resource”: “*”
        }
    ]
}

Even though the AWS account was already created without this entry (elasticmapreduce:ListBootstrapActions), you can always include this policy later.

The workflow of multiple transient clusters (EMR) is not supported. (ASP-1424)

Email

Unravel node fails to send email notifications. (INSTALL-1694)

Insights Overview

The Insights Overview tab uses UTC as the timezone, while other pages use local time. Hence, the date and time that are shown on the Insights Overview tab and the other pages after redirection can be different. (UIX-4176)

Kerberos

Kerberos can only be disabled manually from the unravel.yamlfile.
```
 kerberos:
      enabled: False
```

Migration

WorkloadFit report
- A large number of tags can cause the Workload Fit report to fail. (PG-265)
- WorkloadFit report > Heatmap: The job count has data but Vcore and memory are empty. (MIG-262)

Reports

Cluster discovery
- If the metric retrieval for a host fails, then the CPU and memory capacity/usage graphs and heatmaps are not displayed.
  This happens on a CDH cluster when the Cloudera Manager agent of a host does not send any heartbeats to the Cloudera Manager server. Such a host is shown as Bad Health in Cloudera Manager. (REPORT-1706)
  Workaround: Ensure that the Cloudera Manager agent sends heartbeats to the Cloudera Manager on all hosts and that none of the hosts are shown as Bad Health.
- The On-prem Cluster Identity may show an incorrect Spark version on CDH. The report may incorrectly show Spark 1 when Spark 2 is installed on the CDH cluster. (REPORT-1702)
When using PostgreSQL, the % sign is duplicated and displayed in the Workload Fit report > Map to single cluster tab. (MIG-42)
Cloud Mapping Per Host report scheduled in v4.6.1.x will not work in v4.7.1.0. Users must schedule a new report. (REPORT-1886)
The TopX report email contains a link to the Unravel TopX report instead of showing the report content in the email as in the old reports.
Queue analysis: The log file name (unravel_us_1.log) displayed in the error message is incorrect. The correct name of the log file is unravel_sensor.log. (REPORT-1663)

Spark

There is a lag seen for SQL Streaming applications. (PLATFORM-2764)

Security

If the customer uses an active directory for Kerberos and the samAccountName and principal do not match, this can cause errors when accessing HDFS. (DOC-755)
In AAD login mode when external logout happens, the user still has access to his current logged-in UI. (UIX-4125)

Spark insights

For PySpark applications, the processCPUTime and the processCPULoad are not captured properly. (ASP-626)

Tez

SQL events generator generates SQL Like clause event if the query contains a like pattern even in the literals. (TEZLLAP-349)

Upgrade

Notebooks will not work after upgrading to v4.7.1.0. You can configure them separately. (REPORT-1895)

In case you have configured a single cluster deployment for Unravel and the cluster name is not default, then the Datapage feature may not work properly.

For this, you must explicitly set the following property after upgrading. (INSTALL-2151)

<Unravel installation directory>/unravel/manager stop
<Unravel installation directory>/unravel/manager config properties set hive.metastore.cluster.ids=<cluster-name>
<Unravel installation directory>/unravel/manager apply
<Unravel installation directory>/unravel/manager start

After upgrading from v4.6.x to v4.7.1.0, the Tez application details page does not initially show DAG data. The DAG data is visible only after you refresh the page. (ASP-1126)

UI

The new user interface (UI) can be accessed only from Chrome.
In the App summary page for Impala, the Query> Operator view is visible after scrolling down. (UIX-3536).

Workflow

Jobs getting falsely labeled as a Tez App for Oozie Sqoop and Shell actions. (PLATFORM-2403)

Support

For support issues, contact Unravel Support.

In this section:

Home