v4.7.8.0 Release notes
Software version
Release date: 20/February/2023
See 4.7.8.0 for download information.
Software upgrade support
The following upgrade paths are supported:
4.7.x.x → 4.7.8.0
4.6.1.9 → 4.7.8.0
4.6.1.8 or earlier → 4.6.1.9 → 4.7.8.0
For instructions to upgrade to Unravel v4.6.1.9, see Upgrading Unravel server.
For instructions to upgrade to Unravel v4.7.8.x, see Upgrading Unravel.
For fresh installations, see Installing Unravel.
Certified platforms
The following platforms are tested and certified in this release:
Amazon EMR
Databricks (Azure, AWS)
Review your platform's compatibility matrix before you install Unravel.
Updates to Unravel's configuration properties
See 4.7.x - Updates to Unravel properties.
Updates to upgrading Unravel to v4.7.8.0
An existing license for any previous version (before 4.7.7.x) does not work with the newer version of Unravel. Therefore, before upgrading Unravel , you must obtain a license file from Unravel Customer Support. For information about setting the license, see Upgrading Unravel from version 4.7.x to 4.7.8.x section in Upgrading Unravel.
Optionally, you can regroup multiple Spark worker instances for enhanced performance after upgrading to v4.7.8.0.
Caution
This task requires planning and can be performed only in collaboration with Unravel support team. This is a one-time task.
New features
Data quality integration with Great Expectations
Great Expectations is a product quality tool that enables you to run validations against your data asset by running an Expectation Suite (quality assertions) against it. Great Expectations when integrated with Unravel extends the measure of data quality into Unravel. At the same time, Unravel provides unified visibility of any expectations validated while running the Expectation Suite. Thus adding data quality insights to Unravel's current single-pane data monitoring aspect.
You can view the Data Quality insights of Great Expectations from the Unravel UI > Data > Tables detail page> Analysis tab and also from the Unravel UI > Jobs > Job details page > Analysis tab.
Multi-node deployment of Spark workers for high-volume data processing
You can deploy additional Spark workers on a separate server, other than the server where Unravel is installed, with services to process high-volume data.
Notification channels
A new Notification channels option has been added to the Manage menu, using which you can set up notification channels to receive alerts when certain conditions are triggered. Use notifications to send alerts through email addresses or Slack messages to users or user groups.
For information about the Notification channel, see the following topics:
Topics
Guide name
New topics
Notification channels
Creating a notification channel
Modifying the existing notification channel
Viewing notification channels
Updates to the existing topics
Cost Budget (EMR)
Creating a budget
Viewing a budget and its details
Updates to the existing topics
Cost Budget (Databricks)
Setting a budget
AutoActions support EMR apps and clusters to optimize cost
AutoActions can now monitor EMR apps and clusters. You can set the AutoAction policy to generate alerts for EMR apps and clusters. AutoActions can monitor EMR clusters based on cost, duration, and idle checks and send alerts.
For more information, refer to AutoActions > AutoActions (EMR) topic in User Guide.
Improvements and enhancements
Databricks enhancements
A Databricks Job can be associated with multiple clusters. Each job entry now corresponds to a Databricks job. The following enhancements have been made to the Databricks Workflows > Jobs page (DT-1187):
Pages
Changes
Workflows>Jobs
Removed the Clusters Name column
Workflows>Job Runs
Removed the Clusters Name and Cluster Type columns
Removed the Job name link from the Run Name / ID column.
Renamed the Run Name / ID to Job name / ID
Provided a link to the Run ID. After clicking the Run ID, the job run detail page is displayed.
Updated the Search by ID, Keyword field to Search by keyword. You can search for the job name by typing the keyword.
Changed the Filter by Cluster Name search to Filter by Job name or ID
The following enhancement is done for the Resources tab on the Spark details page. (DT-1456)
Pages
Changes
Compute>Spark>Resources>Host Metrics and Workflow>Job>Task>Resources>Host Metrics
The following new metrics are added to Host metrics:
Total memory
Free memory
You can use these metrics to evaluate the memory spent on processes other than those of Spark.
For information, see User Guide.
Other enhancements
Node count and duration values are provided for the aggregated cost savings for each recommended node type. (EMR-620)
The new Account Id column has been added to the AWS Account Settings page to view configured AWS account ID in the Unravel UI. (UIX-5469)
On the Clusters page, the ID filter has been relocated to the top and is separate. You cannot combine other filters (such as Date and time range) with an ID search. (UIX-5332)
For information, see the Monitor EMR clusters section in the User Guide.
The MySQL client library has been updated to the 12.0 version on the user interface. (UIX-5383)
Enhanced performance by reducing the lag in the Impala pipeline. (ASP-1677)
Support for downloading as CSV option for EMR Clusters and EMR AutoAction pages. (UIX-4853)
For information, see Viewing AutoAction and its details and Monitor EMR clusters sections in the User Guide.
Support for the EMR cluster
idle
state (EMR-465)Unravel now supports AutoAction for the
idle
state of the cluster. You can set AutoAction when the EMR cluster exceeds the idle duration threshold. For information, see Creating AutoActions in User Guide.
Unsupported
Appstore does not support PostgreSQL over SSL.
Unravel does not support Billing for on-prem platforms.
Impala jobs are not supported on the HDP platform.
Monitoring the expiration of the SSL Certificates and Kerberos principals in Unravel multi-cluster deployments.
The following features are not supported for MapR:
Impala applications
Kerberos
The following features are supported on the Data page:
Forecasting
Small Files
File Reports
The following reports are not supported on MapR:
File Reports
Small Files Report
Capacity Forecasting
Migration Planning
The Tuning report is supported only for MR jobs.
Migration Planning
AutoAction is not supported for Impala applications.
Migration
Billing
Insights Overview
Migration planning is not supported for the following regions for Azure Data Lake:
Germany Central (Sovereign)
Germany Northeast (Sovereign)
Forecasting and Migration: In a multi-cluster environment, you can configure only a single cluster at a time. Hence, reports are generated only for that single cluster.
Unravel does not support multi-cluster management of combined on-prem and cloud clusters.
Unravel does not support apps belonging to the same pipeline in a multi-cluster environment but is sourced from different clusters. A pipeline can only contain apps that belong to the same cluster.
In Jobs > Sessions, the feature of applying recommendations and running the newly configured app is not supported.
Bug fixes
AutoActions
When multiple AutoActions policies are created with the Overlapping ruleset and scopes, only one of the AutoAction policies is triggered. (AA-498)
Databricks
The duplicate job runs (with the same run IDs) are generated on the Job Runs page. (DT-1190)
On the Compute page, inaccurate information is displayed for clusters in the Inefficient category. (UIX-5064)
The downloaded TopX Report (in JSON format) lists the incorrect type of Spark app. (REPORT-2094)
In Databricks, when a job in a workflow fails and a new job is launched instead of a new attempt, the new job cannot be part of the same workflow. (PG-269)
On the Chargeback page, when you group by clusters, Unravel has a limitation of only grouping a maximum of 1000 clusters. (SUPPORT-1570)
EMR
After clicking the Hive Query link on a cluster using the bootstrap script, the No apps found with the Id message is displayed. (CLOUD-532)
On the Clusters page, search by cluster name returns incomplete search results. (UIX-5345)
On the Clusters page, the Name and Cluster tags filters return incomplete search results. (EMR-595)
On the Clusters page, the following issues are observed (EMR-588):
The cluster list omits clusters with a zero cost when the custom date range is selected
The cluster list omits the latest cluster cost when the custom date range is selected
If clusters terminate with errors without generating NodeDownSizingEvent, then such clusters are displayed in the Inefficient category on the Clusters page. (EMR-542)
The Spark sensor fails to start. (EMR-485)
On the Clusters page, a mismatch in the cluster IDs displayed in the ID drop-down list with the selected cluster category in the left panel. (EMR-435)
For clusters terminated with errors, the node downsizing recommendations are shown. (EMR-422)
Insights
Clicking the links for operators and stages in the SQLTooManyGroupByEvent does not result in any action. (INSIGHTS-355)
An exception occurs when generating memory insights for a Spark application. (INSIGHTS-363)
Installation
Databricks Healthcheck App Store celery daemon fails to start. (INSTALL-2945)
Installing Unravel fails when connecting with SSL-enabled MariaDB. (INSTALL-3071)
Spark
A blank page is displayed on the Databricks Run Details page for Spark structured streaming applications. (ASP-1629, UIX-5124)
UI
On the Clusters page, a discrepancy exists between the cost of clusters and the minimum and maximum cost displayed in the left pane. (UIX-5270)
From the Clusters page, after clicking the Spark action, refreshing the Spark details page takes longer than expected. (UIX-5247)
When you return from the application details > SQL tab> Stage page to the application details > Attempt page, the Duration, Data I/O, and Jobs Count fields are not displayed. (UIX-5048)
AutoActions stop responding due to an invalid or unsupported HTTP URL or webhook. (AA-575)
App Store tasks fail to start with SSL. (APP-614)
Workaround
To resolve this issue, do the following:
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Use an editor to open
<Installation_directory>/unravel/data/conf/unravel.yaml
file.In the
unravel.yaml
file, under the database > advanced > python_flags block, enter the path to the trusted certificates. For example, if Unravel is installed at /opt/unravel, you must edit theunravel.yaml
file as follows:unravel: ...snip... database: ...snip... advanced: python_flags: ssl_ca: /opt/unravel/data/certificates/trusted_certs.pem
Use the manager utility to upload the certificates.
<Unravel installation directory>/manager config tls trust add --pem
/path/to/certificate
For example: /opt/unravel/manager config tls trust add --pem /path/to/certificate
Enable the Truststore.
<Unravel installation directory>/manager config tls trust enable
Apply the changes and restart Unravel.
<Unravel installation directory>/unravel/manager config apply --restart
If tables are created with the same name and are accessed, deleted, and re-created, and if those tables are re-accessed, then their query and app count does not match.(DATAPAGE-502)
For Hive metastore 3.1.0 or earlier versions, the create time of partitions is not captured if a partition is created dynamically. Therefore, in Unravel, the Last Day KPI for the partition section are not shown. (DATAPAGE-473)
The Insights Overview tab uses UTC as the timezone, while other pages use local time. Hence, the date and time shown on the Insights Overview tab and the other pages after redirection can differ. (UIX-4176)
Kerberos can only be disabled manually from the
unravel.yaml
file.kerberos: enabled: False
WorkloadFit report
A large number of tags can cause the Workload Fit report to fail. (PG-265, CUSTOMER-2084)
WorkloadFit report > Heatmap: The job count has data, but Vcore and memory are empty. (MIG-262)
Cluster discovery
The On-prem Cluster Identity might show an incorrect Spark version on CDH. The report may incorrectly show Spark 1 when Spark 2 is installed on the CDH cluster. (REPORT-1702)
Queue analysis:
The log file name (
unravel_us_1.log
) displayed in the error message is incorrect. The correct name of the log file isunravel_sensor.log
. (REPORT-1663)
Cloud Mapping Per Host report scheduled in v4.6.1.x does not work in v4.7.1.0. Users must organize a new report. (REPORT-1886)
When using PostgreSQL, the percentage (%) sign is duplicated and displayed in the Workload Fit report > Map to single cluster tab. (MIG-42)
SQL events generator generates SQL Like clause event if the query contains a like pattern even in the literals. (TEZLLAP-349)
After upgrading from v4.7.1.1 to v4.7.5.0, the Hive jobs running with the Tez application as an execution engine are not linked. (EMR-406)
After upgrading to v4.7.1.0, Notebooks do not work. You can configure them separately. (REPORT-1895)
After upgrading from v4.6.x to v4.7.1.0, the Tez application details page does not initially show DAG data. The DAG data is visible only after you refresh the page. (ASP-1126)
In the App summary page for Impala, the Query> Operator view is visible after scrolling down. (UIX-3536).
Jobs are falsely labeled as a Tez App for Oozie Sqoop and Shell actions. (PLATFORM-2403)