4.8.2.0 Release notes
Software version
Release date: November 10, 2023
See 4.8.2.0 for download information.
Software upgrade support
Note
Upgrading is not supported in this release.
For fresh installations, see Installing Unravel.
Certified platforms
The following platforms are tested and certified in this release:
Cloud platforms
Google Cloud Platform (GCP) - BigQuery
Snowflake
Review your platform's compatibility matrix before you install Unravel.
Updates to Unravel's configuration properties
See 4.8.x - Updates to Unravel properties.
New features
Improvements and enhancements
Upgraded OpenSearch and OpenSearch dashboards to 2.9.0 version. (INSTALL-3348)
For enhanced security and permission management, the getData permissions are added to Terraform for BigQuery integration only if the user has opted to enable the Data page or set the Billing data feature. (INSTALL-3414)
The command --create-credentials has been replaced with --need-integration to prevent misinterpretation. (BIGQ-795)
UI and server Node.js are now upgraded to 18.14.0. (UIX-5577, UIX-5578)
Python updated to the latest 3.8.17 version. (INSTALL-3417)
Bug fixes
BigQuery
The following bugfixes are included in this release for Unravel BigQuery:
Insights
Scheduled query insights are not generated for on-demand queries. (BIGQ-702)
The action plan associated with Reservation Insights is missing. (BIGQ-761)
Jobs
All projects that are configured for Unravel monitoring should be displayed under the Projects page even if the queries or jobs are not run for those projects. (BIGQ-449)
When comparing two jobs on the Jobs page, the Creation time is not shown on the Job details page when Show differences is enabled, even though the creation time is different. (BIGQ-615)
Snowflake
The following bugfix is included in this release for Unravel Snowflake:
Warehouses
Unravel (Integration Service) IS queries are preventing the suspension of warehouses in the environment. (SNOW-997)
Unravel does not support Billing for on-prem platforms.
On the Data page, File Reports, Small File reports, and file size information are not supported for EMR clusters.
On the Data page, File Reports, Small File reports, and file size information are not supported for Dataproc clusters.
Monitoring the expiration of the SSL Certificates and Kerberos principals in Unravel multi-cluster deployments.
Scheduled Query insights are not generated if you are using the INFORMATION_SCHEMA polling method.
In GCP - BigQuery, for the Data page, a count of more than 100 projects is not supported.
Great Expectations integration is not supported for Snapshot mode in Snowflake.
Profiler-based insights are not supported for Snapshot mode. Since the query profile is not received in the Snapshot mode, the insights that use the query profile are not generated.
Sustained Violation is not supported for Databricks AutoAction.
The following features are not supported for MapR:
Impala applications
Kerberos
The following features are not supported on the Data page:
Forecasting
Small Files Report
File Reports
The following reports are not supported on MapR:
File Reports
Small Files Report
Capacity Forecasting
Cloud Migration reports
AutoAction is not supported for Impala applications.
Billing
Insights Overview
Unravel does not support the Insights Overview tab on the UI for the Amazon EMR platform.
Migration planning is not supported for the following regions for Azure Data Lake:
Germany Central (Sovereign)
Germany Northeast (Sovereign)
Forecasting and Migration: In a multi-cluster environment, you can configure only a single cluster at a time. Hence, reports are generated only for that single cluster.
Migration planning is not supported for MapR.
Unravel does not support multi-cluster management of combined on-prem and cloud clusters.
Automatic FSImage processing is not supported for multi-cluster environments.
Unravel does not support apps belonging to the same pipeline in a multi-cluster environment but is sourced from different clusters. A pipeline can only contain apps that belong to the same cluster.
In GCP - BigQuery, the Data page supports up to a maximum of 100 projects.
All the reports, except for the TopX report, are not supported on Databricks and EMR.
BigQuery
The following Known issues are included in this release for Unravel BigQuery:
For some cached jobs, if many jobs are running in parallel, the View original query link does not show on the Job details page. (BIGQ-466)
Cached query jobs are not analyzed on the Data page. (BIGQ-603)
Terraform does not delete the Service Account and Role in the specified project ID after running the remove command. (INSTALL-3268)
Issue: Sometimes, when you process a large number of BigQuery projects with the manager config bigquery integrate command, you may see the following error:
Provider produced inconsistent result after apply
Workaround: Wait for a few minutes and re-run the command. (INSTALL-2860, INSTALL-2934)
Snowflake
The following Known issues are included in this release for Unravel Snowflake:
Great Expectations integrations are not supported for Snapshot mode. (SNOW-1281)
Issue: Insights may not be generated in some cases where default_runtime_data_connector_name or default_configured_asset_sqldataconnector is used as the data connector. (SNOW-596)
Workaround: The default_inferred_data_connector_name is the recommended data connector to use.
data_connectors: default_inferred_data_connector_name: class_name: InferredAssetSqlDataConnector include_schema_name: false
For more details, refer to Data Connectors and Configure your individual Data Connectors.
The Insights column in the query list does not show the insights count for queries with insight. (SNOW-1295)
Some of the features on the Data page, such as the trend graphs on the Data Overview page and on the Data Tables page, are not applicable for Snapshot mode in Snowflake. (SNOW-1281)
Data quality insights are not supported for Snapshot mode. (SNOW-1281)
If tables are created with the same name and are accessed, deleted, and re-created, and if those tables are re-accessed, then their query and app count do not match. (DATAPAGE-502)
The query to fetch tableDailyKPIs is getting timed out in case of a huge table partition (27 million). (DATAPAGE-740)
For Hive metastore 3.1.0 or earlier versions, the creation time of partitions is not captured if a partition is created dynamically. Therefore, the Last Day KPI for the partition section is not shown in Unravel. (DATAPAGE-473)
On the Data page, size data is missing for certain tables in databases, although the partition size is correctly displayed in the Partition Detail section. (DATAPAGE-695)
Workflow/Job page displays empty for Analysis, Resources, Daggraph, and Errors tabs. (DT-1093)
Event logs and YARN logs are not loaded for some applications in Google Dataproc clusters. (ASP-1372)
On the Table Details page under the Applications tab, inaccurate data is displayed for a table. This issue occurs if a table is deleted and recreated multiple times and applications executed to access the table before the next cycle of the table worker. (PG-156)
When you click the View Clusters link on the Cost-based pages and navigate to the Clusters page, the cluster numbers shown can vary. Sometimes fewer clusters are listed, and at times no clusters are shown. This is a known limitation due to differences in the definition of the time range selector for these pages. (UIX-5328)
Cost page
Shows all the clusters that have accrued cost in the selected period, which may be running or terminated in the selected period, irrespective of their start date.
Cluster page
Shows only those clusters that have started in the selected period.
Due to the known limitations, node downsizing recommendations are not suggested for the following scenarios. (EMR-519, EMR-513, EMR-424)
When only cluster recommendations are applied without applying Application recommendations.
When the workloads require high IO and partitioning.
When the Spark configuration spark.dynamicAllocation.enabled is true.
When the AWS EMR autoscaling is enabled.
When the workload must need parallelism (multiple CPU cores).
On the Clusters details page, the Insights chart is not synchronized with all other graphs displayed on the Cost tab. (EMR-618)
On the Cost > Trends page, users with readonly permission can create a budget. (EMR-576)
The workflow of multiple transient clusters (EMR) is not supported. (ASP-1424, EMR-460)
Unable to run Spark applications on all the master nodes after Unravel bootstrap for high availability clusters. (EMR-49)
Support for high availability EMR master nodes. (EMR-31)
For the MapReduce failed job, error details are missing on the Errors tab. (UIX-5416)
Issue: Jobs created for PySpark application with UDF on a JobCluster fail after applying the recommendations for node downsizing. (DT-1404)
Workaround:
In your Databricks workspace, go to Configure Cluster > Advanced Options > Spark config
Add and set the following property to true for spark.driver.extraJavaOptions and spark.executor.extraJavaOptions spark configurations:
-Dcom.unraveldata.metrics.proctree.enable=true
For example:
spark.executor.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-3.0 spark.driver.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-3.0
Data is not displayed when you click the Optimize button corresponding to OTHERS for the Cost > Chargeback results shown in the table. (UIX-5624)
On the Job Runs page, the Cost slider stops working when the cost for the selected user is less than $1. (UIX-5508)
The job run count displayed on the Chargeback page differs from the job count shown on the Workflow page. (UIX-5581)
Errors and Logs data are missing in specific scenarios for a failed Spark application, such as an application failing with
OutOfMemoryError
. (ASP-1624)On the Cost > Trends and Cost > Chargeback pages, the tooltip for the Last <number of> days field includes more days than the displayed days. (UIX-5042)
Clusters with an Unknown status are excluded from the dashboards often used for monitoring systems. (DT-1445)
When the Interactive cluster is restarted, the cluster count is increased on the Databricks Cost > Trends page. (DT-1275)
After navigating from the Trends and Chargeback pages with Tag filters, the
No data available
message is displayed on the Compute page. (DT-1094)Inconsistent data is displayed for the cluster Duration and Start Time on the Compute page. (ASP-1636)
The DriverOOME and ExecutorOOME events are not generated for the Databricks notebook task. (DT-533)
When a job fails to submit a Spark application, the failed DataBricks job is missing from the Unravel UI. (ASP-1427)
In the Databricks view, the application is shown in a running state, even though the corresponding Spark application is marked as finished. (ASP-1436)
On the Workflows > Jobs page, you can view only up to 100 records or jobs. (ASI-695)
The Email template for the AutoAction policy contains the unformatted table. The issue occurs when the threshold value is zero (0). (ASI-688)
Azure Databricks jobs are randomly missing on the Unravel UI due to Azure Databricks File System (DBFS) mount issues. (PIPELINE-1626)
The Job Run details page displays a duplicate entry for tasks executed during the job. (DT-1461)
Issue: Jobs created for PySpark application with UDF on a JobCluster fail after applying the recommendations for node downsizing. (DT-1404)
Workaround:
In your Databricks workspace, go to Configure Cluster > Advanced Options > Spark config
Add and set the following property to true for spark.driver.extraJavaOptions and spark.executor.extraJavaOptions spark configurations:
-Dcom.unraveldata.metrics.proctree.enable=true
For example:
spark.executor.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-3.0 spark.driver.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-3.0
Google Cloud Dataproc: Executor Logs are not loaded for Spark applications. (ASP-1371)
An exception occurs when installing Unravel version 4.7.6.0 with the Azure MySQL database (SSL Enabled). (INSTALL-2799)
During precheck and healthcheck, the Hadoop check fails for the MapR cluster. You can ignore the messages. (INSTALL-2603)
The Insights Overview tab uses UTC as the timezone, while other pages use local time. Hence, the date and time shown on the Insights Overview tab and the other pages after redirection can differ. (UIX-4176)
Kerberos can only be disabled manually from the
unravel.yaml
file.kerberos: enabled: False
WorkloadFit report
A large number of tags can cause the Workload Fit report to fail. (PG-265, CUSTOMER-2084)
WorkloadFit report > Heatmap: The job count has data, but Vcore and memory are empty. (MIG-262)
Inconsistency between the regions displayed on the Unravel user interface and the ones included in AWS EMR. (MIG-280, MIG-281)
The Cloud Mapping Per Host migration report fails for some regions. (MIG-303)
Cluster discovery
The On-prem Cluster Identity might show an incorrect Spark version on CDH. The report may incorrectly show Spark 1 when Spark 2 is installed on the CDH cluster. (REPORT-1702)
TopX report
The TopX report email links to the Unravel TopX report instead of showing the report content in the email as in the old reports.
Queue analysis:
The log file name (
unravel_us_1.log
) displayed in the error message is incorrect. The correct name of the log file isunravel_sensor.log
. (REPORT-1663)
Cloud Mapping Per Host report scheduled in v4.6.1.x does not work in v4.7.1.0. Users must organize a new report. (REPORT-1886)
When using PostgreSQL, the percentage (%) sign is duplicated and displayed in the Workload Fit report > Map to single cluster tab. (MIG-42)
If the Spark job is not running for Databricks, the values for the Duration and End time fields are not updated on the Databricks Run Details page. (ASP-1616)
You can see a lag for SQL Streaming applications. (PLATFORM-2764)
On the Spark application details page, the timeline histogram is not generated correctly. (UX-632)
For PySpark applications, the
processCPUTime
andprocessCPULoad
values are not captured properly. (ASP-626)
SQL events generator generates SQL Like clause event if the query contains a like pattern even in the literals. (TEZLLAP-349)
After upgrading from v4.7.1.1 to v4.7.5.0, the Hive jobs running with the Tez application as an execution engine are not linked. (EMR-406)
After upgrading to v4.7.1.0, Notebooks do not work. You can configure them separately. (REPORT-1895)
After upgrading from v4.6.x to v4.7.1.0, the Tez application details page does not initially show DAG data. The DAG data is visible only after you refresh the page. (ASP-1126)
In the App summary page for Impala, the Query> Operator view is visible after scrolling down. (UIX-3536).
When you click the hive query, which was executed as part of the Hive on Spark application, a blank page is shown. (UIX-6037)
On the Clusters > Resources page, in the Group By drop-down list, the Application Type, User, and Queue options are duplicated for the YARN/IMPALA resource job type. The issue occurs if identical user-defined tags are used. (UIX-5898)
Jobs are falsely labeled as a Tez App for Oozie Sqoop and Shell actions. (PLATFORM-2403)
BigQuery
The following limitations are included in this release for Unravel BigQuery:
Data page table-level queries and users will be analyzed once the table is captured by DataPage. For older queries, analysis may not be performed. (BIGQ-602)
If tables are created with the same name and are accessed, deleted, and re-created, and if those tables are re-accessed, then their query and app count do not match. (DATAPAGE-502)
If the Tag keys exceed the value of 1000 count or the Tag values per key exceed the value of 1000 count, then the Tags filter does not work on the Unravel BigQuery > Jobs Page. (UIX-6405)
Snowflake
The following limitations are included in this release for Unravel Snowflake:
Some of the trend graphs on the Data Overview page (Number of Tables created, Size of Tables Created, Total Number of Tables, and Total Size of Tables) and the Size trend graph on the Data Tables page are not applicable for Snapshot mode in Snowflake. (SNOW-1281)
For the Unravel installation in Snowflake, if you want to switch from the existing Real-time connection to a Snapshot connection or vice versa, then you must reinstall Unravel. This is because the existing data must be removed, and the new data must be polled. (SNOW-1217)
In the Unravel Snowflake UI, query reflection may exhibit delays when more unravel queries are in the running state or queued state due to low compute processing capacity. Hence, the user queries may reflect late on the Unravel UI. (SNOW-1032)
When the warehouse is not running for long time or deleted then it does not appear in the next iteration of the warehouse query. Therefore, that warehouse appears in an UNKNOWN state. (SNOW-1288)