v4.1.9.4 Release notes
Software version
On-premise RPM |
|
EMR RPM |
|
New features
Application data from Resource Manager is published to jc topic and then processed by JCW2 daemon to create entry in jobs table, and create hitdoc. This ensures that kill apps (jhist and job conf files are not available in HDFS) are available in Unravel.
Cluster Analytics
Tested platforms
CDH with Kerberos
HDP with Kerberos
EMR 5.2.X
Qubole Spark type cluster
Improvements and bug fixes
Fixed the issue of "Done directory setting via UI"
Fixed the issue of "Config wizard not showing the value of spark event log location"
Robustness/Reliability
Improved bootstrap scripts for Qubole and EMR
Improved setup scripts for HDP and MapR
Improved experience deploying and configuring the Unravel Resource Metrics Sensor
One single distribution package - no need to download separate ZIP file for MR and Spark
Simplified configuration - the common settings are all included and don't need to be specified by the user
Improved Unravel Resource Metrics Sensor performance
Spark support
Tag interesting Spark apps, usability improvements, time breakdown event
Put caps on the maximum number of executors in recommendations
Differentiate among actionable events in Spark using dynamic ranks as in Hive and MR
Bounding executor logs + efficient log processing, API to access Spark configuration from MySQL
Accessing S3 log files fixes:
Mapping multiple S3 buckets to the same S3 profile
Set the S3 region to a custom one.
Scripts and DSL API extension to generate large Oozie workflows programmatically
Workflow support
Oozie workflow pipeline:
Improve backend storage efficiency of Oozie workflows.
Test and pass changes on large Oozie workflows.
Tagged workflow change:
Enable using Python script to tag any Unravel supported applications.
Enable user to tag Oozie workflow via similar approach. If a user enables this feature, the Oozie workflows won't show up; instead, the selected (tagged) applications will be grouped into a single tagged workflow and presented in the Workflows tab.
Airflow improvements
MR insights
Improvement to Hive/MR time breakdown event.
Mark interesting events to be shown on front end.
Get ApplicationMaster log for app performance tuning for MR jobs.
Known issues
Considerable delay in MR jobs to appear on UI. delay is more than 15 to 25 minutes in EMR. Need serious investigation on MR pipeline to identify if there are issues with long SQL queries, non-indexed column or JCW.
Running Nested oozie workflow since 4 hours. jhist files for many jobs are missing (source - unravel_emr_sensor.log in EMR cluster).
Improve/Rewrite EMR bootstrap script to avoid all manual steps.