Skip to main content

Home

Limitations

Alerting on running apps

Applications of the following types don't provide any means for real-time alerts, in other words, alerting when the app is running. Once the app has finished, the alerts are generated for policy violations that have already occurred.

  • Impala

  • Hive-on-Tez

Running duration versus final duration inconsistency

Unravel calculates and publishes internally the current duration for apps of the following types in real-time, that is, when in the running state. Upon the app completion Unravel receives the actual end time and performs the final duration calculation. This can lead to an inconsistency where the aggregated duration published during the running state is greater than the duration published upon the app's completion.

  • Workflow

Missing AutoAction violation badge

The badge (AutoAlert.png) isn't displayed for the following app types.

  • Hive

  • Workflow

  • Running and failed Impala apps.

  • Hive on Tez

Unsupported
Kill action for the following apps
  • Hive

  • Workflow

Move to Queue action for the following apps
  • Hive

  • Impala

  • Workflow

"Cloud" type setups
  • Unravel for EMR and Unravel for HDInsight

    • Kill and move actions for all types of apps.

    • Rules that span multiple clusters.

    • In multi-cluster configurations AutoActions doesn't differentiate between entities of each cluster and sets up a policy that targets all monitored clusters. For instance, creating a rule to target the root queue results in the queue being monitored on all clusters.

      Workaround:

      • If the cluster ID is known isolate the policy for the cluster using policy options.

    • Uses the internal Hadoop cluster ID instead of Unravel cluster ID/name. You must obtain the internal cluster ID in order to specify a Hadoop cluster in the policy options section. It can be obtained from HDFS namenode, where it’s stored in {dfs.namenode.name.dir}/current/VERSION.

    • In case of transport message protocol synchronization error, n exceptionally rare occasion AutoAction can be triggered up to 180 seconds after the violation occurs. No data loss is expected.

    • Recent Events & Alerts shows the events across all clusters regardless of the currently selected cluster.

    • Application Master level metrics, such as job metrics and job counters, aren't collected by EMR sensor by default and therefore cann't be used in AA policies. Collection of AM metrics can be enabled manually using “am-polling” option in EMR sensor.

    • In exceptionally rare cases AutoActions can be triggered up to 180 seconds later in case of transport message protocol synchronization error but no data loss is expected.

    Note

    Prior to Unravel v4.5.2.0 Cloud Release, AutoActions aren't supported.

  • All other Cloud platforms, for example, Unravel for Azure Databricks, Quoble, etc.

    • There is no support for AutoActions.

AutoActions properties

See AutoAction properties general AutoActions and AutoAction daemon properties.