Home

Alerts

alerts.png

Alerts can be generated by setting AutoActions. The AutoActions automate the monitoring of your compute cluster by allowing you to define complex, actionable rules on different cluster metrics. For example, you can use an AutoAction to:

  • Alert you to a situation needing manual intervention such as resource contention or stuck jobs.

  • Automatically kill an app or move it to a different queue.

The Unravel server processes AutoActions by:

  • Collecting various metrics from the cluster.

  • Aggregating the metrics according to user-defined triggers (rule) and scope.

  • Detecting violations.

  • Executing the defined actions for each rule violation.

Each rule consists of the:

  • Trigger Conditions the rules (triggers) which determine when an AutoAction is violated, for example, apps using more than 2 GB of memory.

  • Scope for which the triggers are valid for, for example, the app is owned by User X and on cluster Y.

  • Actions to take when the AutoAction is triggered in the correct scope, for example, send an email to User X.

The Alerts page opens displaying the AutoActions created over the past 30 days. The following tabs are displayed:

AutoActions tab

The Alerts page opens displaying the AutoActions created over the past 30 days.

The Created Date pull-down allows you to set the display period. You can filter the list by the AutoAction's creator (Created By ). Click in the text box and enter a string to display all AutoActions created by a user name that contains the string. The left-hand side in an AutoAction's row shows the action's status; green for active and red for inactive. On the right-hand side, the status is again indicated by an arrow (AA-DisabledArrow.png ). Click the arrow to toggle the AutoAction's status.

In the example above, the list was originally comprised of five inactive AutoActions. The third AutoAction was toggled to active status; the green banner notes the successful activation. When the status is toggled, the Inactive and Active numbers were immediately updated. A banner appears whenever you have taken an action, for example, Copy JSON (AA-CopyJSON.png ). If the action fails the banner is red.

The tab initially displays All. Click the status to Active , InActive , and All AutoActions.

Click New from Templates to create an AutoAction using a predefined template or the expert mode . Click New AutoAction for an empty template.