Home

Templates

Note

See here for limitations on AutoActions.

Whether using Create from Template or Build Rule you have five sections. Expert Rule is simply a text box.

The sections are:

Name and description

The name is mandatory and is used by the UI for all AutoActions' displays; we recommend using a name that reflects the AutoAction's purpose.

The description is optional, but we recommend completing it with a succinct description of the action. When users hover over the action's name, the description is displayed.

This example is from Create from Template. The name is already filled with the same name as the template and is highlighted. You can change the name.

AA-CAT-NameHigh.png
Ruleset

At least one rule type (User, Queue, Cluster, Apps, and Expert Rule) must be defined. You can define rules for each rule type using pull-down menus for a metric, type or state. The Expert Rule is available only in the Build Rule template.

Metric defines a rule "metric" "comparison operator" "value".

  • See Supported Cluster Metrics for a list and definition of available metrics.

  • The comparison operators: >, >=, ==, <, and <=.

  • Value: any valid numeric value. The default value is 0; were you to leave it the AutoAction would constantly trigger.

The Type options are:

  • MapReduce, YARN, Tez, Spark, Impala, Workflow and Hive

The State options are:

  • new, new_saving, submitted, accepted, scheduled, allocated, allocatedSaving, launched, running, finishing, finished, killed, failed, undefined, newAny, allocatedAny, pending, and * (all).

Multiple rule types are evaluated in conjunction with each other using:

  • Or, And, or Same

  • Or and And work as you would expect. See Same Logical Operator for the definition of Same and its implementation.

Using create from template

This template has the Ruleset defined as needed to fulfill the template type, for example, Rogue Contention In Queues (allocated memory). The template highlights the fields you can or should change. You can't delete/add any of the rule types or add/delete rules. You can change the metric, comparison, type, and state if available by using the pull-down menus. If you change the Metric, Type, or State the template doesn't perform the task you have selected, for example, Rogue Contention In Queues (allocated memory). The default value for the metric comparison is 0. You must change the value otherwise the AutoAction constantly triggers. Multiple rules types are Same'd together. See Same Logical Operator for its definition and implementation

Using build rule

The Ruleset initially lists the type of rules available, User, Queue, Cluster, App, or EXPERT RULE. Click the rule type you want to define. Below Add Queue is selected with the options to add rules for metric, type and state. These options, and only these options, are available for every rule type except the Expert Rule template which is a text box. You must define at least one rule for each of the rule type selected.

AA-20180424-RuleSetQueueSel.png

In the example below, Metric and Type are selected for the Queue rule type. You use the pull-down menus to select metric, the comparison operator, type, or state. See above for further information. A second rule, Apps has been added. When multiple rules are selected, you must choose how they are evaluated in conjunction with each other. The default is the Same operator, but you may select Or or And. See Same Logical Operator for the definition of Same and its implementation. Or and And work as you would expect. You can choose up to two rules, e.g., user & user, expert rule & queue, etc.

AA-20180424-RuleSet-QueueRuleSetWithApp.png

Click Close to delete a rule type and click trash (Trash.png) to delete a specific rule. If you close the rule before saving the AutoActions, your settings are lost.

Options

Define the scope (User, Queue, Cluster, and Application Name), the period in which the AutoAction acts on a violation (Time), and how long/short the violation must occur before the AutoAction takes action (Sustained).

When you select an option its default is All, except Time which defaults to always and Sustained Violation which defaults to none.

When using Create from Template the required option is already checked and uses the default. Any changes you make may cause the AutoAction to not perform as expected.

Note

As of 4.5.07 when using Create from Template the options sections lists only options that are available for the chosen template. For instance, the Rogue Impala templates only has Queue, Cluster, and Application Name options.

AA-20180426-OptNoneSel.png

Check the box next to the option's name to select it.

  • You can narrow the scope of User, Queue, Cluster, and Application Name by using Only or Except. Only applies the rule to only those apps specified, while Except applies them to all but those specified. Use the Transform to specify the names using a regular expression. The example below is using the Application Name in the Except mode with the app MyApp. You can add more apps by clicking Add Application. Since no regular expression is specified, this option applies to all apps except MyApp. Create from Template defaults to All.

    AA-0180426-AppSelected.png
  • The Time sets the time range and time zone during which the AutoAction can be triggered. The AutoAction remains active but doesn't trigger outside of the specified time range. The default start and end time is when you defined AutoAction with the time zone set to America/Los Angeles. If you don't change the default time the AutoAction can be triggered for only one minute a day. Enter the time directly or click on the clock (clock.png) in the time box. Time is entered in 24 hour time. The end time must be later than the start time.

    AA-20180426-TimeSelected.png
  • Sustained violation specifies a length of time violation must occur before the AutoAction is triggered. This allows time for the violator to "self correct" and decreases false positives The default is zero, i.e., all AutoActions are immediately triggered upon violation and the specified action is carried out. You can select minimum or maximum mode. In both cases the AutoAction must be continually violated.

    • Minimum sustained mode triggers the action only if this violation was continuously detected for at least the specified period. This suppresses triggering of violation actions for “on-offs” and metric spikes. These are normal in multi-tenant cluster environments can return to normal operation on their own. If a violation stops before the minimum time period, the clock is reset for that app. For instance, if the minimum time is one hour and the app violates the AutoAction for 58 minutes and then returns to normal – no action is taken and the time period for that app resets to 0.

    • Maximum sustained mode triggers the action only if this violation is continuously detected for less than the specified time period. This suppresses the triggering of violations for long-running apps and triggers on AutoAction rule scope on ad hoc short-lived user apps.

    AA-20180426-SustainSelected.png
Actions

Defines the actions to take when the AutoAction is triggered.

Note

As of 4.5.0.7 when using Create from Template the actions section lists only options that are available for the chosen template. For instance, Impala and Workflow templates don't have the option to Move the app or workflow.

Build Rule and Create from Template, exception for Impala query, you can specify the following actions: Send an email, HTTP Post, Post to Slack, Move App to Queue, and Kill App. Use Build Rule to enter an Expert Action. See Expert Mode for information on defining an action in JSON. See AutoActions and Pagerduty for using Unravel's API to receive notifications via Pagerduty. You can't kill a Hive, Impala, or Workflow app.

You can choose one or more actions. Check the box to choose that action. If you chose no actions, the UI simply records the violation and saves the data for the cluster view. Shown below are all the possible actions; in Create from Template only actions valid for the template are available.

AA-Actions.png
  • For Send Email you must enter at least one recipient. Add more recipients by clicking Add Recipient. You can also specify to include the owner of the app selecting the Include Owner radio button.

    AA-20180424-ActionsExpMail.png

    Note

    If you need to send an email notification to the owner of the application who is an LDAP user, configure the additional LDAP properties.

  • For HTTP post you must enter at least one URL. You can add more URLs by clicking Add URL.

    AA-20180424-ActionsExpHttp.png
  • Post to Slack Unravel provides integration with SlackApp allowing you to post information to one or more Slack channels and users. In order to use this feature you need a Slack:

    • Webhook URL: The incoming webhook URL configured in Slack for the channel or user. You can post to multiple webhooks.

    • Token: The OAuth access token for the SlackApp.

    See Slack's Incoming Webhooks for further information on creating/obtaining the above.

    AA-post-to-slack.png
  • Move app to queue or Kill App. You must enter a queue to move the App to.

    AA-20180424-ActionsExpMovKil.png

    Warning

    The Move App and Kill App are mutually exclusive. If you select both, the Kill App takes precedence and Move App ignored. In order for these to be executed the scope must:

    Have directly caused the rule violation, and
    Have allocated resources, that is the app is in allocated or running states.

    Move App is a non-destructive action that shouldn't affect the cluster performance and its availability to the user; however, we suggest using it with caution.

    Kill App is a destructive action. It can affect the cluster's performance and its availability to the users. This option is primarily to kill rogue apps that are causing contention of a cluster resources.

  • Use Build Rule to enter an action using JSON. See Expert Rules for examples.

    AA-20180424-ActionsExpExerpt.pn