Skip to main content

Home

Same logical operator

SAME: logically and's rules plus adding the further constraint that the rules must be violated within the same scope to trigger an AutoAction.

Example - a rule designed to alert rogue users.
Human-readable form

If any user runs more than ten jobs on a cluster and the same user has more than five jobs pending, then report the user as a rogue.

More formally

(any user has > 10 running apps) SAME (any user has > 5 pending jobs)

JSON definition
“rules”:[
   “SAME”:[
      {
         “scope”:”users”,
         “metric”:”appCount”,
         “operator”:”>”,
         “value”:10,
         state”:”running”
      },
      {
         “scope”:”users”,
         “metric”:”appCount”,
         “operator”:”>”,
         “value”:5,
         “state”:”pending”
      }
   ]
]
Implementation

Internally the back end uses a clustering technique to implement the SAME operator. AutoActions runs all metric aggregations simultaneously. When the metrics are received and aggregated, it evaluates all rules and expressions. It starts at the evaluation tree's leaf expressions and goes up to the root expression.

Assume the above rule, three users (A, B, and C), and the following conditions

  • user A has 12 running and three pending apps

  • user B has seven running and one pending apps

  • user C has 21 running and 11 pending apps

First, the two (2) simple rules are evaluated:

  • does user have more than 10 apps running?

    • User A has 12 → TRUE

    • User B has seven → FALSE

    • User C has 21 → TRUE

  • does the user have more than 5 apps pending?

    • User A has three → FALSE

    • User B has one → FALSE

    • User C has 11 → TRUE

Second, it applies clustering by the scope, and for each cluster, it counts the number of rules triggered. In the back-end code, this procedure is called the “linking” of rules (see Ruleset.java).

  • Cluster “User A”, link count = 1.

    • User A > 10 running apps? → TRUE

    • User A > five pending apps? → FALSE

  • Cluster “User B”, link count = 0.

    • User B > 10 running apps? → FALSE

    • User B > five pending apps? → FALSE

  • Cluster “User C”, link count = 2.

    • User C > 10 running apps? → TRUE

    • User C > five pending apps? → TRUE

Third, all groups with less than the needed links (2 in this case) are discarded. If some of the rules were triggered, that rule is reset for the group.

  • Cluster “User A” has a link count = 1, so it's reset and discarded.

    • User A > 10 running apps? → TRUE reset to FALSE

    • User A > 5 pending apps? → FALSE

  • Cluster “User B”, link count = 0, so it's discarded.

    • User B > 10 running apps? → FALSE

    • User B > 5 pending apps? → FALSE

Finally, only the users that have triggered all rules remain.

  • Cluster “User C”, link count = 2:

    • User C > 10 running apps? → TRUE

    • User C > 5 pending apps? → TRUE

User C meets the criteria for the Rogue User AutoAction. Therefore, User C triggers the AutoAction, and the alert is sent and/or the actions performed.

Comparison to AND

Both Users A and User C would have triggered the above rule were AND used instead of SAME, that is, (any user has > 10 running apps) AND (any user has > 5 pending jobs).

To achieve the same result as the above example using AND instead of SAME, you would need to create the following AutoAction rule for each and every user on the cluster:

  • (Username has > 10 running apps) AND (Username has > 5 pending apps)