Detecting Resource Contention in the Cluster
If your cluster has a lot of apps they will contend for resources in the cluster. Unravel Web UI assists you in detecting and resolving resource contention.
Click Operations | Usage Details | Resources.
In the Cluster VCores or Cluster MemoryMB graph, click on a spike.
Unravel Web UI displays the list of applications running or pending in the cluster at the spike's timestamp, at the bottom of the page. When you see many applications in the ACCEPTED state (not RUNNING), it means they are waiting for resources. For example, the screenshot below shows that only one Spark application is RUNNING (consuming resources) and four MR applications are ACCEPTED (waiting for resources). Now you can take steps to resolve the problem.
Setting Up Auto Actions (Alerts)
To define an action for Unravel to execute automatically when it detects resource contention in the cluster:
Click Manage | Auto Actions.
Select Resource Contention in Cluster.
Specify the rules for triggering this auto action, such as a memory threshold, job count threshold, and so on:
Select the Send Email check box.