Identifying Rogue Applications
Rogue applications can affect cluster health and lead to missed SLAs. Therefore, it is best practice to identify and eliminate them. Symptoms of a cluster with rogue applications include jobs that take too long to run or applications that use too many vcores. Unravel Web UI makes identifying rogue apps easy:
In the Cluster VCores or Cluster MemoryMB graph, click on a spike.
Unravel Web UI displays the list of applications running or pending in the cluster at the spike's timestamp at the bottom of the page.
Click on the application which has allocated the highest number of vcores. In this example, there is a MapReduce application which has allocated 240 vcores of the cluster.
Check the event panel in the application's APM to see Unravel's recommendations for improving the efficiency of this MapReduce application. For example:
Set up an AutoAction to proactively alert if a rogue application is occupying the cluster.