Home

Overview

By default, the page opens showing the Overview tab. This tab displays an overview of the selected cluster with various KPIs and graphs. You can view the aggregated details for a single cluster. You can select a cluster and the time period to get an overview of the clusters in that specified time period.

The graphs plot the KPIs and other metrics for the selected cluster, over the selected time range. If there are multiple items plotted in a graph, you can select the corresponding checkboxes to hide or show the item.

The Overview page displays the following tiles:

Cluster

The KPIs in Cluster tile shows the current value and changes based on the selected time range. The following KPIs and graphs are shown in this tile:

cluster-overview-cluster.png
  • Nodes: Shows the total count of the current number of active nodes plus the number of bad nodes in the selected clusters. From the Nodes graph, you can plot the active and bad nodes in a cluster, over a specified period.

    cluster-overview-cluster-node.png
  • Available Vcores: Shows the current number of available VCores in the selected clusters. From the VCores graph, you can plot the available VCores and the allocated VCores in a cluster, over a specified period.

    cluster-overview-cluster-VCores.png
  • Available Memory: Shows the current amount of the available memory in the selected cluster. From the Memory graph, you can plot the available memory and allocated memory in a cluster, over a specified period.

    cluster-overview-cluster-memory.png
  • Alerts: Shows the count of alert notifications. The list of the alert notifications are displayed in the right panel.

    Note

    Alerts KPI is displayed only if the count is more than zero.

Jobs

This displays the KPIs pertaining to the jobs run by various application types in the selected cluster, over a period of time.

cluster-overview-jobs.png

The following KPIs and graphs are shown in this section.

  • Running: The current number of running jobs (apps). The Running Jobs chart graphs the running and accepted (pending jobs) over the period selected.

    cluster-overview-jobs-running.png
  • Pending: The current number of pending jobs. The Running Jobs chart graphs this metric.

  • Success/Failed/Killed: Displays the number of successful, failed, or killed jobs for the selected period. The By Status graph plots all these statuses.

    cluster-overview-jobs-by-status.png
  • Inefficient Events: Shows the number of events that have occurred, in a cluster, due to some inefficiencies in the job run, during the specified period. Check Jobs > Inefficient Apps for the list of jobs (apps) experiencing these events.

    Note

    Jobs can experience multiple events; therefore, the number of jobs listed in the Jobs > Inefficient Apps tab is typically less than the number of events.

    cluster-overview-jobs-inefficientevents.png
  • By Type: Display the count of jobs run in the specified time period by various application types. This is shown in pie chart segments as well as plotted in a graph

    cluster-overview-jobs-bytype.png
Alerts

Lists all the alerts and events triggered by jobs for the selected time range. Click CyanWhitePlus.png to create a new AutoAction.

cluster-overview-alerts.png