Skip to main content

Home

Smart Spark Select

The Smart Spark Select app helps users optimize Spark application performance through intelligent resource recommendations and actionable insights. Designed to analyze memory usage and application metrics, this app identifies opportunities to enhance resource efficiency and reduce costs.

The app consists of two tabs:

  • Spark Select: Focused on delivering recommendations for optimizing driver and executor memory. It includes a settings section where users can configure thresholds to suit their specific needs.

  • Parametric Dashboard: Offers detailed application metrics such as vCore allocation and succeeded task analysis.

Installing and opening the Smart Spark Select app

Refer to Installing and opening apps in Unravel.

Spark Select
How We Recommend Resources:

Executor Memory

  • Checks if Unravel provides a recommendation for executor memory. If available, recommends the same. Otherwise, calculates based on usage.

  • If (max executor usage + 20%) < allocated percentage, recommends (max executor usage + 20%). Otherwise, no recommendation is made.

Driver Memory

  • Checks if Unravel provides a recommendation for driver memory. If available, recommends the same. Otherwise, calculates based on usage.

  • If (max driver usage + 20%) < allocated percentage, recommends (max driver usage + 20%). Otherwise, no recommendation is made.

Analyzing the Spark Select dashboard

Under Input Options, select the required Data Range, Queue, Users, Recommendation and click Submit. The Spark select data is displayed in the form of a table. The following fields are available in the Spark select table:

Field

Description

Queue

The queue in which the Spark job is running.

Allocated Executor Memory

The amount of memory allocated for Spark executors.

Max Executor Memory Used

The maximum memory utilized by Spark executors during the job.

Recommended Executor Memory

The memory recommended for Spark executors based on Unravel's calculations.

Executor Memory Used

The actual memory used by Spark executors during the job.

Executor Save Memory

The memory savings achieved for executors based on recommendations.

Max Driver Memory Used

The maximum memory utilized by the Spark driver during the job.

Recommended Driver Memory

The memory recommended for the Spark driver based on Unravel's calculations.

Duration

The total runtime duration of the Spark job.

Status

The current status of the Spark job

Recommendation

Specific recommendations for optimizing Spark job resource usage.

Parametric Dashboard

Under Input Options, select the required Data Range, Kind, Parameters, Value and click Submit. The parametric data for the selected options is displayed in the form of a table.

Note

There are two options available for Kind: Vcores and number of succeeded tasks.

  • Vcores report is available for all the app kinds.

  • Number of succeeded tasks is only available for Tez.

The following fields are available in the Vcores table:

Field

Description

App ID

The unique identifier for the application.

User

The username associated with the Spark job.

Queue

The queue in which the Spark job is running.

Cluster UID

The unique identifier of the cluster where the job is executed.

Allocated Vcores

The number of virtual cores allocated to the application.

Kind

The type of resource allocation (Vcores or Tez).

Start Time

The timestamp indicating when the application started.

End Time

The timestamp indicating when the application ended.

Duration

The total runtime of the application.

Status

The current status of the application.

Vcores.png

The following fields are available in the Tez table:

Field

Description

App ID

The unique identifier for the Tez application

DAG ID

The identifier for the DAG associated with the application.

Cluster UID

The unique identifier for the cluster where the Tez application is running.

Total Launched Tasks

The total number of tasks launched by the Tez application.

Number of Succeeded Tasks

The count of tasks that completed successfully within the Tez application.

Tag

The tag displayed Greater or lesser than the value entered in the Input Value field.

Start Time

The timestamp indicating when the Tez application started.

End Time

The timestamp indicating when the Tez application ended.

Duration

The total runtime of the Tez application.

Status

The current state of the Tez application.

Tez.png