Home

Optimizing the Performance of Spark Applications

Unravel Web UI makes it easy for you to identify under-performing Spark apps: click Operations | Dashboard, and scroll down to INEFFICIENT APPLICATIONS.

The following case illustrates the performance of a Spark app before and after tuning it based on Unravel's performance analysis:

Before Tuning

Before tuning, Unravel Web UI indicated that this app had a running duration of 34 min 11sec:

gui_spark_app_details_before.png

In addition, Unravel Web UI captured details as shown in the following events:

  • LOW UTILIZATION OF MEMORY RESOURCES

  • LOW UTILIZATION OF SPARK STORAGE MEMORY

  • CONTENTION FOR CPU RESOURCES

  • OPPORTUNITY FOR RDD CACHING

    • Save up to 9 minutes by caching at PetFoodAnalysisCaching.scala:129,with StorageLevel.MEMORY_AND_DISK_SER

  • TOO FEW PARTITIONS W.R.T. AVAILABLE PARALLELISM

    • Change executor instances from 2 to 127, partitions from 2 to 289, adjust driver memory (to1161908224)and yarn overhead (to 819 MB).

After Tuning

After tuning, Unravel Web UI indicates that this app now has a running duration of 1min 19sec:

gui_spark_app_details_after.png

Unravel Web UI displays these events:

  • LOW UTILIZATION OF MEMORY RESOURCES

  • LOW UTILIZATION OF SPARK STORAGE MEMORY

  • LARGE IDLE TIME FOR EXECUTORS

  • TOO FEW PARTITIONS W.R.T. AVAILABLE PARALLELISM

    • Change executor instances from 127 to 138