Home

Part 2: Connecting Unravel to a Databricks cluster

Using the Azure Databricks UI or clusters API, connect Unravel to the Databricks jobs/cluster you want Unravel to monitor:

  1. Add a cluster node initialization script:

    • On the cluster configuration page, click Advanced Options.

    • At the bottom of the page, click the Init Scripts tab.

    • In the Destination drop-down, select a destination type.

    • Specify the path to the initialization script as dbfs:/databricks/unravel/unravel-db-sensor-archive/dbin/install-unravel.sh

      azure-databricks-init-script.png
    • Click Add.

  2. Configure Spark:

    • On the cluster configuration page, click Advanced Options.

    • At the bottom of the page, click the Spark tab.

    • Paste the snippet generated by databricks_setup.sh into the text box.

      azure-databricks-spark-snippet.png
    • Click Add.

    Note

    If you submit Spark jobs through spark-submit, you can't configure Spark this way.

    azure-databricks-spark-submit1.png

    Instead, you have to use spark-submit parameters with the snippet as provided by databricks_setup.sh.

    azure-databricks-spark-submit2.png
  3. Connect to cluster logs:

    To configure the log delivery location:

    • On the cluster configuration page, click Advanced Options.

    • At the bottom of the page, click the Logging tab.

    • Select a destination type DBFS.

    • Enter the cluster log path.

      azure-databricks-dbfs-cluster-logs.png
Next steps

Enable additional instrumentation and configure optional settings.