Skip to main content

Home

How to connect your Databricks workspace

  1. Configure the Databricks cluster with Unravel. Review our Cluster Setup Guide and follow the instructions.

  2. On the Unravel UI, enter the workspace details.

    1. Click Manage tab > Workspaces.

    2. Click Add Workspace button. The Add Workspace dialog is displayed.

    3. In the Add Workspace dialog, enter your workspace details.

      Field

      Description

      Workspace Id

      Databricks workspace ID.

      Workspace Name

      Databricks workspace name.

      Instance (Region) URL

      Regional URL where the Databricks workspace is deployed.

      Tier

      Select a subscription option: Standard or Premium.

      Token

      Personal access token to authenticate to and access Databricks REST APIs. Refer to Authentication using Databricks personal access tokens to create personal access tokens.

    4. Click Add.

  3. In your Databricks workspace, click Configure Cluster > Advanced Options and update the following configurations. These configurations must be updated for every cluster (Automated /Interactive) in your workspace.

    • Spark/SparkConfig

      Copy the following snippet to Spark > Spark Conf. Replace <Unravel DNS or IP Address>.

      Note

      For spark-submit jobs, click Configure spark-submit and copy the following snippet in the Set Parameters > Parameters text box as spark-submit parameters. Replace <Unravel DNS or IP Address>.

      "--conf", "spark.eventLog.enabled=true",
      "--conf", "spark.eventLog.dir=dbfs:/databricks/unravel/eventLogs/",
      "--conf", "spark.unravel.shutdown.delay.ms=300",
      "--conf", "spark.unravel.server.hostport=<Unravel DNS or IP Address>:4043",
      "--conf", "spark.executor.extraJavaOptions= -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-2.3",
      "--conf", "spark.driver.extraJavaOptions= -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-2.3"
      spark.eventLog.enabled true
      spark.eventLog.dir dbfs:/databricks/unravel/eventLogs/
      spark.unravel.server.hostport <Unravel DNS or IP Address>:4043
      spark.unravel.shutdown.delay.ms 300
      spark.executor.extraJavaOptions -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-2.3
      spark.driver.extraJavaOptions -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-2.3
    • Logging

      Select DBFS as Destination, and copy the following as Cluster Log Path.

      dbfs:/cluster-logs/
    • Init Script

      In the Init Scripts tab, set Destination to DBFS. Copy the following as the Init script path and click Add.

      dbfs:/databricks/unravel/unravel-db-sensor-archive/dbin/install-unravel.sh