Skip to main content

Home

How to connect your Databricks workspace

  1. On the Unravel UI, click Workspaces. The Workspaces manager page is displayed.

  2. In STEP-1: Configure Unravel with Databricks Workspaces, click Add Workspaces. The Add Workspace dialog is displayed.

  3. Enter the following details:

    Field

    Description

    Workspace Id

    Databricks workspace ID.

    Workspace Name

    Databricks workspace name.

    Instance (Region) URL

    Regional URL where the Databricks workspace is deployed.

    Tier

    Select a subscription option: Standard or Premium.

    Token

    Personal access token to authenticate to and access Databricks REST APIs. Refer to Authentication using Databricks personal access tokens to create personal access tokens.

  4. In STEP-2: Instructions to setup Databricks with Unravel, follow the instructions to configure the Databricks cluster with Unravel.

    Based on these instructions, go to Configure Cluster > Advanced Options  and update the following configurations. These configurations must be updated for every cluster (Automated /Interactive) in your workspace.

    1. Spark/SparkConfig

      Copy the following snippet to Spark > Spark Conf. Replace <Unravel DNS or IP Address>.

      Note

      For spark-submit jobs, click Configure spark-submit and copy the following snippet in the Set Parameters > Parameters text box as spark-submit parameters. Replace <Unravel DNS or IP Address>.

      "--conf", "spark.eventLog.enabled=true",
      "--conf", "spark.eventLog.dir=dbfs:/databricks/unravel/eventLogs/",
      "--conf", "spark.unravel.shutdown.delay.ms=300",
      "--conf", "spark.unravel.server.hostport=<Unravel DNS or IP Address>:4043",
      "--conf", "spark.executor.extraJavaOptions= -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-2.3",
      "--conf", "spark.driver.extraJavaOptions= -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-2.3"
      spark.eventLog.enabled true
      spark.eventLog.dir dbfs:/databricks/unravel/eventLogs/
      spark.unravel.server.hostport <Unravel DNS or IP Address>:4043
      spark.unravel.shutdown.delay.ms 300
      spark.executor.extraJavaOptions -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-2.3
      spark.driver.extraJavaOptions -Dcom.unraveldata.client.rest.request.timeout.ms=1000 -Dcom.unraveldata.client.rest.conn.timeout.ms=1000 -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-2.3
    2. Logging

      Select DBFS as Destination, and copy the following as Cluster Log Path.

      dbfs:/cluster-logs/
    3. Init Script

      In the Init Scripts tab, set Destination to DBFS. Copy the following as the Init script path and click Add.

      dbfs:/databricks/unravel/unravel-db-sensor-archive/dbin/install-unravel.sh