Skip to main content

Home

Databricks workspace setup guide

Configuring a Databricks Workspace

Use the following command to configure Azure Key Vault for Unravel:

<Unravel Directory>/manager config databricks set-azure-keyvault <client-id> <tenant-id> <client-secret> <keyvault-url> <token-key-name-pattern>

Here:

  • <client-id> is the Azure service principal ID.

  • <tenant-id> is the Azure tenant ID.

  • <client-secret> is the Secret of the Azure service principal.

  • <keyvault-url> is the URL of the Azure Key Vault.

  • <token-key-name-pattern> (Optional) defines a pattern for token secret names. Use {workspace.id} as a placeholder for dynamic secret name creation. For example, workspace-{workspace.id}-token could result in secret names like workspace-124-token.

Note

Ensure that the service principal has "Get" and "List" permissions for the configured Key Vault URI.

Run the following command to disable the Token Key Vault feature:

<Unravel Directory>/manager config databricks unset-azure-keyvault
<Unravel Directory>/manager config databricks set-azure-ad <databricks-client-id> <databricks-tenant-id>

Here:

Ensure the appropriate permissions and configurations are set for the service principal used.

  • <databricks-client-id> is the client ID of the Databricks service principal.

  • <databricks-tenant-id> is the tenant ID of the Databricks service principal.

Create a Workspace

Register a new Databricks workspace or edit details of an existing Databricks workspace.

  1. Sign in to Unravel UI, and from the upper right, click manage-icon.png> Workspaces. The Workspaces Manager page is displayed.

  2. Click Add Workspace. The Add Workspace dialog box is displayed. Enter the following details:

    Token-secret-name.png

Follow these steps if you haven't configured the Azure Key Vault and you prefer to create the workspace with a Personal Access Token.

  1. Enable Personal Access Token in Databricks.

    Note

    If you are using Service Principals for generating the Personal Access Token (PAT), refer to Manage service principals for the steps to enable the PAT on Databricks for AWS and Manage service principals for the steps to enable PAT on Databricks for Azure.

    1. Enable personal access token authentication for the workspace.

      1. Go to your workspace, and from the dropdown located in the upper right corner, select Settings.

        DBX-PAT.png
      2. Click the Advanced tab and click the Personal Access Tokens toggle. For more details, refer to Manage personal access tokens.

    2. Create a Databricks personal access token for your Databricks workspace user.

      1. Go to your workspace, and from the dropdown located in the upper right corner, select Settings.

      2. Click Developer > Access Tokens > Manage. The Access Tokens page is displayed.

        manage.png
      3. Click the Generate New Token button. The new token is generated. You must save this token and keep it handy to register a new Databricks workspace. For more details, refer to Authentication using Databricks personal access tokens.

        Generate-new-token.png
  2. Register a new Databricks workspace or edit details of an existing Databricks workspace.

    1. Sign in to Unravel UI, and from the upper right, click manage-icon.png> Workspaces. The Workspaces Manager page is displayed.

    2. Click Add Workspace. The Add Workspace dialog box is displayed. Enter the following details:

      workspace-config.png

Add Workspace fields

Field

Description

Workspace Id

Databricks workspace ID can be found in the Databricks URL.

The numbers shown after o= in the Databricks URL become the workspace ID.

For example, in this URL:https://<databricks-instance>/?o=3205148689792956, the Databricks workspace ID is the number after o=, which is 3205148689792956.

Workspace-id.png

Workspace Name

Databricks workspace name. A name for the workspace. For example, ACME-Workspace. The Workspace name can be got from the Azure portal.

Instance (Region) URL

Regional URL where the Databricks workspace is deployed. Specify the complete URL. The expected format is protocol://dns or ip(:port).  Ensure that the URL does not end with a slash. For example, a valid input is: https://eastus.azuredatabricks.net. An invalid input is: https://eastus.azuredatabricks.net/.

The URL can be got from the Azure portal.

Tier

Select a subscription option from: Standard, Premium, Enterprises, and Dedicated. For Databricks Azure, you can get the pricing information from the Azure portal. For Databricks AWS you can get detailed information about pricing tiers from Databricks AWS pricing.

Note

You have to select the Premium or Enterprises option and select Enable Databricks SQL to enable the Databrick SQL monitoring.

Token Secret Name

Provide the Azure Key Vault secret name associated with the PAT. For details on how to configure the Azure Key Vault in Unravel, see here.

Note

This option is available only after configuring Azure Key Vault in Unravel. You are responsible for keeping PAT tokens up to date in the Azure Key Vault.

Token

Use the personal access token to secure authentication to the Databricks APIs. You can generate the token from your Databricks workspace.

See here for generating a personal access token through SPN on Databricks for AWS and here for Databricks on Azure..

See Authentication using Databricks personal access tokens for more details.

Note

Users with admin or non-admin roles can create personal access tokens. Non-admin users must ensure to fulfill certain requirements before creating personal access tokens.

Note

After you click Add, it takes around 2-3 minutes to register the Databricks Workspace with Unravel.

Global init script applies the Unravel configurations to all clusters in a workspace. The following steps take you through the cluster configuration with Global init. If you are configuring the clusters without Global init, proceed to the next step.

  1. Under Logging, set Destination to DBFS and copy the below snippet as Cluster Log Path.

    Loggin-spark.png
  2. Under Init Script, set Destination to Workspace, copy the snippet below as the Init Script Path, and click Add.

    init-script-spark.png
  • From your Azure Workspace, access Settings > Advanced and apply the following settings:

Requirements for non-admin tokens