Introduction

How Unravel works

Unravel connects to your data platforms as a read-first observer, then applies optimizations through tightly controlled, validated actions — with you in control of how much autonomy to grant at each step.

Deployment model

Unravel runs as a SaaS-hosted control plane. There's nothing to install in your production environment. A lightweight connector per platform handles telemetry collection and action execution. Your data never leaves your infrastructure.

DB
Databricks
REST API + DBFS connector
SF
Snowflake
Native app + INFORMATION_SCHEMA
BQ
BigQuery
IAM-scoped service account
CL
Cloudera
On-prem agent + CDP API

Architecture overview

YOUR INFRASTRUCTURE UNRAVEL SAAS Databricks Jobs · Notebooks Clusters · Pipelines Snowflake Warehouses · Queries Storage · Pipes BigQuery Jobs · Slots Datasets · Tables Cloudera YARN · Spark · Hive CDP · CM API ✓ Metadata & telemetry only — no query data ✓ Read-only access by default actions ARVIX ENGINE Context graph Validation loop Controllable automation ACTIONS AutoApply Human-in-Loop Code rewrites INTERFACES Web UI REST API CI/CD Plugins SOC 2 Type II · TLS 1.3 · Customer-controlled encryption

Key concepts

Arvix
Unravel's autonomous optimization engine. Identifies inefficiencies, generates validated fixes, and applies them — with configurable automation levels.
🔄
AutoApply vs Human-in-Loop
Set per workload. AutoApply handles routine, low-risk actions continuously. Human-in-Loop queues recommendations for engineer approval before execution.
📊
Efficiency Rating
A composite 0–100 score across Cost, Performance, and Reliability. Your operational baseline — tracked over time across all connected platforms.
🔗
Context Graph
Unravel builds a relationship map across code, compute, data, and users. Fixes are informed by system-wide context, not isolated query analysis.
Tip: Most teams start in Human-in-Loop mode to build confidence in Unravel's recommendations, then progressively enable AutoApply per workload class. Typical ramp: 4–6 weeks to full AutoApply on routine infrastructure actions.
Databricks

Databricks integration

Unravel connects to Databricks via the Databricks REST API and workspace-level service principal. No agents run inside your workspace. Jobs, notebooks, pipelines, and clusters are observed continuously; optimizations are applied through the same API surface Databricks exposes to your own tools.

Architecture

DATABRICKS WORKSPACE DATA SOURCES REST APIs Jobs · Clusters Pipelines · SQL Warehouse Notebooks · SCIM System Tables billing . mlflow compute · lakeflow jobs · access DATA SHARING MECHANISMS Unity Catalog Delta Share DBFS Direct AUTHENTICATION SPN / OAuth M2M PAT / Token telemetry actions ARVIX ENGINE Workload Studio analysis Code + cluster optimization Validation loop AutoApply / Human-in-Loop Cost attribution engine Context graph OUTPUTS Query / job rewrites Cluster config changes Autoscaling adjustments Cost attribution reports Delta / storage optimization CI/CD PR annotations

Installation

1
Choose authentication method
Unravel supports two auth methods. Use SPN / OAuth M2M (recommended for production) — create a Databricks Service Principal and generate a client secret. Or use a PAT (Personal Access Token) for quick trials. Enter credentials in the Unravel onboarding flow.
https://<workspace-id>.azuredatabricks.net
2
Configure data sharing mechanism
Select how Unravel pulls telemetry from your workspace — choose one or more:
Unity Catalog + SQL Warehouse — Unravel queries system.* tables via a dedicated SQL Warehouse. Requires Unity Catalog enabled and SELECT on system.* tables.
Delta Share — Unravel receives telemetry data shared via the Delta Sharing protocol. Provide the share name and the recipient profile. No direct workspace access required.
DBFS Direct Share — Unravel reads log and metrics files written to DBFS paths. Requires DBFS read scope on the configured paths.
3
Grant REST API permissions
The SPN or PAT must have: Jobs read/edit Clusters read/edit(Optional edit for Auto Actions), Pipelines read, and Notebooks read (optional,for code analysis). Unity Catalog access is required for the Unity Catalog telemetry mechanism.
4
Select workspaces and clusters
Choose which workspaces to monitor. Scope coverage to specific clusters or job namespaces — useful for phased rollouts in large environments.
5
Set automation policy
Choose Human-in-Loop (default) or AutoApply per action category. Most teams start with Human-in-Loop and enable AutoApply for low-risk infrastructure actions after a few weeks.
6
First insights in ~30 minutes
Unravel begins ingesting job run history. Initial cost and performance insights appear within 30 minutes. Full workload profiling completes after 24–48 hours of data.

Required permissions

Resource Access Purpose
Jobs API read edit Job run history, durations, costs
Clusters API read edit Cluster config reading + rightsizing actions
DBFS / Delta read Storage profiling and Data Temperature analysis
Unity Catalog read Table lineage and storage attribution
Notebooks / Repos read Code analysis for Arvix rewrites (optional)

Key features

Workload Studio
Deep job and notebook profiling — stage-level metrics, shuffle analysis, PySpark diffs, and historical run comparisons.
Cluster rightsizing
Arvix recommends and applies autoscaling config corrections, worker type changes, and idle-cluster policies.
Code rewrites
Inefficient PySpark and SQL detected in production gets a one-click Arvix-generated fix — repartition, reorder, hint injection.
Delta / storage optimization
Data Temperature classification (Hot/Warm/Cold) drives archival and vacuum recommendations, with AutoApply support.
Cost attribution
Job-level and team-level cost breakdown. Chargeback-ready reports per business unit, with MoM trend tracking.
CI/CD integration
Unravel's CI plugin flags cost and performance regressions in PRs before they reach production. Supports GitHub Actions and Jenkins.
Snowflake

Snowflake integration

Unravel connects to Snowflake via a dedicated role with read access to INFORMATION_SCHEMA and ACCOUNT_USAGE views. Optimization actions — warehouse rightsizing, query rewrite suggestions, idle-suspend policies — are applied through a controlled write role that you define and audit separately.

Architecture

SNOWFLAKE ACCOUNT ACCOUNT_USAGE INFORMATION_SCHEMA Warehouses Dedicated Unravel role telemetry actions ARVIX ENGINE Warehouse profiling Query analysis + rewrite Storage optimisation Credit optimization SLA monitoring OUTPUTS WH resize / suspend / reconfigure SQL query rewrites Storage cleanup / reclassification Credit attribution SLA alerts

Installation

1
Run the Unravel setup script
Unravel provides a Snowflake SQL script that creates a dedicated role, warehouse, and grants. Run it as ACCOUNTADMIN. The script is fully auditable — no black boxes.
2
Generate connection credentials
Use key-pair authentication (recommended) or username/password. Provide the account identifier, role name, and warehouse to the Unravel onboarding UI.
3
Configure action permissions
Warehouse resize and auto-suspend actions require an additional MODIFY WAREHOUSE grant. These are gated behind a separate approval step in the UI.
4
Monitoring begins
All virtual warehouses in the account are discovered and monitored automatically. Unravel begins building workload profiles and usage patterns across every warehouse.

Required permissions

ResourceAccessPurpose
ACCOUNT_USAGEreadQuery history, warehouse usage, storage metadata, credit usage
INFORMATION_SCHEMAreadReal-time query history
Virtual warehousesread modifyRead configs, update configs, apply resize

Key features

Warehouse rightsizing
Arvix detects oversized warehouses, idle time patterns, and multi-cluster contention — then right-sizes or adjusts auto-suspend policies automatically.
Query optimization
Identifies expensive queries and generates rewritten SQL — clustering changes, join reorders, partition pruning, predicate pushdown, etc.
Storage optimisation
Identifies cold tables, transient table opportunities, and excessive time-travel/fail-safe retention — recommends cleanup and reclassification to reduce storage costs.
Credit attribution
Breaks credit spend down to individual queries, users, and business units. Chargeback-ready exports to common FinOps tools.
SLA tracking
Monitors query SLAs in real time. Arvix can proactively reschedule or scale to prevent missed windows before they occur.
BigQuery

BigQuery integration

Unravel connects to BigQuery using a GCP service account with project-scoped IAM roles. Telemetry is collected from Information schema views and BigQuery APIs. Optimization actions — slot reservation management, query rewrites, storage recommendations — are applied through the same service account with write roles you control.

Architecture

GCP PROJECT Information Schema Billing Export IAM service account telemetry actions ARVIX ENGINE Slot utilization analysis Query analysis + rewrite Storage optimisation Reservation optimization OUTPUTS Slot reservation updates SQL rewrites Storage cleanup / reconfiguration Project cost attribution

Installation

1
Create a service account
Create a GCP service account in the target project. Assign the roles listed in the table below. Download the JSON key file.
2
Upload credentials to Unravel
Provide the service account JSON key to Unravel via the onboarding UI or a secure handoff.
3
Enable billing export
Setup billing export to a dataset in your project.
4
Select projects to monitor
Specify the list of projects to be monitored by unravel. Specify the GCP project ID, the dataset ID, and the target table name for billing export. Unravel builds a unified view across all the projects.

Required IAM roles

RoleAccessPurpose
BigQuery Resource ViewerreadProjects metadata, job history and query stats
BigQuery Metadata ViewerreadStorage metadata
BigQuery Data Viewer (For billing export table only)readBilling export dataset access
BigQuery User (for one project only)createExecute BQ queries by Unravel
BigQuery Resource Editor (optional, for AutoApply)writeSlot reservation management
BigQuery Resource Admin (optional, for AutoApply)writeSlot Capacity management

Key features

Slot optimization
Continuously analyzes slot demand patterns to find the optimal balance between on-demand and committed capacity — then automatically adjusts reservations so you pay less without sacrificing query performance.
Query optimisation
Identifies expensive queries and generates rewritten SQL — clustering changes, join reorders, partition pruning, predicate pushdown, etc.
Storage optimisation
Detects unused partitions, unqueried tables, and suboptimal clustering — recommends cleanup, partition expiration, and storage type changes to cut costs. Classifies tables as Hot/Warm/Cold based on access patterns. Auto-identifies long-tail tables generating storage cost with no query activity.
Multi-project attribution
Unified cost view across projects and teams. Chargeback reports at the project, dataset, user, and label level.
Cloudera

Cloudera integration

Unravel's Cloudera integration covers both on-premises CDH/CDP environments and Cloudera Data Platform on public cloud. A lightweight Unravel server is deployed within your network boundary — no data leaves your infrastructure. Unravel reads from Cloudera Manager API and YARN, Spark, and Hive event streams.

Note: The Cloudera integration uses an on-premises agent model rather than SaaS connector. The Unravel server runs in your environment and communicates outbound only for UI access and license management. All telemetry stays within your network.

Architecture

YOUR DATA CENTER / VPC CLOUDERA CLUSTER YARN / MR Spark Hive / Impala HDFS Cloudera Manager API CDP / CDH 6+ / CDP Public Cloud UNRAVEL SERVER (runs in your network) Event collection agent Arvix local analysis Action executor Outbound UI only No data leaves network UI only UNRAVEL WEB UI Operations Center dashboard Optimization insights & actions

Installation

1
Provision the Unravel server
Deploy the Unravel server on a dedicated host (or VM) within your cluster network. Minimum: 16 CPU cores, 64 GB RAM, 500 GB SSD. The server communicates with cluster nodes over your internal network only.
2
Install the Unravel agent package
Download the Unravel RPM or tarball from the customer portal. Run the installer script as root on the Unravel server host.
sudo ./unravel-install.sh --cluster-manager cloudera
3
Configure Cloudera Manager credentials
Provide your Cloudera Manager hostname, port, and a read-only API account. Unravel auto-discovers cluster services and begins registering event listeners.
4
Enable Spark and YARN instrumentation
Unravel adds a Spark listener JAR via Cloudera Manager parcel or classpath injection. A rolling restart of affected services is required. YARN history server integration requires no restart.
5
Access the Unravel UI
The UI is served from the Unravel server on port 3000 by default. It's accessible only within your network unless you configure a proxy or VPN. No inbound connections from Unravel SaaS infrastructure.

Key features

Spark job analysis
Stage-level profiling, DAG visualization, executor skew detection, and Arvix-generated PySpark optimization recommendations.
YARN queue optimization
Queue utilization analysis, capacity planning recommendations, and workload scheduling optimization across YARN resource pools.
Hive / Impala query tuning
Query plan analysis, statistics freshness checks, and partition/bucket recommendation for slow-running Hive and Impala workloads.
Migration readiness
Workload inventory and compatibility scoring for teams planning migration from Cloudera to Databricks, Snowflake, or BigQuery.
Cloudera to cloud migration: Unravel's Cloudera integration is often the starting point for teams modernizing to cloud data platforms. The workload inventory and cost attribution data generated here carries forward into your Databricks or Snowflake environment.
Frequently asked questions

Technical FAQ

Real answers for engineers evaluating Unravel. If you don't see what you need, contact solutions@unraveldata.com.

Security & compliance
Does Unravel ever see or store my actual data?

No. Unravel operates exclusively on metadata and telemetry — query text, execution plans, timing, cost signals, cluster configuration, and schema information. The actual data inside your tables, the results of your queries, and the contents of your pipelines never leave your infrastructure.

The one exception is query text (SQL or PySpark code), which Unravel uses for optimization analysis. If your query text contains sensitive column names or business logic you'd prefer to keep private, you can enable query text masking in settings — Unravel will analyze structure without storing the literal text.

→ Full data handling documentation: trust.unraveldata.com

Is Unravel SOC 2 certified?

Yes. Unravel is SOC 2 Type II certified. The report is available to qualified prospects and customers under NDA. Contact your account team to request it.

Unravel also supports HIPAA-eligible deployments and has enterprise customers in regulated industries including financial services (Wells Fargo, Barclays, Equifax) and pharmaceuticals (Novartis).

→ View certifications and compliance docs: trust.unraveldata.com

What encryption does Unravel use?

All data in transit between Unravel and your data platforms uses TLS 1.3. Stored metadata in the Unravel platform is encrypted at rest using AES-256.

For customers with specific key management requirements, Unravel supports customer-managed encryption keys (CMEK) for the metadata store. Available on Enterprise plans — talk to your account team.

→ Full encryption and key management details: trust.unraveldata.com

Where is Unravel's SaaS infrastructure hosted?

Unravel's control plane runs on AWS (us-east-1 by default). EU-region deployment is available for customers with data residency requirements — metadata stays within the EU boundary.

For Cloudera on-premises deployments, Unravel runs entirely within your network. Only the UI connection is outbound — no telemetry leaves your data center.

→ Infrastructure and residency details: trust.unraveldata.com

What network access does Unravel need?

For cloud platforms (Databricks, Snowflake, BigQuery), Unravel makes outbound API calls from Unravel's SaaS IPs to your platform's API endpoints. No inbound firewall rules are required on your side.

Unravel's egress IP ranges are provided during onboarding for allow-listing. You can also use VPC peering or PrivateLink (available on Enterprise plans) to keep all traffic private.

→ Current IP ranges and network architecture: trust.unraveldata.com

Can we restrict which users inside Unravel can see our data?

Yes. Unravel supports role-based access control (RBAC) with SSO integration (SAML 2.0, Okta, Azure AD). You can scope team members to specific platforms, business units, or cost centers.

Sensitive cost data and query text can be hidden from non-admin roles. Audit logs for all Unravel UI actions are available and can be exported to your SIEM.

→ Access control and audit log documentation: trust.unraveldata.com

Deployment & setup
How long does deployment take?

For cloud platforms: initial connection takes under an hour. Provide credentials, scope the workspaces, and Unravel begins collecting telemetry immediately. Meaningful first insights appear within 30 minutes of data ingestion.

Full workload profiling — the context Arvix uses to generate high-confidence optimizations — typically completes after 48–72 hours of production traffic. Most customers find their first significant savings opportunity within the first 48 hours.

For Cloudera on-premises: plan for a 1–2 day engagement with a Unravel solutions engineer to complete server setup and cluster instrumentation.

Does Unravel require an agent or any code changes to our pipelines?

For Databricks, Snowflake, and BigQuery: no agents, no code changes, no instrumentation required. Unravel connects entirely through native APIs. You don't modify any existing jobs or pipelines.

For Cloudera: a lightweight Spark listener JAR is deployed via Cloudera Manager. This requires a rolling restart of Spark services — no application code changes.

The CI/CD integration (GitHub Actions, Jenkins plugin) requires adding a step to your existing pipeline. This is optional and takes about 10 minutes to configure.

Can we connect multiple Databricks workspaces or Snowflake accounts?

Yes. Multi-workspace and multi-account environments are fully supported and common in enterprise deployments. Each workspace or account is added as a separate connection — Unravel builds a unified cross-platform view across all of them.

Cost attribution, team-level reporting, and optimization recommendations span across all connected platforms in a single Operations Center view.

What happens if Unravel loses connectivity to my platform?

Unravel queues telemetry collection and resumes when connectivity restores. No data is lost for gaps under 24 hours. For longer outages, there may be gaps in the historical timeline but no impact to ongoing optimization once connection is restored.

AutoApply actions pause automatically during connectivity loss — Unravel will not apply changes it cannot verify.

Actions & autonomy
Can Unravel break something in production?

Arvix's validation loop is specifically designed to prevent this. Before any optimization is flagged as AutoApply-ready, Arvix models the expected impact against historical run behavior and rejects changes where the outcome is uncertain.

As of today, Unravel has a zero production incident record from Arvix-applied optimizations across its enterprise customer base. That said, Unravel is designed for you to stay in control. Start in Human-in-Loop mode — you review every recommendation before anything runs — and expand AutoApply as your confidence grows.

What's the difference between AutoApply and Human-in-Loop?

AutoApply: Arvix identifies an optimization, validates it, applies it, and logs the result — without requiring any human action. You see what happened in the Operations Center. Suitable for routine, low-risk actions like idle cluster suspension or minor warehouse resizing.

Human-in-Loop: Arvix generates the recommendation and queues it for your approval. You review the proposed change, expected impact, and validation rationale — then click to apply or dismiss. Code rewrites and structural query changes typically default to Human-in-Loop regardless of your policy setting.

Both modes are configurable per action category, per workload, and per team. Most customers use AutoApply for infrastructure and Human-in-Loop for code-level changes.

Can we limit Unravel to read-only mode — insights only, no actions?

Yes. Read-only mode is available and some teams start here during evaluation. You get the full Operations Center, all Efficiency Rating data, and Arvix's optimization queue — but nothing is applied without you explicitly enabling action permissions.

Practically, "read-only" is controlled at the credential level. If you grant Unravel only read permissions for your platform, it physically cannot apply actions regardless of UI settings.

Does Unravel support change management / approval workflows?

Yes. Human-in-Loop recommendations can be routed through approval workflows with configurable approvers by action type, cost threshold, or business unit. Slack and email notifications are supported for pending approvals.

All applied actions are logged with full audit trails — what changed, who approved it (or that it was AutoApply), and what the measured outcome was.

Integrations & APIs
Does Unravel have a REST API?

Yes. The Unravel REST API exposes cost data, efficiency scores, optimization queues, and action history. Common use cases include pulling data into internal dashboards, triggering actions from external orchestration, and integrating optimization recommendations into CI/CD pipelines.

API documentation is available in the Unravel customer portal after onboarding.

Which CI/CD systems does Unravel integrate with?

Unravel currently provides native plugins for GitHub Actions and Jenkins. The plugin adds a cost and performance check step to your existing pipeline — PR comments flag regressions with Arvix-generated fix suggestions before code merges.

A generic webhook-based integration is also available for other CI systems. GitLab CI and Azure DevOps native plugins are on the roadmap.

Can Unravel export cost data to our FinOps tools (e.g., Apptio, Spot.io)?

Yes. Unravel exports chargeback-ready cost attribution data in CSV and via the REST API. Fields include platform, workspace, job/query name, user, team, business unit, and cost — compatible with the tagging structures most FinOps tools expect.

Unravel is designed to complement cloud-level FinOps tools, not replace them. Unravel gives you the within-platform detail (job-level) that those tools can't see; they continue to handle your cloud commitment strategy.

Does Unravel support SSO?

Yes. Unravel supports SAML 2.0 for SSO. Tested integrations include Okta, Azure Active Directory, and Ping Identity. SCIM provisioning for automated user management is also supported on Enterprise plans.

AI & LLMs
What AI does Arvix use under the hood?

Arvix is built on two complementary layers. The first is a set of purpose-built ML models trained specifically on data platform workloads — query execution behavior, cluster utilization patterns, cost anomalies, and optimization outcomes. These models have been built and refined over years of production data across thousands of enterprise environments. They understand data platforms at a depth that general-purpose AI does not.

The second layer is an LLM integration for tasks that benefit from language understanding — code generation, natural language explanations of optimization recommendations, and conversational querying of your environment. Arvix uses whichever LLM your organization already trusts and has approved. If you use Azure OpenAI, Anthropic, Google Gemini, AWS Bedrock, or a self-hosted model, Arvix can connect to it. You're not locked into a specific LLM provider.

The combination matters: the purpose-built ML gives Arvix the domain depth to generate high-confidence, production-safe recommendations. The LLM makes those recommendations legible, actionable, and conversational. Neither alone does what both do together.

Can we use our own LLM or a self-hosted model?

Yes. Arvix's LLM layer is designed to connect to the model your organization already uses. Supported configurations include Azure OpenAI (with your own keys and endpoint), AWS Bedrock, Anthropic, Google Gemini, and self-hosted open-source models via a compatible API interface.

This matters for two reasons: data governance (your query text and metadata goes to your LLM, under your terms) and control (you can update or swap the LLM without waiting on Unravel). The purpose-built ML layer is Unravel-managed and doesn't change based on your LLM choice.

Does Unravel send my data to an LLM?

When LLM features are used (code explanations, natural language recommendations, conversational queries), Unravel passes metadata and query structure — not row-level data. What gets sent to the LLM is the same class of information Unravel already holds: query text, schema references, optimization context.

Because you configure which LLM endpoint Arvix connects to, the data handling terms are governed by your existing agreement with that provider. If you use Azure OpenAI or a self-hosted model, data stays within the boundaries you've already established.

LLM features can also be disabled entirely if your security policy requires it — Arvix's core optimization capabilities (cost reduction, cluster rightsizing, performance tuning) don't depend on LLM integration.

How is Arvix different from just asking ChatGPT to optimize my queries?

A general-purpose LLM operates in isolation — it sees the query you paste, with no context about your environment, your data volumes, your historical run behavior, or your cost structure. It gives you an answer. You still have to evaluate it, test it, and apply it yourself. If it's wrong, you find out in production.

Arvix operates continuously inside your actual environment. It knows the full system: which jobs are expensive, why they're expensive, how they've behaved across hundreds of past runs, what changes have been tried before, and what the current cost and SLA constraints are. When it generates a fix, it validates it against that history before surfacing it. When AutoApply is enabled, it applies and measures the result.

The distinction is advisor vs. operator. An LLM tells you what to try. Arvix does it — and proves it worked.

Pricing & contracts
How is Unravel priced?

Unravel is priced as an annual subscription based on the size of your data platform environment — specifically the compute footprint across connected platforms. Once you're in, everything is unlimited: users, applications, workloads, API calls, optimization actions, and integrations. There are no per-seat fees and no action metering.

This model is intentional. Optimization works best when there are no incentives to limit who has access or how much Unravel does. Your entire team — engineers, FinOps, platform owners — can use the full product from day one.

Pricing is structured to deliver a significant ROI multiple — most customers see 5–10× return relative to subscription cost. Talk to the team for a quote based on your environment.

Is there a free trial or proof-of-concept option?

Yes. Unravel offers a structured proof-of-concept engagement for enterprise prospects — typically 2–4 weeks. The POC connects to one of your production environments, generates real optimization findings, and produces a quantified savings estimate against your actual spend.

Most POCs identify a meaningful savings opportunity within the first 48 hours of data collection. Request a demo to get started.