Prerequisites
Complete the following prerequisites before installing Unravel.
Each version of Unravel has specific platform requirements. Check Compatibility Matrix to confirm your Google Cloud platform meets the requirements for the Unravel version you are installing.
Compute Engine GCE type: General-purpose:
Minimum: n2-standard-16 / n1-standard-16 (64 GiB RAM)
Maximum: n2-standard-64 / n1-standard-64(256 GiB RAM)
Recommended: n2-standard-32 / n1-standard-32 (128 GiB RAM)
Virtualization type: HVM
Root device type: Standard Persistent Disk / SSD persistent disks
Volume specifications:
Minimum: 200GiB.
In a PoC or evaluation, the minimum root disk space should be sufficient.
When monitoring more BigQuery clusters or lots of jobs, we recommend a 300-500GB SSD persistent disks that can handle high rates of IOPS
For production use, we recommend 500GiB SSD persistent disks.
The Baseline IOPS (3 IOPS per GiB with a minimum of 100 IOPS, burstable to 3000 IOPS) is sufficient for Unravel.
Note
Unravel Server does not require heavy resources, but it's best to check your BigQuery Quotas as you proceed.
Important
You must have separate nodes for the Unravel server and the external database.
The minimum requirements for cores, RAM, and disk.
Operating system: RedHat/CentOS 6.4 - 7.4
The following ports must be open on the Unravel GCE. In addition, the Unravel GCE must be able to access all ports on the BigQuery cluster.
Ports | Direction | Description |
---|---|---|
3000 | Both | HTTPS traffic to and from Unravel UI. |
4043 | In | UDP and TCP ingest traffic from the entire cluster to Unravel Servers. |
To manage, monitor, and optimize the modern data applications running on your BigQuery cluster, Unravel needs data from the cluster as well as from apps running on the cluster. This data includes metrics, configuration information, and logs. Parts of this data are pushed to Unravel, and part is pulled by the daemons running on Unravel server. For all the data to be accessible, there must be both inbound and outbound access between Unravel server (on the GCE) and the BigQuery cluster.
The Unravel server must be in the same region as the target BigQuery clusters it is monitoring. There are two possible scenarios:
Both the BigQuery cluster and the Unravel server are created on the same VPC, same subnet; and the security group allows all traffic from the same subnet.
The BigQuery cluster is located on a different VPC than the Unravel server. In this case, you must configure VPC peering, route table creation, and update the firewall policy.
The Unravel server needs a TCP and UDP connection to the BigQuery master node. To implement this, do either of the following:
Create a firewall rule that allows port 3000 and port 4043 from the BigQuery cluster node's IP address. Configure the firewall rule on Unravel Server to allow TCP traffic on ports 3000 for BigQuery cluster nodes.
Put the member of the firewall rule used on the BigQuery cluster in this rule.
The Unravel server and BigQuery clusters must allow all outbound traffic.
These instructions are self-contained and require only basic knowledge of GCP. You don't need to create any scripts or be familiar with any specific programming or scripting language.
These instructions assume you're proficient in:
Provisioning GCEs.
Creating and configuring the required IAM roles, firewall rules, etc.
Understanding GCP networking concepts such as virtual private clouds (VPCs) and subnets.
Running Ansible scripts, basic Unix commands, and AWS CLI commands.
Ensure to have the following handy before you add the projects:
Google account with required IAM role permissions for gcloud CLI authentication
Project ID file in case you are integrating multiple projects at a time with Unravel.
Log Receiver (LR) endpoint (only for Push method)
This is required only when the push method is configured to fetch data from bigquery.
Subscription ID (Optional)
The subscriber ID that you want to configure with a pub/sub topic. This is optional. If this is not provided, then the default subscription ID unravel-bigquery-sub is considered.
Note
If you have created the resources on GCP, then the Subscription ID is mandatory.