Multi-cluster
This document describes the properties that are handled in a multi-cluster deployment.
Layout of multi-cluster configurations for a component:
- com.unraveldata.<component>.list = Comma-delimited list of variables representing the component instances. 
- com.unraveldata.<component>.<variable>.<property> = Value - The variables are for configuration purposes only and do not need to correspond to any hostnames or other properties of the components. Acceptable characters: alphanumerics, hyphens, and underscores only. 
Core node configurations for components
The following components are accessed from the Core node, where Unravel is installed.
- Cloudera Manager (CM) 
- Ambari 
- Hive Metastore 
- Kafka 
- Pipeline (Workflows) 
The following properties must be configured on the Core node in a multi-cluster setup.
Cloudera Manager - Multi-cluster
Ambari
Hadoop
You must set the following property when you install the core node on a server where there is no Hadoop configuration:
Hive Metastore
Kafka
Pipelines
Edge node configurations for components
In a multi-cluster deployment for on-prem platforms, the following properties must be added to the Edge node server, for MR jobs to load jhist and logs, HDFS path for jhist/conf and yarn logs:
| Property/Description | Default | 
|---|---|
| com.unraveldata.min.job.duration.for.attempt.log Minimum duration of a successful application or which executor logs are processed (in milliseconds). | 600000 (10 mins) | 
| com.unraveldata.job.collector.log.aggregation.base HDFS path to the aggregated container logs (logs to process). Don't include the hdfs:// prefix. The log format defaults to TFile. You can specify multiple logs and log formats (TFile or IndexedFormat). Example:  For HDP set this to:  | /tmp/logs/*/logs/ | 
| com.unraveldata.job.collector.done.log.base HDFS path to "done" directory of MR logs. Do not include the  For HDP set this to:  | hdfs:///user/spark/applicationHistory/ | 
The following properties must be added for Tez to the Edge node server in a multi-cluster deployment for on-prem platforms.