Skip to main content

Home

Migrations

migration.png

Cluster discovery

The Cluster discovery report provides overall information about your cluster. The report includes the following sections:

  • On-Prem Cluster Identity contains the cluster configuration details and host information.

  • Overall cluster usage graphs of:

    • Applications submitted By App Type, By user, and By queue.

    • CPU

    • Memory

  • A CPU/Memory heat-map that aggregates usage by weekday, and then hour within the day.

Generating Cluster Discovery report
  1. Go to Migrations > Cluster Discovery.

  2. Click the run.png button to generate a new report.

    migration-clusterdiscovery-run.png
  3. Select a period range from the date picker and click Run to generate the report.

    The progress of the report generation is shown on the top of the page and you are notified about the successful creation of the report.

    All reports (successful or failed attempts) are in the Reports Archive.

Note

Before the initial report generation, the default is a seven-day history.

Scheduling Cluster Discovery report
  1. Click Schedule to generate the report regularly and provide the following details:

    • History (Date Range): Select a period from the date picker.

    • Schedule Name: Name of the schedule.

    • Schedule to Run: Select any of the following schedule option from drop-down and set the time from the hours and minutes drop-down:

      • Daily

      • Weekdays (Sun-Sat)

      • Every two weeks

      • Every month

    • Notification: Provide an email ID to receive the notification of the reports generated.

  2. Click Schedule.

Viewing the Cluster Discovery report
migration-cluster-discovery_report.png
On-Prem Cluster Identity

This tile contains information about your cluster, including the hosts. The Host Summary section shows the cluster's capacity across all hosts.

cluster-discovery-on-prem-cluster-identity.png

To see each host's hardware specifications and the host's roles, click the # Hosts link. The table can be searched on hostname. The potential roles are:

  • Server: Has at least one server component, such as Zookeeper Server, HDFS.

  • Worker: Has at least one daemon component such as HDFS DataNode, YARN NodeManager, or HBase RegionServer.

  • Client: Has at least one client component, such as Zookeeper Client, Hadoop Client, Hive Client, etc.

Applications usage

The donut graphs present the overall usage, in a cluster, of the applications grouped by app type, user and queue. The top 10 in each category are shown. You can select or deselect the checkboxes corresponding to the categories to change the graph view.

cluster-discovery-graphs-resource_usage1.png
Resource availability and usage

The graphs display the cluster's CPU and memory utilization over the time period. The capacity and the actual usage trends are plotted on the graph. The average usage is listed on the right-hand side of the title bar. Hover over the text next to the resource's name to see Unravel's analysis of your cluster's usage for that resource.

cluster-discovery-graphs-resource_usage2.png

The heat-map is a map of the CPU/Memory usage and capacity by a weekday and hour, e.g., Monday between 5 and 6 a.m. Each time slot is color-coded to show how relatively hot the time slot is relative to the rest of the map. You can quickly see the load distribution across your cluster.age. You can filter the heatmap by CPU or memory. It opens displaying CPU. The CPU graph above noted the CPU is under-utilized and the heatmap graphically supports that analysis. Click expand.png to expand and view the heat map.

cluster-discovery-cpu-heatmap.png

Cloud mapping per host

In this report, the cluster data is analyzed for the specified time period, and recommendations are provided based on the current hardware specifications and usage. You can view the following variations of the report:

  • Lift and Shift is a one-to-one mapping of each existing host's capacity (based on CPU, memory, and disk) to the closest fit on the cloud, which meets or exceeds the host's hardware specifications. This method provides an estimate of what your current on-prem cluster configuration can cost on the cloud. This does not consider workload or actual resource usage. However, it minimizes your risks associated with migrating to the cloud.

  • Cost Reduction is a one-to-one mapping of each existing host’s actual usage (based on CPU, memory, and disk) to the closest fit on the cloud that meets or exceeds the host's usage requirements. This method optimizes for cost but not necessarily peak-usage. While still a one-to-one mapping, this method is usually more cost-effective than lift and shift as it minimizes over-provisioning (the under-utilization of your host's resources).

Generating Cloud Mapping Per Host report

Unravel generates reports using:

  • Lift and Shift: a one-to-one mapping of each on-prem host on the cloud based on your host's capacity.

  • Cost Reduction: a one-to-one mapping of each on-prem host on the cloud based on your host's actual usage.

To generate the report:

  1. Go to Migrations > Cluster Mapping Per Host.

  2. Click the run.png button to generate a new report.

  3. Select the following:

    • Cloud Product/Service

      • Google Compute Engine (IaaS)

      • Google DataProc

      • Amazon EC2 (Iaas)

      • Amazon EMR

      • Azure (Iaas)

      • Azure HDInsight

    • Region: Select a region. The available regions are specific to the Cloud Product/Service selected.

    • Storage Type: You can choose Local Storage or Object Storage. The Storage Name is determined by the storage type chosen.

    • One or more VM types: The list is populated based upon the Cloud Product/Service and Region you chose.

      To select a subset of VM types, check the box for each VM Type. To select all types, check the box in the header. Unravel maps each host to the best fit of the VM types you selected. Click Reset to clear your selections.

  4. (Optional) Enter a custom cost for a VM type for Unravel to use for the cost calculation for that VM type. Unravel uses publicly listed prices.

  5. Click Run to generate the report.

    The report generation's progress is shown at the top of the page, and you are notified about the successful creation of the report.

    All reports (successful or failed attempts) are in the Reports Archive.

Scheduling Cloud Mapping Per Host report
  1. Click Schedule to generate the report regularly and provide the following details along with the details to generate the report:

    • Schedule Name: Name of the schedule.

    • Schedule to Run: Select any of the following schedule option from the drop-down and set the time from the hours and minutes drop-down:

      • Daily

      • Weekdays (Sun-Sat)

      • Every two weeks

      • Every month

    • Notification: Provide email IDs to receive the notification of the reports generated.

  2. Click Schedule.

Viewing the Cloud Mapping Per Host report

The latest generated report can be viewed from the following tabs:

  • Lift and Shift

  • Cost Reduction

By default, the tab opens, displaying the last report generated for Lift and Shift. Click Cost Reduction to see that report. Each host is mapped to one of the instances you selected when generating the report.

A brief explanation of the method and the Cloud Product/Service used for the report is provided.

A summary for the Total Hourly Cost, Total Object Storage Required (or Local Storage), and instance your hosts are mapped to.

A table listing the mapping for each host to a VM type is shown. The table contains:

  • Host: Your on-prem host.

  • Host Role: Shows the host role, server, worker, or client. A host can have more than one role.

  • Actual Usage: The host's actual resource usage.

  • Capacity: The total capacity of the host.

  • Recommendation: The cloud instance Unravel maps your host to.

  • Total Cost ($/Hour): The hourly cost of the instance.

Lift and Shift

This method is a one-to-one mapping of each on your on-prem host's capacity to an equivalent VM type in the cloud (the mapping meets or exceeds, where necessary, your host's capacity).

cloudmapping-per-host-main.png

Cost Reduction

This method maps each host to an instance based on the host's actual usage, not its capacity.

cloudmapping-per-host-cost.png

Workload Fit

Report generation takes approximately 50 seconds for a slice that has approximately one million apps. The number of apps is dependent on both the slice's parameters and the selected time period.

This analyzes your cluster workload based upon your selected criteria. It enables you to map your workload (or a portion of it) onto a cloud service/product and see the costs associated with migrating the slice.

First, select a time range from the Date picker.

Then, examine your cluster by

  • Job Count

  • Memory Hours

  • CPU Hours

Then you select a slice by

  • Users

  • App Types

  • Queues

  • Tags

For instance, you can evaluate your workload grouped by Job Count and generate the migration costs on a slice selected by Application Type, Users, Queues, or Tags.

Configuring the Workload fit report

To enable and configure the Workload fit report, set the properties in the following table, using the command:

<Installation directory>/manager config properties set <KEY> <VALUES>

For example:

<Unravel installation directory>/manager config properties set com.unraveldata.migrationplanning.workloadfit.enabled true
Generating the Workload Fit report

Run the following steps to generate the Workload Fit report:

Step 0: Select the time period for the report.

From the date picker on the right, select a time range for the report. Unravel recommends using at least seven days of metrics to generate a meaningful report.

Step 1: Select a slice of the clusters' applications.
migration-step1.png
  1. From the drop-down, select one of the following metrics to analyze your cluster's overall usage:

    • Job Count

    • Memory Hours

    • .CPU Hours

  2. Select the criteria to use for analyzing the migration costs. You can select all or a subset of Application Type, User, Queue, or Tags. All available options for each type are listed.

    Note

    It is these selections that determine the slice of your cluster that is analyzed.

    Hover over a chart section to see the name and absolute number for that section.

    Tags

    Tip

    Tags are comprised of a (tag key:tag value); a tag key can have multiple values. The following images display the tag keyProject , which has five values.

    If you are unfamiliar with tagging, see What is tagging.

    The Tags chart behaves differently than the other charts. Hovering over a donut slice notes the number of apps with a specific tag key regardless of whether you have deselected a specific tag value. Hovering over the Project section shows there are 24 apps tagged with a Project:value. Note the valueAdvertising is deselected. If all Project:values are deselected Project is not included in the donut.

    WorkloadFit-Tags-ProjectHover.png

    Click on the Project section to bring up the distribution of the tag values. Advertising is deselected, yet it is shown in the donut. Click Back to Tags to return to the tag key view.

    WorkloadFit-Tags-ProjectValuesHover.png

    Important

    Regardless of how the tags are displayed, the migration cost analysis only uses the selected values.

    Example workload slices

    • All Tez applications.

      • Select Tez under Application Type.

      • Select All for User, Queues, or Tags.

    • All Spark applications on the root.users.root queue for all tag values for Project.

      • Select Spark for Application Type.

      • Select the root.users.root queue for Queues.

      • Deselect all tag keys except Project under Tags.

      • Select All for Users.

Heatmap

Click Get Heatmap View to view the heatmap for your slice. You can view the heatmap grouped by Job Count, Memory, or vCores. The heatmap typically changes based upon your selection. The current view is noted in the title bar. In the following image, the heatmap is grouped by Job Count.

workload-heatmap.png

The heatmap is not automatically updated when you change your slice. After selecting the slices, always click Get Heatmap View again to updated Heatmap.

Step 2: Obtain cloud topology for the selection

You first must select your VM preferences for the cost analysis.

To edit the VM preferences:

  1. Click the Edit link. The VM preferences dialog box is displayed.

    migration-preferences-summary.png
    migration-vm-preferences.png
  2. Select the following:

    • Cloud Product/Service

      • Google Compute Engine (IaaS)

      • Google DataProc

      • Amazon EC2 (Iaas)

      • Amazon EMR

      • Azure (Iaas)

      • Azure HDInsight

    • Region: Select a region. The available regions are specific to the Cloud Product/Service selected.

    • One or more VM types: The list is populated based upon the Cloud Product/Service and Region you chose.

      To select a subset of VM types, check the box for each VM Type. To select all types, check the box in the header. Unravel maps each host to the best fit of the VM types you selected. Click Reset to clear your selections.

  3. (Optional) Enter a custom cost for a VM type for Unravel to use for the cost calculation for that VM type. Unravel uses publicly listed prices.

  4. Click Run to generate the report.

    The report generation's progress is shown at the top of the page, and you are notified about the successful creation of the report.

    All reports (successful or failed attempts) are in the Reports Archive.

Viewing the Workload Fit report

In the report window, the left tile lists your VM settings. The Storage Type is always Object Storage. At least two and at most three tabs/reports are generated; Map to Multiple Clusters is not necessarily created.

  • Map to Single Cluster: Maps your workload slice to the best cluster from your selected VMs.

    migrations-maptosingle.png

    Cluster topology details list the Total Cost (/hr) of the VM, Object Storage Cost (/hr), and the Local Attached Storage Cost (/hr). The table has a row for each of your Node Type and the number of VM instances used for the mapping plus the associated cost.

  • Map to Single Cluster with Auto Scaling: Compares the cost of migration if you use auto-scaling with a single cluster. The same VM user in Map to Single Cluster is used for the analysis. The chart graphs the number of VMs and their cost by weekday and hour.

    migrations-maptosingleclusterwith-as.png
  • Map to Multiple Clusters: Unravel maps your clusters to multiple VMs. If you have selected multiple App Type, there is a Cluster Topology Details section for each type. Click the App Type to see the type's details. The aggregated VM and storage types for all apps are listed above the table on the right side.

    migration-workloadfit-map-to-multicluster.png

Services and Versions compatibility

The report is a matrix mapping on-prem platforms to the service which are available and compatible on a cloud provider.

Generating Services and Versions Compatibility report
  1. Go to Migrations > Services and Versions Compatibility.

  2. Click the run.png button to generate a new report.

    servicesversions-runreport.png
  3. Select a cloud product and click Run to generate the report.

    The progress of the report generation is shown on the top of the page and you are notified about the successful creation of the report.

    All reports (successful or failed attempts) are in the Reports Archive.

Note

Before the initial report generation, the default is a seven-day history.

Scheduling Services and Versions Compatibility report
  1. Click Schedule to generate the report regularly and provide the following details:

    servicesversions-schedulereport.png
    • Cloud Product: Select a cloud product.

    • Schedule Name: Name of the schedule.

    • Schedule to Run: Select any of the following schedule option from drop-down and set the time from the hours and minutes drop-down:

      • Daily

      • Weekdays (Sun-Sat)

      • Every two weeks

      • Every month

    • Notification: Provide an email ID to receive the notification of the reports generated.

  2. Click Schedule.

Viewing the Services and Versions Compatibility report

The compatibility of the cloud services can be checked based on the following color-coded categories:

  • Services and Versions are Compatible: The service is on your cluster and the cloud product is compatible. (4530-Rep-MigPlan-ServCompat.png )

  • Services and Versions are not Compatible: The service is on your cluster and the cloud product, but they aren't compatible. (4530-Rep-MigPlan-ServNotCompat.png )

  • Service available in Source, but missing in Target: The service is on your cluster but not on the cloud product. (4530-Rep-MigPlan-SrcNotTarg.png )

  • Service missing in Source, but available in Target: The service is missing in your cluster but available on the cloud product. (4530-Rep-MigPln-NotSrcTarget.png )

servicesandversions.png