Skip to main content

Home

Data

The Data page presents information about tables and partitions. This information includes the following

  • Metadata: For example, database and table names, owner, path, storage format, create date, etc.

  • KPIs: For example, the number and size of tables and partitions, the number of applications accessing each table, etc.

  • Insights: For example, tables with too many small files or tables that do not have table statistics, etc.

Unravel v4.6.2.0 introduces multi-cluster support. In this version, the data page supports tables and partitions on multiple on-prem (CDH,CDP, HDP) clusters, each of which has its own Hive metastore and HDFS.

The following scenarios are currently not supported:

  • Multiple EMR, HDI clusters where each cluster has its own metastore and HDFS.

  • Tables whose metadata are stored on an external metastore and are shared by multiple clusters. For example, multiple EMR clusters refer to the same external Hive metastore or Glue.

  • Tables whose data are stored on an external file system and are shared by multiple clusters. For example, multiple EMR clusters refer to data stored on S3.

The data page has the following tabs:

  • Overview: Shows table and partition KPIs for a given cluster and metastore.

  • Tables: Provide details and insights into the tables for a given cluster and metastore.

  • Forecasting:Forecasting Forecasts future disk capacity requirements based upon past performance.

  • Small Files: Adhoc report that generates a list of directories containing small files.

  • File Reports: Similar to Small Files, except canned reports for large, medium, tiny, and empty files.

Configuring Data page

Data page supports getting metadata from multiple Hive metastores. To configure the Data page, refer to Data Page configuration.