FSimage
Note
You must restart the unravel_ondemand daemon for any changes to take effect
Scheduling
Property/Description | Set by user | Unit | Default |
---|---|---|---|
com.unraveldata.ngui.sfhivetable.schedule.interval Frequency, in days, in which to trigger FSimage extraction, for example, every 3 days The scheduler schedules extraction relative to the 1st of the month and then sets each extraction such that it 1st, 1st + X days, 1st + 2X days until 1st + nX days crosses into the next month, at which point the schedule resets to the 1st. See below for an example. Format: | day | 1d | |
com.unraveldata.ngui.sfhivetable.schedule.time Specify the time to download in hours (using 24 hour time) the FSimage. Format: two digits between 00 and 23. | two digits (set member) | 00 |
The scheduler always calculates the extraction schedule from the first of the month. In other words, the extraction schedule is reset on the first. The extraction time is specified by com.unraveldata.ngui.sfhivetable.schedule.time.
The FSimage extraction is triggered on the first of the month and further extractions are calculated 1 + X days, 1 + 2X days, etc. When the interval crosses over into the next month, it is ignored. FSImage extraction occurs on the 1st of the new month and then scheduled every X days.
For example:
the installation is on 23 July 07.00.
com.unraveldata.ngui.sfhivetable.schedule.time=02
com.unraveldata.ngui.sfhivetable.schedule.interval=9d.
The next extraction is at 28 July 02:00 (1 + 9 day * 3). The process begins again with a 1 August extraction then 10 August, 19 August, 28 August, 1 September, and so on.
Process
Property/Description | Set by user | Unit | Default |
---|---|---|---|
unravel.python.reporting.files.external_fsimage_dir Directory for FSimage when skip_fetch_fsimage=true. The FSimage externally fetched is expected to be in this directory. Unravel uses the latest file in this directory which starts with " fsimage_". This directory must be different than the Unravel's internal directory, i.e., /srv/unravel/tmp/reports/fsimage. | string | - | |
unravel.python.reporting.files.skip_fetch_fsimage If HDFS admin privileges can not be granted, set this to true to allow Unravel's OnDemand process to use an externally fetched FSimage.
| boolean | false | |
unravel.python.reporting.fsimage.run_mode Unravel uses Hive or Spark to process the FSImage. Member of a set: spark, hive As of v4.5.4.3 Unravel uses Spark by default. | string (set member) | spark |
Resources
The following properties must be defined when unravel.python.reporting.fsimage.run_mode=spark
.
Property/Description | Set by user | Unit | Default |
---|---|---|---|
unravel.python.reporting.files.hive.warehouse.dir The directory under which the table and partition sizes are calculated. Unravel automatically checks the platform, and uses This is only needed when unravel.python.reporting.fsimage.run_mode=spark CDH: /user/hive/warehouse HDP: /apps/hive/warehouse This property was deprecated as of v4.5.4.3 when Unravel started to use Spark to insert FSimage data into tables. | string (path) | - | |
unravel.python.reporting.files.spark.cores The number of cores used to process FSImage. The default is the recommended value in order to not overload the unravel node with FSImage runs. Unit: count > 0 | count |
| |
unravel.python.reporting.files.spark.driver.memory The amount of memory to allocate to the JVM that runs Spark. The value must be is a positive number. To specify bytes: #, for example, 30. To specify megabytes: #M, for example, 30M. To specify gigabytes: #G, for example, 30G. | count |
|