FSimage
Note
You must restart the unravel_ondemand daemon for any changes to take effect
Property/Description | Set by user | Unit | Default |
---|---|---|---|
com.unraveldata.ngui.sfhivetable.schedule.interval Frequency, in days, in which to trigger FSimage extraction, for example, every 3 days The scheduler schedules extraction relative to the 1st of the month and then sets each extraction such that it 1st, 1st + X days, 1st + 2X days until 1st + nX days crosses into the next month, at which point the schedule resets to the 1st. See below for an example. Format: | day | 1d | |
com.unraveldata.ngui.sfhivetable.schedule.time Specify the time to download in hours (using 24 hour time) the FSimage. Format: two digits between 00 and 23. | two digits (member of set) | 00 | |
unravel.python.reporting.files.external_fsimage_dir Directory for fsimage when skip_fetch_fsimage=true. The fsimage externally fetched is expected to be in this directory. Unravel uses the latest file in this directory which starts with " fsimage_". This directory must be different than the Unravel's internal directory, i.e., /srv/unravel/tmp/reports/fsimage. | string | - | |
unravel.python.reporting.files.skip_fetch_fsimage If hdfs admin privileges can not be granted, set this to true to allow Unravel's Ondemand process to use an externally fetched FSimage.
| boolean | false | |
unravel.python.reporting.fsimage.run_mode Unravel uses Hive or Spark to process the FSImage. Spark is the recommended mode for Unravel 4.6.0.0 or later. Member of set: spark, hive | string (member of set) | spark |
The following properties must be defined when unravel.python.reporting.fsimage.run_mode=spark
.
Property/Description | Set by user | Unit | Default |
---|---|---|---|
unravel.python.reporting.files.hive.warehouse.dir The directory under which the table and partition sizes are calculated. Unravel automatically checks the platform, and uses This is only needed when unravel.python.reporting.fsimage.run_mode=spark CDH: /user/hive/warehouse HDP: /apps/hive/warehouse Unit: string (path) | string (path) | - | |
unravel.python.reporting.files.spark.cores The number of cores used to process FSImage The default is recommended value in order not to overload the unravel node with FSImage runs. Unit: count > 0 | count |
| |
unravel.python.reporting.files.spark.driver.memory The amount of memory to allocate to the JVM that runs Spark. Valid value is a positive number. To specify bytes: #, for example, 30. To specify megabytes: #M, for example, 30M. To specify gigabytes: #G, for example, 30G. | count |
|
The scheduler always calculates the extraction schedule from the first of the month. The extraction time is specified by com.unraveldata.ngui.sfhivetable.schedule.time.
The FSImage extraction is triggered on the first of the month and further extractions are calculated 1 + X days, 1 + 2X days, etc. When the interval crosses over into the next month, it is ignored. FSImage extraction occurs on the 1st of the new month and then scheduled every X days.
For example:
the installation is on 23 July 07.00.
com.unraveldata.ngui.sfhivetable.schedule.time=02
com.unraveldata.ngui.sfhivetable.schedule.interval=9d.
The next extraction is at 28 July 02:00 (1 + 9 day * 3). The process begins again with a 1 August extraction then 10 August, 19 August, 28 August, 1 September and so on.