Source Metadata Caching
Dremio has various options for caching data source metadata configurable for individual sources.
Dataset Discovery option determines the refresh interval for top-level source object names such as names of databases, tables, indexes, etc. The dafault is one hour. This refresh is a lightweight operation. Dataset Discovery option is not available for file-system sources such as HDFS, MapR-FS or NAS.
Dataset Details is the metadata Dremio needs for query planning such as information on fields, types, shards, statistics and locality information.
There are three fetch modes:
Only Queried Datasets- Dremio updates details for previously queried objects in a source. This mode increases query performance as less work needs to be done at query time for these datasets.
All Datasets- Dremio updates details for all datasets in a source. This mode increases query performance as less work needs to be done at query time.
As Needed- Dremio updates details for a dataset at query time. This mode minimizes metadata queries on a source when not used, but might lead to longer planning times.`
Dremio will expire the metadata it knows about datasets after the provided
Expire after value.