This topic describes how to configure the cache settings for data source metadata. Various caching options are available for individual data sources.
To configure caching setting for data source metadata:
For more information about Metadata settings for specific data sources, see the each data source. See HDFS for a list of data sources.
This section describes the configurable caching settings.
Dataset Discovery option determines the refresh interval for top-level source object names such as names of databases, tables, indexes, etc. The dafault is one hour. This refresh is a lightweight operation. Dataset Discovery option is not available for file-system sources such as HDFS, MapR-FS or NAS.
Dataset Details is the metadata Dremio needs for query planning such as information on fields, types, shards, statistics and locality information.
The following fetch modes are available:
Only Queried Datasets- Dremio updates details for previously queried objects in a source. This mode increases query performance as less work needs to be done at query time for these datasets.
All Datasets- (Deprecated as of 3.3) Dremio updates details for all datasets in a source. This mode increases query performance as less work needs to be done at query time.
As Needed- (Not Available as of 3.3) Dremio updates details for a dataset at query time. This mode minimizes metadata queries on a source when not used, but might lead to longer planning times.`
Dremio expires the metadata it knows about datasets after the provided
Expire after value.