Refreshing Data Reflections

Refresh Policy: Refresh Interval and Expiration

The system periodically updates the reflections in the Reflection Store to keep Data Reflections fresh. An administrator can specify the desired Refresh Policy for any physical dataset or data source – determining the refresh interval and expiration of reflections. All reflections based on a physical dataset or source will be refreshed accordingly. Refresh Policy options for a physical dataset will override the value for the source.

Dremio will refresh Data Reflections at the provided refresh interval and serve them until the provided expiration.

Manual Refresh

Disabling and enabling reflections for a dataset in Dremio UI will cause that reflections to refresh. Also for a given physical dataset, all dependent reflections can be refreshed.

Full and Incremental Refresh

Dremio’s default behavior is to perform a full update of the Data Reflection on each update. However, for larger datasets it is better to enable incremental updates. There are two ways in which the system can identify new records:

  • Directory datasets in file-based data sources like S3 and HDFS. The system can automatically identify new files in the directory.
  • All other datasets (physical and virtual). An administrator specifies a monotonically increasing field such as an auto-incrementing key that must be of type BigInt. Incremental updating is not available for datasets without any BigInt fields. This allows the system to fetch the records that have been created since the last time the acceleration was updated.


  • As of Dremio 3.2, incremental refresh is supported for datasets with columns fields of
    BigInt, Int, Timestamp, Date, Varchar, Float, Double, and Decimal data types.
  • In releases prior to Dremio 3.2, incremental refresh is supported for datasets with BigInt columns only.

To specify incremental refresh for your dataset:

  1. Go to your source’s promoted folder.
  2. Click on the settings icon for the promoted folder.
  3. Select Reflection Refresh.
  4. Select Incremental Update.

Incremental Update screenshot


  • Only append-only datasets are supported for Incremental Update Mode. Updates and deletions of underlying files leads to incorrect results. Dremio recommends using Full Refresh in this case.
  • Reflections on virtual datasets that include joins cannot be incrementally updated. Dremio falls back to using full refresh for these datasets.

Near-Real-Time Metadata Refreshes for Reflections

Version Requirement:

This feature is only available when using instances of Dremio v18.0+.

Metadata refreshes for reflections now take place in near-real-time when completing a reflection job.

To activate this functionality, use the dremio.iceberg.enabled and dremio.execution.support_unlimited_splits flags. Enabling flags is done from the Support Settings page.


Using these support keys will enable new functionalities in Dremio that may cause unexpected behaviors with your existing datasets. We recommend testing this functionality first in a test environment as described here.

Support Limitation:

This improvement to metadata refreshes does not support PDFs as a storage method.

We recommend also enabling Near-Real-Time Metadata Refreshes as this removes the limitation on unlimited splits, allowing you to more easily utilize reflections on larger datasets where metadata refreshes may be slower.

Changes to Anchor and Upstream Datasets

Changes in definitions of anchor and/or upstream (i.e. parents, parents of parents) datasets require administrators to re-create affected reflections (including reflections on downstream datasets) to ensure that they are up-to-date.

Dremio guarantees data correctness without any modifications, however, if affected reflections are not re-created when dataset definitions change,
queries may not be able to use those reflections.

Updating a reflection definition causes a full refresh of that reflection.