The system periodically updates the reflections in the Reflection Store in order to keep Data Reflections fresh. An administrator can specify the desired Refresh Policy for any physical dataset or data source – determining the refresh interval and expiration of reflections. All reflections based on a physical dataset or source will be refreshed accordingly. Refresh Policy options for a physical dataset will override the value for the source.
Dremio will refresh Data Reflections at the provided refresh interval and serve them until the provided expiration.
Manual Refresh
Disabling and enabling reflections for a dataset in Dremio UI will cause that reflections to refresh. Also for a given physical dataset, all dependent reflections can be refreshed.
Dremio’s default behavior is to perform a full update of the Data Reflection on each update. However, for larger datasets it is better to enable incremental updates. There are two ways in which the system can identify new records:
Note
- As of Dremio 3.2, incremental refresh is supported for datasets with columns fields of
BigInt, Int, Timestamp, Date, Varchar, Float, Double, and Decimal data types.- In releases prior to Dremio 3.2, incremental refresh is supported for datasets with BigInt columns only.
To specify incremental refresh for your dataset:
Warning
- Only append-only datasets are supported for Incremental Update Mode. Updates and deletions of underlying files leads to incorrect results. Dremio recommends using Full Refresh in this case.
- Reflections on virtual datasets that include joins cannot be incrementally updated. Dremio falls back to using full refresh for these datasets.
Changes in definitions of anchor and/or upstream (i.e. parents, parents of parents) datasets require administrators to re-create affected reflections (including reflections on downstream datasets) to ensure that they are up-to-date.
Dremio guarantees data correctness without any modifications, however,
if affected reflections are not re-created when dataset definitions change,
queries may not be able to use those reflections.
Updating a reflection definition causes a full refresh of that reflection.