On this page

    Reflections Overview

    Dremio maintains physically optimized representations of source data known as Data Reflections. The query optimizer can accelerate a query by utilizing one or more Data Reflections to partially or entirely satisfy that query, rather than processing the raw data in the underlying data source.

    The Distributed Store can reside on HDFS, S3, ADLS, MapR-FS or NAS storage. Data Reflections are maintained in a high-performance columnar representation based on Apache Parquet and Apache Arrow, utilizing advanced compression techniques such as dictionary encoding, run-length encoding, and delta encoding.


    A Data Reflection is always associated with a single table or view, also known as its anchor. The anchor may be a table or a view, so it may contain data from one or more data sources.


    Data Reflections associated with one table or view can be utilized by the optimizer to accelerate a query on a different table or view. For example, an acceleration whose anchor is foo.bar.business may be used to accelerate a query on foo.bar.restaurants, and vice versa.

    Types of Data Reflections

    There are various types of Data Reflections:

    • Raw reflections – A raw reflection includes one or more fields from the anchor, sorted, partitioned and distributed by specific fields.
    • Aggregation reflections – An aggregation reflection includes one or more dimension and measure fields from the anchor, sorted, partitioned and distributed by specified fields.
    • External reflections – An external reflection is an un-managed reflection, which allows users to leverage existing datasets and summary tables built in external system as reflections in Dremio.

    See Creating Data Reflections for use-cases for each reflection type. Review the Reflections API for additional information regarding Reflection statuses.


    If the query is not being accelerated, make sure you are running the query rather than using preview. Reflection matching and optimizer choices are different depending on whether the query is being previewed or actually run.

    Multiple Reflections for a Table or View

    For any given table or view in the system, there may be zero or more raw reflections, and zero or more aggregation reflections. Dremio’s cost-based optimizer automatically chooses the best reflections for a given query when there are multiple options.