Skip to main content
Version: current [25.0.x]

Reflections

Reflections can be defined on an existing dataset by using SQL commands.

Syntax
ALTER { DATASET | TABLE | VIEW } <dataset_name>     
{ CREATE AGGREGATE REFLECTION <reflection_name> USING
{ DIMENSIONS ( <column_name1>, <column_name2>, ... )
| MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) )
| DIMENSIONS ( <column_name1>, <column_name2>, ... ) MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) ) }
[ PARTITION BY ( [<column_name>|<partition_transform>] [ , ... ] ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| CREATE EXTERNAL REFLECTION <reflection_name> USING <table_name>
| CREATE RAW REFLECTION <reflection_name> USING DISPLAY ( <column_name1>, <column_name2>, ...)
[ PARTITION BY ( [<column_name>|<partition_transform>] [ , ... ] ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| DROP REFLECTION <reflection_name>
| REFRESH REFLECTIONS
| ROUTE REFLECTIONS TO { DEFAULT QUEUE | QUEUE { <queue_name> | <queue_uuid> } }
| { ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } }

Parameters

<dataset_name> String

The name of the table or view. The source can be in the scratch directory or in a datalake source.


CREATE AGGREGATE REFLECTION <reflection_name> String

Creates an aggregation reflection with the specified name.


DIMENSIONS ( <column_name1>, <column_name2>, ... ) String

The columns to include as dimensions in the reflection.


MEASURES ( <column_name1> ( <aggregation_type> ), <column_name2> ( <aggregation_type ), ... ) String

The columns to include as measures in the reflection, and the type of aggregation to perform on them. The possible types are COUNT, MIN, MAX, SUM, and APPROXIMATE COUNT DISTINCT.


PARTITION BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to partition the data horizontally in the reflection.


LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to sort the data that is in the reflection.


ARROW CACHE Optional

Specifies that you want Dremio to convert data from your reflection’s Parquet files to the Apache Arrow format when copying that data to executor nodes. Normally, Dremio copies data as-is from the Parquet files to caches on executor nodes, which are nodes that carry out the query plans devised by the query optimizer. Enabling this option can improve query performance even more. However, data in the Apache Arrow format requires more space on the executor nodes than data in the default format. You can use this option with Amazon S3.


CREATE EXTERNAL REFLECTION <reflection_name> String

Creates an external reflection with the specified name.


USING <table_name> String

The path of the derived table to use with the external reflection.


CREATE RAW REFLECTION <reflection_name> String

Creates a raw reflection with the specified name.


USING DISPLAY ( <column_name1>, <column_name2>, ... ) String

The columns to include in the reflection.


PARTITION BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to partition the data horizontally in the reflection.


LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to sort the data that is in the reflection.


DROP REFLECTION String   Optional

Drops the specified reflection defined on the table.


REFRESH REFLECTIONS String   Optional

For information about the refresh action performed, see Triggering Refreshes by Using the Reflection API, the Catalog API, or an SQL Command.


ROUTE REFLECTIONS TO { DEFAULT QUEUE | QUEUE { <queue_name> | <queue_uuid> } } String

Specify the queue on which to run jobs that refresh reflections defined on the dataset. Use either the name or UUID of the queue. If neither a name nor a UUID is specified, the default queue is used. You can also directly specify the default queue. For more information, see Queue Routing.


{ ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } String   Optional

ADD partitions the table data on the values in a single column or by using one of the partition-transformation functions. DROP drops the partition definition. These are the partition-transformation functions:

  • identity( <col> ): Explicitly specified identity transform.
  • year( <col> ): Partition by year. The column uses either the TIMESTAMP or DATE data type.
  • month( <ts_col> ): Partition by month. The column uses either the TIMESTAMP or DATE data type.
  • day( <ts_col> ): Partition by day. The column uses either the TIMESTAMP or DATE data type.
  • hour( <ts_col> ): Partition by hour. The column uses either the TIMESTAMP or DATE data type.
  • bucket( <count>, <col> ): Partition by hashed value into <count> buckets.
  • truncate( <length>, <col> ): Partition by truncated value. Strings are truncated to the specified length. Integer and biginteger values are truncated to bins (for example: truncate(10, i) produces 0, 10, 20, and so on).

For more information about partitioning reflections, see Horizontally Partition Reflections that Have Many Rows and Partition Reflections to Allow for Partition-Based Incremental Refreshes.

Setting Reflection Hints

You can use reflection hints to influence the process of determining which reflections are substituted for tables and views to accelerate queries. For more information, see Influencing the Choice of Reflections Used to Satisfy Queries.

Syntax for setting hint for a session
ALTER SESSION
SET <reflection_hint>=<value>
Example of syntax for setting a hint in a simple query
SELECT /*+ <reflection_hint>  */ <column_name1>, <column_name2>
FROM <table_name>

Examples

Create a raw reflection that sorts customers by last name and partitions them by country
ALTER TABLE "@user1"."customers"
CREATE RAW REFLECTION customers_by_country USING DISPLAY (id,lastName,firstName,address,country)
PARTITION BY (country)
LOCALSORT BY (lastName);
Create a raw reflection that sorts line items by outcomes and partitions them by username
ALTER TABLE TAMV2."outcome_category_line_item"
CREATE RAW REFLECTION outcome_category_line_item_raw USING DISPLAY (dir0,dir1,queryId,queryText,"start",finish,outcome,username)
PARTITION BY (username)
LOCALSORT BY (outcome);
Create an aggregation reflection that counts the cities per state and sorts by state
ALTER TABLE Samples."samples.dremio.com"."zips.json"
CREATE AGGREGATE REFLECTION per_state USING
DIMENSIONS (state) MEASURES (city (COUNT))
LOCALSORT BY (state);
Create an aggregation reflection that counts p1 and p2 as measures in the reflection
ALTER TABLE BI."p1_table"
CREATE AGGREGATE REFLECTION p1_table_agg USING
DIMENSIONS (c1,c2) MEASURES (p1(count, min), p2(sum, max,approximate count distinct))
PARTITION BY (c1)
LOCALSORT BY (c2)
ARROW CACHE;
Create an external reflection that maps the workspace view to sales by region
ALTER DATASET "myWorkspace"."sales_by_region"
CREATE EXTERNAL REFLECTION "external_sales_by_region" USING "mySource"."sales_by_region";
Drop a reflection
ALTER TABLE Samples."samples.dremio.com"."zips.json"
DROP REFLECTION per_state;
Refresh reflections for a given table
ALTER TABLE Samples."samples.dremio.com"."zips.json"
REFRESH REFLECTIONS;
Route reflections defined on the given datasets to the specified queues
ALTER TABLE "Table 1" ROUTE REFLECTIONS TO DEFAULT QUEUE;
ALTER VIEW "View 1" ROUTE REFLECTIONS TO QUEUE "QUEUE 1";
ALTER VIEW "View 3" ROUTE REFLECTIONS TO DEFAULT QUEUE;