Reflections
Reflections can be defined on an existing dataset by using SQL commands.
SyntaxALTER { DATASET | TABLE | VIEW } <dataset_name>
{ CREATE AGGREGATE REFLECTION <reflection_name> USING
{ DIMENSIONS ( <column_name1>, <column_name2>, ... )
| MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) )
| DIMENSIONS ( <column_name1>, <column_name2>, ... ) MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) ) }
[ PARTITION BY ( [<column_name>|<partition_transform>] [ , ... ] ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| CREATE EXTERNAL REFLECTION <reflection_name> USING <table_name>
| CREATE RAW REFLECTION <reflection_name> USING DISPLAY ( <column_name1>, <column_name2>, ...)
[ PARTITION BY ( [<column_name>|<partition_transform>] [ , ... ] ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| DROP REFLECTION <reflection_name>
| REFRESH REFLECTIONS
| ROUTE REFLECTIONS TO { DEFAULT QUEUE | QUEUE { <queue_name> | <queue_uuid> } }
| { ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } }
Parameters
<dataset_name> String
The name of the table or view. The source can be in the scratch directory or in a datalake source.
CREATE AGGREGATE REFLECTION <reflection_name> String
Creates an aggregation reflection with the specified name.
DIMENSIONS ( <column_name1>, <column_name2>, ... ) String
The columns to include as dimensions in the reflection.
MEASURES ( <column_name1> ( <aggregation_type> ), <column_name2> ( <aggregation_type ), ... ) String
The columns to include as measures in the reflection, and the type of aggregation to perform on them. The possible types are COUNT, MIN, MAX, SUM, and APPROXIMATE COUNT DISTINCT.
PARTITION BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to partition the data horizontally in the reflection.
LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to sort the data that is in the reflection.
ARROW CACHE Optional
Specifies that you want Dremio to convert data from your reflection’s Parquet files to the Apache Arrow format when copying that data to executor nodes. Normally, Dremio copies data as-is from the Parquet files to caches on executor nodes, which are nodes that carry out the query plans devised by the query optimizer. Enabling this option can improve query performance even more. However, data in the Apache Arrow format requires more space on the executor nodes than data in the default format. You can use this option with Amazon S3.
CREATE EXTERNAL REFLECTION <reflection_name> String
Creates an external reflection with the specified name.
USING <table_name> String
The path of the derived table to use with the external reflection.
CREATE RAW REFLECTION <reflection_name> String
Creates a raw reflection with the specified name.
USING DISPLAY ( <column_name1>, <column_name2>, ... ) String
The columns to include in the reflection.
PARTITION BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to partition the data horizontally in the reflection.
LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to sort the data that is in the reflection.
DROP REFLECTION String Optional
Drops the specified reflection defined on the table.
REFRESH REFLECTIONS String Optional
For information about the refresh action performed, see Triggering Refreshes by Using the Reflection API, the Catalog API, or an SQL Command.
ROUTE REFLECTIONS TO { DEFAULT QUEUE | QUEUE { <queue_name> | <queue_uuid> } } String
Specify the queue on which to run jobs that refresh reflections defined on the dataset. Use either the name or UUID of the queue. If neither a name nor a UUID is specified, the default queue is used. You can also directly specify the default queue. For more information, see Queue Routing.
{ ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } String Optional
ADD partitions the table data on the values in a single column or by using one of the partition-transformation functions. DROP drops the partition definition. These are the partition-transformation functions:
identity( <col> )
: Explicitly specified identity transform.year( <col> )
: Partition by year. The column uses either the TIMESTAMP or DATE data type.month( <ts_col> )
: Partition by month. The column uses either the TIMESTAMP or DATE data type.day( <ts_col> )
: Partition by day. The column uses either the TIMESTAMP or DATE data type.hour( <ts_col> )
: Partition by hour. The column uses either the TIMESTAMP or DATE data type.bucket( <count>, <col> )
: Partition by hashed value into<count>
buckets.truncate( <length>, <col> )
: Partition by truncated value. Strings are truncated to the specified length. Integer and biginteger values are truncated to bins (for example:truncate(10, i)
produces 0, 10, 20, and so on).
For more information about partitioning reflections, see Horizontally Partition Reflections that Have Many Rows and Partition Reflections to Allow for Partition-Based Incremental Refreshes.
Setting Reflection Hints
You can use reflection hints to influence the process of determining which reflections are substituted for tables and views to accelerate queries. For more information, see Influencing the Choice of Reflections Used to Satisfy Queries.
Syntax for setting hint for a sessionALTER SESSION
SET <reflection_hint>=<value>
SELECT /*+ <reflection_hint> */ <column_name1>, <column_name2>
FROM <table_name>
Examples
Create a raw reflection that sorts customers by last name and partitions them by countryALTER TABLE "@user1"."customers"
CREATE RAW REFLECTION customers_by_country USING DISPLAY (id,lastName,firstName,address,country)
PARTITION BY (country)
LOCALSORT BY (lastName);
ALTER TABLE TAMV2."outcome_category_line_item"
CREATE RAW REFLECTION outcome_category_line_item_raw USING DISPLAY (dir0,dir1,queryId,queryText,"start",finish,outcome,username)
PARTITION BY (username)
LOCALSORT BY (outcome);
ALTER TABLE Samples."samples.dremio.com"."zips.json"
CREATE AGGREGATE REFLECTION per_state USING
DIMENSIONS (state) MEASURES (city (COUNT))
LOCALSORT BY (state);
ALTER TABLE BI."p1_table"
CREATE AGGREGATE REFLECTION p1_table_agg USING
DIMENSIONS (c1,c2) MEASURES (p1(count, min), p2(sum, max,approximate count distinct))
PARTITION BY (c1)
LOCALSORT BY (c2)
ARROW CACHE;
ALTER DATASET "myWorkspace"."sales_by_region"
CREATE EXTERNAL REFLECTION "external_sales_by_region" USING "mySource"."sales_by_region";
ALTER TABLE Samples."samples.dremio.com"."zips.json"
DROP REFLECTION per_state;
ALTER TABLE Samples."samples.dremio.com"."zips.json"
REFRESH REFLECTIONS;
ALTER TABLE "Table 1" ROUTE REFLECTIONS TO DEFAULT QUEUE;
ALTER VIEW "View 1" ROUTE REFLECTIONS TO QUEUE "QUEUE 1";
ALTER VIEW "View 3" ROUTE REFLECTIONS TO DEFAULT QUEUE;