Reflections
Reflections can be defined on an existing dataset by using SQL commands.
SyntaxALTER [ TABLE | VIEW ] <table_or_view_name>
{ CREATE AGGREGATE REFLECTION <reflection_name> USING
{ DIMENSIONS ( <column_name1>, <column_name2>, ... )
| MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) )
| DIMENSIONS ( <column_name1>, <column_name2>, ... ) MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) ) }
[ PARTITION BY ( [<column_name>|<partition_transform>] [ , ... ] ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| CREATE EXTERNAL REFLECTION <reflection_name> USING <table_name>
| CREATE RAW REFLECTION <reflection_name> USING DISPLAY ( <column_name1>, <column_name2>, ...)
[ PARTITION BY ( [<column_name>|<partition_transform>] [ , ... ] ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| DROP REFLECTION <reflection_name>
| ROUTE REFLECTIONS TO { DEFAULT ENGINE | ENGINE { <engine_name> | <engine_uuid> } }
| { ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } }
Parameters
<table_or_view_name> String
The name of the table or view.
CREATE AGGREGATE REFLECTION <reflection_name> String
Creates an aggregation reflection with the specified name.
DIMENSIONS ( <column_name1>, <column_name2>, ... ) String
The columns to include as dimensions in the reflection.
MEASURES ( <column_name1> ( <aggregation_type> ), <column_name2> ( <aggregation_type ), ... ) String
The columns to include as measures in the reflection, and the type of aggregation to perform on them. The possible types are COUNT, MIN, MAX, SUM, and APPROXIMATE COUNT DISTINCT.
PARTITION BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to partition the data horizontally in the reflection.
LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to sort the data that is in the reflection.
ARROW CACHE Optional
Specifies that you want Dremio to convert data from your reflection’s Parquet files to the Apache Arrow format when copying that data to executor nodes. Normally, Dremio copies data as-is from the Parquet files to caches on executor nodes, which are nodes that carry out the query plans devised by the query optimizer. Enabling this option can improve query performance even more. However, data in the Apache Arrow format requires more space on the executor nodes than data in the default format. You can use this option with Amazon S3.
CREATE EXTERNAL REFLECTION <reflection_name> String
Creates an external reflection with the specified name.
USING <table_name> String
The path of the derived table to use with the external reflection.
CREATE RAW REFLECTION <reflection_name> String
Creates a raw reflection with the specified name.
USING DISPLAY ( <column_name1>, <column_name2>, ... ) String
The columns to include in the reflection.
PARTITION BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to partition the data horizontally in the reflection.
LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String Optional
The columns on which to sort the data that is in the reflection.
ARROW CACHE Optional
Specifies that you want Dremio to convert data from your reflection’s Parquet files to the Apache Arrow format when copying that data to executor nodes. Normally, Dremio copies data as-is from the Parquet files to caches on executor nodes, which are nodes that carry out the query plans devised by the query optimizer. Enabling this option can improve query performance even more. However, data in the Apache Arrow format requires more space on the executor nodes than data in the default format. You can use this option with Amazon S3.
DROP REFLECTION String Optional
Drops the specified reflection defined on the table.
ROUTE REFLECTIONS TO { DEFAULT ENGINE | ENGINE { <engine_name> | <engine_uuid> } } String
Specify the engine on which to run jobs that create and refresh reflections that are defined on the table. Use either the name or UUID of the engine. If neither a name nor a UUID is specified, the default engine is used. You can also directly specify the default engine.
{ ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } String Optional
ADD partitions the table data on the values in a single column or by using one of the partition-transformation functions. DROP drops the partition definition. These are the partition-transformation functions:
identity( <col> )
: Explicitly specified identity transform.year( <col> )
: Partition by year. The column uses either the TIMESTAMP or DATE data type.month( <ts_col> )
: Partition by month. The column uses either the TIMESTAMP or DATE data type.day( <ts_col> )
: Partition by day. The column uses either the TIMESTAMP or DATE data type.hour( <ts_col> )
: Partition by hour. The column uses either the TIMESTAMP or DATE data type.bucket( <count>, <col> )
: Partition by hashed value into<count>
buckets.truncate( <length>, <col> )
: Partition by truncated value. Strings are truncated to the specified length. Integer and biginteger values are truncated to bins (for example:truncate(10, i)
produces 0, 10, 20, and so on).
For more information about partitioning reflections, see Horizontally Partition Reflections that Have Many Rows and Partition Reflections to Allow for Partition-Based Incremental Refreshes.
Setting Reflection Hints
You can use reflection hints to influence the process of determining which reflections are substituted for tables and views to accelerate queries. For more information, see Influencing the Choice of Reflections Used to Satisfy Queries.
Syntax for setting hint for a sessionALTER SESSION
SET <reflection_hint>=<value>
SELECT /*+ <reflection_hint> */ <column_name1>, <column_name2>
FROM <table_name>