Influencing the Choice of Reflections Used to Satisfy Queries
You can use reflection hints to influence the process of determining which reflections are substituted for tables and views to accelerate queries.
How Reflections are Substituted for Tables and Views
There are three phases in the determination of which plan to use for a query: Consider, Match, and Choose. The following sections describe how reflections relate to these phases.
1. Consider
In this phase, the planner searches the existing reflections for those that include columns that can satisfy the query. The planner also determines whether it can use a starflake reflection.
2. Match
The planner generates replacement plans for the original user query. It searches through the reflections from the Consider phase and tries to match each reflection into alternative forms of the user query. Each successful match is then registered as a replacement plan. At this point, plan cost has not been considered, yet.
3. Choose
The planner explores all replacement sub-plans by using a dynamic programming approach with statistics, cost models and optimization rules as input. The lowest cost plan is chosen, where cost is a function of estimated row count, CPU, I/O, network, and memory.
Types of Reflection Hints
The types of hints are listed in the order in which they are applied.
consider_reflections
You can use a consider_reflection
hint to reduce the amount of time spent devising query plans by restricting the number of reflections that are considered to one or more. It causes the query planner to consider only the reflections that you specify, rather than all existing reflections. You pass in one or more reflection IDs as parameters. If you do not pass in any reflection IDs, the query planner considers all available reflections.
This hint is ideal when the query planner must usually consider a large number of reflections. You can speed up the planning process by providing a list of the only reflections for the planner to consider. For example, after you run a query for the first time, you can look at the Acceleration section of the raw profile of the job that ran the query. Then, you can add the consider_reflection
hint and provide the ID of the reflection that was used the first time the query ran.
ALTER SESSION
SET "reflections.planning.consider_reflections"='3dd4bd60-6a9e-422c-a4ee-556973223602, 09638927-7d70-499f-89cc-7bd60dec50a8';
SELECT /*+ consider_reflections(“3dd4bd60-6a9e-422c-a4ee-556973223602”, "09638927-7d70-499f-89cc-7bd60dec50a8") */ pickup_datetime, passenger_count
FROM Samples."samples.dremio.com"."NYC-taxi-trips-iceberg";
exclude_reflections
You can use an exclude_reflections
hint to reduce the amount of time spent devising query plans by preventing one or more reflections from being considered. It causes the query planner to consider none of the reflections that you specify. You pass in one or more reflection IDs as parameters.
This hint is ideal when the cost-based optimizer has chosen a query plan that uses suboptimal reflection and you want to direct the optimizer to generate a plan without it. For example, suppose that a particular reflection is continually used to satisfy a query, even though you have created a reflection that you want to use. You could use this hint to prevent the first reflection from being considered.
Example of setting this hint for a sessionALTER SESSION
SET "reflections.planning.exclude_reflections"='3dd4bd60-6a9e-422c-a4ee-556973223602, 09638927-7d70-499f-89cc-7bd60dec50a8';
SELECT /*+ exclude_reflections(“3dd4bd60-6a9e-422c-a4ee-556973223602”, "09638927-7d70-499f-89cc-7bd60dec50a8") */ pickup_datetime, passenger_count
FROM Samples."samples.dremio.com"."NYC-taxi-trips-iceberg";
no_reflections
You can use a no_reflections
hint to prevent query plans from using reflections. It causes the query planner not to consider any reflections for a query.
ALTER SESSION
SET "reflections.planning.no_reflections=true";
SELECT /*+ no_reflections */ pickup_datetime, passenger_count
FROM Samples."samples.dremio.com"."NYC-taxi-trips-iceberg";
choose_reflections
You can use a choose_reflections
hint to force the optimizer to choose query plans that use particular matched reflections. It causes the optimizer to prioritize query plans that use one or more reflections over all of the others. You pass in one or more reflection IDs as parameters.
-
If any of the query plans passed to the optimizer are listed in the
choose_reflections
hint, the cost-based optimizer is forced to pick the lowest cost plan containing at least one of the listed reflections. -
If none of the query plans passed to the optimizer are listed in the
choose_reflections
hint, the optimizer is free to pick the lowest cost plan. -
If no reflections are listed in the
choose_reflections
hint, the optimizer is free to pick the lowest cost plan.
ALTER SESSION
SET "reflections.planning.choose_reflections"='3dd4bd60-6a9e-422c-a4ee-556973223602, 09638927-7d70-499f-89cc-7bd60dec50a8';
SELECT /*+ choose_reflections(“3dd4bd60-6a9e-422c-a4ee-556973223602”, "09638927-7d70-499f-89cc-7bd60dec50a8") */ pickup_datetime, passenger_count
FROM Samples."samples.dremio.com"."NYC-taxi-trips-iceberg";
current_iceberg_data_only
For reflections created on tables in Iceberg format, you can use a current_iceberg_data_only
hint to prevent query plans from using reflections that are not updated to the same versions of base tables.
After an Iceberg table is updated, it may take up to 30 seconds for the reflection manager to mark a reflection as not-current based on the updated table and exclude the reflection from queries. The reflection manager marks reflections as current again when they are refreshed.
Example of setting this hint for a sessionALTER SESSION SET "reflections.planning.current_iceberg_data_only"=true;
SELECT /*+ current_iceberg_data_only */ pickup_datetime, passenger_count
FROM Samples."samples.dremio.com"."NYC-taxi-trips-iceberg";
Scope of Reflection Hints
You can set reflection hints either per session, per view, or per query.
If hints are present for a session and in queries (the hints being either explictly set in the queries or implicitly set in them by views) run during that session, all of the hints are applied.
For example, if a query uses /*+ consider_reflections("r1") */
and a session uses "reflection.planning.consider_reflections"='r2'
, where r1
and r2
are reflection IDs, the query planner considers both reflections.
Setting Hints in Sessions
Only one hint is allowed in the ALTER SESSION SET
command.
ALTER SESSION
SET "reflections.planning.<hint>"='<reflection_id> [ , ... ]';
Setting Hints in Views
Syntax for setting a reflection hint in the definition of a viewCREATE VIEW <view_path> as SELECT /*+ <hint>(“<reflection_id>” [ , ...] ) [ <hint>(“<reflection_id>” [ , ... ] ) ... ] */ <rest_of_the_query>
When a query that references one or more views is parsed, and hints are defined in those views, only the hints in the first, top-level view are used. For example, suppose you run this query:
Example querySELECT * FROM VIEW1
If one or more hints are defined in VIEW1, they are used. However, if the definition of VIEW1 includes VIEW2, and one or more hints are defined in VIEW2, the hints in VIEW2 are not used. VIEW1, in this case, is the top-level hint.
Moreover, suppose you run a query that joins VIEW1 and VIEW3, and the query references VIEW1 first. Only the hints defined in VIEW1 are used, even if hints are defined in VIEW3, because VIEW1 is the view that is referenced first.
Setting Hints in Queries
Syntax for setting reflection hints in a querySELECT /*+ <hint>(“<reflection_id>” [ , ...] ) [ <hint>(“<reflection_id>” [ , ... ] ) ... ] */ <rest_of_the_query>
Obtaining Reflection IDs
Three of these four reflection hints require you to list one or more reflection IDs. You can find the ID of a reflection in Acceleration section of the raw profile of job that ran a query that used the reflection, in the sys.project.reflections
system table, and in reflection summary objects that you retrieve with the Reflection REST API.
To find the ID of a reflection in Acceleration section of the raw profile of job that ran a query that used the reflection:
-
In the Dremio console, click in the left navbar.
-
In the list of jobs, locate the job that ran the query. If the query was satisfied by a reflection, purple lightning bolt appears in after the name of the user who ran the query.
-
Click the ID of the job.
-
Click Raw Profile at the top of the screen.
-
Click the Acceleration tab.
-
In the Reflection Outcome section, locate the ID of the reflection.
To find the ID of a reflection in the sys.project.reflections
system table:
-
In the Dremio console, click in the left navbar.
-
Copy this query and paste it into the SQL editor:
Query for listing info about all existing reflectionsSELECT *
FROM sys.project.reflections; -
Sort the results on the
dataset_name
column. -
In the
dataset_name
column, locate the name of the dataset that the reflection was defined on. -
Scroll the table to the right to look through the display columns, dimensions, measures, sort columns, and partition columns to find the combination of attributes that define the reflection.
-
Scroll the table all the way to the left to find the ID of the reflection.
To find the ID of a reflection by using REST APIs: