Reflection Recommendations Enterprise

Recommendations allow you to obtain effective reflections that accelerate your queries in Dremio. You can use recommendations that are automatically generated based on job history or manually requested for a specific set of jobs.

Viewing Automatically Generated Recommendations

When Autonomous Reflections are not enabled, Dremio automatically provides recommendations to add and remove Reflections based on query patterns to optimize performance for queries on Iceberg tables, Parquet datasets, and views based on them. Recommendations to add Reflections are sorted by overall effectiveness, with the most effective recommendations shown on top. Effectiveness relates to metrics such as the estimated number of accelerated jobs, potential increase in query execution speedup, and potential time saved during querying. These are rough estimates based on past data that can give you insight into the potential benefits of each recommendation. Reflections created using these recommendations refresh automatically when the source data changes on:

Iceberg tables: when the table is modified through Dremio or other engines. Dremio polls tables every 10 seconds.
Parquet datasets: when metadata is updated in Dremio.

To view and apply Reflection recommendations:

Navigate to Project Settings.
Select Reflections.
Click Reflections Recommendations to access the list of suggested Reflections.
To apply a recommendation, click on the plus icon at the end of the corresponding row.

Reflections created using usage-based recommendations are only used when fully synchronized with their source data to ensure up-to-date query results.

Manually Requesting Recommendations for Specific Jobs

You can generate recommendations for default raw Reflections by submitting job IDs to accelerate specific SQL queries using either the sys.recommend_reflections table function or the Recommendations API. Obtain the job IDs by looking them up on the Jobs page.

Running the sys.recommend_reflections Table Function

The SYS.RECOMMEND_Reflections table function returns a table of one or more recommendations.

Syntax

Syntax for the SYS.RECOMMEND_Reflections table function

SELECT * FROM TABLE(SYS.RECOMMEND_REFLECTIONS(ARRAY['<jobId>', '<jobId>']))

The SYS.RECOMMEND_Reflections function's argument must be an array literal.

Parameters

The SYS.RECOMMEND_Reflections table function has one parameter: an array that contains the job IDs for the SQL queries that you want to accelerate.

You can list up to 100 job IDs in each SYS.RECOMMEND_Reflections query. You must have permission to view every job you list in the query.

Output

The output for SYS.RECOMMEND_Reflections queries is a table that includes the following columns:

Column Name	Data Type	Description
view_sql	varchar	If the recommendation is for a default raw Reflection: The value in this field is `--Default Raw Reflection`. If the recommendation is for an aggregation Reflection: An SQL command that creates a view based on the initial SQL query in the referenced job ID. Dremio uses this view to create an aggregation Reflection when you run the SQL command from the `reflection_sql` column. Before running this command, ensure that you have already created an Arctic catalog or folder named `recommended_view`.
reflection_sql	varchar	If the recommendation is for a default raw Reflection: An SQL command for defining the Reflection on an existing view. If the recommendation is for an aggregation Reflection: An SQL command that alters and adds an aggregation Reflection to the view that is created when you run the SQL command in the `view_sql` column. Before running this command, ensure that you have run the corresponding command in the `view_sql` column. Dremio automatically consolidates recommendations for aggregation Reflections when possible to create a single recommendation for similar SQL queries, so the output table may not contain an individual row that corresponds to each job ID listed in the SYS.RECOMMEND_Reflections query.
job_ids	list	The unique identifier of the jobs that ran the queries for which the recommendation is given.
reflection_score	double	Dremio's score for the recommended Reflection's quality, on a scale of 0 (worst) to 100 (best). The reflection_score value considers the recommended Reflection's anticipated quality compared to existing Reflections and other recommended Reflections, as well as the likely improvement in query run times if the recommended Reflection is implemented.
average_improvement_factor	double	The likely average multiplicative rate of improvement for each query if you implement the recommended Reflection. For example, if the average_improvement_factor value is 2.34, implementing the recommended Reflection is likely to speed up each query by 2.34 times, on average.
average_improvement_ms	double	The likely average improvement, in milliseconds, for each query if you implement the recommended Reflection. For example, if the average_improvement_ms value is 5400, implementing the recommended Reflection is likely to save an average of 5400 milliseconds for each query that uses the Reflection.

Example Query and Output

To request a recommendation, you must have the job ID for at least one SQL query that you want to accelerate. For example, suppose this is the SQL query:

Example SQL Query

SELECT Max("p_size")
FROM s3.bucket1.tpch.sf10.parquet.part
JOIN s3.bucket1.tpch.sf10.parquet.lineitem
ON "p_partkey" = "l_partkey"
GROUP BY "p_brand"

Obtaining Recommendations

Retrieve the job ID for the query.
Run the SYS.RECOMMEND_Reflections query with the job ID. In the following example, the job ID is 844c0023-6272-8b16-aef3-aea289acadb1:
Example SYS.RECOMMEND_Reflections Query
```
SELECT * FROM TABLE(SYS.RECOMMEND_REFLECTIONS(ARRAY['844c0023-6272-8b16-aef3-aea289acadb1']))
```

The result of the query is returned in this table:

Example Results

view_sql	reflection_sql	job_ids	reflection_score	average_improvement_factor	average_improvement_ms
CREATE VIEW "recommended_view"."Dataset_9d74a03b-747a-42a2-a5ca-7f9c6f77b55d" AS SELECT "part"."P_BRAND" AS "F0[P_BRAND]", "part"."P_SIZE" AS "F1[P_SIZE]", "part"."P_PARTKEY" AS "extra#0", "part"."P_NAME" AS "extra#1", "part"."P_MFGR" AS "extra#2", "part"."P_TYPE" AS "extra#4", "part"."P_CONTAINER" AS "extra#6", "part"."P_RETAILPRICE" AS "extra#7", "part"."P_COMMENT" AS "extra#8", "lineitem"."L_ORDERKEY" AS "extra#9", "lineitem"."L_PARTKEY" AS "extra#10", "lineitem"."L_SUPPKEY" AS "extra#11", "lineitem"."L_LINENUMBER" AS "extra#12", "lineitem"."L_QUANTITY" AS "extra#13", "lineitem"."L_EXTENDEDPRICE" AS "extra#14", "lineitem"."L_DISCOUNT" AS "extra#15", "lineitem"."L_TAX" AS "extra#16", "lineitem"."L_RETURNFLAG" AS "extra#17", "lineitem"."L_LINESTATUS" AS "extra#18", "lineitem"."L_SHIPDATE" AS "extra#19", "lineitem"."L_COMMITDATE" AS "extra#20", "lineitem"."L_RECEIPTDATE" AS "extra#21", "lineitem"."L_SHIPINSTRUCT" AS "extra#22", "lineitem"."L_SHIPMODE" AS "extra#23", "lineitem"."L_COMMENT" AS "extra#24" FROM "s3"."bucket1"."tpch"."sf10"."parquet"."lineitem" INNER JOIN "s3"."bucket1"."tpch"."sf10"."parquet"."part" ON "part"."P_PARTKEY" = "lineitem"."L_PARTKEY"	ALTER DATASET "recommended_view"."Dataset_9d74a03b-747a-42a2-a5ca-7f9c6f77b55d" CREATE AGGREGATE Reflection "agg_0e07a376-7f8e-4c68-b2ce-6f6e819bebe6" USING DIMENSIONS ("F0[P_BRAND]") MEASURES ("F1[P_SIZE]" (MAX))	"6j6c34cf-9drf-b07a-5ab7-abea69a66d00", "1a3c67c0-aab0-f9fb-97b4-af374b520100", "1a3c67c0-db35-3645-9ef1-2a84e4d0ce00"	50.00	10.00	7196

Using the Recommendation

If a recommendation is for an aggregation Reflection:
1. Create an Arctic catalog or folder named recommended_view.
2. Run the query listed in the view_sql column to create a view based on the initial SQL query.
Run the query listed in the reflection_sql column to create the recommended Reflection and wait for the Reflection to finish refreshing.
When the Reflection refresh is complete, run the SQL query to observe the acceleration due to the added Reflection.

Limitations

The SYS.RECOMMEND_REFLECTIONS table function has the following limitations:

The function cannot provide recommendations for jobs with multiple nested queries that contain joins.
An SQL query can contain only inner joins. Outer joins that are part of a view definition are also supported. Other types of joins are not supported.
An SQL query cannot contain Window functions.

Sending Requests to the Recommendations API

You can use the Recommendations API to submit the job IDs of jobs that ran SQL queries and receive recommendations for Reflections that can accelerate those queries.

For default raw Reflections, each recommendation comprises the path to the view on which to define the Reflection and the parameters to use in a request to create the Reflection.
For aggregation Reflections, each recommendation comprises the parameters to use in a request to create a view on which to define the recommended Reflection and the parameters to use in a request to create the Reflection.

After you retrieve the recommended Reflections for your queries, use the Catalog API to create the recommended views. Then, use the Reflection API to create the desired Reflections.

note

Before submitting Catalog API requests to create the recommended views for aggregation Reflections, create a folder named recommended_view. In your Catalog API requests, the path parameter must include the full path to the recommended_view folder.

Read the Recommendations API reference for more information.

Viewing Automatically Generated Recommendations​

Manually Requesting Recommendations for Specific Jobs​

Running the sys.recommend_reflections Table Function​

Syntax​

Parameters​

Output​

Example Query and Output​

Obtaining Recommendations​

Using the Recommendation​

Limitations​

Sending Requests to the Recommendations API​