Skip to main content

Reflection Recommendations Enterprise

Recommendations allow you to obtain effective reflections that accelerate your queries in Dremio. You can use recommendations that are automatically generated based on job history or manually requested for a specific set of jobs.

Viewing Automatically Generated Recommendations

To accelerate query performance on the most frequently accessed views, Dremio generates a list of recommendations each day based on the jobs that have been executed in the last 7 days.

Dremio suggests up to 10 recommendations at a given time. The number of suggested recommendations varies depending on the data Dremio collects. If Dremio does not have relevant recommendations based on collected data, the list of recommendations is empty.

To view a list of automatically generated recommendations:

  1. On the Project Settings page, click Reflections.
  2. To view a list of recommendations, click in the upper-right corner.
  1. Click at the end of a row to trigger Dremio to create the reflection. You can monitor the reflection status on the Reflections page.

Recommendations are sorted by overall effectiveness, with the most effective recommendations shown on top. Effectiveness relates to metrics such as the estimated number of accelerated jobs, potential increase in query execution speedup, and potential time saved during querying. These are rough estimates based on past data that can give you insight into the potential benefits of each recommendation.

Manually Requesting Recommendations for Specific Jobs

You can generate recommendations for default raw reflections by submitting job IDs to accelerate specific SQL queries using either the SYS.RECOMMEND_REFLECTIONS table function or the Recommendations API. Obtain the job IDs by looking them up on the Jobs page.

Running the SYS.RECOMMEND_REFLECTIONS Table Function

The SYS.RECOMMEND_REFLECTIONS table function returns a table of one or more recommendations.

Syntax

Syntax for the SYS.RECOMMEND_REFLECTIONS table function
SELECT * FROM TABLE(SYS.RECOMMEND_REFLECTIONS(ARRAY['<jobId>', '<jobId>']))

The SYS.RECOMMEND_REFLECTIONS function's argument must be an array literal.

Parameters

The SYS.RECOMMEND_REFLECTIONS table function has one parameter: an array that contains the job IDs for the SQL queries that you want to accelerate.

You can list up to 100 job IDs in each SYS.RECOMMEND_REFLECTIONS query. You must have permission to view every job you list in the query.

Output

The output for SYS.RECOMMEND_REFLECTIONS queries is a table that includes the following columns:

Column NameData TypeDescription
view_sqlvarcharIf the recommendation is for a default raw reflection:
The value in this field is --Default Raw Reflection.

If the recommendation is for an aggregation reflection:
An SQL command that creates a view based on the initial SQL query in the referenced job ID. Dremio uses this view to create an aggregation reflection when you run the SQL command from the reflection_sql column. Before running this command, ensure that you have already created an Arctic catalog or folder named recommended_view.
reflection_sqlvarcharIf the recommendation is for a default raw reflection:
An SQL command for defining the reflection on an existing view.

If the recommendation is for an aggregation reflection:
An SQL command that alters and adds an aggregation reflection to the view that is created when you run the SQL command in the view_sql column. Before running this command, ensure that you have run the corresponding command in the view_sql column.

Dremio automatically consolidates recommendations for aggregation reflections when possible to create a single recommendation for similar SQL queries, so the output table may not contain an individual row that corresponds to each job ID listed in the SYS.RECOMMEND_REFLECTIONS query.
job_idslistThe unique identifier of the jobs that ran the queries for which the recommendation is given.
reflection_scoredoubleDremio's score for the recommended reflection's quality, on a scale of 0 (worst) to 100 (best). The reflection_score value considers the recommended reflection's anticipated quality compared to existing reflections and other recommended reflections, as well as the likely improvement in query run times if the recommended reflection is implemented.
average_improvement_factordoubleThe likely average multiplicative rate of improvement for each query if you implement the recommended reflection. For example, if the average_improvement_factor value is 2.34, implementing the recommended reflection is likely to speed up each query by 2.34 times, on average.
average_improvement_msdoubleThe likely average improvement, in milliseconds, for each query if you implement the recommended reflection. For example, if the average_improvement_ms value is 5400, implementing the recommended reflection is likely to save an average of 5400 milliseconds for each query that uses the reflection.

Example Query and Output

To request a recommendation, you must have the job ID for at least one SQL query that you want to accelerate. For example, suppose this is the SQL query:

Example SQL Query
SELECT Max("p_size")
FROM s3.bucket1.tpch.sf10.parquet.part
JOIN s3.bucket1.tpch.sf10.parquet.lineitem
ON "p_partkey" = "l_partkey"
GROUP BY "p_brand"
Obtaining Recommendations
  1. Retrieve the job ID for the query.

  2. Run the SYS.RECOMMEND_REFLECTIONS query with the job ID. In the following example, the job ID is 844c0023-6272-8b16-aef3-aea289acadb1:

    Example SYS.RECOMMEND_REFLECTIONS Query
    SELECT * FROM TABLE(SYS.RECOMMEND_REFLECTIONS(ARRAY['844c0023-6272-8b16-aef3-aea289acadb1']))

The result of the query is returned in this table:

Example Results
view_sqlreflection_sqljob_idsreflection_scoreaverage_improvement_factoraverage_improvement_ms
CREATE VIEW "recommended_view"."Dataset_9d74a03b-747a-42a2-a5ca-7f9c6f77b55d" AS SELECT "part"."P_BRAND" AS "F0[P_BRAND]", "part"."P_SIZE" AS "F1[P_SIZE]", "part"."P_PARTKEY" AS "extra#0", "part"."P_NAME" AS "extra#1", "part"."P_MFGR" AS "extra#2", "part"."P_TYPE" AS "extra#4", "part"."P_CONTAINER" AS "extra#6", "part"."P_RETAILPRICE" AS "extra#7", "part"."P_COMMENT" AS "extra#8", "lineitem"."L_ORDERKEY" AS "extra#9", "lineitem"."L_PARTKEY" AS "extra#10", "lineitem"."L_SUPPKEY" AS "extra#11", "lineitem"."L_LINENUMBER" AS "extra#12", "lineitem"."L_QUANTITY" AS "extra#13", "lineitem"."L_EXTENDEDPRICE" AS "extra#14", "lineitem"."L_DISCOUNT" AS "extra#15", "lineitem"."L_TAX" AS "extra#16", "lineitem"."L_RETURNFLAG" AS "extra#17", "lineitem"."L_LINESTATUS" AS "extra#18", "lineitem"."L_SHIPDATE" AS "extra#19", "lineitem"."L_COMMITDATE" AS "extra#20", "lineitem"."L_RECEIPTDATE" AS "extra#21", "lineitem"."L_SHIPINSTRUCT" AS "extra#22", "lineitem"."L_SHIPMODE" AS "extra#23", "lineitem"."L_COMMENT" AS "extra#24"
FROM "s3"."bucket1"."tpch"."sf10"."parquet"."lineitem"
INNER JOIN "s3"."bucket1"."tpch"."sf10"."parquet"."part" ON "part"."P_PARTKEY" = "lineitem"."L_PARTKEY"
ALTER DATASET "recommended_view"."Dataset_9d74a03b-747a-42a2-a5ca-7f9c6f77b55d" CREATE AGGREGATE REFLECTION "agg_0e07a376-7f8e-4c68-b2ce-6f6e819bebe6" USING DIMENSIONS ("F0[P_BRAND]") MEASURES ("F1[P_SIZE]" (MAX))"6j6c34cf-9drf-b07a-5ab7-abea69a66d00", "1a3c67c0-aab0-f9fb-97b4-af374b520100", "1a3c67c0-db35-3645-9ef1-2a84e4d0ce00"50.0010.007196
Using the Recommendation
  1. If a recommendation is for an aggregation reflection:

    1. Create an Arctic catalog or folder named recommended_view.
    2. Run the query listed in the view_sql column to create a view based on the initial SQL query.
  2. Run the query listed in the reflection_sql column to create the recommended reflection and wait for the reflection to finish refreshing.

  3. When the reflection refresh is complete, run the SQL query to observe the acceleration due to the added reflection.

Limitations

The SYS.RECOMMEND_REFLECTIONS table function has the following limitations:

  • The function cannot provide recommendations for jobs with multiple nested queries that contain joins.
  • An SQL query can contain only inner joins. Outer joins that are part of a view definition are also supported. Other types of joins are not supported.
  • An SQL query cannot contain Window functions.

Sending Requests to the Recommendations API

You can use the Recommendations API to submit the job IDs of jobs that ran SQL queries and receive recommendations for reflections that can accelerate those queries.

  • For default raw reflections, each recommendation comprises the path to the view on which to define the reflection and the parameters to use in a request to create the reflection.
  • For aggregation reflections, each recommendation comprises the parameters to use in a request to create a view on which to define the recommended reflection and the parameters to use in a request to create the reflection.

After you retrieve the recommended reflections for your queries, use the Catalog API to create the recommended views. Then, use the Reflection API to create the desired reflections.

note

Before submitting Catalog API requests to create the recommended views for aggregation reflections, create a folder named recommended_view. In your Catalog API requests, the path parameter must include the full path to the recommended_view folder.

Read the Recommendations API reference for more information.