Reflection Recommendations
You can submit the IDs of jobs that have run SQL queries, and receive recommendations for default raw reflections or aggregation reflections to create for accelerating the execution of the same SQL queries in the future.
You can obtain the job IDs either by using the SQL API or by looking them up on the Jobs page.
Default Raw Reflections
Default raw reflections (also called basic raw reflections) are reflections that include all of the columns that are in a table or view. For more information about them, see Types of Reflections.
Recommendations are given for creating default raw reflections on views. Each job ID that you submit must represent a query that has run against one or more views. The queries can also have run against tables; however, those tables will not be considered as candidates for default raw reflections.
The most complicated view referenced in a query is the view that will have a default raw reflection recommended for it.
For each job ID that you submit and that represents a query that has run against one or more views, you will receive the SQL for creating one recommended default raw reflection.
Aggregation Reflections
Aggregation reflections are reflections that accelerate BI-style queries that involve aggregations (GROUP BY queries). For more information about them, see Types of Reflections.
A recommendation consists of an SQL command for creating a view and an SQL command for defining an aggregation reflection on that view. Each job ID that you submit must represent a query that has been run against one or more tables or views.
Similar SQL queries are automatically consolidated into groups, when possible; and a single recommendation can be given for a group of queries. For example, if you submit job ID 1, job ID 2, and job ID 3, you might receive one of these sets of results:
- A recommendation for 1, a recommendation for 2, and a recommendation for 3
- A recommendation for 1, and a single recommendation for 2 and 3
- A recommendation for 2, and a single recommendation for 1 and 3
- A recommendation for 3, and a single recommendation for 1 and 2
- A single recommendation for 1, 2, and 3
Methods of Obtaining Recommendations
There are two different methods you can use to obtain recommendations:
Running the SYS.RECOMMEND_REFLECTIONS Table Function
The SYS.RECOMMEND_REFLECTIONS table function returns a table of one or more recommendations.
Syntax
Syntax for the SYS.RECOMMEND_REFLECTIONS table functionSELECT * FROM TABLE(SYS.RECOMMEND_REFLECTIONS(ARRAY['<jobId>', '<jobId>']))
The SYS.RECOMMEND_REFLECTIONS function's argument must be an array literal.
Parameters
The SYS.RECOMMEND_REFLECTIONS table function has one parameter: an array that contains the job IDs for the SQL queries that you want to accelerate.
You can list up to 10 job IDs in each SYS.RECOMMEND_REFLECTIONS query. You must have permission to view every job you list in the query.
Output
The output for SYS.RECOMMEND_REFLECTIONS queries is a table that includes the following three columns:
Column Name | Data Type | Description |
---|---|---|
view_sql | varchar | If the recommendation is for a default raw reflection: The value in this field is --Default Raw Reflection .If the recommendation is for an aggregation reflection: An SQL command that creates a view based on the initial SQL query in the referenced job ID. Dremio uses this view to create an aggregation reflection when you run the SQL command from the reflection_sql column. Before running this command, ensure that you have already created an Arctic catalog or folder named recommended_view . |
reflection_sql | varchar | If the recommendation is for a default raw reflection: An SQL command for defining the reflection on an existing view. If the recommendation is for an aggregation reflection: An SQL command that alters and adds an aggregation reflection to the view that is created when you run the SQL command in the view_sql column. Before running this command, ensure that you have run the corresponding command in the view_sql column.Dremio automatically consolidates recommendations for aggregation reflections when possible to create a single recommendation for similar SQL queries, so the output table may not contain an individual row that corresponds to each job ID listed in the SYS.RECOMMEND_REFLECTIONS query. |
job_ids | list | The unique identifier of the jobs that ran the queries for which the recommendation is given. |
Example Query and Output
To request a recommendation, you must have the job ID for at least one SQL query that you want to accelerate. For example, suppose this is the SQL query:
Example SQL QuerySELECT Max("p_size")
FROM s3.bucket1.tpch.sf10.parquet.part
JOIN s3.bucket1.tpch.sf10.parquet.lineitem
ON "p_partkey" = "l_partkey"
GROUP BY "p_brand"
Obtaining Recommendations
Retrieve the job ID for the query.
Run the SYS.RECOMMEND_REFLECTIONS query with the job ID. In the following example, the job ID is
Example SYS.RECOMMEND_REFLECTIONS Query844c0023-6272-8b16-aef3-aea289acadb1
:SELECT * FROM TABLE(SYS.RECOMMEND_REFLECTIONS(ARRAY['844c0023-6272-8b16-aef3-aea289acadb1']))
The result of the query is returned in this table:
Example Resultsview_sql | reflection_sql | job_ids |
---|---|---|
CREATE VIEW "recommended_view"."Dataset_9d74a03b-747a-42a2-a5ca-7f9c6f77b55d" AS SELECT "part"."P_BRAND" AS "F0[P_BRAND]", "part"."P_SIZE" AS "F1[P_SIZE]", "part"."P_PARTKEY" AS "extra#0", "part"."P_NAME" AS "extra#1", "part"."P_MFGR" AS "extra#2", "part"."P_TYPE" AS "extra#4", "part"."P_CONTAINER" AS "extra#6", "part"."P_RETAILPRICE" AS "extra#7", "part"."P_COMMENT" AS "extra#8", "lineitem"."L_ORDERKEY" AS "extra#9", "lineitem"."L_PARTKEY" AS "extra#10", "lineitem"."L_SUPPKEY" AS "extra#11", "lineitem"."L_LINENUMBER" AS "extra#12", "lineitem"."L_QUANTITY" AS "extra#13", "lineitem"."L_EXTENDEDPRICE" AS "extra#14", "lineitem"."L_DISCOUNT" AS "extra#15", "lineitem"."L_TAX" AS "extra#16", "lineitem"."L_RETURNFLAG" AS "extra#17", "lineitem"."L_LINESTATUS" AS "extra#18", "lineitem"."L_SHIPDATE" AS "extra#19", "lineitem"."L_COMMITDATE" AS "extra#20", "lineitem"."L_RECEIPTDATE" AS "extra#21", "lineitem"."L_SHIPINSTRUCT" AS "extra#22", "lineitem"."L_SHIPMODE" AS "extra#23", "lineitem"."L_COMMENT" AS "extra#24" FROM "s3"."bucket1"."tpch"."sf10"."parquet"."lineitem" INNER JOIN "s3"."bucket1"."tpch"."sf10"."parquet"."part" ON "part"."P_PARTKEY" = "lineitem"."L_PARTKEY" | ALTER DATASET "recommended_view"."Dataset_9d74a03b-747a-42a2-a5ca-7f9c6f77b55d" CREATE AGGREGATE REFLECTION "agg_0e07a376-7f8e-4c68-b2ce-6f6e819bebe6" USING DIMENSIONS ("F0[P_BRAND]") MEASURES ("F1[P_SIZE]" (MAX)) | "6j6c34cf-9drf-b07a-5ab7-abea69a66d00", "1a3c67c0-aab0-f9fb-97b4-af374b520100", "1a3c67c0-db35-3645-9ef1-2a84e4d0ce00" |
Using the Recommendation
If a recommendation is for an aggregation reflection:
- Create an Arctic catalog or folder named
recommended_view
. - Run the query listed in the
view_sql
column to create a view based on the initial SQL query.
- Create an Arctic catalog or folder named
Run the query listed in the
reflection_sql
column to create the recommended reflection and wait for the reflection to finish refreshing.When the reflection refresh is complete, run the SQL query to observe the acceleration due to the added reflection.
Limitations
The SYS.RECOMMEND_REFLECTIONS table function has the following limitations:
- The function cannot provide recommendations for jobs with multiple nested queries that contain joins.
- An SQL query can contain only inner joins. Outer joins that are part of a view definition are also supported. Other types of joins are not supported.
- An SQL query cannot contain Window functions.
Sending Requests to the Recommendations API
Each response includes one or more recommendations.
If a recommendation is for a default raw reflection: Each recommendation is comprised of the path to the view on which to define the reflection, and the parameters to use in a request to create the reflection.
You can use the Catalog API to retrieve the ID of the view. Then, you can use the Reflection API to create the reflection.
If a recommendation is for an aggregation reflection: Each recommendation is comprised of the parameters to use in a request to create a view on which to define the recommended reflection, and the parameters to use in a request to create the reflection.
You can use the Catalog API to create the recommended view. And then, you can use the Reflection API to create the reflection.
noteBefore submitting requests to create recommended views on which to define recommended aggregation reflections, create a folder named
recommended_view
. In your requests, ensure that thepath
parameter includes the full path to that folder.
This example request submits two job IDs:
Example requestcurl -X POST 'https://{hostname}/v0/projects/{projectid}/reflection/recommendations' \
--header 'Authorization: _dremio{tokenstring}' \
--header 'Content-Type: application/json' \
--data-raw
'{
"jobids": [
"1b1874ce-b241-2475-63c0-db22e707f100",
"1b1874ce-b241-2475-63c0-dbdsfssdfs00"
]
}'
You can obtain the job IDs either by using the SQL API or by looking them up on the Jobs page.
The response contains one or more recommendations. Each recommendation has five parts:
viewRequestBody
: The fields that you can include in a request to the Catalog API for creating the view on which to define a recommended aggregation reflection.noteIf a recommendation is for a default raw reflection, then creating a view is unnecessary, because such recommendations are given only for existing views, and the value of this field is
--Default Raw Reflection
.viewRequestEndpoint
: The endpoint to use when submitting the request, given inviewRequestBody
, to the Catalog API.reflectionRequestBody
: The fields that you can include in a request to the Reflection API for creating the recommended reflection.- If the recommendation is for a default raw reflection, replace
{ViewDatasetId}
with the ID of the view that is in the path given by thepath
field. - If the recommendation is for an aggregation reflection, replace
{ViewDatasetId}
with the ID that is returned after you create the recommended view.
- If the recommendation is for a default raw reflection, replace
reflectionRequestEndpoint
: The endpoint to use when submitting the request to the Reflection API.jobIds
: The IDs of the jobs that ran the queries for which the recommendations are given.
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"recommended_view",
"view_data"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "POST {hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "raw_9c4bf1cd-1c8c-42b9-9d38-d4fc2b0d374a",
"datasetId": "{ViewDatasetId}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "i_item_sk"
},
{
"name": "i_item_id"
},
{
"name": "i_rec_start_date"
},
{
"name": "i_rec_end_date"
},
{
"name": "i_item_desc"
},
{
"name": "i_current_price"
},
{
"name": "i_wholesale_cost"
},
{
"name": "i_brand_id"
},
{
"name": "i_brand"
},
{
"name": "i_class_id"
},
{
"name": "i_class"
},
{
"name": "i_category_id"
},
{
"name": "i_category"
},
{
"name": "i_manufact_id"
},
{
"name": "i_manufact"
},
{
"name": "i_size"
},
{
"name": "i_formulation"
},
{
"name": "i_color"
},
{
"name": "i_units"
},
{
"name": "i_container"
},
{
"name": "i_manager_id"
},
{
"name": "i_product_name"
},
{
"name": "d_date_sk"
},
{
"name": "d_date_id"
},
{
"name": "d_date"
},
{
"name": "d_month_seq"
},
{
"name": "d_week_seq"
},
{
"name": "d_quarter_seq"
},
{
"name": "d_year"
},
{
"name": "d_dow"
},
{
"name": "d_moy"
},
{
"name": "d_dom"
},
{
"name": "d_qoy"
},
{
"name": "d_fy_year"
},
{
"name": "d_fy_quarter_seq"
},
{
"name": "d_fy_week_seq"
},
{
"name": "d_day_name"
},
{
"name": "d_quarter_name"
},
{
"name": "d_holiday"
},
{
"name": "d_weekend"
},
{
"name": "d_following_holiday"
},
{
"name": "d_first_dom"
},
{
"name": "d_last_dom"
},
{
"name": "d_same_day_ly"
},
{
"name": "d_same_day_lq"
},
{
"name": "d_current_day"
},
{
"name": "d_current_week"
},
{
"name": "d_current_month"
},
{
"name": "d_current_quarter"
},
{
"name": "d_current_year"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": ["6j6c34cf-9drf-b07a-5ab7-abea69a66d00"]
}
],
"canAlterReflections": true
}
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"path": [
"recommended_view",
"View_1"
],
"type": "VIRTUAL_DATASET",
"sql": "SELECT * FROM Samples.samples.dremio.com.\"NYC-taxi-trips\" WHERE trip_distance_mi <= 2.0 ORDER BY trip_distance_mi ASC",
}
"viewRequestEndpoint": "POST {hostname}/catalog",
"reflectionRequestBody": {
"type": "AGGREGATION",
"name": "agg_250e70d1-5e2a-4938-a1a1-95f664085099",
"datasetId": {ViewDatasetId},
"enabled": true,
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/reflection",
"jobIds": ["5f6d23rg-2ekf-d15b-9fv0-peif76g41h99"]
}
],
"canAlterReflections": true
}
For more information about the Recommendations API, see Recommendations in the API reference.