Skip to main content

Recommendations

Use the Recommendations API to get job-based recommendations and get and create usage-based reflections that can accelerate your queries.

Getting job-based recommendations requires making the following two API requests:

  1. Submit the job IDs of jobs that have run SQL queries. These are job IDs of the queries for which you want to retrieve recommendations in further requests. This request returns the job ID to use in the second request.
  2. Retrieve job-based recommendations for reflections that can accelerate your queries. Use the job ID that was returned in your first request to make the request for recommendations.

Creating reflections from usage-based recommendations requires making the following two API requests:

  1. Retrieve usage-based recommendations for reflections. This request returns the parameters to use in the body of the second request.
  2. Create reflections from usage-based recommendations that can accelerate your queries from the usage-based recommendations. Use the recommendation ID and reflection request body that were returned in your first request to create the reflections.
note

Dremio supports creating only raw reflections from usage-based recommendations.

Recommendation Object (Raw Reflection)
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"azure_3",
"table_2"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "{hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "raw_47f54460-543f-430f-a9e5-ca71d246265e",
"datasetId": "{\"tableKey\":[\"azure_3\",\"table_2\"],\"contentId\":\"356d6214-6c55-4011-8378-34b2f65d38c6\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "EXPR$1"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"13ffb629-9f0e-4265-97df-99bf0d425813"
],
"jobCount": 1,
"recommendationId": "9be8a451-4190-4618-a72e-9932f790c744",
"reflectionScore": 50.67,
"avgImprovementFactor": 10.43,
"avgImprovementMs": 7196
}
],
"canAlterReflections": true
}
Recommendation Object (Aggregation Reflection)
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"recommended_view",
"view_1"
],
"sql": "SELECT * FROM Samples.samples.dremio.com.\"NYC-taxi-trips\" WHERE trip_distance_mi <= 2.0 ORDER BY trip_distance_mi ASC"
},
"viewRequestEndpoint": "POST {hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "AGGREGATION",
"name": "agg_0e0c4ab9-def7-48da-81f1-ca8c1da11ed4",
"datasetId": "{\"tableKey\":[\"recommended_view\",\"view_1\"],\"contentId\":\"7ae1eded-5133-4e80-bf06-1be20975e0aa\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"displayFields": [],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"1ded81f8-4d06-4d09-8163-9e2517027d8d"
],
"jobCount": 1,
"recommendationId": "1855d2dd-4106-4359-a97a-e08a916096e6",
"reflectionScore": 60.12,
"avgImprovementFactor": 8.39,
"avgImprovementMs": 5400
}
],
"canAlterReflections": true
}

Recommendation Attributes

data Array of Object

List of recommended reflection objects for the submitted job IDs.


canAlterReflections Boolean

If the columns in the recommended reflection can be edited, added, and removed, true. Otherwise, false.

Example: true

Attributes of the data Array

viewRequestBody Object

The fields to include in a request to the Catalog API to create the view on which to define the recommended reflection.


viewRequestEndpoint String

The endpoint to use when submitting a request to the Catalog API to create the view on which to define the recommended reflection. NOTE: The API returns the Software endpoint. The correct endpoint for Cloud requests is POST /v0/projects/{project-id}/catalog.


reflectionRequestBody Object

The fields to include in a request to the Reflection API to create the recommended reflection.


reflectionRequestEndpoint String

The endpoint to use when submitting the request to the Reflection API to create the recommended reflection. NOTE: The API returns the Software endpoint. The correct endpoint for Cloud requests is POST /v0/projects/{project-id}/reflection.


jobIds Array of String

The job IDs of the queries for which the reflection recommendations are given.

Example: ["13ffb629-9f0e-4265-97df-99bf0d425813"]


jobCount Array of String

The number of jobs for which reflection recommendations are given.

Example: 1


recommendationId Array of String

The ID of the recommended reflection.

Example: ["9be8a451-4190-4618-a72e-9932f790c744"]


reflectionScore Double

Score for the recommended reflection's quality, on a scale of 0 (worst) to 100 (best). The reflectionScore value considers the recommended reflection's anticipated quality compared to existing reflections and other recommended reflections, as well as the likely improvement in query run times if the recommended reflection is implemented.

Example: 50.67


avgImprovementFactor Double

The likely average multiplicative rate of improvement for each query if you implement the recommended reflection. For example, if the avgImprovementFactor value is 2.34, implementing the recommended reflection is likely to speed up each query by 2.34 times, on average.

Example: 10.43


avgImprovementMs Double

The likely average improvement, in milliseconds, for each query if you implement the recommended reflection. For example, if the avgImprovementMs value is 5400, implementing the recommended reflection is likely to save an average of 5400 milliseconds for each query that uses the reflection.

Example: 7196

Attributes of the viewRequestBody Object

entityType String

Type of catalog entity. For views, the entityType is dataset.


type String

Type of dataset. For views, the type is VIRTUAL_DATASET.


path Array of String

Path to the location where the view should be created within Dremio, expressed in an array. The path lists each level of hierarchy in order, from outer to inner: Arctic source or catalog first, then folder and subfolders, then a name for the view itself as the last item in the array. Views can only be created in Arctic sources and the project's Arctic catalog.

Example: ["azure_3","table_2"]


sql String

For aggregation reflections, the SQL query to use to create the view. For default raw reflections, the sql value --Default Raw Reflection; creating a view is unnecessary because raw recommendations are given only for existing views.

Attributes of the reflectionRequestBody Object

type String

Reflection type. For details, read Types of Reflections.

Enum: RAW, AGGREGATION

Example: AGGREGATION


name String

User-provided name for the reflection. For reflections created in the Dremio console, if the user did not provide a name, the default values are Raw Reflection and Aggregation Reflection (automatically assigned based on the reflection type).

Example: raw_47f54460-543f-430f-a9e5-ca71d246265e


datasetId String

Unique identifier for the anchor dataset to associate with the reflection. The datasetId is a JSON string.

Example: {\"tableKey\":[\"azure_3\",\"table_2\"],\"contentId\":\"356d6214-6c55-4011-8378-34b2f65d38c6\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}


enabled Boolean

If the reflection is available for accelerating queries, true. Otherwise, false.

Example: true


arrowCachingEnabled Boolean

If Dremio converts data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true. Otherwise, false.

Example: false


dimensionFields Array of Object

Information about the dimension fields from the anchor dataset used in the reflection. Dimension fields are the fields you expect to group by when analyzing data. Valid only for aggregation reflections. For raw reflections or if the anchor dataset does not include any dimension fields, the dimensionFields value is an empty array. For aggregation reflections, if the anchor dataset includes dimension fields, each object in the dimensionFields array contains two attributes: name and granularity.

Example: [{"name": "pickup_date","granularity": "DATE"},{"name": "pickup_datetime","granularity": "DATE"},{"name": "dropoff_date","granularity": "DATE"},{"name": "dropoff_datetime","granularity": "DATE"},{"name": "passenger_count","granularity": "DATE"},{"name": "total_amount","granularity": "DATE"}]


measureFields Array of Object

Information about the measure fields from the anchor dataset used in the reflection. Measure fields are the fields you expect to use for calculations when analyzing the data. Valid only for aggregation reflections. For raw reflections or if the anchor dataset does not include any measure fields, the measureFields value is an empty array. For aggregation reflections, if the anchor dataset includes measure fields, each object in the measureFields array contains two attributes: name and measureTypeList.

Example: [{"name": "passenger_count","measureTypeList": ["SUM,"COUNT"]},{"name": "trip_distance_mi","measureTypeList": ["SUM","COUNT"]},{"name": "fare_amount","measureTypeList": ["SUM","COUNT"]},{"name": "surcharge","measureTypeList": ["SUM","COUNT"]},{"name": "tip_amount","measureTypeList": ["SUM","COUNT"]},{"name": "total_amount","measureTypeList": ["SUM","COUNT"]}]


displayFields Array of Object

Information about the fields displayed from the anchor dataset. Valid only for raw reflections. For aggregation reflections or if the anchor dataset does not include any display fields, the value is an empty array. For raw reflections, if the anchor dataset includes display fields, each object in the displayFields array contains one attribute: name.

Example: [{"name": "passenger_count"},{"name": "EXPR$1"}]


entityType String

Type of entity. For reflection objects, the entityType is reflection.

Submitting Job IDs

Submit the job IDs of queries for which you want to request reflection recommendations.

The response includes objects that contain an id attribute and value for each job ID you submit. Use these id values to request recommendations for reflections to accelerate the queries.

Method and URL
POST /v0/projects/{project-id}/reflection/recommendations/job-based/

Parameters

project-id Path   String (UUID)

Unique identifier for the project where the queries were run.

Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea


jobIds Body   Array of String

The job IDs of the queries for which you want to request reflection recommendations. To get the job IDs, use the SQL API or find them on the Jobs page in the Dremio console. Use a comma-separated list to submit multiple job IDs.

Example: ["a7efcd50-791a-48e8-bb05-391b4411e66b"]

Example Request
curl -X POST 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/job-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json' \
--data-raw '{
"jobIds": ["a7efcd50-791a-48e8-bb05-391b4411e66b","c2485882-e6b7-4aa8-af5b-a825d2870589"]
}'
Example Response
{
"id": "13ffb629-9f0e-4265-97df-99bf0d425813"
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

405   Method Not Allowed

500   Internal Server Error

Retrieving Job-Based Recommendations

Retrieve job-based recommended reflections to accelerate the queries whose job IDs you submitted.

  • For default raw reflections, each recommendation comprises the path to the view on which to define the reflection and the parameters to use in a request to create the reflection.
  • For aggregation reflections, each recommendation comprises the parameters to use in a request to create a view on which to define the recommended reflection and the parameters to use in a request to create the reflection.

After you retrieve the recommended reflections for your queries, use the Catalog API to create the recommended views. Then, use the Reflection API to create the desired reflections.

note

Before submitting Catalog API requests to create the recommended views for aggregation reflections, create a folder named recommended_view. In your Catalog API requests, the path parameter must include the full path to the recommended_view folder.

Method and URL
GET /v0/projects/{project-id}/reflection/recommendations/job-based/{id}/results/

Parameters

project-id Path   String (UUID)

Unique identifier for the project where the queries were run.

Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea


id Path   String

The id value returned in the response to your request to submit the job ID or IDs of the queries for which you want to retrieve recommended reflections.

Example: 13ffb629-9f0e-4265-97df-99bf0d425813

Example Request
curl -X GET 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/job-based/13ffb629-9f0e-4265-97df-99bf0d425813/results/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
Example Response
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"azure_3",
"table_2"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "{hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "raw_47f54460-543f-430f-a9e5-ca71d246265e",
"datasetId": "{\"tableKey\":[\"azure_3\",\"table_2\"],\"contentId\":\"356d6214-6c55-4011-8378-34b2f65d38c6\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "EXPR$1"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"13ffb629-9f0e-4265-97df-99bf0d425813"
],
"jobCount": 1,
"recommendationId": "9be8a451-4190-4618-a72e-9932f790c744",
"reflectionScore": 50.67,
"avgImprovementFactor": 10.43,
"avgImprovementMs": 7196
}
],
"canAlterReflections": true
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

405   Method Not Allowed

500   Internal Server Error

Retrieving Usage-Based Recommendations

Retrieve usage-based reflection recommendations. The response includes the reflectionRequestBody and recommendationId attributes to use as body parameters in your request to create usage-based reflections.

Method and URL
GET /v0/projects/{project-id}/reflection/recommendations/usage-based/

Parameters

project-id Path   String (UUID)

Unique identifier for the project where the queries were run.

Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea


Example Request
curl -X GET 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
Example Response
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"prodFolder",
"cost_based"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "POST {hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "AutoRef_cost_based_raw",
"datasetId": "{\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "pickup_datetime"
},
{
"name": "EXPR$2"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"3ae13cd9-9534-4713-875e-70b9a3f5d956"
],
"jobCount": 1,
"recommendationId": "prodFolder.cost_based",
"reflectionScore": 18.031160082597836,
"avgImprovementFactor": 5.263157946323747,
"avgImprovementMs": 11821.950027180083
}
],
"canAlterReflections": true
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

500   Internal Server Error

Creating Reflections from Usage-Based Recommendations

Create reflections to accelerate queries using the usage-based recommendations that you retrieved.

note

You must retrieve usage-based recommendations to get the parameters you need for this request.

Dremio supports creating only raw reflections from usage-based recommendations.

Method and URL
POST /v0/projects/{project-id}/reflection/recommendations/usage-based/

Parameters

project-id Path   String (UUID)

Unique identifier for the project where the queries were run.

Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea


reflection Body   Object

Information about the usage-based reflection to create. The reflection object includes the contents of the reflectionRequestBody included in the response for requests to retrieve usage-based recommendations.

NOTE: If desired, you may change the name of the reflection by changing the value for the reflection.name parameter in the body of your request. Dremio ignores any changes to the values of other parameters in the reflection object.


recommendationId Body   String

Identifier for the usage-based recommendation you want to use to create reflections. The recommendationId is included in the response for requests to retrieve usage-based recommendations.

Example: prodFolder.cost_based


Parameters of the reflectionRequestBody Object

type Body   String

Reflection type. Value must be RAW.


name Body   String

User-provided name for the reflection.

NOTE: If desired, you may change the name of the reflection by changing the value for the name parameter in the body of your request.

Example: AutoRef_cost_based_raw


datasetId Body   String

Unique identifier for the anchor dataset to associate with the reflection. The datasetId is a JSON string.

Example: {\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}


enabled Body   Boolean

If the reflection is available for accelerating queries, true. Otherwise, false.

Example: true


arrowCachingEnabled Body   Boolean

If Dremio converts data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true. Otherwise, false.

Example: false


dimensionFields Body   Array of Object

Information about the dimension fields from the anchor dataset used in the reflection. For raw reflections, the dimensionFields value is an empty array.

Example: []


measureFields Body   Array of Object

Information about the measure fields from the anchor dataset used in the reflection. For raw reflections, the measureFields value is an empty array.

Example: []


displayFields Body   Array of Object

Information about the fields displayed from the anchor dataset. Valid only for raw reflections. If the anchor dataset includes display fields, each object in the displayFields array contains one attribute: name.

Example: [{"name": "passenger_count"},{"name": "EXPR$1"}]


entityType Body   String

Type of entity. For reflection objects, the entityType is reflection.

Example Request
curl -X POST 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json' \
--data-raw '{
"reflectionRequestBody": {
"type": "RAW",
"name": "AutoRef_cost_based_raw",
"datasetId": "{\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "pickup_datetime"
},
{
"name": "EXPR$2"
}
],
"entityType": "reflection"
},
"recommendationId": "prodFolder.cost_based"
}
Example Response
{
"id": "c929b8d2-82bf-4175-9476-010ba17c4f7f",
"type": "RAW",
"name": "AutoRef_cost_based_raw",
"tag": "bd68c096-af11-4bfb-b74a-d08bb9acfdda",
"createdAt": "2024-06-28T19:30:30.977Z",
"updatedAt": "2024-06-28T19:30:30.977Z",
"datasetId": "{\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"currentSizeBytes": 0,
"totalSizeBytes": 0,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "NONE",
"combinedStatus": "CANNOT_ACCELERATE_SCHEDULED",
"failureCount": 0,
"lastDataFetch": "1969-12-31T23:59:59.999Z",
"expiresAt": "1969-12-31T23:59:59.999Z"
},
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "pickup_datetime"
},
{
"name": "EXPR$2"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

500   Internal Server Error

Refreshing Usage-Based Recommendations

Process collected data about view usage, clear existing usage-based recommendations, and generate new usage-based recommendations.

Use the usage-based endpoints to retrieve and create reflections based on the refreshed recommendations this endpoint creates.

Method and URL
POST /v0/projects/{project-id}/reflection/recommendations/usage-based/refresh/

Parameters

project-id Path   String (UUID)

Unique identifier for the project where the queries were run.

Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea

Example Request
curl -X POST 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/refresh/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'

This endpoint returns an empty response body with a 202 Accepted response status code. Dremio updates the recommendations asynchronously, so it may take several minutes before you can retrieve the updated recommendations.

Response Status Codes

202   Accepted

400   Bad Request

401   Unauthorized

500   Internal Server Error

Deleting Usage-Based Recommendations

Delete all collected usage data and all current reflection recommendations.

note

We recommend deleting recommendations only when troubleshooting.

Method and URL
DELETE /v0/projects/{project-id}/reflection/recommendations/usage-based/

Parameters

project-id Path   String (UUID)

Unique identifier for the project where the queries were run.

Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea

Example Request
curl -X DELETE 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'

This endpoint returns an empty response body with a 202 Accepted response status code. Dremio deletes the recommendations asynchronously, so it may take several minutes for the deletion to complete.

Response Status Codes

202   Accepted

400   Bad Request

401   Unauthorized

500   Internal Server Error