Recommendations
Use the Recommendations API to get job-based recommendations and get and create usage-based reflections that can accelerate your queries.
Getting job-based recommendations requires making the following two API requests:
- Submit the job IDs of jobs that have run SQL queries. These are job IDs of the queries for which you want to retrieve recommendations in further requests. This request returns the job ID to use in the second request.
- Retrieve job-based recommendations for reflections that can accelerate your queries. Use the job ID that was returned in your first request to make the request for recommendations.
Creating reflections from usage-based recommendations requires making the following two API requests:
- Retrieve usage-based recommendations for reflections. This request returns the parameters to use in the body of the second request.
- Create reflections from usage-based recommendations that can accelerate your queries from the usage-based recommendations. Use the recommendation ID and reflection request body that were returned in your first request to create the reflections.
Dremio supports creating only raw reflections from usage-based recommendations.
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"azure_3",
"table_2"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "{hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "raw_47f54460-543f-430f-a9e5-ca71d246265e",
"datasetId": "{\"tableKey\":[\"azure_3\",\"table_2\"],\"contentId\":\"356d6214-6c55-4011-8378-34b2f65d38c6\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "EXPR$1"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"13ffb629-9f0e-4265-97df-99bf0d425813"
],
"jobCount": 1,
"recommendationId": "9be8a451-4190-4618-a72e-9932f790c744",
"reflectionScore": 50.67,
"avgImprovementFactor": 10.43,
"avgImprovementMs": 7196
}
],
"canAlterReflections": true
}
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"recommended_view",
"view_1"
],
"sql": "SELECT * FROM Samples.samples.dremio.com.\"NYC-taxi-trips\" WHERE trip_distance_mi <= 2.0 ORDER BY trip_distance_mi ASC"
},
"viewRequestEndpoint": "POST {hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "AGGREGATION",
"name": "agg_0e0c4ab9-def7-48da-81f1-ca8c1da11ed4",
"datasetId": "{\"tableKey\":[\"recommended_view\",\"view_1\"],\"contentId\":\"7ae1eded-5133-4e80-bf06-1be20975e0aa\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"displayFields": [],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"1ded81f8-4d06-4d09-8163-9e2517027d8d"
],
"jobCount": 1,
"recommendationId": "1855d2dd-4106-4359-a97a-e08a916096e6",
"reflectionScore": 60.12,
"avgImprovementFactor": 8.39,
"avgImprovementMs": 5400
}
],
"canAlterReflections": true
}
Recommendation Attributes
data Array of Object
List of recommended reflection objects for the submitted job IDs.
canAlterReflections Boolean
If the columns in the recommended reflection can be edited, added, and removed, true
. Otherwise, false
.
Example: true
Attributes of the data
Array
viewRequestBody Object
The fields to include in a request to the Catalog API to create the view on which to define the recommended reflection.
viewRequestEndpoint String
The endpoint to use when submitting a request to the Catalog API to create the view on which to define the recommended reflection. NOTE: The API returns the Software endpoint. The correct endpoint for Cloud requests is POST /v0/projects/{project-id}/catalog
.
reflectionRequestBody Object
The fields to include in a request to the Reflection API to create the recommended reflection.
reflectionRequestEndpoint String
The endpoint to use when submitting the request to the Reflection API to create the recommended reflection. NOTE: The API returns the Software endpoint. The correct endpoint for Cloud requests is POST /v0/projects/{project-id}/reflection
.
jobIds Array of String
The job IDs of the queries for which the reflection recommendations are given.
Example: ["13ffb629-9f0e-4265-97df-99bf0d425813"]
jobCount Array of String
The number of jobs for which reflection recommendations are given.
Example: 1
recommendationId Array of String
The ID of the recommended reflection.
Example: ["9be8a451-4190-4618-a72e-9932f790c744"]
reflectionScore Double
Score for the recommended reflection's quality, on a scale of 0 (worst) to 100 (best). The reflectionScore value considers the recommended reflection's anticipated quality compared to existing reflections and other recommended reflections, as well as the likely improvement in query run times if the recommended reflection is implemented.
Example: 50.67
avgImprovementFactor Double
The likely average multiplicative rate of improvement for each query if you implement the recommended reflection. For example, if the avgImprovementFactor value is 2.34, implementing the recommended reflection is likely to speed up each query by 2.34 times, on average.
Example: 10.43
avgImprovementMs Double
The likely average improvement, in milliseconds, for each query if you implement the recommended reflection. For example, if the avgImprovementMs value is 5400, implementing the recommended reflection is likely to save an average of 5400 milliseconds for each query that uses the reflection.
Example: 7196
Attributes of the viewRequestBody
Object
entityType String
Type of catalog entity. For views, the entityType is dataset
.
type String
Type of dataset. For views, the type is VIRTUAL_DATASET
.
path Array of String
Path to the location where the view should be created within Dremio, expressed in an array. The path lists each level of hierarchy in order, from outer to inner: Arctic source or catalog first, then folder and subfolders, then a name for the view itself as the last item in the array. Views can only be created in Arctic sources and the project's Arctic catalog.
Example: ["azure_3","table_2"]
sql String
For aggregation reflections, the SQL query to use to create the view. For default raw reflections, the sql value --Default Raw Reflection
; creating a view is unnecessary because raw recommendations are given only for existing views.
Attributes of the reflectionRequestBody
Object
type String
Reflection type. For details, read Types of Reflections.
Enum: RAW, AGGREGATION
Example: AGGREGATION
name String
User-provided name for the reflection. For reflections created in the Dremio console, if the user did not provide a name, the default values are Raw Reflection
and Aggregation Reflection
(automatically assigned based on the reflection type).
Example: raw_47f54460-543f-430f-a9e5-ca71d246265e
datasetId String
Unique identifier for the anchor dataset to associate with the reflection. The datasetId is a JSON string.
Example: {"tableKey":["azure_3","table_2"],"contentId":"356d6214-6c55-4011-8378-34b2f65d38c6","versionContext":{"type":"BRANCH","value":"main"}}
enabled Boolean
If the reflection is available for accelerating queries, true
. Otherwise, false
.
Example: true
arrowCachingEnabled Boolean
If Dremio converts data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true
. Otherwise, false
.
Example: false
dimensionFields Array of Object
Information about the dimension fields from the anchor dataset used in the reflection. Dimension fields are the fields you expect to group by when analyzing data. Valid only for aggregation reflections. For raw reflections or if the anchor dataset does not include any dimension fields, the dimensionFields value is an empty array. For aggregation reflections, if the anchor dataset includes dimension fields, each object in the dimensionFields array contains two attributes: name and granularity.
Example: [{"name": "pickup_date","granularity": "DATE"},{"name": "pickup_datetime","granularity": "DATE"},{"name": "dropoff_date","granularity": "DATE"},{"name": "dropoff_datetime","granularity": "DATE"},{"name": "passenger_count","granularity": "DATE"},{"name": "total_amount","granularity": "DATE"}]
measureFields Array of Object
Information about the measure fields from the anchor dataset used in the reflection. Measure fields are the fields you expect to use for calculations when analyzing the data. Valid only for aggregation reflections. For raw reflections or if the anchor dataset does not include any measure fields, the measureFields value is an empty array. For aggregation reflections, if the anchor dataset includes measure fields, each object in the measureFields array contains two attributes: name and measureTypeList.
Example: [{"name": "passenger_count","measureTypeList": ["SUM,"COUNT"]},{"name": "trip_distance_mi","measureTypeList": ["SUM","COUNT"]},{"name": "fare_amount","measureTypeList": ["SUM","COUNT"]},{"name": "surcharge","measureTypeList": ["SUM","COUNT"]},{"name": "tip_amount","measureTypeList": ["SUM","COUNT"]},{"name": "total_amount","measureTypeList": ["SUM","COUNT"]}]
displayFields Array of Object
Information about the fields displayed from the anchor dataset. Valid only for raw reflections. For aggregation reflections or if the anchor dataset does not include any display fields, the value is an empty array. For raw reflections, if the anchor dataset includes display fields, each object in the displayFields array contains one attribute: name.
Example: [{"name": "passenger_count"},{"name": "EXPR$1"}]
entityType String
Type of entity. For reflection objects, the entityType is reflection
.
Submitting Job IDs
Submit the job IDs of queries for which you want to request reflection recommendations.
The response includes objects that contain an id attribute and value for each job ID you submit. Use these id values to request recommendations for reflections to accelerate the queries.
Method and URLPOST /v0/projects/{project-id}/reflection/recommendations/job-based/
Parameters
project-id Path String (UUID)
Unique identifier for the project where the queries were run.
Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea
jobIds Body Array of String
The job IDs of the queries for which you want to request reflection recommendations. To get the job IDs, use the SQL API or find them on the Jobs page in the Dremio console. Use a comma-separated list to submit multiple job IDs.
Example: ["a7efcd50-791a-48e8-bb05-391b4411e66b"]
Example Requestcurl -X POST 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/job-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json' \
--data-raw '{
"jobIds": ["a7efcd50-791a-48e8-bb05-391b4411e66b","c2485882-e6b7-4aa8-af5b-a825d2870589"]
}'
{
"id": "13ffb629-9f0e-4265-97df-99bf0d425813"
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
405 Method Not Allowed
500 Internal Server Error
Retrieving Job-Based Recommendations
Retrieve job-based recommended reflections to accelerate the queries whose job IDs you submitted.
- For default raw reflections, each recommendation comprises the path to the view on which to define the reflection and the parameters to use in a request to create the reflection.
- For aggregation reflections, each recommendation comprises the parameters to use in a request to create a view on which to define the recommended reflection and the parameters to use in a request to create the reflection.
After you retrieve the recommended reflections for your queries, use the Catalog API to create the recommended views. Then, use the Reflection API to create the desired reflections.
Before submitting Catalog API requests to create the recommended views for aggregation reflections, create a folder named recommended_view
. In your Catalog API requests, the path
parameter must include the full path to the recommended_view
folder.
GET /v0/projects/{project-id}/reflection/recommendations/job-based/{id}/results/
Parameters
project-id Path String (UUID)
Unique identifier for the project where the queries were run.
Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea
id Path String
The id value returned in the response to your request to submit the job ID or IDs of the queries for which you want to retrieve recommended reflections.
Example: 13ffb629-9f0e-4265-97df-99bf0d425813
Example Requestcurl -X GET 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/job-based/13ffb629-9f0e-4265-97df-99bf0d425813/results/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"azure_3",
"table_2"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "{hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "raw_47f54460-543f-430f-a9e5-ca71d246265e",
"datasetId": "{\"tableKey\":[\"azure_3\",\"table_2\"],\"contentId\":\"356d6214-6c55-4011-8378-34b2f65d38c6\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "EXPR$1"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"13ffb629-9f0e-4265-97df-99bf0d425813"
],
"jobCount": 1,
"recommendationId": "9be8a451-4190-4618-a72e-9932f790c744",
"reflectionScore": 50.67,
"avgImprovementFactor": 10.43,
"avgImprovementMs": 7196
}
],
"canAlterReflections": true
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
405 Method Not Allowed
500 Internal Server Error
Retrieving Usage-Based Recommendations
Retrieve usage-based reflection recommendations. The response includes the reflectionRequestBody
and recommendationId
attributes to use as body parameters in your request to create usage-based reflections.
GET /v0/projects/{project-id}/reflection/recommendations/usage-based/
Parameters
project-id Path String (UUID)
Unique identifier for the project where the queries were run.
Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea
Example Request
curl -X GET 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
{
"data": [
{
"viewRequestBody": {
"entityType": "dataset",
"type": "VIRTUAL_DATASET",
"path": [
"prodFolder",
"cost_based"
],
"sql": "--Default Raw Reflection"
},
"viewRequestEndpoint": "POST {hostname}/api/v3/catalog",
"reflectionRequestBody": {
"type": "RAW",
"name": "AutoRef_cost_based_raw",
"datasetId": "{\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "pickup_datetime"
},
{
"name": "EXPR$2"
}
],
"entityType": "reflection"
},
"reflectionRequestEndpoint": "POST {hostname}/api/v3/reflection",
"jobIds": [
"3ae13cd9-9534-4713-875e-70b9a3f5d956"
],
"jobCount": 1,
"recommendationId": "prodFolder.cost_based",
"reflectionScore": 18.031160082597836,
"avgImprovementFactor": 5.263157946323747,
"avgImprovementMs": 11821.950027180083
}
],
"canAlterReflections": true
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
500 Internal Server Error
Creating Reflections from Usage-Based Recommendations
Create reflections to accelerate queries using the usage-based recommendations that you retrieved.
You must retrieve usage-based recommendations to get the parameters you need for this request.
Dremio supports creating only raw reflections from usage-based recommendations.
POST /v0/projects/{project-id}/reflection/recommendations/usage-based/
Parameters
project-id Path String (UUID)
Unique identifier for the project where the queries were run.
Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea
reflection Body Object
Information about the usage-based reflection to create. The reflection object includes the contents of the reflectionRequestBody included in the response for requests to retrieve usage-based recommendations.
NOTE: If desired, you may change the name of the reflection by changing the value for the reflection.name parameter in the body of your request. Dremio ignores any changes to the values of other parameters in the reflection object.
recommendationId Body String
Identifier for the usage-based recommendation you want to use to create reflections. The recommendationId is included in the response for requests to retrieve usage-based recommendations.
Example: prodFolder.cost_based
Parameters of the reflectionRequestBody
Object
type Body String
Reflection type. Value must be RAW
.
name Body String
User-provided name for the reflection.
NOTE: If desired, you may change the name of the reflection by changing the value for the name parameter in the body of your request.
Example: AutoRef_cost_based_raw
datasetId Body String
Unique identifier for the anchor dataset to associate with the reflection. The datasetId is a JSON string.
Example: {"tableKey":["prodFolder","cost_based"],"contentId":"afe81b44-2518-4bfc-a659-e47cd40024e0","versionContext":{"type":"BRANCH","value":"main"}}
enabled Body Boolean
If the reflection is available for accelerating queries, true
. Otherwise, false
.
Example: true
arrowCachingEnabled Body Boolean
If Dremio converts data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true
. Otherwise, false
.
Example: false
dimensionFields Body Array of Object
Information about the dimension fields from the anchor dataset used in the reflection. For raw reflections, the dimensionFields value is an empty array.
Example: []
measureFields Body Array of Object
Information about the measure fields from the anchor dataset used in the reflection. For raw reflections, the measureFields value is an empty array.
Example: []
displayFields Body Array of Object
Information about the fields displayed from the anchor dataset. Valid only for raw reflections. If the anchor dataset includes display fields, each object in the displayFields array contains one attribute: name.
Example: [{"name": "passenger_count"},{"name": "EXPR$1"}]
entityType Body String
Type of entity. For reflection objects, the entityType is reflection
.
curl -X POST 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json' \
--data-raw '{
"reflectionRequestBody": {
"type": "RAW",
"name": "AutoRef_cost_based_raw",
"datasetId": "{\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [],
"measureFields": [],
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "pickup_datetime"
},
{
"name": "EXPR$2"
}
],
"entityType": "reflection"
},
"recommendationId": "prodFolder.cost_based"
}
{
"id": "c929b8d2-82bf-4175-9476-010ba17c4f7f",
"type": "RAW",
"name": "AutoRef_cost_based_raw",
"tag": "bd68c096-af11-4bfb-b74a-d08bb9acfdda",
"createdAt": "2024-06-28T19:30:30.977Z",
"updatedAt": "2024-06-28T19:30:30.977Z",
"datasetId": "{\"tableKey\":[\"prodFolder\",\"cost_based\"],\"contentId\":\"afe81b44-2518-4bfc-a659-e47cd40024e0\",\"versionContext\":{\"type\":\"BRANCH\",\"value\":\"main\"}}",
"currentSizeBytes": 0,
"totalSizeBytes": 0,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "NONE",
"combinedStatus": "CANNOT_ACCELERATE_SCHEDULED",
"failureCount": 0,
"lastDataFetch": "1969-12-31T23:59:59.999Z",
"expiresAt": "1969-12-31T23:59:59.999Z"
},
"displayFields": [
{
"name": "passenger_count"
},
{
"name": "pickup_datetime"
},
{
"name": "EXPR$2"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
500 Internal Server Error
Refreshing Usage-Based Recommendations
Process collected data about view usage, clear existing usage-based recommendations, and generate new usage-based recommendations.
Use the usage-based endpoints to retrieve and create reflections based on the refreshed recommendations this endpoint creates.
Method and URLPOST /v0/projects/{project-id}/reflection/recommendations/usage-based/refresh/
Parameters
project-id Path String (UUID)
Unique identifier for the project where the queries were run.
Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea
Example Requestcurl -X POST 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/refresh/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
This endpoint returns an empty response body with a 202 Accepted
response status code. Dremio updates the recommendations asynchronously, so it may take several minutes before you can retrieve the updated recommendations.
Response Status Codes
202 Accepted
400 Bad Request
401 Unauthorized
500 Internal Server Error
Deleting Usage-Based Recommendations
Delete all collected usage data and all current reflection recommendations.
We recommend deleting recommendations only when troubleshooting.
DELETE /v0/projects/{project-id}/reflection/recommendations/usage-based/
Parameters
project-id Path String (UUID)
Unique identifier for the project where the queries were run.
Example: 0b00aed4-f2c0-4daa-8f30-f63d0aacefea
Example Requestcurl -X DELETE 'https://api.dremio.cloud/v0/api/projects/0b00aed4-f2c0-4daa-8f30-f63d0aacefea/reflection/recommendations/usage-based/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
This endpoint returns an empty response body with a 202 Accepted
response status code. Dremio deletes the recommendations asynchronously, so it may take several minutes for the deletion to complete.
Response Status Codes
202 Accepted
400 Bad Request
401 Unauthorized
500 Internal Server Error