Reflection
Use the Reflection API to retrieve a list of raw and aggregation reflections, retrieve individual reflections, and create, update, and delete reflections.
A reflection is an optimized materialization of source data or a query, similar to a materialized view, that is derived from an existing table or view. The query optimizer can accelerate queries by using one or more reflections to partially or entirely satisfy the queries rather than running queries against the raw data in the data source that underlies the table or view.
Reflection Object (Raw Reflection){
"id": "7a380a24-3b63-436c-9ea0-63cb534cc404",
"type": "RAW",
"name": "Raw Reflection",
"tag": "085e7704-544d-4c94-a666-2f298f5f8d7b",
"createdAt": "2023-01-30T14:11:43.826Z",
"updatedAt": "2023-01-30T14:11:43.826Z",
"datasetId": "tk973df7-ddf7-4d1e-fa9e-bccf28ae253f",
"currentSizeBytes": 4393709246,
"totalSizeBytes": 4393709246,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-01-30T14:11:51.801Z",
"expiresAt": "2023-01-30T17:11:51.801Z"
},
"displayFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
},
{
"name": "trip_distance_mi"
},
{
"name": "fare_amount"
},
{
"name": "tip_amount"
},
{
"name": "total_amount"
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
}
],
"partitionFields": [
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "pickup_datetime"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
{
"id": "95dda9dd-2371-467f-b68d-fc4c5ea57a8b",
"type": "AGGREGATION",
"name": "Aggregation Reflection",
"tag": "05c6a6b5-1174-4f29-b1a3-b846ead498ce",
"createdAt": "2022-07-05T19:19:40.244Z",
"updatedAt": "2023-01-10T17:12:40.244Z",
"datasetId": "df99ab32-c2d4-4d1c-9e91-2c8be861bb8a",
"currentSizeBytes": 18639885,
"totalSizeBytes": 142639924,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-01-10T17:12:40.244Z",
"expiresAt": "3022-07-05T19:19:40.244Z"
},
"dimensionFields": [
{
"name": "pickup_date",
"granularity": "DATE"
},
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "dropoff_date",
"granularity": "DATE"
},
{
"name": "dropoff_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "surcharge",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
Reflection Attributes
id
String (UUID)
Unique identifier for the reflection.
Example 95dda9dd-2371-467f-b68d-fc4c5ea57a8b
type
String
Reflection type. For details, read Types of Reflections.
EnumRAW, AGGREGATION
Example AGGREGATION
name
String
User-provided name for the reflection. For reflections created in the Dremio UI, if the user did not provide a name, the default values are Raw Reflection
and Aggregation Reflection
(automatically assigned based on the reflection type).
Example Aggregation Reflection
tag
String
Unique identifier for the reflection instance. Dremio changes the tag whenever the reflection changes and uses the tag value to ensure that PUT requests apply to the most recent version of the reflection.
Example 05c6a6b5-1174-4f29-b1a3-b846ead498ce
createdAt
String
Date and time that the reflection was created. In UTC format.
Example 2022-07-05T19:19:40.244Z
updatedAt
String
Date and time that the reflection was last updated. In UTC format.
Example 2023-01-10T17:12:40.244Z
datasetId
String (UUID)
Unique identifier for the anchor dataset that is associated with the reflection.
Example df99ab32-c2d4-4d1c-9e91-2c8be861bb8a
currentSizeBytes
Integer
Data size of the latest reflection job (if one exists). In bytes.
Example 18639885
totalSizeBytes
Integer
Data size of all reflection jobs that have not been pruned (if any exist). In bytes.
Example 142639924
enabled
Boolean
If the reflection is available for accelerating queries, true
. Otherwise, false
.
Example true
arrowCachingEnabled
Boolean
If Dremio converts data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true
. Otherwise, false
.
Example false
Object
Information about the status of the reflection.
Example { "config": "OK", "refresh": "SCHEDULED", "availability": "AVAILABLE", "combinedStatus": "CAN_ACCELERATE", "failureCount": 0, "lastDataFetch": "2023-01-10T17:12:40.244Z", "expiresAt": "3022-07-05T19:19:40.244Z" }
[Object]
Information about the fields displayed from the anchor dataset. Each displayFields object contains one attribute: name. Valid only for raw reflections.
Example [ { "name": "pickup_datetime" }, { "name": "passenger_count" }, { "name": "trip_distance_mi" }, { "name": "fare_amount" }, { "name": "tip_amount" }, { "name": "total_amount" } ]
[Object]
Information about the dimension fields from the anchor dataset used in the reflection. Dimension fields are the fields you expect to group by when analyzing data. Each dimensionFields object contains two attributes: name and granularity. Valid only for aggregation reflections.
Example [ { "name": "pickup_date", "granularity": "DATE" }, { "name": "pickup_datetime", "granularity": "DATE" }, { "name": "dropoff_date", "granularity": "DATE" }, { "name": "dropoff_datetime", "granularity": "DATE" }, { "name": "passenger_count", "granularity": "DATE" }, { "name": "total_amount", "granularity": "DATE" } ]
[Object]
Information about the measure fields from the anchor dataset used in the reflection. Measure fields are the fields you expect to use for calculations when analyzing the data. Each measureFields object contains two attributes: name and measureTypeList. Valid only for aggregation reflections.
Example [ { "name": "passenger_count", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "trip_distance_mi", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "fare_amount", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "surcharge", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "tip_amount", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "total_amount", "measureTypeList": [ "SUM", "COUNT" ] } ]
[Object]
Information about the distribution fields from the anchor dataset used in the reflection. Distribution fields allow data from multiple datasets to be co-located and co-partitioned across nodes to minimize data movement during join operations. Each distributionFields object contains one attribute: name.
Example [ { "name": "trip_distance_mi" }, { "name": "total_amount" } ]
[Object]
Information about the fields from the anchor dataset used to partition data in the reflection. Each field name is listed as an individual object. For details, read Horizontally Partition Reflections that Have Many Rows.
Example [ { "name": "dropoff_date" }, { "name": "passenger_count" } ]
[Object]
Information about the fields from the anchor dataset used for sorting in the reflection. Each sortFields object contains one attribute: name. For details, read Sort Reflections on High-Cardinality Fields.
Example [ { "name": "trip_distance_mi" } ]
partitionDistributionStrategy
String
Method used to optimize data compression when executing reflections. CONSOLIDATED
means Dremio minimizes the number of files produced. The query threads pool the data and ensure that the fewest number of files are written to the reflection store. Optimizing for a smaller number of files generally improves read performance because users can perform fewer searches for a given query. STRIPED
means Dremio minimizes the time required to refresh the reflection. Each final-stage query thread opens its own writers to write the data, which can result in many small files if each query thread contains a small amount of data.
CONSOLIDATED, STRIPED
Example CONSOLIDATED
canView
Boolean
If you can view reflections on all datasets of a project, source, or folder, true
. Otherwise, false
.
Example true
canAlter
Boolean
If you can create, edit, and view reflections on all datasets of a source, system, or folder, true
. Otherwise, false
.
Example true
entityType
String
Type of entity. For reflection objects, the entityType is reflection
.
Example reflection
status
config
String
Status of the reflection configuration. If OK
, the reflection configuration is free of errors. If INVALID
, the reflection configuration contains one or more errors.
OK, INVALID
Example OK
refresh
String
Status of the reflection refresh.
GIVEN_UP
: Dremio attempted to refresh the reflection multiple times, but each attempt has failed and Dremio will not make further attempts.MANUAL
: Refresh period is set to 0, so you must use the Dremio UI to manually refresh the reflection.RUNNING
: Dremio is currently refreshing the reflection.SCHEDULED
: The reflection refreshes according to a schedule.GIVEN_UP, MANUAL, RUNNING, SCHEDULED
Example SCHEDULED
availability
String
Status of the reflection's availability for accelerating queries.
EnumNONE, INCOMPLETE, EXPIRED, AVAILABLE
Example AVAILABLE
combinedStatus
String
Status of the reflection based on a combination of config, refresh, and availability statuses.
CAN_ACCELERATE
: The reflection is fully functional.CAN_ACCELERATE_WITH_FAILURES
: The most recent refresh failed to obtain a status, but Dremio still has a valid materialization.CANNOT_ACCELERATE_MANUAL
: The reflection is unable to accelerate any queries, and the Never Refresh
option is selected for the refresh policy.CANNOT_ACCELERATE_SCHEDULED
: The reflection is currently unable to accelerate any queries, but it has been scheduled for a refresh at a future time.DISABLED
: The reflection has been manually disabled.EXPIRED
: The reflection has expired and cannot be used.FAILED
: The attempt to refresh the reflection has failed, typically three times in a row. The reflection is still usable.INCOMPLETE
: One or more pseudo-distributed file system (PDFS) nodes that contain materialized files are down. Only partial data is available. Configurations that use the Hadoop Distributed File System (HDFS) to store reflections should not experience incomplete status.INVALID
: The reflection is invalid because the underlying dataset has changed.REFRESHING
: The reflection is currently being refreshed.Example CAN_ACCELERATE
failureCount
Integer
Number of times that an attempt to refresh the reflection failed.
Example 0
lastDataFetch
String
Date and time that the reflection data was last refreshed. In UTC format. If the reflection is running, failing, or disabled, the lastDataFetch value is 1969-12-31T23:59:59.999Z
.
Example 2023-01-10T17:12:40.244Z
expiresAt
String
Date and time that the reflection will expire. In UTC format. If the reflection is running, failing, or disabled, the lastDataFetch value is 1969-12-31T23:59:59.999Z
.
Example 3022-07-05T19:19:40.244Z
displayFields
name
String
Name of the field from the anchor dataset that is displayed in the raw reflection.
Example passenger_count
dimensionFields
name
String
Name of the field from the anchor dataset that is configured as a dimension for the reflection.
Example pickup_date
granularity
String
Grouping used for the dimension field. When timestamp and date fields are configured as dimensions, Dremio can automatically extract and use the day-level date value (DATE
) or use the field's original value (NORMAL
).
DATE, NORMAL
Example DATE
measureFields
name
String
Name of the field from the anchor dataset that is configured as a measure for the reflection.
Example passenger_count
measureTypeList
[String]
Types of calculations for which Dremio uses the specified measure field.
EnumAPPROX_COUNT_DISTINCT, MIN, MAX, UNKNOWN, SUM, COUNT
Example [ "SUM", "COUNT" ]
distributionFields
name
String
Name of the field from the anchor dataset that is used for co-locating and co-partitioning data from multiple datasets across nodes.
Example trip_distance_mi
partitionFields
name
String
Name of the field from the anchor dataset on which you can partition the rows in the reflection.
Example trip_distance_mi
sortFields
name
String
Name of the field from the anchor dataset that is used for sorting in the reflection.
Example dropoff_date
Create a Reflection
Create a new reflection.
Method and URLPOST /v0/projects/{project-id}/reflection
Parameters
project-id
path
String (UUID)
Unique identifier for the project where you want to create the reflection.
Example 1df71752-69b7-47d9-9e6c-990e6b194aa4
type
body
String
Reflection type. For details, read Types of Reflections.
EnumRAW, AGGREGATION
Example AGGREGATION
name
body
String
Name to use for the reflection.
Example New Aggregation Reflection
datasetId
body
String (UUID)
Unique identifier for the anchor dataset to associate with the reflection.
Example 81e2ad31-a119-447d-a831-085831e505be
enabled
body
Boolean
If the reflection should be available for accelerating queries, true
. Otherwise, false
.
Example true
arrowCachingEnabled
body
Boolean
Optional
If Dremio should convert data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true
. Otherwise, false
(default).
Example false
body
[Object]
Optional
Information about the fields to display from the anchor dataset. The displayfields array must list every field in the anchor dataset or the reflection will fail. Each displayFields object contains one attribute: name. Valid only for raw reflections.
Example [ { "name": "pickup_datetime" }, { "name": "passenger_count" }, { "name": "trip_distance_mi" }, { "name": "fare_amount" }, { "name": "tip_amount" }, { "name": "total_amount" } ]
body
[Object]
Optional
Information about the dimension fields from the anchor dataset to use in the reflection. Dimension fields are the fields you expect to group by when analyzing data. Each dimensionFields object contains two attributes: name and granularity. Valid only for aggregation reflections.
Example [ { "name": "pickup_datetime", "granularity": "DATE" }, { "name": "passenger_count", "granularity": "DATE" }, { "name": "total_amount", "granularity": "DATE" }, { "name": "trip_distance_mi", "granularity": "DATE" } ]
body
[Object]
Optional
Information about the measure fields from the anchor dataset to use in the reflection. Measure fields are the fields you expect to use for calculations when analyzing the data. Each measureFields object contains two attributes: name and measureTypeList. Valid only for aggregation reflections.
Example [ { "name": "passenger_count", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "trip_distance_mi", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "fare_amount", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "tip_amount", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "total_amount", "measureTypeList": [ "SUM", "COUNT" ] } ]
body
[Object]
Optional
Information about the distribution fields from the anchor dataset to use for co-locating and co-partitioning data from multiple datasets across nodes. Each distributionFields object contains one attribute: name.
Example [ { "name": "trip_distance_mi" }, { "name": "total_amount" } ]
body
[Object]
Optional
Information about the fields from the anchor dataset to use to partition data in the reflection. Each field name is listed as an individual object. For details, read Horizontally Partition Reflections that Have Many Rows.
Example [ { "name": "pickup_datetime" }, { "name": "passenger_count" } ]
body
[Object]
Optional
Information about the fields from the anchor dataset to use for sorting in the reflection. Each sortFields object contains one attribute: name. For details, read Sort Reflections on High-Cardinality Fields.
Example [ { "name": "trip_distance_mi" } ]
partitionDistributionStrategy
body
String
Optional
Method to use to optimize data compression when executing reflections. If CONSOLIDATED
(default), Dremio will minimize the number of files produced. If STRIPED
, Dremio will minimize the time required to refresh the reflection.
CONSOLIDATED, STRIPED
Example CONSOLIDATED
canView
body
Boolean
Optional
To allow users to view reflections on all datasets of a project, source, or folder, true
(default). Otherwise, false
.
Example true
canAlter
body
Boolean
Optional
To allow users to create, edit, and view reflections on all datasets of a source, system, or folder, true
(default). Otherwise, false
.
Example true
entityType
body
String
Type of entity. For reflection objects, the entityType is reflection
.
displayFields
name
body
String
Name of the field to display from the anchor dataset.
Example pickup_datetime
dimensionFields
name
body
String
Name of the field from the anchor dataset to configure as a dimension for the reflection.
Example pickup_datetime
granularity
body
String
Grouping to use for the dimension field. If Dremio should automatically extract the day-level date value and use it as the grouping value in the reflection, DATE
. If Dremio should use the original value for grouping, NORMAL
.
DATE, NORMAL
Example DATE
measureFields
name
body
String
Name of the field from the anchor dataset that you expect to use in calculations. Fields of types LIST
, MAP
, and UNION
are not valid measureFields.
Example passenger_count
measureTypeList
body
[String]
Types of calculations for which Dremio should use the specified measure field. The calculations must be valid for the specified field (for example, SUM
is not valid for a timestamp field like pickup_datetime).
APPROX_COUNT_DISTINCT, MIN, MAX, UNKNOWN, SUM, COUNT
Example [ "SUM", "COUNT" ]
distributionFields
name
body
String
Optional
Name of the field from the anchor dataset to use for co-locating and co-partitioning data from multiple datasets across nodes. In aggregation reflections, every field listed as a distribution field must also be listed as a dimension field.
Example trip_distance_mi
partitionFields
name
body
String
Optional
Name of the field from the anchor dataset on which you want to be able to partition rows. Every field listed as a partition field must also be listed as a dimension field. If you list a field as a partition field, you cannot list the same field as a sort field in the same reflection.
Example pickup_datetime
sortFields
name
body
String
Optional
Name of the field from the anchor dataset to use for sorting in the reflection. Every field listed as a sort field must also be listed as a dimension field. If you list a field as a sort field, you cannot list the same field as a partition field in the same reflection.
Example trip_distance_mi
Example Request
curl -X POST 'https://api.dremio.cloud/v0/api/projects/1df71752-69b7-47d9-9e6c-990e6b194aa4/reflection/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "AGGREGATION",
"name": "New Aggregation Reflection",
"datasetId": "81e2ad31-a119-447d-a831-085831e505be",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"entityType": "reflection"
}'
{
"id": "26526a2e-3edb-4af0-a24f-8e9e2dcfa82a",
"type": "AGGREGATION",
"name": "New Aggregation Reflection",
"tag": "ff9f4477-e44d-4995-a76a-70edbe572c56",
"createdAt": "2023-01-30T14:30:24.311Z",
"updatedAt": "2023-01-30T14:30:24.311Z",
"datasetId": "81e2ad31-a119-447d-a831-085831e505be",
"currentSizeBytes": 0,
"totalSizeBytes": 0,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "NONE",
"combinedStatus": "CANNOT_ACCELERATE_SCHEDULED",
"failureCount": 0,
"lastDataFetch": "1969-12-31T23:59:59.999Z",
"expiresAt": "1969-12-31T23:59:59.999Z"
},
"dimensionFields": [
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
Response Status Codes
200
OK
401
Unauthorized
404
Not Found
405
Method Not Allowed
500
Internal Server Error
Retrieve All Reflections
Retrieve a list of reflection objects that includes all raw and aggregation reflections in the Dremio instance.
Method and URLGET /v0/projects/{project-id}/reflection
Parameters
project-id
path
String (UUID)
Unique identifier for the project that contains the reflections that you want to retrieve.
Example 1df71752-69b7-47d9-9e6c-990e6b194aa4
Example Request
curl -X GET 'https://api.dremio.cloud/v0/api/projects/1df71752-69b7-47d9-9e6c-990e6b194aa4/reflection/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
In the response for a request to retrieve all raw and aggregation reflections, the reflection objects are wrapped with a data array. Each object in the data array represents one reflection.
Example Response{
"data": [
{
"id": "95dda9dd-2371-467f-b68d-fc4c5ea57a8b",
"type": "AGGREGATION",
"name": "Aggregation Reflection",
"tag": "05c6a6b5-1174-4f29-b1a3-b846ead498ce",
"createdAt": "2022-07-05T19:19:40.244Z",
"updatedAt": "2023-01-10T17:12:40.244Z",
"datasetId": "df99ab32-c2d4-4d1c-9e91-2c8be861bb8a",
"currentSizeBytes": 18639885,
"totalSizeBytes": 142639924,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-01-10T17:12:40.244Z",
"expiresAt": "3022-07-05T19:19:40.244Z"
},
"dimensionFields": [
{
"name": "pickup_date",
"granularity": "DATE"
},
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "dropoff_date",
"granularity": "DATE"
},
{
"name": "dropoff_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "surcharge",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
},
{
"id": "14f22052-cbb3-4d5d-8bbc-6154cca98e49",
"type": "RAW",
"name": "listings",
"tag": "7707981c-c2d4-4d1c-9e91-2c8be861bb8a",
"createdAt": "2022-07-12T16:45:35.249Z",
"updatedAt": "2022-07-12T16:45:35.249Z",
"datasetId": "df99ab32-cb33-42bc-a048-d27a8915f468",
"currentSizeBytes": 0,
"totalSizeBytes": 0,
"enabled": true,
"arrowCachingEnabled": true,
"status": {
"config": "OK",
"refresh": "MANUAL",
"availability": "NONE",
"combinedStatus": "CANNOT_ACCELERATE_MANUAL",
"failureCount": 0,
"lastDataFetch": "1969-12-31T23:59:59.999Z",
"expiresAt": "1969-12-31T23:59:59.999Z"
},
"displayFields": [
{
"name": "id"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
},
{
"id": "6c209200-b522-4f81-bbe0-d10668c7752c",
"type": "AGGREGATION",
"name": "Aggregation Reflection",
"tag": "bbd9a011-3062-489d-b1ca-d775947b4bbe",
"createdAt": "2021-09-29T15:47:44.806Z",
"updatedAt": "2021-09-29T15:47:44.806Z",
"datasetId": "746f867a-c27c-4711-bb8c-99546a4c25e0",
"currentSizeBytes": 0,
"totalSizeBytes": 1675978,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "GIVEN_UP",
"availability": "NONE",
"combinedStatus": "FAILED",
"failureCount": 3,
"lastDataFetch": "1969-12-31T23:59:59.999Z",
"expiresAt": "1969-12-31T23:59:59.999Z"
},
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "pickup_datetime",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "trip_distance_mi",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "total_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
},
{
"id": "c5c5b282-ffea-4a34-835f-cc591584412b",
"type": "AGGREGATION",
"name": "Test reflection",
"tag": "e6dd9a96-a069-4623-a672-75f9d9355bfd",
"createdAt": "2021-10-11T18:44:27.064Z",
"updatedAt": "2021-10-11T18:44:27.064Z",
"datasetId": "316531b8-3c56-42f2-b05f-81f228ef3162",
"currentSizeBytes": 0,
"totalSizeBytes": 0,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "MANUAL",
"availability": "NONE",
"combinedStatus": "CANNOT_ACCELERATE_MANUAL",
"failureCount": 0,
"lastDataFetch": "1969-12-31T23:59:59.999Z",
"expiresAt": "1969-12-31T23:59:59.999Z"
},
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "trip_distance_mi",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "total_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
],
"canAlterReflections": true
}
Response Status Codes
200
OK
401
Unauthorized
404
Not Found
500
Internal Server Error
Retrieve a Reflection
Retrieve the specified reflection.
Method and URLGET /v0/projects/{project-id}/reflection/{id}
Parameters
project-id
path
String (UUID)
Unique identifier for the project that contains the reflection that you want to retrieve.
Example 1df71752-69b7-47d9-9e6c-990e6b194aa4
id
path
String (UUID)
Unique identifier for the reflection that you want to retrieve.
Example 95dda9dd-2371-467f-b68d-fc4c5ea57a8b
Example Request
curl -X GET 'https://api.dremio.cloud/v0/api/projects/1df71752-69b7-47d9-9e6c-990e6b194aa4/reflection/95dda9dd-2371-467f-b68d-fc4c5ea57a8b'
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
{
"id": "95dda9dd-2371-467f-b68d-fc4c5ea57a8b",
"type": "AGGREGATION",
"name": "Aggregation Reflection",
"tag": "05c6a6b5-1174-4f29-b1a3-b846ead498ce",
"createdAt": "2022-07-05T19:19:40.244Z",
"updatedAt": "2023-01-10T17:12:40.244Z",
"datasetId": "df99ab32-c2d4-4d1c-9e91-2c8be861bb8a",
"currentSizeBytes": 18639885,
"totalSizeBytes": 142639924,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-01-10T17:12:40.244Z",
"expiresAt": "3022-07-05T19:19:40.244Z"
},
"dimensionFields": [
{
"name": "pickup_date",
"granularity": "DATE"
},
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "dropoff_date",
"granularity": "DATE"
},
{
"name": "dropoff_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "surcharge",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
Response Status Codes
200
OK
401
Unauthorized
404
Not Found
500
Internal Server Error
Retrieve All Reflections for a Dataset
Retrieve all raw and aggregation reflections for the specified dataset.
Method and URLGET /v0/projects/{project-id}/dataset/{dataset-id}/reflection
Parameters
project-id
path
String (UUID)
Unique identifier for the project that contains the dataset whose reflections you want to retrieve.
Example 1df71752-69b7-47d9-9e6c-990e6b194aa4
dataset-id
path
String (UUID)
Unique identifier for the dataset whose reflections you want to retrieve.
Example df99ab32-c2d4-4d1c-9e91-2c8be861bb8a
Example Request
curl -X GET 'https://api.dremio.cloud/v0/projects/1df71752-69b7-47d9-9e6c-990e6b194aa4/dataset/3cbab7b3-ee82-44c1-abcc-e86d56078d4d/reflection'
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
In the response for a request to retrieve all raw and aggregation reflections for a dataset, the reflection objects are wrapped with a data array. Each object in the data array represents one reflection.
Example Response{
"data": [
{
"id": "23f75eb1-045f-447f-b3fa-374377877569",
"type": "RAW",
"name": "Raw Reflection",
"tag": "41deaef2-58a1-40b3-9ff4-6418b881d881",
"createdAt": "2023-02-03T16:38:27.770Z",
"updatedAt": "2023-02-03T16:38:27.770Z",
"datasetId": "3cbab7b3-ee82-44c1-abcc-e86d56078d4d",
"currentSizeBytes": 43286,
"totalSizeBytes": 87522,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "MANUAL",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-02-03T16:38:27.780Z",
"expiresAt": "3022-06-06T16:38:27.780Z"
},
"displayFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
},
{
"name": "trip_distance_mi"
},
{
"name": "fare_amount"
},
{
"name": "tip_amount"
},
{
"name": "total_amount"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
},
{
"id": "4f7c303d-fa6e-4f80-8c6a-cd8243095de9",
"type": "AGGREGATION",
"name": "Aggregation Reflection",
"tag": "bc5fa8f9-4af5-47de-b234-66f2b401c90e",
"createdAt": "2023-02-03T16:39:40.556Z",
"updatedAt": "2023-02-03T16:39:40.556Z",
"datasetId": "1acab7b3-ee82-44c1-abcc-e86d56078d4d",
"currentSizeBytes": 31579,
"totalSizeBytes": 124983,
"enabled": true,
"arrowCachingEnabled": false,
"status": {
"config": "OK",
"refresh": "MANUAL",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-02-03T16:39:40.568Z",
"expiresAt": "3022-06-06T16:39:40.568Z"
},
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "pickup_datetime",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "trip_distance_mi",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "total_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
],
"canAlterReflections": true
}
Response Status Codes
200
OK
401
Unauthorized
404
Not Found
405
Method Not Allowed
500
Internal Server Error
Update a Reflection
Update the specified reflection.
Method and URLPUT /v0/projects/{project-id}/reflection/{id}
Parameters
project-id
path
String (UUID)
Unique identifier for the project that contains the reflection that you want to update.
Example 1df71752-69b7-47d9-9e6c-990e6b194aa4
id
path
String (UUID)
Unique identifier for the reflection that you want to update.
Example 836eae91-306e-487b-a687-31c999653a86
type
body
String
Reflection type. For details, read Types of Reflections.
EnumRAW, AGGREGATION
Example AGGREGATION
name
body
String
Name to use for the reflection.
Example New Aggregation Reflection
tag
body
String
Unique identifier for the most recent version of the reflection. Dremio uses the tag value to ensure that you are updating the most recent version of the reflection.
Example ff9f4477-e44d-4995-a76a-70edbe572c56
datasetId
body
String (UUID)
Unique identifier for the anchor dataset associated with the reflection.
Example 81e2ad31-a119-447d-a831-085831e505be
enabled
body
Boolean
If the reflection should be available for accelerating queries, true
. Otherwise, false
.
Example false
arrowCachingEnabled
body
Boolean
Optional
If Dremio should convert data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true
. Otherwise, false
(default).
Example true
body
[Object]
Information about the dimension fields from the anchor dataset to use in the reflection. Dimension fields are the fields you expect to group by when analyzing data. Each dimensionFields object contains two attributes: name and granularity. Valid only for aggregation reflections.
If you omit the dimensionFields object in a PUT request, Dremio will remove all existing dimension fields from the reflection. To keep existing dimension fields while making other updates, duplicate the existing dimensionFields array in the PUT request.
Example [ { "name": "pickup_datetime", "granularity": "DATE" }, { "name": "passenger_count", "granularity": "DATE" }, { "name": "total_amount", "granularity": "DATE" }, { "name": "trip_distance_mi", "granularity": "DATE" } ]
body
[Object]
Information about the fields to display from the anchor dataset. The displayfields array must list every field in the anchor dataset or the reflection will fail. Each displayFields object contains one attribute: name. Valid only for raw reflections.
Example [ { "name": "pickup_datetime" }, { "name": "passenger_count" }, { "name": "trip_distance_mi" }, { "name": "fare_amount" }, { "name": "tip_amount" }, { "name": "total_amount" } ]
body
[Object]
Information about the measure fields from the anchor dataset to use in the reflection. Measure fields are the fields you expect to use for calculations when analyzing the data. Each measureFields object contains two attributes: name and measureTypeList. Valid only for aggregation reflections.
If you omit the measureFields object in a PUT request, Dremio will remove all existing measure fields from the reflection. To keep existing measure fields while making other updates, duplicate the existing measureFields array in the PUT request.
Example [ { "name": "passenger_count", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "trip_distance_mi", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "fare_amount", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "tip_amount", "measureTypeList": [ "SUM", "COUNT" ] }, { "name": "total_amount", "measureTypeList": [ "SUM", "COUNT" ] } ]
body
[Object]
Information about the distribution fields from the anchor dataset to use for co-locating and co-partitioning data from multiple datasets across nodes. Each distributionFields object contains one attribute: name.
If you omit the distributionFields object in a PUT request, Dremio will remove all existing distribution fields from the reflection. To keep existing distribution fields while making other updates, duplicate the existing distributionFields array in the PUT request.
Example [ { "name": "trip_distance_mi" }, { "name": "total_amount" } ]
body
[Object]
Information about the fields from the anchor dataset to use to partition data in the reflection. Each field name is listed as an individual object. For details, read Horizontally Partition Reflections that Have Many Rows.
If you omit the partitionFields object in a PUT request, Dremio will remove all existing partition fields from the reflection. To keep existing partition fields while making other updates, duplicate the existing partitionFields array in the PUT request.
Example [ { "name": "pickup_datetime" }, { "name": "passenger_count" } ]
body
[Object]
Information about the fields from the anchor dataset to use for sorting in the reflection. Each sortFields object contains one attribute: name. For details, read Sort Reflections on High-Cardinality Fields.
If you omit the sortFields object in a PUT request, Dremio will remove all existing sort fields from the reflection. To keep existing sort fields while making other updates, duplicate the existing sortFields array in the PUT request.
Example [ { "name": "trip_distance_mi" } ]
partitionDistributionStrategy
body
String
Optional
Method to use to optimize data compression when executing reflections. If CONSOLIDATED
(default), Dremio will minimize the number of files produced. If STRIPED
, Dremio will minimize the time required to refresh the reflection.
CONSOLIDATED, STRIPED
Example CONSOLIDATED
dimensionFields
name
body
String
Name of the field from the anchor dataset to configure as a dimension for the reflection.
Example pickup_datetime
granularity
body
String
Grouping to use for the dimension field. If Dremio should automatically extract the day-level date value and use it as the grouping value in the reflection, DATE
. If Dremio should use the original value for grouping, NORMAL
.
DATE, NORMAL
Example DATE
displayFields
name
body
String
Name of the field to display from the anchor dataset.
Example pickup_datetime
measureFields
name
body
String
Name of the field from the anchor dataset that you expect to use in calculations. Fields of types LIST
, MAP
, and UNION
are not valid measureFields.
Example passenger_count
measureTypeList
body
[String]
Types of calculations for which Dremio should use the specified measure field. The calculations must be valid for the specified field (for example, SUM
is not valid for a timestamp field like pickup_datetime).
APPROX_COUNT_DISTINCT, MIN, MAX, UNKNOWN, SUM, COUNT
Example [ "COUNT", "SUM" ]
distributionFields
name
body
String
Name of the field from the anchor dataset to use for co-locating and co-partitioning data from multiple datasets across nodes. Every field listed as a distribution field must also be listed as a dimension field.
Example trip_distance_mi
partitionFields
name
body
String
Name of the field from the anchor dataset on which you want to be able to partition rows. Every field listed as a partition field must also be listed as a dimension field. If you list a field as a partition field, you cannot list the same field as a sort field in the same reflection.
Example pickup_datetime
sortFields
name
body
String
Name of the field from the anchor dataset to use for sorting in the reflection. Every field listed as a sort field must also be listed as a dimension field. If you list a field as a sort field, you cannot list the same field as a partition field in the same reflection.
Example trip_distance_mi
Example Request
curl -X PUT 'https://api.dremio.cloud/v0/api/projects/1df71752-69b7-47d9-9e6c-990e6b194aa4/reflection/26526a2e-3edb-4af0-a24f-8e9e2dcfa82a' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json' \
--data-raw '{
"id": "836eae91-306e-487b-a687-31c999653a86",
"type": "AGGREGATION",
"name": "New Aggregation Reflection",
"tag": "ff9f4477-e44d-4995-a76a-70edbe572c56",
"datasetId": "81e2ad31-a119-447d-a831-085831e505be",
"enabled": false,
"arrowCachingEnabled": true,
"dimensionFields": [
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"entityType": "reflection"
}'
{
"id": "26526a2e-3edb-4af0-a24f-8e9e2dcfa82a",
"type": "AGGREGATION",
"name": "New Aggregation Reflection",
"tag": "59dcb49e-ed20-49b3-9e54-b9d02b4411d4",
"createdAt": "2023-01-30T14:35:19.192Z",
"updatedAt": "2023-01-30T14:35:19.192Z",
"datasetId": "81e2ad31-a119-447d-a831-085831e505be",
"currentSizeBytes": 8808,
"totalSizeBytes": 8808,
"enabled": false,
"arrowCachingEnabled": true,
"status": {
"config": "OK",
"refresh": "SCHEDULED",
"availability": "AVAILABLE",
"combinedStatus": "CAN_ACCELERATE",
"failureCount": 0,
"lastDataFetch": "2023-02-03T18:40:37.655Z",
"expiresAt": "3023-01-30T14:35:01.180Z"
},
"dimensionFields": [
{
"name": "pickup_datetime",
"granularity": "DATE"
},
{
"name": "passenger_count",
"granularity": "DATE"
},
{
"name": "total_amount",
"granularity": "DATE"
},
{
"name": "trip_distance_mi",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "passenger_count",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
},
{
"name": "total_amount",
"measureTypeList": [
"SUM",
"COUNT"
]
}
],
"distributionFields": [
{
"name": "trip_distance_mi"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
}
],
"sortFields": [
{
"name": "trip_distance_mi"
}
],
"partitionDistributionStrategy": "CONSOLIDATED",
"canView": true,
"canAlter": true,
"entityType": "reflection"
}
Response Status Codes
200
OK
401
Unauthorized
404
Not Found
409
Conflict
500
Internal Server Error
Delete a Reflection
Delete the specified reflection.
Method and URLDELETE /v0/projects/{project-id}/reflection/{id}
Parameters
project-id
path
String (UUID)
Unique identifier for the project that contains the reflection that you want to delete.
Example 1df71752-69b7-47d9-9e6c-990e6b194aa4
id
path
String (UUID)
Unique identifier for the reflection that you want to delete.
Example 95dda9dd-2371-467f-b68d-fc4c5ea57a8b
Example Request
curl -X DELETE 'https://api.dremio.cloud/v0/api/projects/1df71752-69b7-47d9-9e6c-990e6b194aa4/reflection/95dda9dd-2371-467f-b68d-fc4c5ea57a8b'
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type:application/json'
No response
Response Status Codes
200
OK
401
Unauthorized
404
Not Found
405
Method Not Allowed