Skip to main content
Version: 24.3.x

Dataset

note

The Dataset API is supported in Dremio 24.3.8+.

Use the Dataset API to retrieve Dremio's reflection recommendations for your datasets.

Dataset Object (All Reflections)
{
"data": [
{
"type": "RAW",
"enabled": true,
"arrowCachingEnabled": false,
"displayFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
},
{
"name": "trip_distance_mi"
},
{
"name": "fare_amount"
},
{
"name": "tip_amount"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"entityType": "reflection"
},
{
"type": "AGGREGATION",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "total_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"entityType": "reflection"
}
]
}

Dataset Attributes

data Array of Object

List of recommended reflection objects for the specified dataset ID.

Attributes of objects in the data Array

type String

Reflection type. For details, read Types of Reflections.

Enum: RAW, AGGREGATION

Example: RAW


enabled Boolean

If the reflection is available for accelerating queries, true. Otherwise, false.

Example: true


arrowCachingEnabled Boolean

If Dremio converts data from the reflection's Parquet files to Apache Arrow format when copying that data to executor nodes, true. Otherwise, false.

Example: false


displayFields Array of Object

Information about the fields displayed from the anchor dataset. Each object in the displayFields array contains one attribute: name. Included only for raw reflections. Not included for aggregation reflections.

Example: [{"name":"pickup_datetime"},{"name":"passenger_count"},{"name":"trip_distance_mi"},{"name":"fare_amount"},{"name":"tip_amount"},{"name":"total_amount"}]


dimensionFields Array of Object

Information about the dimension fields from the anchor dataset used in the reflection. Dimension fields are the fields you expect to group by when analyzing data. Each object in the dimensionFields array contains two attributes: name and granularity. Included only for aggregation reflections. If the anchor dataset does not include any dimension fields, the dimensionFields value is an empty array. Not included for raw reflections.

Example: [{"name":"passenger_count","granularity":"DATE"}]


measureFields Array of Object

Information about the measure fields from the anchor dataset used in the reflection. Measure fields are the fields you expect to use for calculations when analyzing the data. Each object in the measureFields array contains two attributes: name and measureTypeList. Included only for aggregation reflections. If the anchor dataset does not include any measure fields, the measureFields value is an empty array. Not included for raw reflections.

Example: [{"name":"total_amount","measureTypeList":["COUNT","SUM"]},{"name":"trip_distance_mi","measureTypeList":["COUNT","SUM"]},{"name":"fare_amount","measureTypeList":["COUNT","SUM"]},{"name":"tip_amount","measureTypeList":["COUNT","SUM"]}]


partitionFields Array of Object

Information about the fields from the anchor dataset used to partition data in the reflection. Each object in the partitionFields array contains one attribute: name. Included only for aggregation reflections. If the anchor dataset does not include any partition fields, the partitionFields value is an empty array. Not included for raw reflections.

Example: [{"name": "dropoff_date"},{"name": "passenger_count"}]


entityType String

Type of entity. For objects in dataset responses, the entityType is reflection.

Creating and Retrieving Reflection Recommendations for a Dataset

Create reflection recommendations for the specified dataset. The response contains the reflection recommendations.

Method and URL
POST /api/v3/dataset/{id}/reflection/recommendation/{type}/

Parameters

id Path   String (UUID)

The id of the dataset for which you want to create and retrieve recommended reflections.

Example: 88e5fbdf-4b56-4286-9b8b-bb48e1f350eb


type Path   String

The type of reflection recommendations you want to create and retrieve.

  • ALL: Create and retreive both raw and aggregation reflection recommendations.
  • RAW: Create and retreive only raw reflection recommendations.
  • AGG: Create and retreive only aggregation reflection recommendations.

NOTE: The type is not case-sensitive. For example, AGG, agg, and aGg are valid type values for aggregation reflection recommendations.

Example: ALL

Example Request (All Reflections)
curl -X POST 'https://{hostname}/api/v3/dataset/88e5fbdf-4b56-4286-9b8b-bb48e1f350eb/reflection/recommendation/ALL/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
Example Response (All Reflections)
{
"data": [
{
"type": "RAW",
"enabled": true,
"arrowCachingEnabled": false,
"displayFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
},
{
"name": "trip_distance_mi"
},
{
"name": "fare_amount"
},
{
"name": "tip_amount"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"entityType": "reflection"
},
{
"type": "AGGREGATION",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "total_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"entityType": "reflection"
}
]
}
Example Request (Raw Reflections)
curl -X POST 'https://{hostname}/api/v3/dataset/88e5fbdf-4b56-4286-9b8b-bb48e1f350eb/reflection/recommendation/RAW/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
Example Response (Raw Reflections)
{
"data": [
{
"type": "RAW",
"enabled": true,
"arrowCachingEnabled": false,
"displayFields": [
{
"name": "pickup_datetime"
},
{
"name": "passenger_count"
},
{
"name": "trip_distance_mi"
},
{
"name": "fare_amount"
},
{
"name": "tip_amount"
},
{
"name": "total_amount"
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"entityType": "reflection"
}
]
}
Example Request (Aggregation Reflections)
curl -X POST 'https://{hostname}/api/v3/dataset/88e5fbdf-4b56-4286-9b8b-bb48e1f350eb/reflection/recommendation/AGG/' \
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header 'Content-Type: application/json'
Example Response (Aggregation Reflections)
{
"data": [
{
"type": "AGGREGATION",
"enabled": true,
"arrowCachingEnabled": false,
"dimensionFields": [
{
"name": "passenger_count",
"granularity": "DATE"
}
],
"measureFields": [
{
"name": "total_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "trip_distance_mi",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "fare_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
},
{
"name": "tip_amount",
"measureTypeList": [
"COUNT",
"SUM"
]
}
],
"partitionFields": [
{
"name": "dropoff_date"
},
{
"name": "passenger_count"
}
],
"entityType": "reflection"
}
]
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

405   Method Not Allowed

500   Internal Server Error