Promote a File/Folder to a Physical Dataset
This API promotes a file or folder in a file-based source to a physical dataset (PDS). The supplied path is used to determine what entity is promoted.
Files or folders inside a source can be promoted to physical datasets. This converts the folder/file to a dataset; the dataset then has a new ID since it is a new entity.
Note:
To unpromote a physical dataset (PDS), you delete the dataset. This reverts the PDS back to its original form (a folder or file).
To promote a file or folder, you need the following:
- ID (encoded) of the folder or file. Use GET /catalog/by-path to obtain information about the file or folder.
- Request input. As a minimum:
entityType
,id
,path
,type
, andformat
. - Dataset format.
Syntax
POST /api/v3/catalog/{id}
Request Input
See Dataset for more information.
{
"entityType": "dataset" [immutable, generated by Dremio],
"id": String [immutable, generated by Dremio],
"path": [String] [immutable after creation],
"tag": String [immutable, generated by Dremio],
"type": String ["PHYSICAL_DATASET", "VIRTUAL_DATASET"] [immutable],
"fields": [DatasetField] [immutable],
"createdAt": String (RFC3339 date) [immutable, generated by Dremio],
"accelerationRefreshPolicy": DatasetAccelerationRefreshPolicy [optional, only for physical datasets in a source],
"sql": String [optional, required for virtual datasets],
"sqlContext": [String] [optional, only for virtual datasets],
"format": DatasetFormat [optional, required for promoted datasets],
"approximateStatisticsAllowed": Boolean [optional, introduced in Dremio 2.1.0]
}
Response Output
See Dataset for more information.
{
"entityType": "dataset" [immutable, generated by Dremio],
"id": String [immutable, generated by Dremio],
"path": [String] [immutable after creation],
"tag": String [immutable, generated by Dremio],
"type": String ["PHYSICAL_DATASET", "VIRTUAL_DATASET"] [immutable],
"fields": [DatasetField] [immutable],
"createdAt": String (RFC3339 date) [immutable, generated by Dremio],
"accelerationRefreshPolicy": DatasetAccelerationRefreshPolicy [optional, only for physical datasets in a source],
"sql": String [optional, required for virtual datasets],
"sqlContext": [String] [optional, only for virtual datasets],
"format": DatasetFormat [optional, required for promoted datasets],
"approximateStatisticsAllowed": Boolean [optional, introduced in Dremio 2.1.0]
}
Response Codes
400
- The supplied CatalogEntity object is invalid.403
- User does not have permission to create the catalog entity.
Example: Promote a Folder
In this example, a folder, acquisition-mini, in a HDFS source, my_hdfs_2, is being promoted to a PDS. We have the following information about the folder:
- Encoded ID:
dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini
- Path:
[ "my_hdfs_2", "data", "loans", "acquisition-mini" ]
- Dataset format:
{
"type": "Text",
"fieldDelimiter": "|",
"lineDelimiter": "\n",
"escape": "\"",
"skipFirstLine": false,
"extractHeader": false,
"trimHeader": false,
"autoGenerateColumnNames": true
}
Note:
Postman was used to generate samples.
HTTP
POST localhost:9047/api/v3/catalog/dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini
Raw Body Input
{
"entityType": "dataset",
"id": "dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini",
"path": [
"my_hdfs_2",
"data",
"loans",
"acquisition-mini"
],
"type": "PHYSICAL_DATASET",
"format": {
"type": "Text",
"fieldDelimiter": "|",
"lineDelimiter": "\n",
"escape": "\"",
"skipFirstLine": false,
"extractHeader": false,
"trimHeader": false,
"autoGenerateColumnNames": true
}
}
Curl
curl -X POST \
http://localhost:9047/api/v3/catalog/dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini \
-H 'Authorization: _dremioo8opojj6vn4ughkvcpalpr46d6' \
-H 'Content-Type: application/json' \
-d '{
"entityType": "dataset",
"id": "dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini",
"path": [
"my_hdfs_2",
"data",
"loans",
"acquisition-mini"
],
"type": "PHYSICAL_DATASET",
"format": {
"type": "Text",
"fieldDelimiter": "|",
"lineDelimiter": "\n",
"escape": "\"",
"skipFirstLine": false,
"extractHeader": false,
"trimHeader": false,
"autoGenerateColumnNames": true
}
}'
Python
import requests
url = "http://localhost:9047/api/v3/catalog/dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini"
payload = "{\n \"entityType\": \"dataset\",\n \"id\": \"dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini\",\n \"path\": [\n \t\"my_hdfs_2\",\n \t\"data\",\n \t\"loans\",\n \t\"acquisition-mini\"\n \t],\n \t\n \"type\": \"PHYSICAL_DATASET\",\n \"format\": {\n \"type\": \"Text\",\n \"fieldDelimiter\": \"|\",\n \"lineDelimiter\": \"\\n\",\n \"escape\": \"\\\"\",\n \"skipFirstLine\": false,\n \"extractHeader\": false,\n \"trimHeader\": false,\n \"autoGenerateColumnNames\": true\n }\n}"
headers = {
'Authorization': "_dremioo8opojj6vn4ughkvcpalpr46d6",
'Content-Type': "application/json"
}
response = requests.request("POST", url, data=payload, headers=headers)
print(response.text)
Response
{
"entityType": "dataset",
"id": "cf771ed4-8ffc-49c6-b75c-b6ce4a518289",
"type": "PHYSICAL_DATASET",
"path": [
"my_hdfs_2",
"data",
"loans",
"acquisition-mini"
],
"createdAt": "2019-03-26T18:56:57.085Z",
"tag": "0",
"format": {
"type": "Text",
"ctime": 0,
"isFolder": true,
"location": "/data/loans/acquisition-mini",
"fieldDelimiter": "|",
"skipFirstLine": false,
"extractHeader": false,
"quote": "\"",
"comment": "#",
"escape": "\"",
"lineDelimiter": "\n",
"autoGenerateColumnNames": true,
"trimHeader": false
},
"accessControlList": {
"version": "0"
},
"fields": [
{
"name": "A",
"type": {
"name": "VARCHAR"
}
},
{
"name": "B",
"type": {
"name": "VARCHAR"
}
},
{
"name": "C",
"type": {
"name": "VARCHAR"
}
},
{
"name": "D",
"type": {
"name": "VARCHAR"
}
},
{
"name": "E",
"type": {
"name": "VARCHAR"
}
},
{
"name": "F",
"type": {
"name": "VARCHAR"
}
},
{
"name": "G",
"type": {
"name": "VARCHAR"
}
},
{
"name": "H",
"type": {
"name": "VARCHAR"
}
},
{
"name": "I",
"type": {
"name": "VARCHAR"
}
},
{
"name": "J",
"type": {
"name": "VARCHAR"
}
},
{
"name": "K",
"type": {
"name": "VARCHAR"
}
},
{
"name": "L",
"type": {
"name": "VARCHAR"
}
},
{
"name": "M",
"type": {
"name": "VARCHAR"
}
},
{
"name": "N",
"type": {
"name": "VARCHAR"
}
},
{
"name": "O",
"type": {
"name": "VARCHAR"
}
},
{
"name": "P",
"type": {
"name": "VARCHAR"
}
},
{
"name": "Q",
"type": {
"name": "VARCHAR"
}
},
{
"name": "R",
"type": {
"name": "VARCHAR"
}
},
{
"name": "S",
"type": {
"name": "VARCHAR"
}
},
{
"name": "T",
"type": {
"name": "VARCHAR"
}
},
{
"name": "U",
"type": {
"name": "VARCHAR"
}
},
{
"name": "V",
"type": {
"name": "VARCHAR"
}
},
{
"name": "W",
"type": {
"name": "VARCHAR"
}
}
],
"approximateStatisticsAllowed": false
}