Promote a File/Folder to a Physical Dataset

This API promotes a file or folder in a file-based source to a physical dataset (PDS). The supplied path is used to determine what entity is promoted.

Files or folders inside a source can be promoted to physical datasets. This converts the folder/file to a dataset; the dataset then has a new ID since it is a new entity.

To unpromote a physical dataset (PDS), you delete the dataset. This reverts the PDS back to its original form (a folder or file).

To promote a file or folder, you need the following:

  • ID (encoded) of the folder or file. Use GET /catalog/by-path to obtain information about the file or folder.
  • Request input. As a minimum: entityType, id, path, type, and format.
  • Dataset format.

Syntax

POST /api/v3/catalog/{id}

Request Input

See Dataset for more information.

{
  "entityType": "dataset" [immutable, generated by Dremio],
  "id": String [immutable, generated by Dremio],
  "path": [String] [immutable after creation],
  "tag": String [immutable, generated by Dremio],
  "type": String ["PHYSICAL_DATASET", "VIRTUAL_DATASET"] [immutable],
  "fields": [DatasetField] [immutable],
  "createdAt": String (RFC3339 date) [immutable, generated by Dremio],
  "accelerationRefreshPolicy": DatasetAccelerationRefreshPolicy [optional, only for physical datasets in a source],
  "sql": String [optional, required for virtual datasets],
  "sqlContext": [String] [optional, only for virtual datasets],
  "format": DatasetFormat [optional, required for promoted datasets],
  "approximateStatisticsAllowed": Boolean [optional, introduced in Dremio 2.1.0]
}

Response Output

See Dataset for more information.

{
  "entityType": "dataset" [immutable, generated by Dremio],
  "id": String [immutable, generated by Dremio],
  "path": [String] [immutable after creation],
  "tag": String [immutable, generated by Dremio],
  "type": String ["PHYSICAL_DATASET", "VIRTUAL_DATASET"] [immutable],
  "fields": [DatasetField] [immutable],
  "createdAt": String (RFC3339 date) [immutable, generated by Dremio],
  "accelerationRefreshPolicy": DatasetAccelerationRefreshPolicy [optional, only for physical datasets in a source],
  "sql": String [optional, required for virtual datasets],
  "sqlContext": [String] [optional, only for virtual datasets],
  "format": DatasetFormat [optional, required for promoted datasets],
  "approximateStatisticsAllowed": Boolean [optional, introduced in Dremio 2.1.0]
}

Response Codes

400 - The supplied CatalogEntity object is invalid.
403 - User does not have permission to create the catalog entity.

Example: Promote a Folder

In this example, a folder, acquisition-mini, in a HDFS source, my_hdfs_2, is being promoted to a PDS. We have the following information about the folder:

  • Encoded ID:
    dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini
  • Path:
    [
        "my_hdfs_2",
        "data",
        "loans",
        "acquisition-mini"
        ]
    
  • Dataset format:
{
        "type": "Text",
        "fieldDelimiter": "|",
        "lineDelimiter": "\n",
        "escape": "\"",
        "skipFirstLine": false,
        "extractHeader": false,
        "trimHeader": false,
        "autoGenerateColumnNames": true
    }

Postman was used to generate samples.

HTTP

POST localhost:9047/api/v3/catalog/dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini

Raw Body Input

{
    "entityType": "dataset",
    "id": "dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini",
    "path": [
    	"my_hdfs_2",
    	"data",
    	"loans",
    	"acquisition-mini"
    	],
    	
    "type": "PHYSICAL_DATASET",
    "format": {
        "type": "Text",
        "fieldDelimiter": "|",
        "lineDelimiter": "\n",
        "escape": "\"",
        "skipFirstLine": false,
        "extractHeader": false,
        "trimHeader": false,
        "autoGenerateColumnNames": true
    }
}

Curl

curl -X POST \
  http://localhost:9047/api/v3/catalog/dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini \
  -H 'Authorization: _dremioo8opojj6vn4ughkvcpalpr46d6' \
  -H 'Content-Type: application/json' \
  -d '{
    "entityType": "dataset",
    "id": "dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini",
    "path": [
    	"my_hdfs_2",
    	"data",
    	"loans",
    	"acquisition-mini"
    	],
    	
    "type": "PHYSICAL_DATASET",
    "format": {
        "type": "Text",
        "fieldDelimiter": "|",
        "lineDelimiter": "\n",
        "escape": "\"",
        "skipFirstLine": false,
        "extractHeader": false,
        "trimHeader": false,
        "autoGenerateColumnNames": true
    }
}'

Python

import requests

url = "http://localhost:9047/api/v3/catalog/dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini"

payload = "{\n    \"entityType\": \"dataset\",\n    \"id\": \"dremio%3A%2Fmy_hdfs_2%2Fdata%2Floans%2Facquisition-mini\",\n    \"path\": [\n    \t\"my_hdfs_2\",\n    \t\"data\",\n    \t\"loans\",\n    \t\"acquisition-mini\"\n    \t],\n    \t\n    \"type\": \"PHYSICAL_DATASET\",\n    \"format\": {\n        \"type\": \"Text\",\n        \"fieldDelimiter\": \"|\",\n        \"lineDelimiter\": \"\\n\",\n        \"escape\": \"\\\"\",\n        \"skipFirstLine\": false,\n        \"extractHeader\": false,\n        \"trimHeader\": false,\n        \"autoGenerateColumnNames\": true\n    }\n}"
headers = {
    'Authorization': "_dremioo8opojj6vn4ughkvcpalpr46d6",
    'Content-Type': "application/json"
    }

response = requests.request("POST", url, data=payload, headers=headers)

print(response.text)

Response

{
    "entityType": "dataset",
    "id": "cf771ed4-8ffc-49c6-b75c-b6ce4a518289",
    "type": "PHYSICAL_DATASET",
    "path": [
        "my_hdfs_2",
        "data",
        "loans",
        "acquisition-mini"
    ],
    "createdAt": "2019-03-26T18:56:57.085Z",
    "tag": "0",
    "format": {
        "type": "Text",
        "ctime": 0,
        "isFolder": true,
        "location": "/data/loans/acquisition-mini",
        "fieldDelimiter": "|",
        "skipFirstLine": false,
        "extractHeader": false,
        "quote": "\"",
        "comment": "#",
        "escape": "\"",
        "lineDelimiter": "\n",
        "autoGenerateColumnNames": true,
        "trimHeader": false
    },
    "accessControlList": {
        "version": "0"
    },
    "fields": [
        {
            "name": "A",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "B",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "C",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "D",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "E",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "F",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "G",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "H",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "I",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "J",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "K",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "L",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "M",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "N",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "O",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "P",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "Q",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "R",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "S",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "T",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "U",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "V",
            "type": {
                "name": "VARCHAR"
            }
        },
        {
            "name": "W",
            "type": {
                "name": "VARCHAR"
            }
        }
    ],
    "approximateStatisticsAllowed": false
}