GET /catalog/by-path/{path}

Retrieves information about a specific catalog entity (source, space, folder, file or dataset) using it's path. Child information (if applicable) of the catalog entity are also retrieved along with their ID, path, type, and containerType.

Syntax

GET /api/v3/catalog/by-path/{path}

Path is the Dremio path for the entity, using / as a separator. Each path component should be url escaped.

Example Syntax

For example, given a source called MySource which has a folder called MyFolder that contains a dataset called MyDataset, the URL will look like this:

GET /api/v3/catalog/by-path/MySource/MyFolder/MyDataset

If the dataset was called My?Dataset, then the URL will be:

GET /api/v3/catalog/by-path/MySource/MyFolder/My%3FDataset

This is because ? is a special character in URLs and we have to url escape it.

Response Output

The CatalogEntity is one of the following:

403 - User does not have permission to view the catalog entity.
404 - A catalog entity with the specified path could not be found.

Example: Get PDS by Path

In this example, information is requested about a physical dataset yellow_tripdata_2009-01.csv, found in the HDFS source called DEV HDFS under the directory path data/nyctaxi.

HTTP

GET localhost:9047/api/v3/catalog/by-path/DEV%20HDFS/data/nyctaxi/yellow_tripdata_2009-01.csv

Curl

curl -X GET \
    http://localhost:9047/api/v3/catalog/by-path/DEV%20HDFS/data/nyctaxi/yellow_tripdata_2009-01.csv \
    -H "Content-Type: application/json" \
    -H "Authorization: _dremiohs85l11k2mh0b10l51ett9fsca"

Response

For a physical dataset like this, the response body includes information about formatting and datatypes.

{
  "entityType": "dataset",
  "id": "8a2df787-2e28-49ef-b961-52e214672d33",
  "type": "PHYSICAL_DATASET",
  "path": [
    "DEV HDFS",
    "data",
    "nyctaxi",
    "yellow_tripdata_2009-01.csv"
  ],
  "createdAt": "2019-01-10T16:10:29.676Z",
  "tag": "0",
  "format": {
    "type": "Text",
    "ctime": 0,
    "isFolder": false,
    "location": "/data/nyctaxi/yellow_tripdata_2009-01.csv",
    "fieldDelimiter": ",",
    "skipFirstLine": false,
    "extractHeader": true,
    "quote": "\"",
    "comment": "#",
    "escape": "\"",
    "lineDelimiter": "\r\n",
    "autoGenerateColumnNames": true,
    "trimHeader": true
  },
  "accessControlList": {
    "version": 0
  },
  "fields": [
    {
      "name": "Trip_Pickup_DateTime",
      "type": {
        "name": "VARCHAR"
      }
    },
    {
      "name": "Trip_Dropoff_DateTime",
      "type": {
        "name": "VARCHAR"
      }
    },
    {
      "name": "Passenger_Count",
      "type": {
        "name": "VARCHAR"
      }
    },
    {
      "name": "Trip_Distance",
      "type": {
        "name": "VARCHAR"
      }
    },
    {
      "name": "Total_Amt",
      "type": {
        "name": "VARCHAR"
      }
    }
  ],
  "approximateStatisticsAllowed": false
}

Example: Get Source Folder by Path

In this example, a HDFS source, my_hdfs_2, has a sub-folder (data/loans) with three (3) folders (acquisition, acquisition-mini, and performance). Two of the folders are not promoted and one folder is promoted to a PDS. We are retrieving information about the loans entity.

[info] Postman is used to generate samples.

HTTP

GET localhost:9047/api/v3/catalog/by-path/my_hdfs_2/data/loans

Curl

curl -X GET \
  http://localhost:9047/api/v3/catalog/by-path/my_hdfs_2/data/loans \
  -H 'Authorization: _dremioo8opojj6vn4ughkvcpalpr46d6' \
  -H 'Content-Type: application/json' \
  -H 'Postman-Token: 6a8f1b11-6340-44c0-ae74-899b6c4df7bd' \
  -H 'cache-control: no-cache'

Python

import requests

url = "http://localhost:9047/api/v3/catalog/by-path/my_hdfs_2/data/loans"

payload = ""
headers = {
    'Authorization': "_dremioo8opojj6vn4ughkvcpalpr46d6",
    'Content-Type': "application/json",
    'cache-control': "no-cache",
    'Postman-Token': "f7f851aa-fbdd-403d-a400-1b816d93dfae"
    }

response = requests.request("GET", url, data=payload, headers=headers)

print(response.text)

Response

{
    "entityType": "folder",
    "id": "2ea08d02-13d3-419b-86cc-b39e7a8ee26b",
    "path": [
        "my_hdfs_2",
        "data",
        "loans"
    ],
    "tag": "0",
    "children": [
        {
            "id": "dremio:/my_hdfs_2/data/loans/\"acquisition\"",
            "path": [
                "my_hdfs_2",
                "data",
                "loans",
                "\"acquisition\""
            ],
            "type": "CONTAINER",
            "containerType": "FOLDER"
        },
        {
            "id": "cf771ed4-8ffc-49c6-b75c-b6ce4a518289",
            "path": [
                "my_hdfs_2",
                "data",
                "loans",
                "\"acquisition-mini\""
            ],
            "type": "DATASET",
            "datasetType": "PROMOTED"
        },
        {
            "id": "dremio:/my_hdfs_2/data/loans/\"performance\"",
            "path": [
                "my_hdfs_2",
                "data",
                "loans",
                "\"performance\""
            ],
            "type": "CONTAINER",
            "containerType": "FOLDER"
        }
    ],
    "accessControlList": {
        "version": "0"
    }
}

results matching ""

    No results matching ""