Table
Use the Catalog API to retrieve, create, update, and delete tables.
Table Object{
"entityType": "dataset",
"id": "c9c11d32-0576-4200-5a5b-8c7229cb3d72",
"type": "PHYSICAL_DATASET",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"createdAt": "2024-01-13T19:52:01.894Z",
"tag": "cb2905bb-39c0-497f-ae74-4c310d534f25",
"accelerationRefreshPolicy": {
"activePolicyType": "SCHEDULE",
"refreshPeriodMs": 3600000,
"gracePeriodMs": 10800000,
"refreshSchedule": "0 0 8 * * ?",
"method": "FULL",
"neverExpire": false,
"neverRefresh": false,
"sourceRefreshOnDataChanges": false
},
"isMetadataExpired": false,
"lastMetadataRefreshAt": "2024-01-31T09:50:01.012Z",
"format": {
"type": "Parquet",
"name": "restaurant_reviews.parquet",
"fullPath": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"ctime": 0,
"isFolder": false,
"location": "/samples.dremio.com/Dremio University/restaurant_reviews.parquet",
"ignoreOtherFileFormats": false,
"autoCorrectCorruptDates": true
},
"accessControlList": {
"users": [
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
},
{
"id": "30fca499-4abc-4469-7142-fc8dd29acac8",
"permissions": [
"SELECT",
"ALTER",
"MANAGE_GRANTS"
]
}
],
"roles": [
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
},
"owner": {
"ownerId": "30fca499-4abc-4469-7142-fc8dd29acac8",
"ownerType": "USER"
},
"fields": [
{
"name": "_id",
"type": {
"name": "VARCHAR"
}
},
{
"name": "name",
"type": {
"name": "VARCHAR"
}
},
{
"name": "city",
"type": {
"name": "VARCHAR"
}
},
{
"name": "state",
"type": {
"name": "VARCHAR"
}
},
{
"name": "categories",
"type": {
"name": "LIST",
"subSchema": [
{
"type": {
"name": "VARCHAR"
}
}
]
}
},
{
"name": "review_count",
"type": {
"name": "BIGINT"
}
},
{
"name": "stars",
"type": {
"name": "DOUBLE"
}
},
{
"name": "attributes",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "Parking",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "garage",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "street",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "lot",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "valet",
"type": {
"name": "BOOLEAN"
}
}
]
}
},
{
"name": "Accepts Credit Cards",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Wheelchair Accessible",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Price Range",
"type": {
"name": "BIGINT"
}
}
]
}
},
{
"name": "date",
"type": {
"name": "VARCHAR"
}
}
],
"approximateStatisticsAllowed": false
}
Table Attributes
entityType String
Type of catalog entity. For tables, the entityType is dataset.
Example: dataset
id String (UUID)
UUID of the table.
type String
Type of dataset. For tables, the type is PHYSICAL_DATASET.
path Array of String
Path to the table within Dremio, expressed as an array. The path lists each level of hierarchy in order, from outer to inner: source first, then folders and subfolders, then the table itself as the last item in the array.
Example:
[
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
]
createdAt String
Timestamp when the table was created.
tag String (UUID)
UUID of the version of the table. Dremio changes the tag whenever the table changes and uses the tag value to ensure that PUT requests apply to the most recent version of the table.
accelerationRefreshPolicy Object
Attributes that define the acceleration refresh policy for the table.
isMetadataExpired Boolean
- If true, the metadata of the table needs to be refreshed. To refresh it, run the ALTER TABLE command using the REFRESH METADATA clause. See ALTER TABLE.
- If false, the metadata can still be used for planning queries against the table.
- If NULL, metadata has never been collected for the table.
lastMetadataRefreshAt String
Timestamp when the table metadata was last refreshed. If NULL, the metadata has never been refreshed.
format Object
Table format attributes.
accessControlList Object
Information about users and roles with privileges on the table and the specific privileges each user or role has. May include a users array, a roles array, or both, depending on the configured access and privileges. The accessControlList object is empty if table-specific privileges are not set. See Privileges for more information.
owner Object
Information about the table's owner.
fields Array of Object
Attributes that represent the table schema.
approximateStatisticsAllowed Boolean
If true, COUNT DISTINCT queries run on the table will return approximate results. Otherwise, false.
Attributes of the accelerationRefreshPolicy Object
activePolicyType String
Option to set the policy for refreshing Reflections that are defined on the source. For this option to take effect, the neverRefresh parameter must be set to false.
The possible values are:
NEVER: The Reflections are never refreshed.PERIOD: Default. The Reflections are refreshed at the end of every period that is defined by refreshPeriodMs.SCHEDULE: The Reflections are refreshed according to the schedule that is set by refreshSchedule.REFRESH_ON_DATA_CHANGES: Reflections automatically refresh for underlying tables that are in Iceberg format when new snapshots are created after an update. If the Reflection refresh job finds no changes, then no data is updated. Reflections that are automatically updated based on Iceberg source table changes also update according to the source-level policy as the minimum refresh frequency.
refreshPeriodMs Integer
Refresh period for the data in all Reflections for the table, in milliseconds.
Example: 3600000
refreshSchedule String
A cron expression that sets the schedule, in UTC time, according to which the Reflections that are defined on the source are refreshed.
Cron expressions consist of six fields that specify when a job should run:
Second Minute Hour Day-of-Month Month Day-of-Week
0 0-59 0-23 1-31 1-12 1-7 or SUN-SAT
The Second field is always set to 0 and cannot be modified.
Cron expressions support several special characters:
*(asterisk) – Matches all values in that field- Example:
*in the Month field means "every month"
- Example:
?(question mark) – Wildcard for day fields (use in either Day-of-Month OR Day-of-Week, but not both)- Equivalent to "any day"
,(comma) – Specifies multiple values- Example:
MON,WED,FRImeans "Monday, Wednesday, and Friday"
- Example:
-(hyphen) – Defines a range of values- Example:
2-6in Day-of-Week means "Monday through Friday"
- Example:
Examples:
0 0 0 * * ?: Every day at midnight UTC0 45 15 * * 1,4,7: Sundays, Wednesdays, and Saturdays at 3:45 PM UTC0 15 7 ? * 2-6: Monday through Friday at 7:15 AM UTC0 0 */6 * * ?: Every 6 hours0 30 9 1 * ?: First day of every month at 9:30 AM UTC
Tips:
- Remember that schedules run in UTC, so convert your local time accordingly
- Use
?in Day-of-Month when specifying Day-of-Week, and vice versa - Day-of-Week uses 1=Sunday through 7=Saturday (or SUN-SAT)
gracePeriodMs Integer
Maximum age allowed for Reflection data used to accelerate queries, in milliseconds.
Example: 10800000
method String
Method used for refreshing the data in Reflections.
-
AUTO: For tables that are in the Apache Iceberg format; Parquet datasets in filesystems; or Parquet datasets, Avro datasets, or non-transactional ORC datasets in AWS Glue. In this case, the method used depends on this algorithm:- The initial refresh of a Reflection is always a full refresh.
- If the Reflection is created from a view that uses nested group-bys, joins, unions, or window functions, then a full refresh is performed.
- If the changes to the base table are only appends, then an incremental refresh based on table snapshots is performed.
- If the changes to the base table include non-append operations, then a partition-based incremental refresh is attempted.
- If the partitions of the base table and the partitions of the Reflection are not compatible, or if either the base table or the Reflection is not partitioned, then a full refresh is performed.
-
FULL: A full refresh is performed at each refresh interval. -
INCREMENTAL: An incremental refresh is performed at each refresh interval.
See Refreshing Reflections for more information.
Example: FULL
refreshField String
For the INCREMENTAL refresh method, the field to refresh for the table. Used only if method is INCREMENTAL. This parameter applies only to tables that are not in the Apache Iceberg format.
Example: business_id
neverExpire Boolean
If the Reflection will never expire, true. Otherwise, false.
neverRefresh Boolean
If the Reflection will never refresh, true. Otherwise, false.
sourceRefreshOnDataChanges Boolean
If the table's source is configured so that Reflections on tables in Iceberg format in the source will refresh when new snapshots are created after an update, true. Otherwise, false.
Attributes of the format Object
type String
Type of data in the table.
Valid Values: Delta, Excel, Iceberg, JSON, Parquet, Text, Unknown, XLS
name String
Table name. Dremio automatically duplicates the name of the origin file or folder to populate this value. The name of the origin file or folder cannot include the following special characters: /, :, [, or ].
Example: restaurant_reviews.parquet
fullPath Array of String
Path to the table within Dremio, expressed as an array. The path lists each level of hierarchy in order, from outer to inner: source first, then folders and subfolders, then the table itself as the last item in the array.
Example:
[
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
]
ctime Integer
Not used. Has the value 0.
isFolder Boolean
If true, the table was created from a folder. If false, the table was created from a file.
location String
Location where the table's metadata is stored within a Dremio source, expressed as a string.
Example: /samples.dremio.com/Dremio University/restaurant_reviews.parquet
ignoreOtherFileFormats Boolean
If true, Dremio ignores all non-Parquet files in the related folder structure, and the promoted table works as if only Parquet files are in the folder structure. Otherwise, false. Included only for Parquet folders.
metaStoreType String
Not used. Has the value HDFS.
parquetDataFormat Object
Information about data format for Parquet tables.
dataFormatTypeList Array of String
List of data format types in the table. Included only for Iceberg tables, and PARQUET is the only valid value.
Example:
[
"PARQUET"
]
sheetName String
For tables created from files with multiple sheets, name of the sheet used to create the table.
Example: location_1
extractHeader Boolean
For tables created from files, true if Dremio extracted the table's column names from the first line of the file. Otherwise, false.
hasMergedCells Boolean
For tables created from files, true if Dremio expanded merged cells in the file when creating the table. Otherwise, false.
fieldDelimiter String
Character used to indicate separate fields in the table. May be , for a comma (default), \t for a tab, | for a pipe, or a custom character.
quote String
Character used for quotes in the table. May be \" for a double quote (default), ' for a single quote, or a custom character.
comment String
Character used to indicate comments in the table. May be # for a number sign (default) or a custom character.
escape String
Character used to indicate an escape in the table. May be \" for a double quote (default), ` for a back quote, \\ for a backward slash, or a custom character.
lineDelimiter String
Character used to indicate separate lines in the table. May be \r\n for a carriage return plus new line (default), \n for a new line, or a custom character.
skipFirstLine Boolean
If Dremio skipped the first line in the file or folder when creating the table, true. Otherwise, false.
autoGenerateColumnNames Boolean
If Dremio used the existing column names in the file or folder for the table columns, true. Otherwise, false.
trimHeader Boolean
If Dremio trimmed column names to a specific number of characters when creating the table, true. Otherwise, false.
autoCorrectCorruptDates Boolean
If Dremio automatically corrects corrupted date fields in the table, true. Otherwise, false.
Attributes of the accessControlList Object
users Array of Object
Information about the users that have been granted privileges on the table and the privileges each user has.
Example:
[
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
},
{
"id": "30fca499-4abc-4469-7142-fc8dd29acac8",
"permissions": [
"SELECT",
"ALTER",
"MANAGE_GRANTS"
]
}
]
roles Array of Object
Information about the roles that have been granted privileges on the table and the privileges each role has.
Example:
[
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
Attributes of Objects in the users and roles Arrays
id String (UUID)
UUID of the role or user.
permissions Array of String
The privileges that the role or user has on the table. See Table Scope in Open Catalog Privileges or Source Privileges.
Example:
[
"SELECT",
"ALTER"
]
Attributes of the owner Object
ownerId String (UUID)
UUID of the owner.
ownerType String
Type of owner for the table. Must be USER or ROLE.
Example: USER
Attributes of Objects in the fields Array
name String
Name of the table field.
Example: review_count
type Object
Information about the table field.
Attributes of the type Object
name String
Name of the table field's type.
Valid Values: STRUCT, LIST, UNION, INTEGER, BIGINT, FLOAT, DOUBLE, VARCHAR, VARBINARY, BOOLEAN, DECIMAL, TIME, DATE, TIMESTAMP, INTERVAL DAY TO SECOND, INTERVAL YEAR TO MONTH
precision Integer
Total number of digits in the number. Included only for the DECIMAL type.
scale Integer
Number of digits to the right of the decimal point. Included only for the DECIMAL type.
subSchema Array of Object
List of objects that represent the field's composition. For example, a field composed of data about a restaurant might have a subSchema with an object for parking options, another for payment methods, and so on. subSchemas may be nested within other subSchemas. subSchema appears only for the STRUCT, LIST, and UNION types.
Attributes of Objects in the subSchema Array
name String
Name for the subSchema object.
Example: Parking
type Object
Object that contains a name attribute that provides the field's type.
Example:
{
"name": "BOOLEAN"
}
Attributes of the parquetDataFormat Object
type String
Type of data in the table. Within the parquetDataFormat object, the only valid type is Parquet.
ctime Integer
Not used. Has the value 0.
isFolder Boolean
If true, the table was created from a folder. If false, the table was created from a file.
autoCorrectCorruptDates Boolean
If true, Dremio automatically corrects corrupted date fields in the table. Otherwise, false.
Format a File or Folder as a Table
Method and URLPOST /v0/projects/{project_id}/catalog/{id}
Parameters
project_id Path String (UUID)
id Path String
Unique identifier of the file or folder you want to format. The ID can be a unique identifier or a text path.
- Unique identifier: Provide the unique identifier of the file or folder.
- Text path: Use URL encoding to replace special characters with their UTF-8-equivalent characters: %3A for a colon, %2F for a forward slash, and %20 for a space. For example, if the ID value is dremio:/Samples/samples.dremio.com/Dremio University, the URI-encoded ID is dremio%3A%2FSamples%2Fsamples.dremio.com%2FDremio%20University.
entityType Body String
Type of catalog entity. To format a file or folder as a table, the entityType is dataset.
path Body Array of String
Path to the file or folder you want to format, expressed as an array. List each level of hierarchy in order, from outer to inner: source first, then any folders and subfolders, then the file or folder itself as the last item in the array. Get the path from the file or folder's children object in the response to a Folder request.
Example:
[
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
]
type Body String
Type of dataset. For tables, the type is PHYSICAL_DATASET.
accelerationRefreshPolicy Object
Attributes that define the acceleration refresh policy for the table.
format Body Object
Formatting parameters for the file or folder.
Parameters of the accelerationRefreshPolicy Object
activePolicyType Body String
Policy to use for refreshing Reflections that are defined on the source. For this option to take effect, the neverRefresh parameter must be set to false.
The possible values are:
NEVER: The Reflections are never refreshed.PERIOD: Default. The Reflections are refreshed at the end of every period that is defined by refreshPeriodMs.SCHEDULE: The Reflections are refreshed according to the schedule that is set by refreshSchedule.REFRESH_ON_DATA_CHANGES: Reflections automatically refresh for underlying tables that are in Iceberg format when new snapshots are created after an update. If the Reflection refresh job finds no changes, then no data is updated. Reflections that are automatically updated based on Iceberg source table changes also update according to the source-level policy as the minimum refresh frequency. Only available for tables in Iceberg format.
refreshPeriodMs Body Integer
Refresh period to use for the data in all Reflections for the table, in milliseconds. Optional if you set activePolicyType to PERIOD. The default setting is 3600000 milliseconds or one hour, which is also the minimum amount of time that is supported.
Example: 3600000
refreshSchedule Body String
A cron expression that sets the schedule, in UTC time, according to which the Reflections that are defined on the source should be refreshed. Optional if you set activePolicyType to SCHEDULE. The default refreshSchedule setting is to refresh every day at 8:00 AM.
Cron expressions consist of six fields that specify when a job should run:
Second Minute Hour Day-of-Month Month Day-of-Week
0 0-59 0-23 1-31 1-12 1-7 or SUN-SAT
The Second field is always set to 0 and cannot be modified.
Cron expressions support several special characters:
*(asterisk) – Matches all values in that field- Example:
*in the Month field means "every month"
- Example:
?(question mark) – Wildcard for day fields (use in either Day-of-Month OR Day-of-Week, but not both)- Equivalent to "any day"
,(comma) – Specifies multiple values- Example:
MON,WED,FRImeans "Monday, Wednesday, and Friday"
- Example:
-(hyphen) – Defines a range of values- Example:
2-6in Day-of-Week means "Monday through Friday"
- Example:
Examples:
0 0 0 * * ?: Every day at midnight UTC0 45 15 * * 1,4,7: Sundays, Wednesdays, and Saturdays at 3:45 PM UTC0 15 7 ? * 2-6: Monday through Friday at 7:15 AM UTC0 0 */6 * * ?: Every 6 hours0 30 9 1 * ?: First day of every month at 9:30 AM UTC
Tips:
- Remember that schedules run in UTC, so convert your local time accordingly
- Use
?in Day-of-Month when specifying Day-of-Week, and vice versa - Day-of-Week uses 1=Sunday through 7=Saturday (or SUN-SAT)
gracePeriodMs Body Integer
Maximum age to allow for Reflection data used to accelerate queries, in milliseconds.
Example: 10800000
method Body String
Method to use for refreshing the data in Reflections.
-
AUTO: For tables that are in the Apache Iceberg format; Parquet datasets in filesystems; or Parquet datasets, Avro datasets, or non-transactional ORC datasets in AWS Glue. In this case, the method used depends on this algorithm:- The initial refresh of a Reflection is always a full refresh.
- If the Reflection is created from a view that uses nested group-bys, joins, unions, or window functions, then a full refresh is performed.
- If the changes to the base table are only appends, then an incremental refresh based on table snapshots is performed.
- If the changes to the base table include non-append operations, then a partition-based incremental refresh is attempted.
- If the partitions of the base table and the partitions of the Reflection are not compatible, or if either the base table or the Reflection is not partitioned, then a full refresh is performed.
-
FULL: A full refresh is performed at each refresh interval. -
INCREMENTAL: An incremental refresh is performed at each refresh interval.
See Refreshing Reflections for more information.
Example: FULL
refreshField Body String
For the INCREMENTAL refresh method, the field to refresh for the table. Used only if the method is INCREMENTAL. This parameter applies only to tables that are not in the Apache Iceberg format.
Example: business_id
neverExpire Body Boolean
If the Reflection should never expire, true. Otherwise, false.
neverRefresh Body Boolean
If the Reflection should never refresh, true. Otherwise, false.
Parameters of the format Object
type Body String
Type of data in the file or folder. To format a folder, all files in the folder must be the same format.
Valid Values: Delta, Excel, Iceberg, JSON, Parquet, Text, Unknown, XLS
ignoreOtherFileFormats Body Boolean Optional
If Dremio should ignore all non-Parquet files in the related folder structure so that the promoted table works as if only Parquet files are in the folder structure, set to true. Otherwise, set to false (default). Optional for Parquet folders.
skipFirstLine Body Boolean Optional
If Dremio should skip the first line in the file or folder when creating the table, true. Otherwise, false (default). Optional for Excel and Text types.
extractHeader Body Boolean Optional
If Dremio should extract the table's column names from the first line of the file, true. Otherwise, false (default). Optional for Excel and Text types.
hasMergedCells Body Boolean Optional
If Dremio should expand merged cells in the file when creating the table, true. Otherwise, false (default). Optional for Excel types.
sheetName Body String Optional
For tables created from Excel files with multiple sheets, name of the sheet to use to create the table. Default is the first sheet in the file (for files with multiple sheets).
Example: location_1
fieldDelimiter Body String Optional
Character to use to indicate separate fields in the table. May be , for a comma (default), \t for a tab, | for a pipe, or a custom character. Optional for Text type.
quote Body String Optional
Character to use for quotes in the table. May be \" for a double quote (default), ' for a single quote, or a custom character. Optional for Text type.
comment Body String Optional
Character to use to indicate comments in the table. May be # for a number sign (default) or a custom character. Optional for Text type.
escape Body String Optional
Character to use to indicate an escape in the table. May be \" for a double quote (default), ` for a back quote, \\ for a backward slash, or a custom character. Optional for Text type.
lineDelimiter Body String Optional
Character to use to indicate separate lines in the table. May be \r\n for a carriage return plus new line (default), \n for a new line, or a custom character. Optional for Text type.
autoGenerateColumnNames Body Boolean Optional
If Dremio should use the existing column names in the file or folder for the table columns, true (default). Otherwise, false. Optional for Text type.
trimHeader Body Boolean Optional
If Dremio should trim column names to a specific number of characters when creating the table, true. Otherwise, false (default). Optional for Text type.
Example of Parquet Format
Requestcurl -X POST "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/$FOLDER_ID" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json' \
--data-raw '{
"entityType": "dataset",
"path": [
"Samples",
"Dremio University",
"restaurant_reviews.parquet"
],
"type": "PHYSICAL_DATASET",
"accelerationRefreshPolicy": {
"activePolicyType": "PERIOD",
"refreshPeriodMs": 3600000,
"refreshSchedule": "0 56 18 * * *",
"gracePeriodMs": 259200000,
"method": "AUTO",
"neverExpire": true,
"neverRefresh": false,
"sourceRefreshOnDataChanges": false
},
"format": {
"type": "Parquet"
},
"accessControlList": {
"users": [
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
},
{
"id": "30fca499-4abc-4469-7142-fc8dd29acac8",
"permissions": [
"SELECT",
"ALTER",
"MANAGE_GRANTS"
]
}
],
"roles": [
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
}
}'
{
"entityType": "dataset",
"id": "c9c11d32-0576-4200-5a5b-8c7229cb3d72",
"type": "PHYSICAL_DATASET",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"createdAt": "2024-01-13T19:52:01.894Z",
"tag": "cb2905bb-39c0-497f-ae74-4c310d534f25",
"isMetadataExpired": false,
"lastMetadataRefreshAt": "2024-01-31T09:50:01.012Z",
"accelerationRefreshPolicy": {
"activePolicyType": "PERIOD",
"refreshPeriodMs": 3600000,
"refreshSchedule": "0 56 18 * * *",
"gracePeriodMs": 259200000,
"method": "AUTO",
"neverExpire": true,
"neverRefresh": false,
"sourceRefreshOnDataChanges": false
},
"format": {
"type": "Parquet",
"name": "restaurant_reviews.parquet",
"fullPath": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"ctime": 0,
"isFolder": false,
"location": "/samples.dremio.com/Dremio University/restaurant_reviews.parquet",
"ignoreOtherFileFormats": false,
"autoCorrectCorruptDates": true
},
"accessControlList": {
"users": [
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
},
{
"id": "30fca499-4abc-4469-7142-fc8dd29acac8",
"permissions": [
"SELECT",
"ALTER",
"MANAGE_GRANTS"
]
}
],
"roles": [
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
},
"owner": {
"ownerId": "30fca499-4abc-4469-7142-fc8dd29acac8",
"ownerType": "USER"
},
"fields": [
{
"name": "_id",
"type": {
"name": "VARCHAR"
}
},
{
"name": "name",
"type": {
"name": "VARCHAR"
}
},
{
"name": "city",
"type": {
"name": "VARCHAR"
}
},
{
"name": "state",
"type": {
"name": "VARCHAR"
}
},
{
"name": "categories",
"type": {
"name": "LIST",
"subSchema": [
{
"type": {
"name": "VARCHAR"
}
}
]
}
},
{
"name": "review_count",
"type": {
"name": "BIGINT"
}
},
{
"name": "stars",
"type": {
"name": "DOUBLE"
}
},
{
"name": "attributes",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "Parking",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "garage",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "street",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "lot",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "valet",
"type": {
"name": "BOOLEAN"
}
}
]
}
},
{
"name": "Accepts Credit Cards",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Wheelchair Accessible",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Price Range",
"type": {
"name": "BIGINT"
}
}
]
}
},
{
"name": "date",
"type": {
"name": "VARCHAR"
}
}
],
"approximateStatisticsAllowed": false
}
Example of Excel Format
Requestcurl -X POST "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/dremio%3A%2FSamples%2Fsamples.dremio.com%2FDremio%20University%2Foracle-departments.xlsx" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json' \
--data-raw '{
"entityType": "dataset",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"oracle-departments.xlsx"
],
"type": "PHYSICAL_DATASET",
"format": {
"type": "Excel",
"extractHeader": true,
"hasMergedCells": true,
"sheetName": "Sheet1"
}
}'
Example of Text Format
Requestcurl -X POST "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/$FILE_ID" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json' \
--data-raw '{
"entityType": "dataset",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"airbnb_listings.csv"
],
"type": "PHYSICAL_DATASET",
"format": {
"type": "Text",
"fieldDelimiter": ",",
"skipFirstLine": false,
"extractHeader": true,
"quote": "\"",
"comment": "#",
"escape": "\"",
"lineDelimiter": "\r\n",
"autoGenerateColumnNames": true,
"trimHeader": false
}
}'
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
500 Internal Server Error
Retrieve a Table by ID
Method and URLGET /v0/projects/{project_id}/catalog/{id}
Parameters
project_id Path String (UUID)
id Path String
May be a Dremio unique identifier or a text path. If the ID is a text path, use URL-encoded format to replace special characters with their UTF-8-equivalent characters: %3A for a colon, %2F for a forward slash, and %20 for a space. For example, if the ID value is dremio:/"My Source"/folder1/, the encoded ID is dremio%3A%2FMy%20Source%2Ffolder1.
Example: dremio%3A%2FMy%20Source%2Ffolder1
Example
Requestcurl -X GET "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/$TABLE_ID" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json'
{
"entityType": "dataset",
"id": "c9c11d32-0576-4200-5a5b-8c7229cb3d72",
"type": "PHYSICAL_DATASET",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"createdAt": "2024-01-13T19:52:01.894Z",
"tag": "cb2905bb-39c0-497f-ae74-4c310d534f25",
"isMetadataExpired": false,
"lastMetadataRefreshAt": "2024-01-31T09:50:01.012Z",
"accelerationRefreshPolicy": {
"activePolicyType": "PERIOD",
"refreshPeriodMs": 3600000,
"refreshSchedule": "0 56 18 * * *",
"gracePeriodMs": 259200000,
"method": "AUTO",
"neverExpire": true,
"neverRefresh": false,
"sourceRefreshOnDataChanges": false
},
"format": {
"type": "Parquet",
"name": "restaurant_reviews.parquet",
"fullPath": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"ctime": 0,
"isFolder": false,
"location": "/samples.dremio.com/Dremio University/restaurant_reviews.parquet",
"ignoreOtherFileFormats": false,
"autoCorrectCorruptDates": true
},
"accessControlList": {
"users": [
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
},
{
"id": "30fca499-4abc-4469-7142-fc8dd29acac8",
"permissions": [
"SELECT",
"ALTER",
"MANAGE_GRANTS"
]
}
],
"roles": [
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
},
"owner": {
"ownerId": "30fca499-4abc-4469-7142-fc8dd29acac8",
"ownerType": "USER"
},
"fields": [
{
"name": "_id",
"type": {
"name": "VARCHAR"
}
},
{
"name": "name",
"type": {
"name": "VARCHAR"
}
},
{
"name": "city",
"type": {
"name": "VARCHAR"
}
},
{
"name": "state",
"type": {
"name": "VARCHAR"
}
},
{
"name": "categories",
"type": {
"name": "LIST",
"subSchema": [
{
"type": {
"name": "VARCHAR"
}
}
]
}
},
{
"name": "review_count",
"type": {
"name": "BIGINT"
}
},
{
"name": "stars",
"type": {
"name": "DOUBLE"
}
},
{
"name": "attributes",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "Parking",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "garage",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "street",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "lot",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "valet",
"type": {
"name": "BOOLEAN"
}
}
]
}
},
{
"name": "Accepts Credit Cards",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Wheelchair Accessible",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Price Range",
"type": {
"name": "BIGINT"
}
}
]
}
},
{
"name": "date",
"type": {
"name": "VARCHAR"
}
}
],
"approximateStatisticsAllowed": false
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
Retrieve a Table by Path
Method and URLGET /v0/projects/{project_id}/catalog/by-path/{path}
Parameters
project_id Path String (UUID)
path Path String
Table location within Dremio, using forward slashes as separators. For example, for a "NYC-taxi-trips" table in the "samples.dremio.com" folder within the source "Samples," the path is Samples/samples.dremio.com/NYC-taxi-trips. If the name of any component in the path includes special characters for URLs, such as spaces, use URL encoding to replace the special characters with their UTF-8-equivalent characters. For example, "Dremio University" should be Dremio%20University in the URL path.
Example: Samples/samples.dremio.com/Dremio%20University/restaurant_reviews.parquet
Example
Requestcurl -X GET "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/by-path/Samples/samples.dremio.com/Dremio%20University/restaurant_reviews.parquet" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json'
{
"entityType": "dataset",
"id": "c9c11d32-0576-4200-5a5b-8c7229cb3d72",
"type": "PHYSICAL_DATASET",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"createdAt": "2024-01-13T19:52:01.894Z",
"tag": "cb2905bb-39c0-497f-ae74-4c310d534f25",
"isMetadataExpired": false,
"lastMetadataRefreshAt": "2024-01-31T09:50:01.012Z",
"accelerationRefreshPolicy": {
"activePolicyType": "PERIOD",
"refreshPeriodMs": 3600000,
"refreshSchedule": "0 56 18 * * *",
"gracePeriodMs": 259200000,
"method": "AUTO",
"neverExpire": true,
"neverRefresh": false,
"sourceRefreshOnDataChanges": false
},
"format": {
"type": "Parquet",
"name": "restaurant_reviews.parquet",
"fullPath": [
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
],
"ctime": 0,
"isFolder": false,
"location": "/samples.dremio.com/Dremio University/restaurant_reviews.parquet",
"ignoreOtherFileFormats": false,
"autoCorrectCorruptDates": true
},
"accessControlList": {
"users": [
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
},
{
"id": "30fca499-4abc-4469-7142-fc8dd29acac8",
"permissions": [
"SELECT",
"ALTER",
"MANAGE_GRANTS"
]
}
],
"roles": [
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
},
"owner": {
"ownerId": "30fca499-4abc-4469-7142-fc8dd29acac8",
"ownerType": "USER"
},
"fields": [
{
"name": "_id",
"type": {
"name": "VARCHAR"
}
},
{
"name": "name",
"type": {
"name": "VARCHAR"
}
},
{
"name": "city",
"type": {
"name": "VARCHAR"
}
},
{
"name": "state",
"type": {
"name": "VARCHAR"
}
},
{
"name": "categories",
"type": {
"name": "LIST",
"subSchema": [
{
"type": {
"name": "VARCHAR"
}
}
]
}
},
{
"name": "review_count",
"type": {
"name": "BIGINT"
}
},
{
"name": "stars",
"type": {
"name": "DOUBLE"
}
},
{
"name": "attributes",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "Parking",
"type": {
"name": "STRUCT",
"subSchema": [
{
"name": "garage",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "street",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "lot",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "valet",
"type": {
"name": "BOOLEAN"
}
}
]
}
},
{
"name": "Accepts Credit Cards",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Wheelchair Accessible",
"type": {
"name": "BOOLEAN"
}
},
{
"name": "Price Range",
"type": {
"name": "BIGINT"
}
}
]
}
},
{
"name": "date",
"type": {
"name": "VARCHAR"
}
}
],
"approximateStatisticsAllowed": false
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
Update a Table
Method and URLPUT /v0/projects/{project_id}/catalog/{id}
Parameters
project_id Path String (UUID)
id Path String (UUID)
UUID for the table you want to update.
entityType Body String
Type of catalog entity. For tables, the entityType is dataset.
path Body Array of String
Path to the table you want to update, expressed in an array. List each level of hierarchy in order, from outer to inner: source first, then any folder and subfolders, then the table itself as the last item in the array. Get the path from the table's children object in the response to a Folder.
Example:
[
"Samples",
"samples.dremio.com",
"Dremio University",
"restaurant_reviews.parquet"
]
tag Body String (UUID) Optional
UUID for the version of the table. If you provide a tag in the request body, Dremio uses the tag value to ensure that you are requesting to update the most recent version of the table. If you do not provide a tag, Dremio automatically updates the most recent version of the table.
type Body String
Type of dataset. For tables, the type is PHYSICAL_DATASET.
accelerationRefreshPolicy Object
Attributes that define the acceleration refresh policy for the table.
format Body Object
Formatting parameters for the table.
accessControlList Body Object Optional
Object used to specify which users and roles should have privileges on the table and the specific privileges each user and role has. May include a users array, a roles array, or both. Omit if you want the table to inherit access and privileges. See Privileges for more information.
Parameters of the accelerationRefreshPolicy Object
activePolicyType Body String
Policy to use for refreshing Reflections that are defined on the source. For this option to take effect, the neverRefresh parameter must be set to false.
The possible values are:
NEVER: The Reflections are never refreshed.PERIOD: Default. The Reflections are refreshed at the end of every period that is defined by refreshPeriodMs.SCHEDULE: The Reflections are refreshed according to the schedule that is set by refreshSchedule.REFRESH_ON_DATA_CHANGES: Reflections automatically refresh for underlying tables that are in Iceberg format when new snapshots are created after an update. If the Reflection refresh job finds no changes, then no data is updated. Reflections that are automatically updated based on Iceberg source table changes also update according to the source-level policy as the minimum refresh frequency. Only available for tables in Iceberg format.
refreshPeriodMs Body Integer
Refresh period to use for the data in all Reflections for the table, in milliseconds. Optional if you set activePolicyType to PERIOD. The default setting is 3600000 milliseconds or one hour, which is also the minimum amount of time that is supported.
Example: 3600000
refreshSchedule Body String
A cron expression that sets the schedule, in UTC time, according to which the Reflections that are defined on the source should be refreshed. Optional if you set activePolicyType to SCHEDULE. The default refreshSchedule setting is to refresh every day at 8:00 a.m.
Cron expressions consist of six fields that specify when a job should run:
Second Minute Hour Day-of-Month Month Day-of-Week
0 0-59 0-23 1-31 1-12 1-7 or SUN-SAT
The Second field is always set to 0 and cannot be modified.
Cron expressions support several special characters:
*(asterisk) – Matches all values in that field- Example:
*in the Month field means "every month"
- Example:
?(question mark) – Wildcard for day fields (use in either Day-of-Month OR Day-of-Week, but not both)- Equivalent to "any day"
,(comma) – Specifies multiple values- Example:
MON,WED,FRImeans "Monday, Wednesday, and Friday"
- Example:
-(hyphen) – Defines a range of values- Example:
2-6in Day-of-Week means "Monday through Friday"
- Example:
Examples:
0 0 0 * * ?: Every day at midnight UTC0 45 15 * * 1,4,7: Sundays, Wednesdays, and Saturdays at 3:45 PM UTC0 15 7 ? * 2-6: Monday through Friday at 7:15 AM UTC0 0 */6 * * ?: Every 6 hours0 30 9 1 * ?: First day of every month at 9:30 AM UTC
Tips:
- Remember that schedules run in UTC, so convert your local time accordingly
- Use
?in Day-of-Month when specifying Day-of-Week, and vice versa - Day-of-Week uses 1=Sunday through 7=Saturday (or SUN-SAT)
gracePeriodMs Body Integer
Maximum age to allow for Reflection data used to accelerate queries, in milliseconds.
Example: 10800000
method Body String
Method used for refreshing the data in Reflections.
-
AUTO: For tables that are in the Apache Iceberg format; Parquet datasets in filesystems; or Parquet datasets, Avro datasets, or non-transactional ORC datasets in AWS Glue.- The initial refresh of a Reflection is always a full refresh.
- If the Reflection is created from a view that uses nested group-bys, joins, unions, or window functions, then a full refresh is performed.
- If the changes to the base table are only appends, then an incremental refresh based on table snapshots is performed.
- If the changes to the base table include non-append operations, then a partition-based incremental refresh is attempted.
- If the partitions of the base table and the partitions of the Reflection are not compatible, or if either the base table or the Reflection is not partitioned, then a full refresh is performed.
-
FULL: A full refresh is performed at each refresh interval. -
INCREMENTAL: An incremental refresh is performed at each refresh interval.
See Refresh Reflections for more information.
Example: FULL
refreshField Body String
For the INCREMENTAL refresh method, the field to refresh for the table. Used only if the method is INCREMENTAL. This parameter applies only to tables that are not in the Apache Iceberg format.
Example: business_id
neverExpire Body Boolean
If the Reflection should never expire, true. Otherwise, false.
neverRefresh Body Boolean
If the Reflection should never refresh, true. Otherwise, false.
Parameters of the format Object
type Body String
Type of data in the table.
Valid Values: Delta, Excel, Iceberg, JSON, Parquet, Text, Unknown, XLS
ignoreOtherFileFormats Body Boolean Optional
If Dremio should ignore all non-Parquet files in the related folder structure so that the promoted table works as if only Parquet files are in the folder structure, set to true. Otherwise, set to false (default). Optional for Parquet folders.
skipFirstLine Body Boolean Optional
If Dremio should skip the first line in the table, true. Otherwise, false (default). Optional for Excel and Text types.
extractHeader Body Boolean Optional
If Dremio should extract the table's column names from the first line of the file, true. Otherwise, false (default). Optional for Excel and Text types.
hasMergedCells Body Boolean Optional
If Dremio should expand merged cells in the table, true. Otherwise, false (default). Optional for Excel types.
fieldDelimiter Body String Optional
Character to use to indicate separate fields in the table. May be , for a comma (default), \t for a tab, | for a pipe, or a custom character. Optional for Text type.
quote Body String Optional
Character to use for quotes in the table. May be \" for a double quote (default), ' for a single quote, or a custom character. Optional for Text type.
comment Body String Optional
Character to use to indicate comments for the table. May be # for a number sign (default) or a custom character. Optional for Text type.
escape Body String Optional
Character to use to indicate an escape for the table. May be \" for a double quote (default), ` for a back quote, \\ for a backward slash, or a custom character. Optional for Text type.
lineDelimiter Body String Optional
Character to use to indicate separate lines for the table. May be \r\n for a carriage return plus new line (default), \n for a new line, or a custom character. Optional for Text type.
autoGenerateColumnNames Body Boolean Optional
If Dremio should use the existing column names for the table columns, true (default). Otherwise, false. Optional for Text type.
trimHeader Body Boolean Optional
If Dremio should trim column names to a specific number of characters when updating the table, true. Otherwise, false (default). Optional for Text type.
Parameters of the accessControlList Object
users Body Array of Object Optional
The users that should have privileges on the table and the privileges each user should have.
Example:
[
{
"id": "c590ed7f-b2b4-4e1f-ba7d-94173afdc9a3",
"permissions": [
"SELECT",
"ALTER"
]
}
]
roles Body Array of Object Optional
The roles that should have privileges on the table and the privileges each role should have.
Example:
[
{
"id": "76a9884b-aea5-46d5-a73a-000edf23f390",
"permissions": [
"SELECT",
"ALTER"
]
}
]
Parameters of Objects in the users and roles Arrays
id Body String (UUID) Optional
UUID of the role or user.
permissions Body Array of String Optional
The privileges that the role or user has on the table. See Table Scope in Open Catalog Privileges or Source Privileges.
Example:
[
"SELECT",
"ALTER"
]
Example
Requestcurl -X PUT "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/$TABLE_ID" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json' \
--data-raw '{
"entityType": "dataset",
"id": "dba1e4fe-6351-44d2-a3e0-7aa20e782bf3",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"airbnb_listings.csv"
],
"type": "PHYSICAL_DATASET",
"format": {
"type": "Text",
"fieldDelimiter": ",",
"skipFirstLine": false,
"extractHeader": true,
"quote": "\"",
"comment": "#",
"escape": "\"",
"lineDelimiter": "\r\n",
"autoGenerateColumnNames": true,
"trimHeader": true
}
}'
{
"entityType": "dataset",
"id": "dba1e4fe-6351-44d2-a3e0-7aa20e782bf3",
"type": "PHYSICAL_DATASET",
"path": [
"Samples",
"samples.dremio.com",
"Dremio University",
"airbnb_listings.csv"
],
"createdAt": "2024-01-23T21:26:59.568Z",
"tag": "fc1707df-35a1-45c1-87d7-5f66fb11a729",
"isMetadataExpired": false,
"lastMetadataRefreshAt": "2024-01-31T09:50:01.012Z",
"accelerationRefreshPolicy": {
"activePolicyType": "PERIOD",
"refreshPeriodMs": 3600000,
"refreshSchedule": "0 56 18 * * *",
"gracePeriodMs": 259200000,
"method": "AUTO",
"neverExpire": true,
"neverRefresh": false,
"sourceRefreshOnDataChanges": false
},
"format": {
"type": "Text",
"ctime": 0,
"isFolder": false,
"location": "/samples.dremio.com/Dremio University/airbnb_listings.csv",
"fieldDelimiter": ",",
"skipFirstLine": false,
"extractHeader": true,
"quote": "\"",
"comment": "#",
"escape": "\"",
"lineDelimiter": "\r\n",
"autoGenerateColumnNames": true,
"trimHeader": true
},
"accessControlList": {},
"owner": {
"ownerId": "c590ed7f-7142-4e1f-ba7d-94173afdc9a3",
"ownerType": "USER"
},
"fields": [
{
"name": "id",
"type": {
"name": "VARCHAR"
}
},
{
"name": "listing_url",
"type": {
"name": "VARCHAR"
}
},
{
"name": "scrape_id",
"type": {
"name": "VARCHAR"
}
},
{
"name": "last_scraped",
"type": {
"name": "VARCHAR"
}
},
{
"name": "name",
"type": {
"name": "VARCHAR"
}
},
{
"name": "summary",
"type": {
"name": "VARCHAR"
}
},
{
"name": "reviews_per_month",
"type": {
"name": "VARCHAR"
}
}
],
"approximateStatisticsAllowed": false
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
500 Internal Server Error
Refresh the Reflections on a Table
Method and URLPOST /v0/projects/{project_id}/catalog/{id}/refresh
Parameters
project_id Path String (UUID)
id Path String (UUID)
UUID for the table.
Example
Requestcurl -X POST "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/$TABLE_ID/refresh" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json'
A successful request returns an empty response body with HTTP status 204 No Content.
Response Status Codes
204 No Content
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
Revert a Table to a File or Folder
Method and URLDELETE /v0/projects/{project_id}/catalog/{id}
Parameters
project_id Path String (UUID)
id Path String (UUID)
UUID for the table.
Example
Requestcurl -X DELETE "https://api.dremio.cloud/v0/projects/$PROJECT_ID/catalog/$TABLE_ID" \
-H "Authorization: Bearer $DREMIO_TOKEN" \
-H 'Content-Type: application/json'
A successful request returns an empty response body with HTTP status 204 No Content.
Response Status Codes
204 No Content
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found