Source
The Source API enables you to manage the sources that are supported in Dremio. You can add, retrieve, update, and delete sources. Sources are of entityType: source
.
The primary Arctic catalog is created when you add a Sonar project and provides data management capabilities for your project. The primary Arctic catalog is considered a source entity within the API.
The source object contains the same attributes for all source types except for the config object, which is different for each source type. The examples on this page use an Amazon S3 source to demonstrate the available endpoints. For configuration information for each source, see Source Configuration.
{
"entityType": "source",
"config": {
"accessKey": "AKIAIOSFODNN7EXAMPLE",
"accessSecret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"secure": true,
"whitelistedBuckets": [
"sales",
"marketing"
],
"rootPath": "/",
"enableAsync": true,
"compatibilityMode": false,
"isCachingEnabled": true,
"maxCacheSpacePct": 100,
"requesterPays": false,
"enableFileStatusCheck": true,
"defaultCtasFormat": "ICEBERG",
"isPartitionInferenceEnabled": false,
"credentialType": "ACCESS_KEY"
},
"id": "4abb1367-5282-43e8-a0f6-54a8ae6428ed",
"tag": "676e73e0-8b51-44d0-972d-ef80c29ac0c6",
"type": "S3",
"name": "source-name",
"createdAt": "2022-07-26T19:56:28.826Z",
"metadataPolicy": {
"authTTLMs": 86400000,
"namesRefreshMs": 3600000,
"datasetRefreshAfterMs": 3600000,
"datasetExpireAfterMs": 10800000,
"datasetUpdateMode": "PREFETCH_QUERIED",
"deleteUnavailableDatasets": true,
"autoPromoteDatasets": false
},
"accelerationGracePeriodMs": 10800000,
"accelerationRefreshPeriodMs": 3600000,
"accelerationNeverExpire": false,
"accelerationNeverRefresh": false,
"accelerationActivePolicyType": "NEVER",
"accelerationRefreshSchedule": "",
"children": [
{
"id": "dremio:/customer-records/sales",
"path": [
"customer-records",
"sales"
],
"type": "CONTAINER",
"containerType": "FOLDER"
},
{
"id": "dremio:/customer-records/marketing",
"path": [
"customer-records",
"marketing"
],
"type": "CONTAINER",
"containerType": "FOLDER"
},
],
"allowCrossSourceSelection": false,
"disableMetadataValidityCheck": false,
"accessControlList": {
"roles": [
{
"id": "1e5f38e4-8209-46dc-96f0-cfbd3276dbd8",
"permissions": [
"ALTER",
"CREATE_TABLE",
"DROP",
"INSERT",
"DELETE",
"UPDATE",
"TRUNCATE",
"VIEW_REFLECTION",
"ALTER_REFLECTION",
"MODIFY",
"MANAGE_GRANTS",
"SELECT"
]
}
],
"users": [
{
"id": "4deaf0e4-f05c-441c-bb37-d4d6e43fdec7",
"permissions": [
"VIEW_REFLECTION",
"SELECT"
]
}
]
},
"permissions": [
"READ",
"WRITE",
"ALTER_REFLECTION",
"SELECT",
"ALTER",
"VIEW_REFLECTION",
"MODIFY",
"MANAGE_GRANTS",
"CREATE_TABLE",
"DROP",
"EXTERNAL_QUERY",
"INSERT",
"TRUNCATE",
"DELETE",
"UPDATE",
"EXECUTE",
"CREATE_SOURCE",
"ALL"
],
"accelerationRefreshOnDataChanges": false,
"owner": {
"ownerId": "5c9c394d-7395-4687-97dd-beac4384c359",
"ownerType": "USER"
}
}
Source Attributes
entityType String
Specifies the type of container. For sources, the type is source
.
config Object
Configuration options for the specified data source.
id String (UUID)
The unique identifier that is generated to identify a source.
Example: 4abb1367-5282-43e8-a0f6-54a8ae6428ed
tag String (UUID)
Identifies the instance of the source. Dremio changes this tag whenever a change is made to the source. Immutable by the user.
Example: 676e73e0-8b51-44d0-972d-ef80c29ac0c6
type String
Identifies the source type that you are configuring. The source type determines the config
parameters that display. Read Attributes of the config
Object for more information.
Enum: ARCTIC, AWSGLUE, AZURE_STORAGE, MSSQL, MYSQL, ORACLE, POSTGRES, REDSHIFT, S3, SNOWFLAKE
name String
The user-defined name of the source.
Example: customer-records
createdAt String
The date and time that the source was created.
Example: 2022-07-26T19:56:28.826Z
metadataPolicy Object
The policies covering the update of a source’s metadata.
accelerationGracePeriodMs Integer
Identifies the length of time to keep reflections for all the datasets in a source before these reflections expire. The default setting is 10800000
milliseconds or three hours. The minimum amount of time that is supported is 3600000
milliseconds or one hour. Read Setting the Expiration Policy for Reflections for more information.
Example: 10800000
accelerationRefreshPeriodMs Integer
Identifies the refresh frequency for a dataset's reflections. The default setting is 3600000
milliseconds or one hour, which is also the minimum amount of time that is supported.
Example: 3600000
accelerationNeverExpire Boolean
Option to set an expiration for reflections. Default setting is false
. Set to true
to prevent reflections from expiring and to override the accelerationGracePeriodMs setting.
accelerationNeverRefresh Boolean
Option to set a refresh for reflections. Default setting is false
. Set to true
to prevent reflections from refreshing and to override the accelerationRefreshPeriodMs setting.
accelerationActivePolicyType String
Option to set the policy for refreshing reflections that are defined on the source. For this option to take effect, the accelerationNeverRefresh parameter must be set to false
.
The possible values are:
NEVER
: The reflections are never refreshed.PERIOD
: The reflections are refreshed at the end of every period that is defined by accelerationRefreshPeriodMs.SCHEDULE
: The reflections are refreshed according to the schedule that is set by accelerationRefreshSchedule.
accelerationRefreshSchedule String
A cron expression that sets the schedule, in UTC time, according to which the reflections that are defined on the source are refreshed.
Field | Allowed Values | Allowed Special Characters |
---|---|---|
Second | 0 | N/A |
Minute | 0-59 | N/A |
Hour | 0-23 | N/A |
Day of month | N/A | * ? |
Month | N/A | * ? |
Days of week | 1-7 or SUN-SAT | , - * ? |
Special Character | Description |
---|---|
* | Used to specify all values for a field. For Day of month , specifies every day of the month. For Month , specifies every month. For Days of week , specifies every day of the week. |
? | Equivalent to *. |
, | Used to specify two or more days in the Days of week field. For example, MON,WED,FRI . |
- | Used to specify ranges in the Days of week field. For example, 1-3 is equivalent to Sunday, Monday, and Tuesday . |
Examples:
0 0 0 * * ?
: Refreshes every day at midnight.
0 45 15 * * 1,4,7
: Refreshes at 15:45 on Sunday, Wednesday, and Saturday.
0 15 7 ? * 2-6
: Refreshes at 7:15 on Monday and Friday.
children Object
The catalog entities (including datasets, files, and folders) contained in the source.
allowCrossSourceSelection Boolean
For queries that can select from multiple sources, this option enables a source to be available in that query. Default setting is false
. Set to true
to enable this option.
disableMetadataValidityCheck Boolean
Option to set a validity check for the source's metadata. Default setting is false
. Set to true
to disable the validity check. This attribute is not supported by default. Contact Dremio Support to enable the option.
accessControlList Object
Information about users and roles with privileges on the source and the specific privileges each user or role has. May include a users array, a roles array, or both, depending on the configured access and privileges. The accessControlList object is empty if source-specific access control privileges are not set. For Arctic sources and the primary Arctic catalog, the source object does not include the accessControlList object.
permissions Array of String
The permissions that the user submitting the API call has to a source. This will be an empty array, unless the query parameter include=permissions
is set. When set, the permission values that are available will be based on the source you are connecting to. Read Privileges for more information.
accelerationRefreshOnDataChanges Boolean
If reflections automatically refresh for underlying tables that are in Iceberg format when new snapshots are created after an update, true
. Otherwise, false
.
owner Object
The owner of the source.
Attributes of the config
Object
accessKey String
A unique identifier from your AWS account used in conjunction with a secret access key to sign API requests made to your data source.
Example: AKIAIOSFODNN7EXAMPLE
accessSecret String
A secret access key from your AWS account that is used in conjunction with an access key to cryptographically sign programmatic requests. Signing a request identifies the sender and prevents the request from being altered.
Example: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
secure Boolean
Option to enable SSL encryption for the source. Default setting is true
. Set to false
to disable SSL encryption.
whitelistedBuckets Array of String
A list of allowed Amazon S3 buckets.", "example": "
Example: ["sales","marketing"]
rootPath String
Identifies the root path of the S3 bucket.
Example: /
enableAsync Boolean
Option to enable asynchronous access, which allows cloud caching for the source to support simultaneous actions such as adding and editing. Default setting is true
. Set to false
to disable asynchronous access.
compatibilityMode Boolean
Option to enable compatibility mode for an Amazon S3 source, which allows you to use S3-compatible storage. Default setting is false
. Set to true
to enable compatibility mode.
isCachingEnabled Boolean
Option to enable local caching. Default setting is true
. Set to false
to disable local caching.
maxCacheSpacePct Integer
Sets the maximum percent of total available cache space to use, when this space is available. The minimum value is 1
and the maximum value is 100
. The default value is 100
.
Example: 100
requesterPays Boolean
Option to apply S3 requests to the requestor. Default setting is false
. Set to true
to enable this option.
enableFileStatusCheck Boolean
Option to enable file status check. Default setting is true
. Set to false
to disable file status check.
defaultCtasFormat String
Sets the default format for tables that are created in Dremio. The default format that is used is ICEBERG
.
Enum: ICEBERG, PARQUET
isPartitionInferenceEnabled Boolean
Option to enable partition column inference. The default setting is false
. Set to true
to enable partition column inference.
credentialType String
Specifies the type of credential to enable Dremio to connect to the source.
Enum: ACCESS_KEY, PROJECT_DATA_CRED
Attributes of the metadataPolicy
Object
authTTLMs Integer
Sets the length of time, in milliseconds, that source permissions are cached. For example, if the default 24 hours (86400000
milliseconds) is used, then for each file you query, your permission status is checked once every 24 hours. The minimum supported time period is one minute (60000
milliseconds).
Example: 86400000
namesRefreshMs Integer
Sets when to run a refresh of a source, in milliseconds. The default time period is one hour (3600000
milliseconds). The minimum supported time period is one minute (60000
milliseconds).
Example: 3600000
datasetRefreshAfterMs Integer
Determines how often the metadata in the dataset is refreshed, in milliseconds. The default refresh rate is one hour (3600000
milliseconds). The minimum supported refresh rate is one minute (60000
milliseconds).
Example: 3600000
datasetExpireAfterMs Integer
Sets the amount of time, in milliseconds, to keep the metadata before it expires. The default time period is one hour (3600000
milliseconds). The minimum supported time period is one minute (60000
milliseconds).
Example: 10800000
datasetUpdateMode String
Sets the metadata policy for when a dataset is updated. Use PREFETCH_QUERIED
to update the details for previously queried objects in a source.
Example: PREFETCH_QUERIED
deleteUnavailableDatasets Boolean
Option to remove dataset definitions if the underlying data is unavailable to Dremio. The default setting is true
. Set to false
to keep the dataset definitions.
autoPromoteDatasets Boolean
Option to automatically format files into tables when a query is issued. The default setting is false
. Set to true
to enable this capability. This attribute applies only to metastore and object storage sources.
Attributes of Objects in the children
Array
id String
The catalog entity ID, which is generated by Dremio and is immutable.
Example: dremio:/customer-records/sales
path String
The catalog entity path, which is immutable.
Example: customer-records
type String
The type of the object that is contained in the source. This typing is generated by Dremio and is immutable. For folders, the type is CONTAINER
. For datasets, the type is DATASET
. For Parquet and other file formats, the type is FILE
.
Enum: CONTAINER, DATASET, FILE
containerType String
Identifies the container type. This attribute is displayed only if type
is set to CONTAINER
. This attribute is automatically generated by Dremio and is immutable.
Enum: FOLDER
datasetType String
Identifies the dataset type. This attribute is displayed only if type
is set to DATASET
. The dataset can be one of the following types:
DIRECT
: A dataset that is from an external source (such as PostgreSQL).PROMOTED
: A table that is promoted from a source.VIRTUAL
: A view that is created on another dataset. This attribute is automatically generated by Dremio and is immutable.
Enum: DIRECT, PROMOTED, VIRTUAL
Attributes of the accessControlList
Object
roles Array of Object
Information about the roles that have been granted privileges on the source and the privileges each role has.
users Array of Object
Information about the users that have been granted privileges on the source and the privileges each user has.
Attributes of Objects in the roles
Array
id String
The ID of the role.
Example: 1e5f38e4-8209-46dc-96f0-cfbd3276dbd8
permissions Array of String
The privileges that the role has on the source. Read Privileges for more information.
Attributes of Objects in the users
Array
id String
The ID of the user.
permissions Array of String
The privileges that the user has on the source. Read Privileges for more information.
Attributes of the owner
Object
ownerId String (UUID)
The unique identifier for the owner (user or role) of the source. This identifier corresponds to the user_id
or role_id
in the users
or roles
system table. Read the Users system table for more information.
Example: 5c9c394d-7395-4687-97dd-beac4384c359
ownerType String
The type of the owner.
Enum: USER, ROLE
Creating a Source
Create a new source.
When you create a source that supports dataset discovery, metadata is initially populated in the catalog for each table that is accessible in the source. If the source contains a very large number of tables, this process may take longer than the request timeout that is configured by the Dremio Cloud gateway, and the response will contain a 502 or 504 response code. In this case, instead of retrying the request, we recommend that you poll the Retrieving a Source by Path endpoint to await the source's creation.
POST /v0/projects/{project-id}/catalog
Parameters
project-id Path String (UUID)
The unique identifier for the project that contains the source that you want to add.
Example: 02d36975-73eb-47ed-9bb5-de73060380f6
entityType Body String
Specifies the type of container. For sources, the type is source
.
config Body Object
Configuration options for the specified data source.
type Body String
Identifies the source type that you are configuring. The source type determines the config
parameters that display. Read Parameters of the config
Object for more information.
Enum: ARCTIC, AWSGLUE, MSSQL, MYSQL, ORACLE, POSTGRES, REDSHIFT, S3, SNOWFLAKE
name Body String
The user-defined name of the source. The name cannot include the following special characters: /
, :
, [
, or ]
.
Example: customer-records
metadataPolicy Body Object
The policies covering the update of a source’s metadata.
accelerationGracePeriodMs Body Integer
Identifies the length of time to keep reflections for all the datasets in a source before these reflections expire. The default setting is 10800000
milliseconds or three hours. The minimum amount of time that is supported is 3600000
milliseconds or one hour. Read Setting the Expiration Policy for Reflections for more information.
Example: 10800000
accelerationRefreshPeriodMs Body Integer
Identifies the refresh frequency for a dataset's reflections. Optional if you set accelerationActivePolicyType to PERIOD
. The default setting is 3600000
milliseconds or one hour, which is also the minimum amount of time that is supported.
Example: 3600000
accelerationActivePolicyType Body String
Option to set the policy for refreshing reflections that are defined on the source. For this option to take effect, the accelerationNeverRefresh parameter must be set to false
.
The possible values are:
NEVER
: The reflections are never refreshed.PERIOD
: Default. The reflections are refreshed at the end of every period that is defined by accelerationRefreshPeriodMs.SCHEDULE
: The reflections are refreshed according to the schedule that is set by accelerationRefreshSchedule.
accelerationRefreshSchedule Body String
A cron expression that sets the schedule, in UTC time, according to which the reflections that are defined on the source are refreshed. Optional if you set accelerationActivePolicyType to SCHEDULE
. The default accelerationRefreshSchedule setting is to refresh every day at 8:00 a.m.
Field | Allowed Values | Allowed Special Characters |
---|---|---|
Second | 0 | N/A |
Minute | 0-59 | N/A |
Hour | 0-23 | N/A |
Day of month | N/A | * ? |
Month | N/A | * ? |
Days of week | 1-7 or SUN-SAT | , - * ? |
Special Character | Description |
---|---|
* | Used to specify all values for a field. For Day of month , specifies every day of the month. For Month , specifies every month. For Days of week , specifies every day of the week. |
? | Equivalent to *. |
, | Used to specify two or more days in the Days of week field. For example, MON,WED,FRI . |
- | Used to specify ranges in the Days of week field. For example, 1-3 is equivalent to Sunday, Monday, and Tuesday . |
Examples:
0 0 0 * * ?
: Refreshes every day at midnight.
0 45 15 * * 1,4,7
: Refreshes at 15:45 on Sunday, Wednesday, and Saturday.
0 15 7 ? * 2-6
: Refreshes at 7:15 on Monday and Friday.
accelerationRefreshOnDataChanges Body Boolean
To refresh reflections on underlying tables that are in Iceberg format in the source when new snapshots are created after an update, true
. Otherwise, false
. Reflections that are automatically updated based on Iceberg source table changes also update according to the source-level policy as the minimum refresh frequency. For this option to take effect, the source must support Iceberg table format, the accelerationNeverRefresh parameter must be set to false
, and the accelerationActivePolicyType parameter must be set to either PERIOD
or SCHEDULE
.
accessControlList Body Object
Object used to specify which users and roles should have privileges on the source and the specific privileges each user and role has. May include a users array, a roles array, or both. Not supported for Arctic sources and the primary Arctic catalog.
Parameters of the config
Object
hostname Body String
The name of the server you are connecting to.
Example: 177.15.0.112
port Body String
The port number of the server you are connecting to.
Example: 3306
authenticationType Body String
The type of authentication needed to connect to the server. Use ANONYMOUS when no authentication is needed and MASTER when credentials from a master database user is required.
Enum: ANONYMOUS, MASTER
netWriteTimeout Body Integer
The amount of time, in seconds, to wait for data from the server before aborting the connection.
Example: 60
fetchSize Body Integer
The number of records to fetch per request. Set to 0
to enable Dremio to automatically set the size.
Example: 200
maxIdleConns Body Integer
Sets the maximum number of idle connections that you want to keep.
Example: 8
idleTimeSec Body Integer
Sets the idle time, in seconds, before a connection is evaluated for closure.
Example: 60
propertyList Body Array of Object
An array of name/value pairs defining optional connection properties that is used by the source. Use a comma separated list to specify multiple name value pairs.
Parameters of the metadataPolicy
Object
authTTLMs Body Integer
Sets the expiration for an authorization to a metadata policy, in milliseconds. The default time period is 24 hours (86400000
milliseconds). The minimum supported time period is one minute (60000
milliseconds).
Example: 86400000
namesRefreshMs Body Integer
Sets when to run a refresh of a source, in milliseconds. The default time period is one hour (3600000
milliseconds). The minimum supported time period is one minute (60000
milliseconds).
Example: 86400000
datasetRefreshAfterMs Body Integer
Determines how often the metadata in the dataset is refreshed, in milliseconds. The default refresh rate is one hour (3600000
milliseconds). The minimum supported refresh rate is one minute (60000
milliseconds).
Example: 86400000
datasetExpireAfterMs Body Integer
Sets the amount of time, in milliseconds, to keep the metadata before it expires. The default time period is one hour (3600000
milliseconds). The minimum supported time period is one minute (60000
milliseconds).
Example: 259200000
datasetUpdateMode Body String
Sets the metadata policy for when a dataset is updated. Use PREFETCH_QUERIED
to update the details for previously queried objects in a source.
Enum: PREFETCH_QUERIED
deleteUnavailableDatasets Body Boolean
Option to remove dataset definitions if the underlying data is unavailable to Dremio, for example, at the time the data is queried. The default setting is true
. Set to false
to keep the dataset definitions.
Attributes of the accessControlList
Object
roles Body Array of Object
The roles that should have privileges on the source and the privileges each role should have.
users Body Array of Object
The users that should have privileges on the source and the privileges each user should have.
Parameters of Objects in the roles
Array
id Body String
The ID of the role.
Example: 9ab42853-bdef-465f-b9bb-381a13a9bf78
permissions Body Array of String
The privileges that the role should have on the source. Read Privileges for more information.
Parameters of Objects in the users
Array
id Body String
The ID of the user.
permissions Body Array of String
The privileges that the user should have on the source. Read Privileges for more information.
Example Requestcurl -X POST 'https://api.dremio.cloud/v0/projects/02d36975-73eb-47ed-9bb5-de73060380f6/catalog' \
-H 'Authorization: Bearer <personal access token>' \
-H 'Content-Type: application/json' \
-d '{
"entityType": "source",
"config": {
"hostname": "177.15.0.112",
"port": "3306",
"authenticationType": "ANONYMOUS",
"netWriteTimeout": 60,
"fetchSize": 200,
"maxIdleConns": 8,
"idleTimeSec": 60,
"propertyList": []
},
"type": "MYSQL",
"name": "recruitingdb",
"metadataPolicy": {
"authTTLMs": 86400000,
"namesRefreshMs": 86400000,
"datasetRefreshAfterMs": 86400000,
"datasetExpireAfterMs": 259200000,
"datasetUpdateMode": "PREFETCH_QUERIED",
"deleteUnavailableDatasets": true
},
"accelerationGracePeriodMs": 10800000,
"accelerationRefreshPeriodMs": 3600000,
"accelerationActivePolicyType": "PERIOD",
"accessControlList": {
"roles": [
{
"id": "9ab42853-bdef-465f-b9bb-381a13a9bf78",
"permissions": []
}
],
"users": []
}
}'
{
"entityType": "source",
"config": {
"hostname": "177.15.0.112",
"port": "3306",
"authenticationType": "ANONYMOUS",
"netWriteTimeout": 60,
"fetchSize": 200,
"maxIdleConns": 8,
"idleTimeSec": 60,
"propertyList": []
},
"id": "629904cf-9c06-4ae6-8cc1-faf1d2f1ab5f",
"tag": "a927dd14-2c7d-4684-9de2-4000e05c0d07",
"type": "MYSQL",
"name": "recruitingdb",
"createdAt": "2022-09-12T07:22:19.334Z",
"metadataPolicy": {
"authTTLMs": 86400000,
"namesRefreshMs": 86400000,
"datasetRefreshAfterMs": 86400000,
"datasetExpireAfterMs": 259200000,
"datasetUpdateMode": "PREFETCH_QUERIED",
"deleteUnavailableDatasets": true
},
"accelerationGracePeriodMs": 10800000,
"accelerationRefreshPeriodMs": 3600000,
"accelerationActivePolicyType": "PERIOD",
"accelerationNeverExpire": false,
"accelerationNeverRefresh": false,
"children": [
{
"id": "42312a90-1723-4bcb-b62c-bf9abee371b1",
"path": [
"recruitingdb",
"contacts"
],
"type": "CONTAINER",
"containerType": "FOLDER"
}
],
"allowCrossSourceSelection": false,
"disableMetadataValidityCheck": false,
"accessControlList": {
"roles": [
{
"id": "9ab42853-bdef-465f-b9bb-381a13a9bf78",
"permissions": []
}
],
"users": []
},
"permissions": [],
"checkTableAuthorizer": false,
"accelerationRefreshOnDataChanges": false,
"owner": {
"ownerId": "5c9c394d-7395-4687-97dd-beac4384c359",
"ownerType": "USER"
}
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
502 Bad Gateway
504 Gateway Timeout
Retrieving a Source by ID
Get information about the source using the source ID.
Method and URLGET /v0/projects/{project-id}/catalog/{id}
Parameters
project-id Path String (UUID)
The unique identifier for the project that contains the source.
Example: 02d36975-73eb-47ed-9bb5-de73060380f6
id Path String (UUID)
The unique identifier for the source that you want to retrieve.
Example: ffbe8c1d-1db7-48d1-9c58-f452838fedc0
include Query String Optional
When using include
, the name of a non-default response field is returned. You can retrieve either a list of access control list names (aclNames
) or a list of permissions
. Read Include and Exclude Query Parameters for usage examples.
Enum: aclNames, permissions
Example: ?include=permissions
exclude Query String Optional
When using exclude
, the children
field is excluded from the response. Read Include and Exclude Query Parameters for usage examples.
Enum: children
Example: ?exclude=children
maxChildren Query Integer Optional
Specify the maximum number of child objects to include in each page of results. Use in concert with the pageToken query parameter to split large sets of results into multiple pages. For more information, read maxChildren Query Parameter.
NOTE: The maxChildren query parameter is not supported for filesystem sources.
Example: ?maxChildren=25
pageToken Query String Optional
Specify the token for retrieving the next page of results. Must be used in concert with the maxChildren query parameter: the first request URL includes maxChildren set to the maximum number of child objects to include in each page of results. If the source has more child objects than the specified maxChildren value, the response includes a nextPageToken attribute. Add the pageToken query parameter with the nextPageToken value to the request URL to retrieve the next page of results. Do not remove or change the maxChildren query parameter when you add pageToken to the request URL. Read pageToken Query Parameter: User-Specified Maximum for more information.
NOTE: Dremio ignores the pageToken query parameter for filesystem sources.
Example: ?pageToken=cHAAFLceQCKsTVpwaEVisqgjDntZJUCuTqVNghPdkyBDUNoJvwrEXAMPLE
Example Requestcurl -X GET 'https://api.dremio.cloud/v0/projects/02d36975-73eb-47ed-9bb5-de73060380f6/catalog/ffbe8c1d-1db7-48d1-9c58-f452838fedc0' \
-H 'Authorization: Bearer <personal access token>' \
-H 'Content-Type: application/json'
{
"entityType": "source",
"config": {
"hostname": "177.15.0.112",
"port": "3306",
"authenticationType": "ANONYMOUS",
"netWriteTimeout": 60,
"fetchSize": 200,
"maxIdleConns": 8,
"idleTimeSec": 60,
"propertyList": []
},
"id": "629904cf-9c06-4ae6-8cc1-faf1d2f1ab5f",
"tag": "a927dd14-2c7d-4684-9de2-4000e05c0d07",
"type": "MYSQL",
"name": "recruitingdb",
"createdAt": "2022-09-12T07:22:19.334Z",
"metadataPolicy": {
"authTTLMs": 86400000,
"namesRefreshMs": 86400000,
"datasetRefreshAfterMs": 86400000,
"datasetExpireAfterMs": 259200000,
"datasetUpdateMode": "PREFETCH_QUERIED",
"deleteUnavailableDatasets": true
},
"accelerationGracePeriodMs": 10800000,
"accelerationRefreshPeriodMs": 3600000,
"accelerationActivePolicyType": "PERIOD",
"accelerationNeverExpire": false,
"accelerationNeverRefresh": false,
"children": [
{
"id": "42312a90-1723-4bcb-b62c-bf9abee371b1",
"path": [
"recruitingdb",
"contacts"
],
"type": "CONTAINER",
"containerType": "FOLDER"
}
],
"allowCrossSourceSelection": false,
"disableMetadataValidityCheck": false,
"accessControlList": {},
"permissions": [],
"checkTableAuthorizer": false,
"accelerationRefreshOnDataChanges": false,
"owner": {
"ownerId": "5c9c394d-7395-4687-97dd-beac4384c359",
"ownerType": "USER"
}
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
Retrieving a Source by Path
Get information about the source using the source's path.
Method and URLGET /v0/projects/{project-id}/catalog/by-path/{path}
Parameters
project-id Path String (UUID)
The unique identifier for the project that contains the source.
Example: 02d36975-73eb-47ed-9bb5-de73060380f6
path Path String (UUID)
Source's location within Dremio, using forward slashes as separators. For example, for the source "Samples," the path is Samples
. If the name of any component in the path includes special characters for URLs, such as spaces, use URL encoding to replace the special characters with their UTF-8-equivalent characters. For example, "Dremio University" should be Dremio%20University
in the URL path.
Example: recruitingdb
maxChildren Query Integer Optional
Specify the maximum number of child objects to include in each page of results. Use in concert with the pageToken query parameter to split large sets of results into multiple pages. For more information, read maxChildren Query Parameter.
NOTE: The maxChildren query parameter is not supported for filesystem sources.
Example: ?maxChildren=25
pageToken Query String Optional
Specify the token for retrieving the next page of results. Must be used in concert with the maxChildren query parameter: the first request URL includes maxChildren set to the maximum number of child objects to include in each page of results. If the source has more child objects than the specified maxChildren value, the response includes a nextPageToken attribute. Add the pageToken query parameter with the nextPageToken value to the request URL to retrieve the next page of results. Do not remove or change the maxChildren query parameter when you add pageToken to the request URL. Read pageToken Query Parameter: User-Specified Maximum for more information.
NOTE: Dremio ignores the pageToken query parameter for filesystem sources.
Example: ?pageToken=cHAAFLceQCKsTVpwaEVisqgjDntZJUCuTqVNghPdkyBDUNoJvwrEXAMPLE
Example Requestcurl -X GET 'https://api.dremio.cloud/v0/projects/02d36975-73eb-47ed-9bb5-de73060380f6/catalog/by-path/recruitingdb' \
-H 'Authorization: Bearer <personal access token>' \
-H 'Content-Type: application/json'
{
"entityType": "source",
"config": {
"hostname": "177.15.0.112",
"port": "3306",
"authenticationType": "ANONYMOUS",
"netWriteTimeout": 60,
"fetchSize": 200,
"maxIdleConns": 8,
"idleTimeSec": 60,
"propertyList": []
},
"id": "629904cf-9c06-4ae6-8cc1-faf1d2f1ab5f",
"tag": "a927dd14-2c7d-4684-9de2-4000e05c0d07",
"type": "MYSQL",
"name": "recruitingdb",
"createdAt": "2022-09-12T07:22:19.334Z",
"metadataPolicy": {
"authTTLMs": 86400000,
"namesRefreshMs": 86400000,
"datasetRefreshAfterMs": 86400000,
"datasetExpireAfterMs": 259200000,
"datasetUpdateMode": "PREFETCH_QUERIED",
"deleteUnavailableDatasets": true
},
"accelerationGracePeriodMs": 10800000,
"accelerationRefreshPeriodMs": 3600000,
"accelerationActivePolicyType": "PERIOD",
"accelerationNeverExpire": false,
"accelerationNeverRefresh": false,
"children": [
{
"id": "42312a90-1723-4bcb-b62c-bf9abee371b1",
"path": [
"recruitingdb",
"contacts"
],
"type": "CONTAINER",
"containerType": "FOLDER"
}
],
"allowCrossSourceSelection": false,
"disableMetadataValidityCheck": false,
"accessControlList": {},
"permissions": [],
"checkTableAuthorizer": false,
"accelerationRefreshOnDataChanges": false,
"owner": {
"ownerId": "5c9c394d-7395-4687-97dd-beac4384c359",
"ownerType": "USER"
}
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
Updating a Source
Edit the source information using the source ID.
Method and URLPUT /v0/projects/{project-id}/catalog/{id}
When you update a source that supports dataset discovery, metadata is initially populated in the catalog for each table that is accessible in the source. If the source contains a very large number of tables, this process may take longer than the request timeout that is configured by the Dremio Cloud gateway, and the response will contain a 502 or 504 response code. In this case, instead of retrying the request, we recommend that you poll the Retrieving a Source by Path endpoint to await the source's update. The source is updated when the value for the tag
attribute in the response changes.
Parameters
project-id Path String (UUID)
The unique identifier for the project that contains the source.
Example: 02d36975-73eb-47ed-9bb5-de73060380f6
id Path String (UUID)
The unique identifier for the source that you want to update.
Example: ffbe8c1d-1db7-48d1-9c58-f452838fedc0
entityType Body String
Specifies the type of container. For sources, the type is source
.
config Body Object
Configuration options for the specified data source.
type Body String
Identifies the source type that you are configuring. The source type determines the config
parameters that display. Read Parameters of the config
Object for more information.
Enum: ARCTIC, AWSGLUE, MSSQL, MYSQL, ORACLE, POSTGRES, REDSHIFT, S3, SNOWFLAKE
name Body String
The user-defined name of the source.
Example: recruitingdb
metadataPolicy Body Object
The policies covering the update of a source’s metadata.
accelerationGracePeriodMs Body Integer
Identifies the length of time to keep reflections for all the datasets in a source before these reflections expire. The default setting is 10800000
milliseconds or three hours. The minimum amount of time that is supported is 3600000
milliseconds or one hour. Read Setting the Expiration Policy for Reflections for more information.
Example: 10800000
accelerationRefreshPeriodMs Body Integer
Identifies the refresh frequency for a dataset's reflections. Optional if you set accelerationActivePolicyType to PERIOD
. The default setting is 3600000
milliseconds or one hour, which is also the minimum amount of time that is supported.
Example: 3600000
accelerationActivePolicyType String
Option to set the policy for refreshing reflections that are defined on the source. For this option to take effect, the accelerationNeverRefresh parameter must be set to false
.
The possible values are:
NEVER
: The reflections are never refreshed.PERIOD
: The reflections are refreshed at the end of every period that is defined by accelerationRefreshPeriodMs.SCHEDULE
: The reflections are refreshed according to the schedule that is set by accelerationRefreshSchedule.
accelerationRefreshSchedule String
A cron expression that sets the schedule, in UTC time, according to which the reflections that are defined on the source are refreshed. Optional if you set accelerationActivePolicyType to SCHEDULE
. The default accelerationRefreshSchedule setting is to refresh every day at 8:00 a.m.
Field | Allowed Values | Allowed Special Characters |
---|---|---|
Second | 0 | N/A |
Minute | 0-59 | N/A |
Hour | 0-23 | N/A |
Day of month | N/A | * ? |
Month | N/A | * ? |
Days of week | 1-7 or SUN-SAT | , - * ? |
Special Character | Description |
---|---|
* | Used to specify all values for a field. For Day of month , specifies every day of the month. For Month , specifies every month. For Days of week , specifies every day of the week. |
? | Equivalent to *. |
, | Used to specify two or more days in the Days of week field. For example, MON,WED,FRI . |
- | Used to specify ranges in the Days of week field. For example, 1-3 is equivalent to Sunday, Monday, and Tuesday . |
Examples:
0 0 0 * * ?
: Refreshes every day at midnight.0 45 15 * * 1,4,7
: Refreshes at 15:45 on Sunday, Wednesday, and Saturday.0 15 7 ? * 2-6
: Refreshes at 7:15 on Monday and Friday.
accelerationRefreshOnDataChanges Body Boolean
To refresh reflections on underlying tables that are in Iceberg format in the source when new snapshots are created after an update, true
. Otherwise, false
. Reflections that are automatically updated based on Iceberg source table changes also update according to the source-level policy as the minimum refresh frequency. For this option to take effect, the source must support Iceberg table format, the accelerationNeverRefresh parameter must be set to false
, and the accelerationActivePolicyType parameter must be set to either PERIOD
or SCHEDULE
.
accessControlList Body