Jobs
Dremio Arctic enables you to schedule optimization jobs to help you manage the accumulation of the data files that occurs through DML operations. This API allows you to run a one-off job, list existing jobs, retrieve job status, and cancel a job that is running.
Jobs Object{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "4696f3ba-eefd-174c-681b-0d753ed5ad85",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}
Job Attributes
type String
The type of the job.
Enum: OPTIMIZE, VACUUM
Example: OPTIMIZE
catalogId String (UUID)
Unique identifier for the catalog where the job ran.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
id String (UUID)
Unique identifier for the job.
Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42
state String
Status of the job.
Enum: SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED
Example: COMPLETED
username String
The user who created the job.
Example: dremio_user@company.com
startedAt String
Date and time when the job started, in UTC format.
Example: 2023-02-23T20:25:44Z
endedAt String
Date and time when the job ended, in UTC format.
Example: 2023-02-23T20:28:50Z
engineSize String
Engine size used by the job.
Enum: XX_SMALL_V1, X_SMALL_V1, SMALL_V1, MEDIUM_V1, LARGE_V1, X_LARGE_V1, XX_LARGE_V1, XXX_LARGE_V1
Example: XX_SMALL_V1
scheduleId String (UUID)
Unique identifier for the schedule that created the job, if applicable. Empty for jobs created with the Arctic Jobs API or the Optimize Once option in the Dremio console.
Example: 4696f3ba-eefd-174c-681b-0d753ed5ad85
errorMessage String
For unsuccessful jobs, a description of the problem. Not included for successful jobs.
Example: Job has failed due to an internal error. Please contact Support if this issue persists.
config Object
Configuration options for the job. Not included for VACUUM
job types.
Example: {"tableId": "zip_lookup","reference": "main","targetFileSize": "256 MB","minFileSize": "192 MB","maxFileSize": "460 MB","minFiles": 5}
metrics Object
For OPTIMIZE
job types, information about the number of rewritten and new data files and removed Iceberg DeleteFiles
that resulted from the job. For VACUUM
job types, information about the number of files the job removed.
Example: {"rewrittenDataFiles": 6,"newDataFiles": 2,"rewrittenDeleteFiles": 5}
Attributes of the config
Object
tableId String
The name of the table that the optimization job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.
Example: zip_lookup, address.zip_lookup
reference String
Identifies the branch, tag, or commit where the table containing the optimization job is located.
Example: main
targetFileSize String
Controls the target size of the files that are generated for type OPTIMIZE
. The integer must be non-negative. The default file size is 256 MB.
Example: 256 MB
minFileSize String
The minimum file size threshold for type OPTIMIZE
. Files that are smaller in size than the specified minFileSize
qualify for optimization. The integer must be non-negative. The default value is 192 MB.
Example: 192 MB
maxFileSize String
The maximum file size threshold for type OPTIMIZE
. Files that are larger in size than the specified maxFileSize
qualify for optimization. The integer must be non-negative. The default value is 460 MB.
Example: 460 MB
minFiles String
The minimum number of qualified files required to run a table optimization job. Supports only type OPTIMIZE
. The default is 5.
Example: 5
Attributes of the metrics
Object
rewrittenDataFiles Integer
Number of data files that were rewritten in the job. Included only for OPTIMIZE
job types.
Example: 6
newDataFiles Integer
Number of new data files that were created in the job. Included only for OPTIMIZE
job types.
Example: 2
rewrittenDeleteFiles Integer
Number of Iceberg DeleteFiles
that were removed. Included only for OPTIMIZE
job types.
Example: 5
deletedFilesCount Integer
Number of files that were removed by the VACUUM CATALOG command. Included only for VACUUM
job types.
Example: 91
Creating a Job
Create a job for the specified Arctic catalog.
Method and URLPOST /v0/arctic/catalogs/{catalogId}/jobs
Parameters
catalogId Body String (UUID)
Unique identifier for the catalog where the job should run.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
type Body String
The type of job to run.
Enum: OPTIMIZE, VACUUM
Example: OPTIMIZE
config Body Object
Configuration options for the job. Supported only for OPTIMIZE
job types.
Example: {"tableId": "zip_lookup","reference": "main","targetFileSize": "256 MB","minFileSize": "192 MB","maxFileSize": "460 MB","minFiles": 5}
Parameters of the config
Object
tableId Body String
The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.
Example: zip_lookup, address.zip_lookup
reference Body String
Identifies the branch, tag, or commit where the table to contain the job is located.
Example: main
targetFileSize Body String Optional
Controls the target size of the files that will be generated for type OPTIMIZE
. The integer must be non-negative. The default file size is 256 MB.
Example: 256 MB
minFileSize Body String Optional
The minimum file size threshold for type OPTIMIZE
. Files that are smaller in size than the specified minFileSize
qualify for optimization. The integer must be non-negative. The default value is 192 MB.
Example: 192 MB
maxFileSize Body String Optional
The maximum file size threshold for type OPTIMIZE
. Files that are larger in size than the specified maxFileSize
qualify for optimization. The integer must be non-negative. The default value is 460 MB.
Example: 460 MB
minFiles Body String Optional
The minimum number of qualified files required to run the job. Supports only type OPTIMIZE
. The default is 5.
Example: 5
Example Request for an OPTIMIZE-Type Jobcurl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "OPTIMIZE",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}'
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "SETUP",
"username": "dremio_user@company.com",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "VACUUM"
}'
{
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "cb2605d9-1758-4ca2-a209-2ea8e4fc8f56",
"state": "SETUP",
"type": "VACUUM",
"username": "dremio_user@company.com",
"startedAt": "2023-09-14T20:38:48Z",
"engineSize": "X_SMALL_V1"
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
404 Not Found
500 Internal Server Error
Listing All Jobs
Returns a listing of all jobs for the specified Arctic catalog.
Method and URLGET /v0/arctic/catalogs/{catalogId}/jobs
Parameters
catalogId Path String (UUID)
Unique identifier of the catalog you want to list all job jobs for.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
pageToken Query String Optional
Token for retrieving the next page of jobs. If the Dremio instance has more jobs than the maximum per page (default 10), the response will include a nextPageToken after the data array. Use the nextPageToken value in your request URL as the pageToken value. Do not change any other query parameters included in the request URL when you use pageToken. Read pageToken Query Parameter for usage examples.
maxResults Query Integer Optional
The maximum number of results to return in the response. The server may return fewer results, but not more. The default is 10
. Read maxResults Query Parameter for usage examples.
filter Query Object Optional
A common expression language (CEL) expression that filters responses so that they include only results with the specified attributes and values. Filters for job type, tableId, user, reference, state, and quickfind. Value is a URL-encoded string that represents a JSON object. The JSON object specifies the attributes to filter on and the values to match for each attribute. Read filter Query Parameter for usage examples.
view Query String Optional
Level of detail to include in the response. Valid values are SUMMARY
(default) and FULL
(include the config and metrics objects in the job objects in the response). Read view Query Parameter for usage examples.
Parameters of the filter
Object
type Query String
The type of job to be run.
Enum: OPTIMIZE, VACUUM
Example: OPTIMIZE
tableId Query String
The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator. If a tableId
is used, then the type is OPTIMIZE
.
Example: sample-NYC-taxi-trips, tripdata.sample-NYC-taxi-trips
reference Query String
Identifies the branch, tag, or commit where the table containing the job is located. If a reference is used, then the type is OPTIMIZE
.
Example: main
user Query String
The user who created the job.
Example: dremio_user@company.com
state Query String
The job's state. The enum lists the valid values.
Enum: SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED
quickfind Query String
Composite field that will match the job ID, username, tableId, or reference.
Example Requestcurl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
In the response for a request to retrieve all Arctic optimization jobs, the job objects are wrapped with a data array. Each object in the data array represents one job.
Example Response{
"data": [
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "1907c1f6-5fa7-4cf2-ae9c-4089620f7e28",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T19:06:52Z",
"endedAt": "2023-02-23T19:09:57Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists."
},
{
"type": "VACUUM",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "c0a2648f-3400-4613-8a5e-e4ee71386ca2",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:18:12Z",
"endedAt": "2023-02-16T17:20:18Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "55dbb0fa-d7e2-4520-9748-a48d4b96f837"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "7d0ebe6b-908e-42fe-b753-ba4b277a38de",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:12:29Z",
"endedAt": "2023-02-16T17:14:34Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists.",
"scheduleId": "acb29b8c-4358-0a8f-4518-0def8a495081"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "e72dff53-bb7c-4737-8cd1-a788adc209b3",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:10:47Z",
"endedAt": "2023-02-16T17:14:52Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "3777cde6-88df-4477-8662-77f7e45e9b69",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:06:29Z",
"endedAt": "2023-02-16T17:10:34Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "90522152-efa5-4ed8-8692-7f1fd59b62a0",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T16:03:23Z",
"endedAt": "2023-02-16T16:06:29Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "830c7a49-1408-4047-a6df-4ecf684fb918",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T13:53:23Z",
"endedAt": "2023-02-16T14:07:30Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "12946fe4-267a-4398-808c-4e03a97fb961",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-15T23:14:43Z",
"endedAt": "2023-02-15T23:17:48Z",
"engineSize": "XX_SMALL_V1"
}
],
"nextPageToken": "a"
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
404 Not Found
500 Internal Server Error
Retrieving a Job
Retrieve the specified job.
Method and URLGET /v0/arctic/catalogs/{catalogId}/jobs/{id}
Parameters
catalogId Path String (UUID)
Unique identifier of the catalog that contains the job you want to retrieve.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
id Path String (UUID)
Unique identifier for the job you want to retrieve.
Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42
Example Requestcurl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
404 Not Found
500 Internal Server Error
Canceling a Job
Canceling a job.
Method and URLPOST /v0/arctic/catalogs/{catalogId}/jobs/{id}/cancel
Parameters
catalogId Path String (UUID)
Unique identifier of the catalog that contains the job you want to cancel.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
id Path String (UUID)
Unique identifier for the job you want to cancel.
Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42
Example Requestcurl -X DELETE 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a'/cancel \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
No response
Response Status Codes
204 No Content
400 Bad Request
403 Forbidden
404 Not Found