Jobs
Dremio Arctic enables you to schedule optimization jobs to help you manage the accumulation of the data files that occurs through DML operations. This API allows you to run a one-off job, list existing jobs, retrieve job status, and cancel a job that is running.
Jobs Object{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "4696f3ba-eefd-174c-681b-0d753ed5ad85",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}
Job Attributes
type
String
The type of the job.
EnumOPTIMIZE, VACUUM
Example OPTIMIZE
catalogId
String (UUID)
Unique identifier for the catalog where the job ran.
Example 5d138bd7-6513-46d7-b9cb-d236e195b34a
id
String (UUID)
Unique identifier for the job.
Example 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42
state
String
Status of the job.
EnumSETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED
Example COMPLETED
username
String
The user who created the job.
Example dremio_user@company.com
startedAt
String
Date and time when the job started, in UTC format.
Example 2023-02-23T20:25:44Z
endedAt
String
Date and time when the job ended, in UTC format.
Example 2023-02-23T20:28:50Z
engineSize
String
Engine size used by the job.
EnumXX_SMALL_V1, X_SMALL_V1, SMALL_V1, MEDIUM_V1, LARGE_V1, X_LARGE_V1, XX_LARGE_V1, XXX_LARGE_V1
Example XX_SMALL_V1
scheduleId
String (UUID)
Unique identifier for the schedule that created the job, if applicable. Empty for jobs created with the Arctic Jobs API or the Optimize Once option in the Dremio console.
Example 4696f3ba-eefd-174c-681b-0d753ed5ad85
errorMessage
String
For unsuccessful jobs, a description of the problem. Not included for successful jobs.
Example Job has failed due to an internal error. Please contact Support if this issue persists.
Object
Configuration options for the job. Not included for VACUUM
job types.
Example { "tableId": "zip_lookup", "reference": "main", "targetFileSize": "256 MB", "minFileSize": "192 MB", "maxFileSize": "460 MB", "minFiles": 5 }
Object
For OPTIMIZE
job types, information about the number of rewritten and new data files and removed Iceberg DeleteFiles
that resulted from the job. For VACUUM
job types, information about the number of files the job removed.
Example { "rewrittenDataFiles": 6, "newDataFiles": 2 "rewrittenDeleteFiles": 5}
config
Object Attributes
tableId
String
The name of the table that the optimization job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.
Example zip_lookup, address.zip_lookup
reference
String
Identifies the branch, tag, or commit where the table containing the optimization job is located.
Example main
targetFileSize
String
Controls the target size of the files that are generated for type OPTIMIZE
. The integer must be non-negative. The default file size is 256 MB.
Example 256 MB
minFileSize
String
The minimum file size threshold for type OPTIMIZE
. Files that are smaller in size than the specified minFileSize
qualify for optimization. The integer must be non-negative. The default value is 192 MB.
Example 192 MB
maxFileSize
String
The maximum file size threshold for type OPTIMIZE
. Files that are larger in size than the specified maxFileSize
qualify for optimization. The integer must be non-negative. The default value is 460 MB.
Example 460 MB
minFiles
String
The minimum number of qualified files required to run a table optimization job. Supports only type OPTIMIZE
. The default is 5.
Example 5
metrics
Object Attributes
rewrittenDataFiles
Integer
Number of data files that were rewritten in the job. Included only for OPTIMIZE
job types.
Example 6
newDataFiles
Integer
Number of new data files that were created in the job. Included only for OPTIMIZE
job types.
Example 2
rewrittenDeleteFiles
Integer
Number of Iceberg DeleteFiles
that were removed. Included only for OPTIMIZE
job types.
Example 5
deletedFilesCount
Integer
Number of files that were removed by the VACUUM CATALOG command. Included only for VACUUM
job types.
Example 91
Creating a Job
Create a job for the specified Arctic catalog.
Method and URLPOST /v0/arctic/catalogs/{catalogId}/jobs
Parameters
catalogId
path
String (UUID)
Unique identifier for the catalog where the job should run.
Example 5d138bd7-6513-46d7-b9cb-d236e195b34a
type
body
String
The type of job to run.
EnumOPTIMIZE, VACUUM
Example OPTIMIZE
body
Object
Configuration options for the job. Supported only for OPTIMIZE
job types.
Example { "tableId": "zip_lookup", "reference": "main", "targetFileSize": "256 MB", "minFileSize": "192 MB", "maxFileSize": "460 MB", "minFiles": 5 }
config
Object Parameters
tableId
body
String
The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.
Example zip_lookup, address.zip_lookup
reference
body
String
Identifies the branch, tag, or commit where the table to contain the job is located.
Example main
targetFileSize
body
String
Optional
Controls the target size of the files that will be generated for type OPTIMIZE
. The integer must be non-negative. The default file size is 256 MB.
Example 256 MB
minFileSize
body
String
Optional
The minimum file size threshold for type OPTIMIZE
. Files that are smaller in size than the specified minFileSize
qualify for optimization. The integer must be non-negative. The default value is 192 MB.
Example 192 MB
maxFileSize
body
String
Optional
The maximum file size threshold for type OPTIMIZE
. Files that are larger in size than the specified maxFileSize
qualify for optimization. The integer must be non-negative. The default value is 460 MB.
Example 460 MB
minFiles
body
String
Optional
The minimum number of qualified files required to run the job. Supports only type OPTIMIZE
. The default is 5.
Example 5
Example Request for an OPTIMIZE-Type Job
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "OPTIMIZE",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}'
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "SETUP",
"username": "dremio_user@company.com",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}
Example Request for a VACUUM-Type Job
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "VACUUM"
}'
{
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "cb2605d9-1758-4ca2-a209-2ea8e4fc8f56",
"state": "SETUP",
"type": "VACUUM",
"username": "dremio_user@company.com",
"startedAt": "2023-09-14T20:38:48Z",
"engineSize": "X_SMALL_V1"
}
Response Status Codes
200
OK
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error
Listing All Jobs
Returns a listing of all jobs for the specified Arctic catalog.
Method and URLGET /v0/arctic/catalogs/{catalogId}/jobs
Parameters
catalogId
path
String (UUID)
Unique identifier of the catalog you want to list all job jobs for.
Example 5d138bd7-6513-46d7-b9cb-d236e195b34a
pageToken
query
String
Optional
Token for retrieving the next page of jobs. If the Dremio instance has more jobs than the maximum per page (default 10), the response will include a nextPageToken after the data array. Use the nextPageToken value in your request URL as the pageToken value. Do not change any other query parameters included in the request URL when you use pageToken. Read pageToken Query Parameter for usage examples.
maxResults
query
Integer
Optional
The maximum number of results to to return in the response. The server may return fewer results, but not more. The default is 10
. Read maxResults Query Parameter for usage examples.
query
Object
Optional
A common expression language (CEL) expression that filters responses so that they include only results with the specified attributes and values. Filters for job type, tableId, user, reference, state, and quickfind. Value is a URL-encoded string that represents a JSON object. The JSON object specifies the attributes to filter on and the values to match for each attribute. Read filter Query Parameter for usage examples.
view
query
String
Optional
Level of detail to include in the response. Valid values are SUMMARY
(default) and FULL
(include the config and metrics objects in the job objects in the response). Read view Query Parameter for usage examples.
filter
Object Attributes
type
String
The type of job to be run.
EnumOPTIMIZE, VACUUM
Example OPTIMIZE
tableId
String
The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator. If a tableID
is used, then the type is OPTIMIZE
.
Example sample-NYC-taxi-trips, tripdata.sample-NYC-taxi-trips
reference
String
Identifies the branch, tag, or commit where the table containing the job is located. If a reference is used, then the type is OPTIMIZE
.
Example main
user
String
The user who created the job.
Example dremio_user@company.com
state
String
The job's state. The enum lists the valid values.
EnumSETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED
quickfind
String
Composite field that will match the job ID, username, tableId, or reference.
Example Request
curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
In the response for a request to retrieve all Arctic optimization jobs, the job objects are wrapped with a data array. Each object in the data array represents one job.
Example Response{
"data": [
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "1907c1f6-5fa7-4cf2-ae9c-4089620f7e28",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T19:06:52Z",
"endedAt": "2023-02-23T19:09:57Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists."
},
{
"type": "VACUUM",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "c0a2648f-3400-4613-8a5e-e4ee71386ca2",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:18:12Z",
"endedAt": "2023-02-16T17:20:18Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "55dbb0fa-d7e2-4520-9748-a48d4b96f837"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "7d0ebe6b-908e-42fe-b753-ba4b277a38de",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:12:29Z",
"endedAt": "2023-02-16T17:14:34Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists.",
"scheduleId": "acb29b8c-4358-0a8f-4518-0def8a495081"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "e72dff53-bb7c-4737-8cd1-a788adc209b3",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:10:47Z",
"endedAt": "2023-02-16T17:14:52Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "3777cde6-88df-4477-8662-77f7e45e9b69",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:06:29Z",
"endedAt": "2023-02-16T17:10:34Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "90522152-efa5-4ed8-8692-7f1fd59b62a0",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T16:03:23Z",
"endedAt": "2023-02-16T16:06:29Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "830c7a49-1408-4047-a6df-4ecf684fb918",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T13:53:23Z",
"endedAt": "2023-02-16T14:07:30Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "12946fe4-267a-4398-808c-4e03a97fb961",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-15T23:14:43Z",
"endedAt": "2023-02-15T23:17:48Z",
"engineSize": "XX_SMALL_V1"
}
],
"nextPageToken": "a"
}
Response Status Codes
200
OK
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error
Retrieving a Job
Retrieve the specified job.
Method and URLGET /v0/arctic/catalogs/{catalogId}/jobs/{id}
Parameters
catalogId
path
String (UUID)
Unique identifier of the catalog that contains the job you want to retrieve.
Example 5d138bd7-6513-46d7-b9cb-d236e195b34a
id
path
String (UUID)
Unique identifier for the job you want to retrieve.
Example 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42
Example Request
curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}
Response Status Codes
200
OK
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error
Canceling a Job
Canceling a job.
Method and URLPOST /v0/arctic/catalogs/{catalogId}/jobs/{id}/cancel
Parameters
catalogId
path
String (UUID)
Unique identifier for the catalog that contains the job you want to cancel.
id
path
String (UUID)
Unique identifier for the job to cancel.
Example Request
curl -X DELETE 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a'/cancel \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
No response
Response Status Codes
204
No Content
400
Bad Request
403
Forbidden
404
Not Found