Jobs
Dremio Arctic enables you to schedule optimization jobs to help you manage the accumulation of the data files that occurs through DML operations. This API allows you to run a one-off job, list existing jobs, retrieve job status, and cancel a job that is running.
Jobs Object{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "4696f3ba-eefd-174c-681b-0d753ed5ad85",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}
Job Attributes
type String
The type of the job.
Enum: OPTIMIZE, VACUUM
Example: OPTIMIZE
catalogId String (UUID)
Unique identifier for the catalog where the job ran.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
id String (UUID)
Unique identifier for the job.
Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42
state String
Status of the job.
Enum: SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED
Example: COMPLETED
username String
The user who created the job.
Example: dremio_user@company.com
startedAt String
Date and time when the job started, in UTC format.
Example: 2023-02-23T20:25:44Z
endedAt String
Date and time when the job ended, in UTC format.
Example: 2023-02-23T20:28:50Z
engineSize String
Engine size used by the job.
Enum: XX_SMALL_V1, X_SMALL_V1, SMALL_V1, MEDIUM_V1, LARGE_V1, X_LARGE_V1, XX_LARGE_V1, XXX_LARGE_V1
Example: XX_SMALL_V1
scheduleId String (UUID)
Unique identifier for the schedule that created the job, if applicable. Empty for jobs created with the Arctic Jobs API or the Optimize Once option in the Dremio console.
Example: 4696f3ba-eefd-174c-681b-0d753ed5ad85
errorMessage String
For unsuccessful jobs, a description of the problem. Not included for successful jobs.
Example: Job has failed due to an internal error. Please contact Support if this issue persists.
config Object
Configuration options for the job. Not included for VACUUM
job types.
Example: {"tableId": "zip_lookup","reference": "main","targetFileSize": "256 MB","minFileSize": "192 MB","maxFileSize": "460 MB","minFiles": 5}
metrics Object
For OPTIMIZE
job types, information about the number of rewritten and new data files and removed Iceberg DeleteFiles
that resulted from the job. For VACUUM
job types, information about the number of files the job removed.
Example: {"rewrittenDataFiles": 6,"newDataFiles": 2,"rewrittenDeleteFiles": 5}
Attributes of the config
Object
tableId String
The name of the table that the optimization job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.
Example: zip_lookup, address.zip_lookup
reference String
Identifies the branch, tag, or commit where the table containing the optimization job is located.
Example: main
targetFileSize String
Controls the target size of the files that are generated for type OPTIMIZE
. The integer must be non-negative. The default file size is 256 MB.
Example: 256 MB
minFileSize String
The minimum file size threshold for type OPTIMIZE
. Files that are smaller in size than the specified minFileSize
qualify for optimization. The integer must be non-negative. The default value is 192 MB.
Example: 192 MB
maxFileSize String
The maximum file size threshold for type OPTIMIZE
. Files that are larger in size than the specified maxFileSize
qualify for optimization. The integer must be non-negative. The default value is 460 MB.
Example: 460 MB
minFiles String
The minimum number of qualified files required to run a table optimization job. Supports only type OPTIMIZE
. The default is 5.
Example: 5
Attributes of the metrics
Object
rewrittenDataFiles Integer
Number of data files that were rewritten in the job. Included only for OPTIMIZE
job types.
Example: 6
newDataFiles Integer
Number of new data files that were created in the job. Included only for OPTIMIZE
job types.
Example: 2
rewrittenDeleteFiles Integer
Number of Iceberg DeleteFiles
that were removed. Included only for OPTIMIZE
job types.
Example: 5
deletedFilesCount Integer
Number of files that were removed by the VACUUM CATALOG command. Included only for VACUUM
job types.
Example: 91
Creating a Job
Create a job for the specified Arctic catalog.
Method and URLPOST /v0/arctic/catalogs/{catalogId}/jobs
Parameters
catalogId Body String (UUID)
Unique identifier for the catalog where the job should run.
Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a
type Body String
The type of job to run.
Enum: OPTIMIZE, VACUUM
Example: OPTIMIZE
config Body Object
Configuration options for the job. Supported only for OPTIMIZE
job types.
Example: {"tableId": "zip_lookup","reference": "main","targetFileSize": "256 MB","minFileSize": "192 MB","maxFileSize": "460 MB","minFiles": 5}
Parameters of the config
Object
tableId Body String
The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.
Example: zip_lookup, address.zip_lookup
reference Body String
Identifies the branch, tag, or commit where the table to contain the job is located.
Example: main
targetFileSize Body String Optional
Controls the target size of the files that will be generated for type OPTIMIZE
. The integer must be non-negative. The default file size is 256 MB.
Example: 256 MB
minFileSize Body String Optional
The minimum file size threshold for type OPTIMIZE
. Files that are smaller in size than the specified minFileSize
qualify for optimization. The integer must be non-negative. The default value is 192 MB.
Example: 192 MB
maxFileSize Body String Optional
The maximum file size threshold for type OPTIMIZE
. Files that are larger in size than the specified maxFileSize
qualify for optimization. The integer must be non-negative. The default value is 460 MB.
Example: 460 MB
minFiles Body String Optional
The minimum number of qualified files required to run the job. Supports only type OPTIMIZE
. The default is 5.
Example: 5
Example Request for an OPTIMIZE-Type Jobcurl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "OPTIMIZE",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}'
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "SETUP",
"username": "dremio_user@company.com",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "VACUUM"
}'
{
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "cb2605d9-1758-4ca2-a209-2ea8e4fc8f56",
"state": "SETUP",
"type": "VACUUM",
"username": "dremio_user@company.com",
"startedAt": "2023-09-14T20:38:48Z",
"engineSize": "X_SMALL_V1"
}
Response Status Codes
200 OK
400 Bad Request
401 Unauthorized
404 Not Found
500 Internal Server Error