Skip to main content

Jobs

Dremio Arctic enables you to schedule optimization jobs to help you manage the accumulation of the data files that occurs through DML operations. This API allows you to run a one-off job, list existing jobs, retrieve job status, and cancel a job that is running.

Jobs Object
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "4696f3ba-eefd-174c-681b-0d753ed5ad85",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}

Job Attributes

type String

The type of the job.

Enum: OPTIMIZE, VACUUM

Example: OPTIMIZE


catalogId String (UUID)

Unique identifier for the catalog where the job ran.

Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a


id String (UUID)

Unique identifier for the job.

Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42


state String

Status of the job.

Enum: SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED

Example: COMPLETED


username String

The user who created the job.

Example: dremio_user@company.com


startedAt String

Date and time when the job started, in UTC format.

Example: 2023-02-23T20:25:44Z


endedAt String

Date and time when the job ended, in UTC format.

Example: 2023-02-23T20:28:50Z


engineSize String

Engine size used by the job.

Enum: XX_SMALL_V1, X_SMALL_V1, SMALL_V1, MEDIUM_V1, LARGE_V1, X_LARGE_V1, XX_LARGE_V1, XXX_LARGE_V1

Example: XX_SMALL_V1


scheduleId String (UUID)

Unique identifier for the schedule that created the job, if applicable. Empty for jobs created with the Arctic Jobs API or the Optimize Once option in the Dremio console.

Example: 4696f3ba-eefd-174c-681b-0d753ed5ad85


errorMessage String

For unsuccessful jobs, a description of the problem. Not included for successful jobs.

Example: Job has failed due to an internal error. Please contact Support if this issue persists.


config Object

Configuration options for the job. Not included for VACUUM job types.

Example: {"tableId": "zip_lookup","reference": "main","targetFileSize": "256 MB","minFileSize": "192 MB","maxFileSize": "460 MB","minFiles": 5}


metrics Object

For OPTIMIZE job types, information about the number of rewritten and new data files and removed Iceberg DeleteFiles that resulted from the job. For VACUUM job types, information about the number of files the job removed.

Example: {"rewrittenDataFiles": 6,"newDataFiles": 2,"rewrittenDeleteFiles": 5}

Attributes of the config Object

tableId String

The name of the table that the optimization job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.

Example: zip_lookup, address.zip_lookup


reference String

Identifies the branch, tag, or commit where the table containing the optimization job is located.

Example: main


targetFileSize String

Controls the target size of the files that are generated for type OPTIMIZE. The integer must be non-negative. The default file size is 256 MB.

Example: 256 MB


minFileSize String

The minimum file size threshold for type OPTIMIZE. Files that are smaller in size than the specified minFileSize qualify for optimization. The integer must be non-negative. The default value is 192 MB.

Example: 192 MB


maxFileSize String

The maximum file size threshold for type OPTIMIZE. Files that are larger in size than the specified maxFileSize qualify for optimization. The integer must be non-negative. The default value is 460 MB.

Example: 460 MB


minFiles String

The minimum number of qualified files required to run a table optimization job. Supports only type OPTIMIZE. The default is 5.

Example: 5

Attributes of the metrics Object

rewrittenDataFiles Integer

Number of data files that were rewritten in the job. Included only for OPTIMIZE job types.

Example: 6


newDataFiles Integer

Number of new data files that were created in the job. Included only for OPTIMIZE job types.

Example: 2


rewrittenDeleteFiles Integer

Number of Iceberg DeleteFiles that were removed. Included only for OPTIMIZE job types.

Example: 5


deletedFilesCount Integer

Number of files that were removed by the VACUUM CATALOG command. Included only for VACUUM job types.

Example: 91

Creating a Job

Create a job for the specified Arctic catalog.

Method and URL
POST /v0/arctic/catalogs/{catalogId}/jobs

Parameters

catalogId Body   String (UUID)

Unique identifier for the catalog where the job should run.

Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a


type Body   String

The type of job to run.

Enum: OPTIMIZE, VACUUM

Example: OPTIMIZE


config Body   Object

Configuration options for the job. Supported only for OPTIMIZE job types.

Example: {"tableId": "zip_lookup","reference": "main","targetFileSize": "256 MB","minFileSize": "192 MB","maxFileSize": "460 MB","minFiles": 5}

Parameters of the config Object

tableId Body   String

The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.

Example: zip_lookup, address.zip_lookup


reference Body   String

Identifies the branch, tag, or commit where the table to contain the job is located.

Example: main


targetFileSize Body   String   Optional

Controls the target size of the files that will be generated for type OPTIMIZE. The integer must be non-negative. The default file size is 256 MB.

Example: 256 MB


minFileSize Body   String   Optional

The minimum file size threshold for type OPTIMIZE. Files that are smaller in size than the specified minFileSize qualify for optimization. The integer must be non-negative. The default value is 192 MB.

Example: 192 MB


maxFileSize Body   String   Optional

The maximum file size threshold for type OPTIMIZE. Files that are larger in size than the specified maxFileSize qualify for optimization. The integer must be non-negative. The default value is 460 MB.

Example: 460 MB


minFiles Body   String   Optional

The minimum number of qualified files required to run the job. Supports only type OPTIMIZE. The default is 5.

Example: 5

Example Request for an OPTIMIZE-Type Job
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "OPTIMIZE",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}'
Example Response for an OPTIMIZE-Type Job
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "SETUP",
"username": "dremio_user@company.com",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}
Example Request for a VACUUM-Type Job
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "VACUUM"
}'
Example Response for a VACUUM-Type Job
{
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "cb2605d9-1758-4ca2-a209-2ea8e4fc8f56",
"state": "SETUP",
"type": "VACUUM",
"username": "dremio_user@company.com",
"startedAt": "2023-09-14T20:38:48Z",
"engineSize": "X_SMALL_V1"
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

404   Not Found

500   Internal Server Error

Listing All Jobs

Returns a listing of all jobs for the specified Arctic catalog.

Method and URL
GET /v0/arctic/catalogs/{catalogId}/jobs

Parameters

catalogId Path   String (UUID)

Unique identifier of the catalog you want to list all job jobs for.

Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a


pageToken Query   String   Optional

Token for retrieving the next page of jobs. If the Dremio instance has more jobs than the maximum per page (default 10), the response will include a nextPageToken after the data array. Use the nextPageToken value in your request URL as the pageToken value. Do not change any other query parameters included in the request URL when you use pageToken. Read pageToken Query Parameter for usage examples.


maxResults Query   Integer   Optional

The maximum number of results to to return in the response. The server may return fewer results, but not more. The default is 10. Read maxResults Query Parameter for usage examples.


filter Query   Object   Optional

A common expression language (CEL) expression that filters responses so that they include only results with the specified attributes and values. Filters for job type, tableId, user, reference, state, and quickfind. Value is a URL-encoded string that represents a JSON object. The JSON object specifies the attributes to filter on and the values to match for each attribute. Read filter Query Parameter for usage examples.


view Query   String   Optional

Level of detail to include in the response. Valid values are SUMMARY (default) and FULL (include the config and metrics objects in the job objects in the response). Read view Query Parameter for usage examples.

Parameters of the filter Object

type Query   String

The type of job to be run.

Enum: OPTIMIZE, VACUUM

Example: OPTIMIZE


tableId Query   String

The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator. If a tableId is used, then the type is OPTIMIZE.

Example: sample-NYC-taxi-trips, tripdata.sample-NYC-taxi-trips


reference Query   String

Identifies the branch, tag, or commit where the table containing the job is located. If a reference is used, then the type is OPTIMIZE.

Example: main


user Query   String

The user who created the job.

Example: dremio_user@company.com


state Query   String

The job's state. The enum lists the valid values.

Enum: SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED


quickfind Query   String

Composite field that will match the job ID, username, tableId, or reference.

Example Request
curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'

In the response for a request to retrieve all Arctic optimization jobs, the job objects are wrapped with a data array. Each object in the data array represents one job.

Example Response
{
"data": [
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "1907c1f6-5fa7-4cf2-ae9c-4089620f7e28",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T19:06:52Z",
"endedAt": "2023-02-23T19:09:57Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists."
},
{
"type": "VACUUM",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "c0a2648f-3400-4613-8a5e-e4ee71386ca2",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:18:12Z",
"endedAt": "2023-02-16T17:20:18Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "55dbb0fa-d7e2-4520-9748-a48d4b96f837"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "7d0ebe6b-908e-42fe-b753-ba4b277a38de",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:12:29Z",
"endedAt": "2023-02-16T17:14:34Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists.",
"scheduleId": "acb29b8c-4358-0a8f-4518-0def8a495081"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "e72dff53-bb7c-4737-8cd1-a788adc209b3",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:10:47Z",
"endedAt": "2023-02-16T17:14:52Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "3777cde6-88df-4477-8662-77f7e45e9b69",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:06:29Z",
"endedAt": "2023-02-16T17:10:34Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "90522152-efa5-4ed8-8692-7f1fd59b62a0",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T16:03:23Z",
"endedAt": "2023-02-16T16:06:29Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "830c7a49-1408-4047-a6df-4ecf684fb918",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T13:53:23Z",
"endedAt": "2023-02-16T14:07:30Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "12946fe4-267a-4398-808c-4e03a97fb961",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-15T23:14:43Z",
"endedAt": "2023-02-15T23:17:48Z",
"engineSize": "XX_SMALL_V1"
}
],
"nextPageToken": "a"
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

404   Not Found

500   Internal Server Error

Retrieving a Job

Retrieve the specified job.

Method and URL
GET /v0/arctic/catalogs/{catalogId}/jobs/{id}

Parameters

catalogId Path   String (UUID)

Unique identifier of the catalog that contains the job you want to retrieve.

Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a


id Path   String (UUID)

Unique identifier for the job you want to retrieve.

Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42

Example Request
curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
Example Response
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}

Response Status Codes

200   OK

400   Bad Request

401   Unauthorized

404   Not Found

500   Internal Server Error

Canceling a Job

Canceling a job.

Method and URL
POST /v0/arctic/catalogs/{catalogId}/jobs/{id}/cancel

Parameters

catalogId Path   String (UUID)

Unique identifier of the catalog that contains the job you want to cancel.

Example: 5d138bd7-6513-46d7-b9cb-d236e195b34a


id Path   String (UUID)

Unique identifier for the job you want to cancel.

Example: 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42

Example Request
curl -X DELETE 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a'/cancel \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
Example Response
No response

Response Status Codes

204   No Content

400   Bad Request

403   Forbidden

404   Not Found