Skip to main content

Jobs

Dremio Arctic enables you to schedule optimization jobs to help you manage the accumulation of the data files that occurs through DML operations. This API allows you to run a one-off job, list existing jobs, retrieve job status, and cancel a job that is running.

Jobs Object
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "4696f3ba-eefd-174c-681b-0d753ed5ad85",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}

Job Attributes

type

String

The type of the job.

Enum OPTIMIZE, VACUUM

Example OPTIMIZE


catalogId

String (UUID)

Unique identifier for the catalog where the job ran.

Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


id

String (UUID)

Unique identifier for the job.

Example 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42


state

String

Status of the job.

Enum SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED

Example COMPLETED


username

String

The user who created the job.

Example dremio_user@company.com


startedAt

String

Date and time when the job started, in UTC format.

Example 2023-02-23T20:25:44Z


endedAt

String

Date and time when the job ended, in UTC format.

Example 2023-02-23T20:28:50Z


engineSize

String

Engine size used by the job.

Enum XX_SMALL_V1, X_SMALL_V1, SMALL_V1, MEDIUM_V1, LARGE_V1, X_LARGE_V1, XX_LARGE_V1, XXX_LARGE_V1

Example XX_SMALL_V1


scheduleId

String (UUID)

Unique identifier for the schedule that created the job, if applicable. Empty for jobs created with the Arctic Jobs API or the Optimize Once option in the Dremio console.

Example 4696f3ba-eefd-174c-681b-0d753ed5ad85


errorMessage

String

For unsuccessful jobs, a description of the problem. Not included for successful jobs.

Example Job has failed due to an internal error. Please contact Support if this issue persists.


config

Object

Configuration options for the job. Not included for VACUUM job types.

Example { "tableId": "zip_lookup", "reference": "main", "targetFileSize": "256 MB", "minFileSize": "192 MB", "maxFileSize": "460 MB", "minFiles": 5 }


metrics

Object

For OPTIMIZE job types, information about the number of rewritten and new data files and removed Iceberg DeleteFiles that resulted from the job. For VACUUM job types, information about the number of files the job removed.

Example { "rewrittenDataFiles": 6, "newDataFiles": 2 "rewrittenDeleteFiles": 5}

config Object Attributes

tableId

String

The name of the table that the optimization job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.

Example zip_lookup, address.zip_lookup


reference

String

Identifies the branch, tag, or commit where the table containing the optimization job is located.

Example main


targetFileSize

String

Controls the target size of the files that are generated for type OPTIMIZE. The integer must be non-negative. The default file size is 256 MB.

Example 256 MB


minFileSize

String

The minimum file size threshold for type OPTIMIZE. Files that are smaller in size than the specified minFileSize qualify for optimization. The integer must be non-negative. The default value is 192 MB.

Example 192 MB


maxFileSize

String

The maximum file size threshold for type OPTIMIZE. Files that are larger in size than the specified maxFileSize qualify for optimization. The integer must be non-negative. The default value is 460 MB.

Example 460 MB


minFiles

String

The minimum number of qualified files required to run a table optimization job. Supports only type OPTIMIZE. The default is 5.

Example 5

metrics Object Attributes

rewrittenDataFiles

Integer

Number of data files that were rewritten in the job. Included only for OPTIMIZE job types.

Example 6


newDataFiles

Integer

Number of new data files that were created in the job. Included only for OPTIMIZE job types.

Example 2


rewrittenDeleteFiles

Integer

Number of Iceberg DeleteFiles that were removed. Included only for OPTIMIZE job types.

Example 5


deletedFilesCount

Integer

Number of files that were removed by the VACUUM CATALOG command. Included only for VACUUM job types.

Example 91

Creating a Job

Create a job for the specified Arctic catalog.

Method and URL
POST /v0/arctic/catalogs/{catalogId}/jobs

Parameters

catalogId

path

String (UUID)

Unique identifier for the catalog where the job should run.

Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


type

body

String

The type of job to run.

Enum OPTIMIZE, VACUUM

Example OPTIMIZE


config

body

Object

Configuration options for the job. Supported only for OPTIMIZE job types.

Example { "tableId": "zip_lookup", "reference": "main", "targetFileSize": "256 MB", "minFileSize": "192 MB", "maxFileSize": "460 MB", "minFiles": 5 }


config Object Parameters

tableId

body

String

The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.

Example zip_lookup, address.zip_lookup


reference

body

String

Identifies the branch, tag, or commit where the table to contain the job is located.

Example main


targetFileSize

body

String

Optional

Controls the target size of the files that will be generated for type OPTIMIZE. The integer must be non-negative. The default file size is 256 MB.

Example 256 MB


minFileSize

body

String

Optional

The minimum file size threshold for type OPTIMIZE. Files that are smaller in size than the specified minFileSize qualify for optimization. The integer must be non-negative. The default value is 192 MB.

Example 192 MB


maxFileSize

body

String

Optional

The maximum file size threshold for type OPTIMIZE. Files that are larger in size than the specified maxFileSize qualify for optimization. The integer must be non-negative. The default value is 460 MB.

Example 460 MB


minFiles

body

String

Optional

The minimum number of qualified files required to run the job. Supports only type OPTIMIZE. The default is 5.

Example 5


Example Request for an OPTIMIZE-Type Job
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "OPTIMIZE",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}'
Example Response for an OPTIMIZE-Type Job
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "SETUP",
"username": "dremio_user@company.com",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
}
}

Example Request for a VACUUM-Type Job
curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"type": "VACUUM"
}'
Example Response for a VACUUM-Type Job
{
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "cb2605d9-1758-4ca2-a209-2ea8e4fc8f56",
"state": "SETUP",
"type": "VACUUM",
"username": "dremio_user@company.com",
"startedAt": "2023-09-14T20:38:48Z",
"engineSize": "X_SMALL_V1"
}

Response Status Codes

200

OK

400

Bad Request

401

Unauthorized

404

Not Found

500

Internal Server Error


Listing All Jobs

Returns a listing of all jobs for the specified Arctic catalog.

Method and URL
GET /v0/arctic/catalogs/{catalogId}/jobs

Parameters

catalogId

path

String (UUID)

Unique identifier of the catalog you want to list all job jobs for.

Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


pageToken

query

String

Optional

Token for retrieving the next page of jobs. If the Dremio instance has more jobs than the maximum per page (default 10), the response will include a nextPageToken after the data array. Use the nextPageToken value in your request URL as the pageToken value. Do not change any other query parameters included in the request URL when you use pageToken. Read pageToken Query Parameter for usage examples.


maxResults

query

Integer

Optional

The maximum number of results to to return in the response. The server may return fewer results, but not more. The default is 10. Read maxResults Query Parameter for usage examples.


filter

query

Object

Optional

A common expression language (CEL) expression that filters responses so that they include only results with the specified attributes and values. Filters for job type, tableId, user, reference, state, and quickfind. Value is a URL-encoded string that represents a JSON object. The JSON object specifies the attributes to filter on and the values to match for each attribute. Read filter Query Parameter for usage examples.


view

query

String

Optional

Level of detail to include in the response. Valid values are SUMMARY (default) and FULL (include the config and metrics objects in the job objects in the response). Read view Query Parameter for usage examples.

filter Object Attributes

type

String

The type of job to be run.

Enum OPTIMIZE, VACUUM

Example OPTIMIZE


tableId

String

The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator. If a tableID is used, then the type is OPTIMIZE.

Example sample-NYC-taxi-trips, tripdata.sample-NYC-taxi-trips


reference

String

Identifies the branch, tag, or commit where the table containing the job is located. If a reference is used, then the type is OPTIMIZE.

Example main


user

String

The user who created the job.

Example dremio_user@company.com


state

String

The job's state. The enum lists the valid values.

Enum SETUP, QUEUED, STARTING, RUNNING, COMPLETED, CANCELLED, FAILED

quickfind

String

Composite field that will match the job ID, username, tableId, or reference.


Example Request
curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'

In the response for a request to retrieve all Arctic optimization jobs, the job objects are wrapped with a data array. Each object in the data array represents one job.

Example Response
{
"data": [
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "1907c1f6-5fa7-4cf2-ae9c-4089620f7e28",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T19:06:52Z",
"endedAt": "2023-02-23T19:09:57Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists."
},
{
"type": "VACUUM",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "c0a2648f-3400-4613-8a5e-e4ee71386ca2",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:18:12Z",
"endedAt": "2023-02-16T17:20:18Z",
"engineSize": "XX_SMALL_V1",
"scheduleId": "55dbb0fa-d7e2-4520-9748-a48d4b96f837"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "7d0ebe6b-908e-42fe-b753-ba4b277a38de",
"state": "FAILED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:12:29Z",
"endedAt": "2023-02-16T17:14:34Z",
"engineSize": "XX_SMALL_V1",
"errorMessage": "Job has failed due to an internal error. Please contact Support if this issue persists.",
"scheduleId": "acb29b8c-4358-0a8f-4518-0def8a495081"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "e72dff53-bb7c-4737-8cd1-a788adc209b3",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:10:47Z",
"endedAt": "2023-02-16T17:14:52Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "3777cde6-88df-4477-8662-77f7e45e9b69",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T17:06:29Z",
"endedAt": "2023-02-16T17:10:34Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "90522152-efa5-4ed8-8692-7f1fd59b62a0",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T16:03:23Z",
"endedAt": "2023-02-16T16:06:29Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "830c7a49-1408-4047-a6df-4ecf684fb918",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-16T13:53:23Z",
"endedAt": "2023-02-16T14:07:30Z",
"engineSize": "XX_SMALL_V1"
},
{
"type": "OPTIMIZE",
"catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "12946fe4-267a-4398-808c-4e03a97fb961",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-15T23:14:43Z",
"endedAt": "2023-02-15T23:17:48Z",
"engineSize": "XX_SMALL_V1"
}
],
"nextPageToken": "a"
}

Response Status Codes

200

OK

400

Bad Request

401

Unauthorized

404

Not Found

500

Internal Server Error


Retrieving a Job

Retrieve the specified job.

Method and URL
GET /v0/arctic/catalogs/{catalogId}/jobs/{id}

Parameters

catalogId

path

String (UUID)

Unique identifier of the catalog that contains the job you want to retrieve.

Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


id

path

String (UUID)

Unique identifier for the job you want to retrieve.

Example 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42


Example Request
curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a' \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
Example Response
{
"type": "OPTIMIZE",
"catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
"id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
"state": "COMPLETED",
"username": "dremio_user@company.com",
"startedAt": "2023-02-23T20:25:44Z",
"endedAt": "2023-02-23T20:28:50Z",
"engineSize": "XX_SMALL_V1",
"config": {
"tableId": "zip_lookup",
"reference": "main",
"targetFileSize": "256 MB",
"minFileSize": "192 MB",
"maxFileSize": "460 MB",
"minFiles": 5
},
"metrics": {
"rewrittenDataFiles": 6,
"newDataFiles": 2,
"rewrittenDeleteFiles": 5
}
}

Response Status Codes

200

OK

400

Bad Request

401

Unauthorized

404

Not Found

500

Internal Server Error


Canceling a Job

Canceling a job.

Method and URL
POST /v0/arctic/catalogs/{catalogId}/jobs/{id}/cancel

Parameters

catalogId

path

String (UUID)

Unique identifier for the catalog that contains the job you want to cancel.


id

path

String (UUID)

Unique identifier for the job to cancel.


Example Request
curl -X DELETE 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a'/cancel \
--header 'Authorization: Bearer <personal access token>' \
--header 'Content-Type: application/json'
Example Response
No response

Response Status Codes

204

No Content

400

Bad Request

403

Forbidden

404

Not Found