On this page

    Jobs preview

    Dremio Arctic enables you to schedule optimization jobs to help you manage the accumulation of the data files that occurs through DML operations. This API allows you to retrieve job status and run a one-off job.

    Jobs Object
    {
      "type": "OPTIMIZE",
      "catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
      "id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
      "state": "COMPLETED",
      "username": "dremio_user@company.com",
      "startedAt": "2023-02-23T20:25:44Z",
      "endedAt": "2023-02-23T20:28:50Z",
      "engineSize": "XX_SMALL_V1",
      "config": {
        "tableId": "zip_lookup",
        "reference": "main",
        "targetFileSize": "256 MB",
        "minFileSize": "192 MB",
        "maxFileSize": "460 MB",
        "minFiles": 5
      },
      "metrics": {
        "rewrittenDataFiles": 6,
        "newDataFiles": 2
      }
    }
    

    Job Attributes

    type

    String

    The type of the job. For Arctic optimization jobs, the type is OPTIMIZE.

    Example OPTIMIZE


    catalogId

    String (UUID)

    Unique identifier for the catalog where the job ran.

    Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


    id

    String (UUID)

    Unique identifier for the job.

    Example 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42


    state

    String

    Status of the job.

    Enum SETUP , QUEUED , STARTING , RUNNING , COMPLETED , CANCELLED , FAILED

    Example COMPLETED


    username

    String

    The user who created the job.

    Example dremio_user@company.com


    startedAt

    String

    Date and time when the job started, in UTC format.

    Example 2023-02-23T20:25:44Z


    endedAt

    String

    Date and time when the job ended, in UTC format.

    Example 2023-02-23T20:28:50Z


    engineSize

    String

    Engine size used by the job.

    Enum XX_SMALL_V1 , X_SMALL_V1 , SMALL_V1 , MEDIUM_V1 , LARGE_V1 , X_LARGE_V1 , XX_LARGE_V1 , XXX_LARGE_V1

    Example XX_SMALL_V1


    config

    Object

    Configuration options for the job.

    Example { "tableId": "zip_lookup", "reference": "main", "targetFileSize": "256 MB", "minFileSize": "192 MB", "maxFileSize": "460 MB", "minFiles": 5 }


    metrics

    Object

    Information about the number of rewritten and new data files that result from the job.

    Example { "rewrittenDataFiles": 6, "newDataFiles": 2 }

    config Object Attributes

    tableId

    String

    The name of the table that the optimization job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.

    Example zip_lookup, address.zip_lookup


    reference

    String

    Identifies the branch, tag, or commit where the table containing the optimization job is located.

    Example main


    targetFileSize

    String

    Controls the target size of the files that are generated for type OPTIMIZE. The integer must be a non-negative number. The default file size is 256 MB.

    Example 256 MB


    minFileSize

    String

    The minimum file size threshold for type OPTIMIZE. Files that are smaller in size than the specified minFileSize qualify for optimization. The integer must be a non-negative number. The default value is 192 MB.

    Example 192 MB


    maxFileSize

    String

    The maximum file size threshold for type OPTIMIZE. Files that are larger in size than the specified maxFileSize qualify for optimization. The integer must be a non-negative number. The default value is 460 MB.

    Example 460 MB


    minFiles

    String

    The minimum number of qualified files required to run an optimization job. Supports only type OPTIMIZE. The default is 5.

    Example 5

    metrics Object Attributes

    rewrittenDataFiles

    Integer

    Number of data files that were rewritten in the job.

    Example 6


    newDataFiles

    Integer

    Number of new data files that were created in the job.

    Example 2

    Creating a Job

    Create a job for the specified Arctic catalog.

    Method and URL
    POST /v0/arctic/catalogs/{catalogId}/jobs
    

    Parameters

    catalogId

    path

    String (UUID)

    Unique identifier for the catalog where the job should run.

    Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


    type

    body

    String

    The type of job to run. For Arctic optimization jobs, the type is OPTIMIZE.

    Example OPTIMIZE


    config

    body

    Object

    Configuration options for the job.

    Example { "tableId": "zip_lookup", "reference": "main", "targetFileSize": "256 MB", "minFileSize": "192 MB", "maxFileSize": "460 MB", "minFiles": 5 }


    config Object Parameters

    tableId

    body

    String

    The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator.

    Example zip_lookup, address.zip_lookup


    reference

    body

    String

    Identifies the branch, tag, or commit where the table to contain the job is located.

    Example main


    targetFileSize

    body

    String

    Optional

    Controls the target size of the files that will be generated for type OPTIMIZE. The integer must be a non-negative number. The default file size is 256 MB.

    Example 256 MB


    minFileSize

    body

    String

    Optional

    The minimum file size threshold for type OPTIMIZE. Files that are smaller in size than the specified minFileSize qualify for optimization. The integer must be a non-negative number. The default value is 192 MB.

    Example 192 MB


    maxFileSize

    body

    String

    Optional

    The maximum file size threshold for type OPTIMIZE. Files that are larger in size than the specified maxFileSize qualify for optimization. The integer must be a non-negative number. The default value is 460 MB.

    Example 460 MB


    minFiles

    body

    String

    Optional

    The minimum number of qualified files required to run the job. Supports only type OPTIMIZE. The default is 5.

    Example 5


    Example Request
    curl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
    --header 'Authorization: Bearer <personal access token>' \
    --header 'Content-Type: application/json' \
    --data-raw '{
      "type": "OPTIMIZE",
      "config": {
        "tableId": "zip_lookup",
        "reference": "main",
        "targetFileSize": "256 MB",
        "minFileSize": "192 MB",
        "maxFileSize": "460 MB",
        "minFiles": 5
      }
    }'
    
    Example Response
    {
      "type": "OPTIMIZE",
      "catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
      "id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
      "state": "SETUP",
      "username": "dremio_user@company.com",
      "engineSize": "XX_SMALL_V1",
      "config": {
        "tableId": "zip_lookup",
        "reference": "main",
        "targetFileSize": "256 MB",
        "minFileSize": "192 MB",
        "maxFileSize": "460 MB",
        "minFiles": 5
      }
    }
    

    Response Status Codes

    200

    OK

    400

    Bad Request

    401

    Unauthorized

    404

    Not Found

    500

    Internal Server Error


    Listing All Jobs

    Returns a listing of all jobs for the specified Arctic catalog.

    Method and URL
    GET /v0/arctic/catalogs/{catalogId}/jobs
    

    Parameters

    catalogId

    path

    String (UUID)

    Unique identifier of the catalog you want to list all job jobs for.

    Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


    pageToken

    query

    String

    Optional

    Token for retrieving the next page of jobs. If the Dremio instance has more jobs than the maximum per page (default 10), the response will include a nextPageToken after the data array. Use the nextPageToken value in your request URL as the pageToken value. Do not change any other query parameters included in the request URL when you use pageToken. Read pageToken Query Parameter for usage examples.


    maxResults

    query

    Integer

    Optional

    The maximum number of results to to return in the response. The server may return fewer results, but not more. The default is 10. Read maxResults Query Parameter for usage examples.


    filter

    query

    Object

    Optional

    A common expression language (CEL) expression that filters responses so that they include only results with the specified attributes and values. Filters for job type, tableId, user, reference, state, and quickfind. Value is a URL-encoded string that represents a JSON object. The JSON object specifies the attributes to filter on and the values to match for each attribute. Read filter Query Parameter for usage examples.


    view

    query

    String

    Optional

    Level of detail to include in the response. Valid values are SUMMARY (default) and FULL (include the config and metrics objects in the job objects in the response). Read view Query Parameter for usage examples.

    filter Object Attributes

    type

    String

    The type of job to be run. For Arctic optimization jobs, the type is OPTIMIZE.

    Example OPTIMIZE


    tableId

    String

    The name of the table that the job applies to. The table name is preceded by zero or more namespaces with a period (.) as the separator. If a tableID is used, then the type is OPTIMIZE.

    Example sample-NYC-taxi-trips, tripdata.sample-NYC-taxi-trips


    reference

    String

    Identifies the branch, tag, or commit where the table containing the job is located. If a reference is used, then the type is OPTIMIZE.

    Example main


    user

    String

    The user who created the job.

    Example dremio_user@company.com


    state

    String

    The job's state. The enum lists the valid values.

    Enum SETUP , QUEUED , STARTING , RUNNING , COMPLETED , CANCELLED , FAILED


    quickfind

    String

    Composite field that will match the job ID, username, tableId, or reference.


    Example Request
    curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs' \
    --header 'Authorization: Bearer <personal access token>' \
    --header 'Content-Type: application/json'
    

    In the response for a request to retrieve all Arctic optimization jobs, the job objects are wrapped with a data array. Each object in the data array represents one job.

    Each object in the response also includes two attributes that are not part of the individual job object:

    • errorMessage (String): For unsuccessful jobs, a description of the problem.
    • scheduleId (String): Unique identifier for the schedule that created the job, if applicable. Empty for jobs created via API or the “Optimize Once” option in the Dremio Cloud application.
    Example Response
    {
      "data": [
        {
          "type": "OPTIMIZE",
          "catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-23T20:25:44Z",
          "endedAt": "2023-02-23T20:28:50Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "1907c1f6-5fa7-4cf2-ae9c-4089620f7e28",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-23T19:06:52Z",
          "endedAt": "2023-02-23T19:09:57Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "c0a2648f-3400-4613-8a5e-e4ee71386ca2",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-16T17:18:12Z",
          "endedAt": "2023-02-16T17:20:18Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "7d0ebe6b-908e-42fe-b753-ba4b277a38de",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-16T17:12:29Z",
          "endedAt": "2023-02-16T17:14:34Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "e72dff53-bb7c-4737-8cd1-a788adc209b3",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-16T17:10:47Z",
          "endedAt": "2023-02-16T17:14:52Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "3777cde6-88df-4477-8662-77f7e45e9b69",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-16T17:06:29Z",
          "endedAt": "2023-02-16T17:10:34Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "90522152-efa5-4ed8-8692-7f1fd59b62a0",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-16T16:03:23Z",
          "endedAt": "2023-02-16T16:06:29Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": "55dbb0fa-d7e2-4520-9748-a48d4b96f837"
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "830c7a49-1408-4047-a6df-4ecf684fb918",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-16T13:53:23Z",
          "endedAt": "2023-02-16T14:07:30Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        },
        {
          "type": "OPTIMIZE",
          "catalogId": "6d138bd7-6513-46d7-b9cb-d236e195b34a",
          "id": "12946fe4-267a-4398-808c-4e03a97fb961",
          "state": "COMPLETED",
          "username": "dremio_user@company.com",
          "startedAt": "2023-02-15T23:14:43Z",
          "endedAt": "2023-02-15T23:17:48Z",
          "engineSize": "XX_SMALL_V1",
          "errorMessage": "",
          "scheduleId": ""
        }
      ],
      "nextPageToken": "a"
    }
    

    Response Status Codes

    200

    OK

    400

    Bad Request

    401

    Unauthorized

    404

    Not Found

    500

    Internal Server Error


    Retrieving a Job

    Retrieve the specified job.

    Method and URL
    GET /v0/arctic/catalogs/{catalogId}/jobs/{id}
    

    Parameters

    catalogId

    path

    String (UUID)

    Unique identifier of the catalog that contains the job you want to retrieve.

    Example 5d138bd7-6513-46d7-b9cb-d236e195b34a


    id

    path

    String (UUID)

    Unique identifier for the job you want to retrieve.

    Example 8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42


    Example Request
    curl -X GET 'https://api.dremio.cloud/v0/arctic/catalogs/{catalogId}/jobs/5d138bd7-6513-46d7-b9cb-d236e195b34a' \
    --header 'Authorization: Bearer <personal access token>' \
    --header 'Content-Type: application/json' 
    
    Example Response
    {
      "type": "OPTIMIZE",
      "catalogId": "5d138bd7-6513-46d7-b9cb-d236e195b34a",
      "id": "8c03f8c8-2c21-49c2-aa5a-dfaed20f5f42",
      "state": "COMPLETED",
      "username": "dremio_user@company.com",
      "startedAt": "2023-02-23T20:25:44Z",
      "endedAt": "2023-02-23T20:28:50Z",
      "engineSize": "XX_SMALL_V1",
      "config": {
        "tableId": "zip_lookup",
        "reference": "main",
        "targetFileSize": "256 MB",
        "minFileSize": "192 MB",
        "maxFileSize": "460 MB",
        "minFiles": 5
      },
      "metrics": {
        "rewrittenDataFiles": 6,
        "newDataFiles": 2
      }
    }
    

    Response Status Codes

    200

    OK

    400

    Bad Request

    401

    Unauthorized

    404

    Not Found

    500

    Internal Server Error