Vacuum
Dremio Arctic can run vacuum (table cleanup) jobs to remove expire snapshots and orphaned metadata files for Iceberg tables. This API allows you to add, retrieve, modify, and delete cutoff policies and enable and disable schedules for Arctic catalog vacuum jobs.
Vacuum Object{
"defaultCutoffPolicy": "P5D",
"isVacuumScheduled": true
}
Vacuum Attributes
defaultCutoffPolicy String
Duration string that describes the cutoff policy for files that Dremio removes when the VACUUM CATALOG command runs on an Arctic catalog. The duration string starts with P
and is followed by the duration value and D
to indicate days. For example, P1D
means 1 day, P30D
means 30 days, and P90D
means 90 days. Dremio propagates the defaultCutoffPolicy value to the Nessie garbage collector configuration. For more information, read Enabling Table Cleanup and Setting the Cutoff Policy.
Example: P5D
isVacuumScheduled Boolean
Indicates whether an automatic VACUUM CATALOG schedule is enabled for the specified catalog. For more information, read Enabling Table Cleanup and Setting the Cutoff Policy.
Example: true
Creating a Cutoff Policy and Vacuum Schedule
Create a cutoff policy and enable or disable the vacuum schedule for the specified Arctic catalog.
Method and URLPOST /v0/arctic/catalogs/{catalogId}/engine/vacuum
Parameters
catalogId Path String (UUID)
Unique identifier for the Arctic catalog whose VACUUM CATALOG configuration you wish to create.
Example: d34bd884-4d19-fe37-ac42-e45443b8234c
defaultCutoffPolicy Body String
Duration string that describes the cutoff policy for files that Dremio removes when the VACUUM CATALOG command runs on an Arctic catalog. The duration string starts with P
and is followed by the duration value and D
to indicate days. For example, P1D
means 1 day, P30D
means 30 days, and P90D
means 90 days. Dremio propagates the defaultCutoffPolicy value you provide to the Nessie garbage collector configuration.
Example: P5D
isVacuumScheduled Body Boolean
To enable table cleanup so that Dremio automatically runs the VACUUM CATALOG command on the specified Arctic catalog once per day at 00:00 UTC, set to true
. Otherwise, set to false
. If you set isVacuumScheduled to false
and provide a valid value for defaultCutoffPolicy in the same request, Dremio propagates the defaultCutoffPolicy value to the Nessie garbage collector configuration and disables the VACUUM CATALOG schedule for the specified Arctic catalog. If you want Dremio to automatically run VACUUM CATALOG at a different interval and time than daily at 00:00 UTC, use the Arctic Catalog Schedules API to create or modify the VACUUM CATALOG schedule.
Example: true
Example Requestcurl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/d34bd884-4d19-fe37-ac42-e45443b8234c/engine/vacuum'
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header "Content-Type: application/json' \
--data-raw '{
"defaultCutoffPolicy": "P5D",
"isVacuumScheduled": "true"
}'
Example Response
A successful request to create a vacuum cutoff policy and schedule returns an empty response with the HTTP 200 OK
status response code.
Response Status Codes
200 OK
400 Bad Request
404 Not Found
500 Internal Server Error
Retrieving a Vacuum Cutoff Policy and Schedule
Retrieve the vacuum cutoff policy and schedule configuration for the specified Arctic catalog.
Method and URLGET /v0/arctic/catalogs/{catalogId}/engine/vacuum
Parameters
catalogId Path String (UUID)
Unique identifier for the Arctic catalog whose VACUUM CATALOG configuration you wish to retrieve.
Example: d34bd884-4d19-fe37-ac42-e45443b8234c
Example Requestcurl -X POST 'https://api.dremio.cloud/v0/arctic/catalogs/d34bd884-4d19-fe37-ac42-e45443b8234c/engine/vacuum'
--header 'Authorization: Bearer <PersonalAccessToken>' \
--header "Content-Type: application/json
{
"defaultCutoffPolicy": "P5D",
"isVacuumScheduled": true
}
Response Status Codes
200 OK
404 Not Found
500 Internal Server Error