Automated Maintenance with Dremio Catalog
Dremio Catalog automates maintenance tasks for data within the catalog to optimize query performance and minimize storage costs. Dremio Catalog currently supports automation for the following Iceberg maintenance tasks:
- Table optimization, which compacts small files into larger files.
- Table cleanup, which deletes expired snapshots and orphaned metadata files.
Enabling Automatic Optimization and Table Cleanup
Automatic Optimization
To enable automated optimization for the Dremio Catalog:
- In the Sources section in the bottom-left corner of the Datasets page, right-click on your Dremio Catalog source and click Settings.
- In the Source Settings dialog, select Advanced Options.
- Toggle Enable auto optimization.
- Click Save.
Automatic Table Cleanup
To enable an automated vacuum for Dremio Catalog:
- In the Sources section in the bottom-left corner of the Datasets page, right-click on your Dremio Catalog source and click Settings.
- In the Source Settings dialog, select Advanced Options.
- Toggle Enable table clean up.
- Click Save.
Table-Level Configuration
To enable/disable automatic optimization and cleanup at the table level within Dremio Catalog:
- Locate the desired table in the Dremio Catalog.
- Right-click on the table name and click
to open the table settings.
- In the Table Settings dialog, select Table Maintenance from the settings sidebar.
- Toggle the relevant settings—Enable automatic table maintenance and/or Enable table cleanup.
Customization
The following support keys are used to configure frequency and behavior for automatic maintenance operations:
dremio.optimization.auto.optimize.period.hours
- controls how often automatic optimization should run. Defaults to 3 hours.dremio.optimization.auto.vacuum.period.hours
- controls how often table cleanup should run. Defaults to 24 hours.dremio.optimization.auto.maintenance.rate_limit.batch_size
- controls the maximim number of concurrent maintenance queries. Defaults to 10.