Automated Maintenance with the Open Catalog
The Open Catalog automates maintenance tasks for data within the catalog to optimize query performance and minimize storage costs. Open Catalog currently supports automation for the following Iceberg maintenance tasks:
- Table optimization, which compacts small files into larger files.
- Table cleanup, which deletes expired snapshots and orphaned metadata files.
Enabling Automatic Optimization and Table Cleanup
Automatic Optimization
To enable automated optimization for the Open Catalog:
- In the Sources section in the bottom-left corner of the Datasets page, right-click on your Open Catalog source and click Settings.
- In the Source Settings dialog, select Advanced Options.
- Toggle Enable auto optimization.
- Click Save.
Automatic Table Cleanup
To enable an automated vacuum for the Open Catalog:
- In the Sources section in the bottom-left corner of the Datasets page, right-click on your Open Catalog source and click Settings.
- In the Source Settings dialog, select Advanced Options.
- Toggle Enable table clean up.
- Click Save.
Table-Level Configuration
To enable/disable automatic optimization and cleanup at the table level within the Open Catalog:
- Locate the desired table in the Open Catalog.
- Right-click on the table name and click
to open the table settings. - In the Table Settings dialog, select Table Maintenance from the settings sidebar.
- Toggle the relevant settings—Enable automatic table maintenance and/or Enable table cleanup.
Customization
The following support keys are used to configure frequency and behavior for automatic maintenance operations:
dremio.optimization.auto.optimize.period.hours- controls how often automatic optimization should run. Defaults to 3 hours.dremio.optimization.auto.vacuum.period.hours- controls how often table cleanup should run. Defaults to 24 hours.dremio.optimization.auto.maintenance.rate_limit.batch_size- controls the maximim number of concurrent maintenance queries. Defaults to 10.