Clean Metadata
This topic provides usage information for the dremio-admin clean
CLI command.
Requirements
- Perform a backup before running the command (see Backup for more information).
- Shut down all cluster nodes completely before running the command (see Startup/Shutdown for more information).
Syntax
Clean command syntaxdremio-admin clean [args...]
Options
Clean command options -c, --compact
compact kvstore
Default: false
-o, --delete-orphans
delete orphans records in kvstore (e.g., old splits)
Default: false
-h, --help
show usage
-j, --max-job-days
delete jobs, profiles, and temporary dataset versions older than provided number of days
Default: 2147483647
-i, --reindex-data
reindex data
Default: false
-p, --delete-orphan-profiles
remove orphaned jobs
Default: false
-d, --delete-orphan-datasetversions
delete dataset versions older than the provided number of days
Default: 2147483647
Examples
This section provides examples for cleaning up different types of data.
Compact Metadata
Compacts metadata store entries.
Compact metadata store entriesdremio-admin clean -c
dremio-admin clean --compact
Delete Orphaned Entries
Deletes orphaned metadata store entries.
Delete orphaned metadata store entriesdremio-admin clean -o
dremio-admin clean --delete-orphans
Delete Jobs
Deletes jobs, profiles, and temporary dataset versions older than the specified threshold days. If no threshold is specified, items older than the default number of days (2147483647) will be deleted. Using the default threshold will effectively not delete anything. It is recommended that you use a reasonable value (e.g., 7, 14, 30, etc.).
Delete jobsdremio-admin clean -j=7
dremio-admin clean --max-job-days=7
Re-index Data
Re-index data.
Re-index datadremio-admin clean -i
dremio-admin clean --reindex-data
Delete Orphaned Profiles
Delete orphaned Dremio job profiles.
Delete orphaned job profilesdremio-admin clean -p
dremio-admin clean --delete-orphan-profiles
Delete Orphaned Dataset Versions
This command is available in Dremio 19.6.3+, 19.8.0+, 20.4.0+, 21.2.0+, and 22.0.0+.
Deletes dataset versions that Dremio is not using that are older than the specified threshold days. If no threshold is specified, dataset versions older than the default number of days (2147483647) will be deleted. Using the default threshold will effectively not delete anything. It is recommended that you use a reasonable value (e.g., 7, 14, 30, etc.).
Delete orphaned dataset versionsdremio-admin clean -d=7
dremio-admin clean --delete-orphan-datasetversions=7