Clean Metadata
This topic provides usage information for the dremio-admin clean
CLI command.
Requirements
Perform a backup before running the command (see Backup for more information).
Shut down all cluster nodes completely before running the command (see Startup/Shutdown for more information).
Syntax
dremio-admin clean [args...]
Options
-c, --compact
compact kvstore
Default: false
-o, --delete-orphans
delete orphans records in kvstore (e.g., old splits)
Default: false
-h, --help
show usage
-j, --max-job-days
delete jobs, profiles, and temporary dataset versions older than provided number of days
Default: 2147483647
-i, --reindex-data
reindex data
Default: false
-p, --delete-orphan-profiles
remove orphaned jobs
Default: false
-d, --delete-orphan-datasetversions
delete dataset versions older than the provided number of days
Default: 2147483647
Examples
This section provides examples for cleaning up different types of data.
Compact Metadata
Compacts metadata store entries.
Usage
dremio-admin clean -c
dremio-admin clean --compact
Delete Orphaned Entries
Deletes orphaned metadata store entries.
Usage
dremio-admin clean -o
dremio-admin clean --delete-orphans
Delete Jobs
Deletes jobs, profiles, and temporary dataset versions older than the specified threshold days. If no threshold is specified, items older than the default number of days (2147483647) will be deleted. Using the default threshold will effectively not delete anything. It is recommended that you use a reasonable value (e.g., 7, 14, 30, etc.).
Usage
dremio-admin clean -j=7
dremio-admin clean --max-job-days=7
Re-index Data
Re-index data.
Usage
dremio-admin clean -i
dremio-admin clean --reindex-data
Delete Orphaned Profiles
Delete orphaned Dremio job profiles.
Usage
dremio-admin clean -p
dremio-admin clean --delete-orphan-profiles
Delete Orphaned Dataset Versions
note:
This command is available in Dremio 19.6.3+, 19.8.0+, 20.4.0+, 21.2.0+, and 22.0.0+.
Deletes dataset versions that Dremio is not using that are older than the specified threshold days. If no threshold is specified, dataset versions older than the default number of days (2147483647) will be deleted. Using the default threshold will effectively not delete anything. It is recommended that you use a reasonable value (e.g., 7, 14, 30, etc.).
Usage
dremio-admin clean -d=7
dremio-admin clean --delete-orphan-datasetversions=7