Skip to main content
Version: current [25.x]

Clean Metadata

This topic provides usage information for the dremio-admin clean CLI command.

Requirements

  • Perform a backup before running the command (see Backup for more information).

  • Shut down all cluster nodes completely before running the command (see Startup/Shutdown for more information).

Syntax

Clean command syntax
dremio-admin clean <options>

You must specify at least one option. If you do not specify any options, the command opens the metadata store, but does not perform any operations. In this case, it just returns the message No operation requested.

Options

Clean command options
    -c, --compact
compact kvstore
Default: false
-o, --delete-orphans
delete orphans records in kvstore (e.g., old splits)
Default: false
-h, --help
show usage
-j, --max-job-days
delete jobs, profiles, and temporary dataset versions older than provided number of days
Default: 2147483647
-i, --reindex-data
reindex data
Default: false
-p, --delete-orphan-profiles
remove orphaned jobs
Default: false
-d, --delete-orphan-datasetversions
delete dataset versions older than the provided number of days
Default: 2147483647
note

If you do not specify any options, the output of the dremio-admin clean command is a report of metadata store statistics.

Examples

This section provides examples for using the dremio-admin clean command.

Compact Metadata

Compacts metadata store entries.

Compact metadata store entries
dremio-admin clean -c
dremio-admin clean --compact

Delete Orphaned Entries

Deletes orphaned metadata store entries.

Delete orphaned metadata store entries
dremio-admin clean -o
dremio-admin clean --delete-orphans

Delete Jobs

Deletes jobs, profiles, and temporary dataset versions older than the specified threshold days. If no threshold is specified, items older than the default number of days (2147483647) will be deleted. Using the default threshold will effectively not delete anything. It is recommended that you use a reasonable value (e.g., 7, 14, 30, etc.).

Delete jobs
dremio-admin clean -j=7
dremio-admin clean --max-job-days=7

Re-index Data

Re-index data.

Re-index data
dremio-admin clean -i
dremio-admin clean --reindex-data

Delete Orphaned Profiles

Delete orphaned Dremio job profiles.

Delete orphaned job profiles
dremio-admin clean -p
dremio-admin clean --delete-orphan-profiles

Delete Orphaned Dataset Versions

note

This command is available in Dremio 19.6.3+, 19.8.0+, 20.4.0+, 21.2.0+, and 22.0.0+.

Deletes dataset versions that Dremio is not using that are older than the specified threshold days. If no threshold is specified, dataset versions older than the default number of days (2147483647) will be deleted. Using the default threshold will effectively not delete anything. It is recommended that you use a reasonable value (e.g., 7, 14, 30, etc.).

Delete orphaned dataset versions
dremio-admin clean -d=7
dremio-admin clean --delete-orphan-datasetversions=7

Multiple Options

Running individual clean commands with a single option per command makes it easier to inspect the impact of each action. However, you can run the clean command with more than one option at a time.

For example, the following command compacts metadata, deletes jobs older than 7 days, and deletes orphaned dataset versions older than 7 days:

Use multiple options
dremio-admin clean -c -j=7 -d=7
dremio-admin clean --compact --max-job-days=7 --delete-orphan-datasetversions=7

Report Metadata Statistics

If you do not specify any options, the output of the clean command is statistics about the metadata store. The statistics include estimated key count, estimated total in-memory size, and total file size for different categories of objects in the store. Running the clean command without options is a read operation and will not clean metadata.

Report metadata store statistics
dremio-admin clean