Open Catalog Backup and Restore Enterprise
Regular backups are essential for protecting Open Catalog metadata and ensuring business continuity. This section explains how to back up and restore the MongoDB cluster that stores Open Catalog's configuration, table metadata, and access control policies.
Ensure you have enabled automated backup for your Dremio cluster before backing up the Open Catalog.
Automated Backups
Automated MongoDB backup is enabled in your values-overrides.yaml. The backups are automatically written to your distributed storage and must be taken while Dremio is operational. Not all object store authentication methods are supported by this feature. See Configuring the Distributed Storage for details on supported configurations.
When enabled, a backup agent will be deployed into the cluster as a container of the first MongoDB pod dremio-mongodb-rs0-0. Inspect the agent logs with the command: kubectl logs -f dremio-mongodb-rs0-0 -c backup-agent -n <your-namespace>. Backups are written to the catalog-backups folder of Dremio's distributed storage. The backup names will follow a consistent pattern, for example, cron-dremio-mongodb-20251112124000-87jl7.
Restore
Prerequisites
-
Ensure that Dremio is in Admin Mode. See Using the Dremio Admin CLI on Kubernetes to understand how to switch to Admin Mode.
-
Export your Kubernetes namespace as an environment variable. Replace the
<namespace>placeholder with your value:export NAMESPACE = <namespace> -
Run the following command for a list of available backups for the restore:
kubectl get psmdb-backup -n $NAMESPACE -
Run the following command for MongoDB cluster information. The
clusternamewill be required to start the restore.kubectl get psmdb -n $NAMESPACE
Restore From a Full Backup
Restore based on the name of the specific backup.
-
Create a file named
restore.yaml. Fill in the YAML based on the output from the prerequisites, namely:<my-cluster-name>and<my-backup-name>. Dremio recommends substituting<my-restore-name>with a name containing the date the restore was performed.apiVersion: psmdb.dremio.com/v1
kind: PerconaServerMongoDBRestore
metadata:
name: <my-restore-name>
spec:
clusterName: <my-cluster-name>
backupName: <my-backup-name> -
Start the restore by applying the YAML created in the previous step:
kubectl apply -f restore.yaml -n $NAMESPACE
Once completed, bring Dremio back online. See Using the Dremio Admin CLI on Kubernetes to understand how to leave Admin Mode.
Point-in-time Recovery
Restore to a particular point in time within a given backup. This allows for a more granular restore.
-
Use this command to get a list of all restore times available within a backup.
kubectl get psmdb-backup <backup_name> -n $NAMESPACE -o jsonpath='{.status.latestRestorableTime} -
Modify the
restore.yamlspecifying your chosen restore date and time in the following formatYYYY-MM-DD HH:MM:SSfrom those available.apiVersion: psmdb.dremio.com/v1
kind: PerconaServerMongoDBRestore
metadata:
name: <my-restore-name>
spec:
clusterName: <my-cluster-name>
backupName: <my-backup-name>
pitr:
type: date
date: YYYY-MM-DD hh:mm:ss -
Start the restore by applying the YAML created in the previous step:
kubectl apply -f restore.yaml -n $NAMESPACE
Once completed, bring Dremio back online. See Using the Dremio Admin CLI on Kubernetes to understand how to leave Admin Mode.