Skip to main content
Version: current [26.x]

Open Catalog Backup and Restore Enterprise

Regular backups are essential for protecting Open Catalog metadata and ensuring business continuity. This section explains how to back up and restore the MongoDB cluster that stores Open Catalog's configuration, table metadata, and access control policies.

note

Ensure you have enabled automated backup for your Dremio cluster before backing up the Open Catalog.

Automated Backups

Automated MongoDB backup is enabled in your values-overrides.yaml. The backups are automatically written to your distributed storage and must be taken while Dremio is operational. Not all object store authentication methods are supported by this feature. See Configuring the Distributed Storage for details on supported configurations.

When enabled, a backup agent will be deployed into the cluster as a container of the first MongoDB pod dremio-mongodb-rs0-0. Inspect the agent logs with the command: kubectl logs -f dremio-mongodb-rs0-0 -c backup-agent -n <your-namespace>. Backups are written to the catalog-backups folder of Dremio's distributed storage. The backup names will follow a consistent pattern, for example, cron-dremio-mongodb-20251112124000-87jl7.

Restore

Prerequisites

  1. Ensure that Dremio is in Admin Mode. See Using the Dremio Admin CLI on Kubernetes to understand how to switch to Admin Mode.

  2. Export your Kubernetes namespace as an environment variable. Replace the <namespace> placeholder with your value:

    export NAMESPACE = <namespace>
  3. Run the following command for a list of available backups for the restore:

    kubectl get psmdb-backup -n $NAMESPACE
  4. Run the following command for MongoDB cluster information. The clustername will be required to start the restore.

    kubectl get psmdb -n $NAMESPACE

Restore From a Full Backup

Restore based on the name of the specific backup.

  1. Create a file named restore.yaml. Fill in the YAML based on the output from the prerequisites, namely: <my-cluster-name> and <my-backup-name>. Dremio recommends substituting <my-restore-name> with a name containing the date the restore was performed.

    apiVersion: psmdb.dremio.com/v1
    kind: PerconaServerMongoDBRestore
    metadata:
    name: <my-restore-name>
    spec:
    clusterName: <my-cluster-name>
    backupName: <my-backup-name>
  2. Start the restore by applying the YAML created in the previous step:

    kubectl apply -f restore.yaml -n $NAMESPACE

Once completed, bring Dremio back online. See Using the Dremio Admin CLI on Kubernetes to understand how to leave Admin Mode.

Point-in-time Recovery

Restore to a particular point in time within a given backup. This allows for a more granular restore.

  1. Use this command to get a list of all restore times available within a backup.

    kubectl get psmdb-backup <backup_name> -n $NAMESPACE -o jsonpath='{.status.latestRestorableTime}
  2. Modify the restore.yaml specifying your chosen restore date and time in the following format YYYY-MM-DD HH:MM:SS from those available.

    apiVersion: psmdb.dremio.com/v1
    kind: PerconaServerMongoDBRestore
    metadata:
    name: <my-restore-name>
    spec:
    clusterName: <my-cluster-name>
    backupName: <my-backup-name>
    pitr:
    type: date
    date: YYYY-MM-DD hh:mm:ss
  3. Start the restore by applying the YAML created in the previous step:

    kubectl apply -f restore.yaml -n $NAMESPACE

Once completed, bring Dremio back online. See Using the Dremio Admin CLI on Kubernetes to understand how to leave Admin Mode.