Backup and Restore for Dremio Catalog Enterprise
Backups are crucial to restoring Dremio's state and ensuring business continuity in case of a critical failure. This topic outlines the process for backing up and restoring data to and from the MongoDB cluster which serves as the metadata storage backend for Dremio Catalog.
Ensure you have enabled automated backup for your Dremio cluster before backing up the metadata storage backend for Dremio Catalog.
General Considerations
Cluster Topology
The instructions presented in this topic rely on the official MongoDB utilities mongodump
and mongorestore
. These tools should be used only on non-sharded clusters, which is the case with the MongoDB cluster deployed by the Dremio Enterprise Helm chart.
If you have a sharded cluster instead, do not apply these instructions as they will not guarantee a consistent state of your data after backup. Dremio does not provide any guidance on how to back up and restore sharded clusters.
Downtime
The backup and restore procedures described below both require a downtime of the entire Dremio cluster, which can be triggered by setting the below values when upgrading the Helm release:
helm upgrade $RELEASE_NAME --namespace $NAMESPACE --values $VALUES_FILE \
--set DremioAdmin=true \
--set catalog.replicas=0 \
--set catalog.externalAccess.replicas=0 \
--set catalogservices.replicas=0
The above command will scale all Dremio statefulsets and deployments down to zero replicas, effectively creating a downtime. Conversely, the downtime can be lifted by setting DremioAdmin
back to false
:
helm upgrade $RELEASE_NAME --namespace $NAMESPACE --values $VALUES_FILE --set DremioAdmin=false
Please note that the MongoDB cluster itself cannot be down when mongodump
or mongorestore
are executed. Both tools must connect to a running mongod
instance.
Full vs. Incremental Backups
While it is possible to use mongodump
to create incremental backups, incremental backups are considerably more complex to set up since they need to be performed on a per-collection basis. This documentation provides guidance on how to create full backups only.
Version Compatibility
When using mongorestore
to load backups created by mongodump
, the MongoDB versions of your source and destination deployments must be either the same major version or the same feature compatibility version. For more details, read about mongorestore
compatibility requirements.
Backing up Data with mongodump
Before You Start
To ensure mongodump
can take a consistent snapshot of a replica set, always trigger a downtime as explained above.
The backup options presented below depend on two factors: the uncompressed size of the data and whether compression is used when creating backups. You can get an estimate of the data size by running the db.stats()
command on the dremio-catalog
database. An uncompressed backup will roughly be slightly bigger than the dataSize
field. If desired, you can add the --gzip
option to create compressed backups. The examples below use the --gzip
option.
Important: If you performed a backup with the --gzip
option, you MUST restore that backup using the --gzip
option.
Option 1: Back up Data to Your Local Machine
Use this option if the data to back up is of reasonable size and fits on your local disk.
Follow these steps:
-
Install the MongoDB Database Tools on your machine.
-
Obtain and save the current MongoDB backup user and password:
The name of the secret depends on the Helm release name. In this example, the release name is dremio
, so the secret name is dremio-mongodb-system-users
).
export MONGODB_BACKUP_USER=$(kubectl get secret --namespace $NAMESPACE dremio-mongodb-system-users -o jsonpath="{.data.MONGODB_BACKUP_USER}" | base64 --decode)
export MONGODB_BACKUP_PASSWORD=$(kubectl get secret --namespace $NAMESPACE dremio-mongodb-system-users -o jsonpath="{.data.MONGODB_BACKUP_PASSWORD}" | base64 --decode)
- Forward the MongoDB service port and place the process in the background:
The name of the service depends on the Helm release name. In this example, the release name is dremio
, so the service name is dremio-mongodb-rs0
.
kubectl port-forward --namespace $NAMESPACE svc/dremio-mongodb-rs0 27017:27017 &
- Create a directory for the backup files and make sure there is enough disk space available, using the following:
export BACKUP_DIR=...
mkdir -p $BACKUP_DIR
chmod o+w $BACKUP_DIR
-
Trigger a downtime as explained above.
-
Back up the contents of the
dremio
database using themongodump
tool:
mongodump -u $MONGODB_BACKUP_USER -p $MONGODB_BACKUP_PASSWORD -o $BACKUP_DIR --gzip --db dremio --authenticationDatabase admin
-
Stop the
kubectl port-forward
background process. -
Lift the downtime.
Option 2: Back up Data to a Persistent Volume
For bigger clusters, or for easier backup automation, a Kubernetes Job
can be leveraged to back up your MongoDB cluster. Backups created in this manner should be saved to a pre-configured persistent volume.
To create a backup Job
, follow these steps:
-
Create a persistent disk in your infrastructure backed by a
PersistentVolume
to store the backups. The detailed procedure will depend on your infra and storage provider. Make sure it's big enough to host the data to back up. -
Create a
PersistentVolumeClaim
in the same namespace as your MongoDB cluster. Again, the actual characteristics of the claim will depend on your storage needs. See Appendix for a template. -
Trigger a downtime as explained above.
-
Create a
Job
in the same namespace as your MongoDB cluster to back up the data. Again, the exactJob
definition will depend on your backup needs. See Appendix for a template. -
Monitor the created
Job
to verify that the backup was successful. -
Lift the downtime.
-
Delete the backup
Job
if desired. (You cannot create another one with the same name.)
Restore Data with mongorestore
Before You Start
Data can be restored to the same logical cluster or to a different one, possibly in a different namespace, and possibly containing data. The instructions below focus on restores done against a blank target cluster replacing the source cluster in the same namespace.
Keep in mind that if the mongorestore
utility can restore to a new, blank cluster as well as to an existing one, it can only perform inserts (and not updates or upserts). As a consequence, if you restore documents to a non-empty collection, and existing documents have the same _id
field, mongorestore
will skip those documents and log a warning like the one below:
continuing through error: E11000 duplicate key error collection: dremio.objs index: _id_ dup key: { _id: { r: "", i: BinData(...
This situation should not happen if the target cluster is blank. If it does happen, you can either ignore the warnings or use the --drop
option to remedy that; it drops each collection to restore before actually restoring data. This option is generally safe to use, unless the target cluster contains data that is not in the backup being restored.
Option 1: Restore Data from Your Local Machine
Follow these steps:
-
Make sure the MongoDB Database Tools are installed on your machine.
-
Obtain and save the target MongoDB cluster's backup user and password:
export MONGODB_BACKUP_USER=$(kubectl get secret --namespace $NAMESPACE dremio-mongodb-system-users -o jsonpath="{.data.MONGODB_BACKUP_USER}" | base64 --decode)
export MONGODB_BACKUP_PASSWORD=$(kubectl get secret --namespace $NAMESPACE dremio-mongodb-system-users -o jsonpath="{.data.MONGODB_BACKUP_PASSWORD}" | base64 --decode)
- Forward the MongoDB service port and place the process in the background:
kubectl port-forward --namespace $NAMESPACE svc/dremio-mongodb-rs0 27017:27017 &
-
If the target cluster is not a blank one and has active users, trigger a downtime as explained above.
-
Restore the contents of the backup using the
mongorestore
tool. Use the--drop
option to recreate collections from scratch, if desired, and the--gzip
option if the backup is compressed:
mongorestore -u $MONGODB_BACKUP_USER -p $MONGODB_BACKUP_PASSWORD --drop --gzip --authenticationDatabase admin $BACKUP_DIR
- Lift the downtime, if applicable.
Option 2: Restore Data from a Persistent Volume Backup
A Kubernetes Job
can be leveraged to restore your MongoDB cluster to a previously backed-up state.
Follow these steps:
-
Reuse the same
PersistentVolume
andPersistentVolumeClaim
objects created when backing up. See Appendix for a template. -
If the target cluster is not a blank one and has active users, trigger a downtime as explained above.
-
Create a
Job
in the same namespace as your target MongoDB cluster to restore the data. Again, the exactJob
definition will depend on your backup needs. See Appendix for a template. -
Once the job is successfully finished, and if applicable, lift the downtime.
-
Delete the restore
Job
if desired. (You cannot create another one with the same name.)
Appendix
PersistentVolumeClaim
Template
Adjust size, storage class, etc. according to your needs. Consult your storage provider if necessary.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongodb-backups-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 512Gi
Job
Template for Backup Jobs
Make sure to use a Docker image that is compatible with your cluster's MongoDB version (see "Version Compatibility" above).
The template below will accumulate up to 10 backups in subfolders of the persistent volume. It uses the --gzip
option to create compressed backups.
apiVersion: batch/v1
kind: Job
metadata:
name: mongodb-backup
spec:
template:
securityContext:
fsGroup: 1001
containers:
- name: mongodb-backup
image: quay.io/dremio/percona/percona-server-mongodb:8.0.4-1-multi
securityContext:
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1001
runAsGroup: 1001
capabilities:
drop:
- ALL
command:
- "/bin/sh"
- "-c"
- |
mongodump \
--username "$(MONGODB_BACKUP_USER)" \
--password "$(MONGODB_BACKUP_PASSWORD)" \
--uri "$(MONGODB_URI)" \
--gzip --db dremio --authenticationDatabase admin \
--out /backups/mongo_backup_$(date -u +'%Y%d%m_%H%M%S') || exit 1
find /backups/mongo_backup_* -type d -maxdepth 0 | sort -r | \
tail -n +$(($MAX_BACKUPS + 1)) | \
while read folder; do
echo "deleting old backup: $folder..."
rm -rf "$folder"
done
echo backups:
ls -lh /backups
echo "backup completed successfully"
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "100m"
memory: "256Mi"
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: MAX_BACKUPS
value: "10"
- name: MONGODB_URI
value: "mongodb+srv://dremio-mongodb-rs0.$(NAMESPACE).svc.cluster.local/?ssl=false"
- name: MONGODB_BACKUP_USER
valueFrom:
secretKeyRef:
name: dremio-mongodb-system-users
key: MONGODB_BACKUP_USER
- name: MONGODB_BACKUP_PASSWORD
valueFrom:
secretKeyRef:
name: dremio-mongodb-system-users
key: MONGODB_BACKUP_PASSWORD
volumeMounts:
- name: mongodb-backups
mountPath: /backups
restartPolicy: OnFailure
volumes:
- name: mongodb-backups
persistentVolumeClaim:
claimName: mongodb-backups-pvc
Job
Template for Restore Jobs
Make sure to use a Docker image that is compatible with your cluster's MongoDB version (see "Version Compatibility" above).
The template below will run a Job
that restores the dremio
database to a specified backup timestamp, or to the latest backup available.
It also uses the --drop
option to drop each collection before restoring, and the --gzip
option to restore compressed backups.
apiVersion: batch/v1
kind: Job
metadata:
name: mongodb-restore
spec:
template:
spec:
securityContext:
fsGroup: 1001
containers:
- name: mongodb-restore
image: quay.io/dremio/percona/percona-server-mongodb:8.0.4-1-multi
securityContext:
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1001
runAsGroup: 1001
capabilities:
drop:
- ALL
command:
- "/bin/sh"
- "-c"
- |
if [ -z "$(MONGODB_BACKUP)" ] || [ "$(MONGODB_BACKUP)" = "latest" ]; then
echo "restoring from latest backup..."
mongorestore \
--username "$(MONGODB_BACKUP_USER)" \
--password "$(MONGODB_BACKUP_PASSWORD)" \
--uri "$(MONGODB_URI)" \
--drop --gzip --authenticationDatabase admin \
$$(find /backups/mongo_backup_* -type d -maxdepth 0 | sort -r | head -n 1) || exit 1
else
echo "restoring from requested backup: $MONGODB_BACKUP..."
mongorestore \
--username "$(MONGODB_BACKUP_USER)" \
--password "$(MONGODB_BACKUP_PASSWORD)" \
--uri "$(MONGODB_URI)" \
--drop --gzip --authenticationDatabase admin \
/backups/mongo_backup_$(MONGODB_BACKUP) || exit 1
fi
echo "restore completed successfully"
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "100m"
memory: "256Mi"
env:
- name: MONGODB_BACKUP
value: "latest" # or "20241210_145010"
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: MONGODB_URI
value: "mongodb+srv://dremio-mongodb-rs0.$(NAMESPACE).svc.cluster.local/?ssl=false"
- name: MONGODB_BACKUP_USER
valueFrom:
secretKeyRef:
name: dremio-mongodb-system-users
key: MONGODB_BACKUP_USER
- name: MONGODB_BACKUP_PASSWORD
valueFrom:
secretKeyRef:
name: dremio-mongodb-system-users
key: MONGODB_BACKUP_PASSWORD
volumeMounts:
- name: mongodb-backups
mountPath: /backups
restartPolicy: Never
volumes:
- name: mongodb-backups
persistentVolumeClaim:
claimName: mongodb-backups-pvc