Restore Dremio
This page describes the various methods for restoring Dremio, such as the dremio-admin restore
CLI command, restoring a project from AWS, and other general information about restoring Dremio from a backup.
-
Dremio metadata (view definitions, source settings, spaces, RBAC, local user accounts) and user uploaded files can be backed up and restored.
-
The restore CLI command does not restore Apache Iceberg metadata stored in distributed storage.
-
Performing a restore does not restore the contents of the distributed cache, such as acceleration cache, downloaded files, and query results.
Restoring with the Admin CLI
This section provides details about restoring Dremio using the dremio-admin restore
CLI command.
Requirements
-
Ensure that the
dremio
service is shut down on all nodes of your Dremio cluster. See Startup/Shutdown for more information. -
The
restore
command must be run on the master node.cautionA backup can only be restored using the same version of Dremio that the backup was created on.
Syntax
Syntax for restore command<dremio_home>/bin/dremio-admin restore -d <BACKUP_PATH> [additional options]
Options
To obtain a list of restore options on the command line:
Get options for restore command./bin/dremio-admin restore -h
* -d, --backupdir
backup directory path. for example, /mnt/dremio/backups or
hdfs://$namenode:8020/dremio/backups
-h, --help
show usage
-r, --restore
restore dremio metadata (deprecated, always true)
Default: false
-v, --verify
verify backup contents (deprecated, noop)
Default: false
Example
Restore from a backup./bin/dremio-admin restore -d /tmp/dremio_backup_05august/dremio_backup_2022-08-05_13.41
Restoring a Dremio backup step-by-step
The following are step-by-step instructions for restoring Dremio from a backup.
-
Ensure that the
dremio
service is shut down on all nodes of your Dremio cluster. See Startup/Shutdown for more information. -
On the master node, create a copy of
<DREMIO_LOCAL_DATA_PATH>
. Check the default location for your deployment model. -
Prepare
<DREMIO_LOCAL_DATA_PATH>
.-
If you are restoring Dremio in-place (in other words, backing up and restoring within the same Dremio instance):
a. If you want to keep your original data, make a copy of
<DREMIO_LOCAL_DATA_PATH>
with a different name.b. Create an empty directory under
<DREMIO_LOCAL_DATA_PATH>
nameddb
that is readable and writable by the user running restore and by the Dremio daemon.
cautionSecret decryption relies on the contents of the
security
folder. If you delete thesecurity
folder (for in-place restore), source connection will fail during Dremio startup. -
-
Run the following command located in
Run restore command<DREMIO_HOME>/bin/
../dremio-admin restore -d <BACKUP_FOLDER_PATH>
-
Look for the confirmation message. For example:
Example confirmation message...
Restored from backup at /tmp/dremio_backup_2017-02-23_18.25, dremio tables 14, uploaded files 1
If you enabled unlimited splits before creating a backup, and the backup stored metadata about your tables in distributed storage, you must fix sources after restoring the backup through one of the following ways:
-
Forget and refresh the metadata by running SQL for each table. Learn about the SQL commands in Managing Tables.
-
Copy the
/metadata
folder to the new cluster. Learn more about it in Relocating Distributed Storage and Metadata.
Restoring Dremio from an AWS Backup
Should you need to perform a rollback to a previous version of your Dremio project using AWS, this section describes the steps necessary for restoring to a backup.
This process is used only if you previously created a backup manually or automatically. You may also alternately create a cloned environment to use for testing new versions of Dremio against your current project.
-
If you have not done so yet, stop the project you wish to restore from the backup. This terminates all EC2 instances associated with the project and releases any attached volumes.
-
Create a new Dremio stack with the same version of Dremio as the one tagged in the EBS snapshot you took previously (
dremio_version
).noteIf you're attempting to roll back from an upgrade, ensure that the stack you're creating is using the version of Dremio your project was previously using.
-
When the stack creation is complete and you have opened the project landing page from the coordinator EC2 instance, locate your project.
-
Using the Backup column, the total number of backups available for your project should display.
-
Click on this number.
-
From the modal window that opens, identify your backup by the timestamp it was created.
-
Click Restore and Open.
The project has been rolled back to the previous version.
Cloning a Dremio Stack in AWS Using Backup/Restore
This section outlines the process necessary to take an instance of Dremio and clone a copy of it. This is used when administrators wish to create a testing environment for upgrading to a new version of Dremio.
Prior to this, you must have first created a manual backup of your Dremio project.
-
Copy the AMI identifier for your original Dremio project. You'll use this when creating the target (clone) EC2 instance.
-
Create a backup of your source Dremio instance, as described here.
-
Create a new Dremio project on AWS using these instructions.
-
Copy the backup folder to the EC2 instance where your new AWSE project is running. Make sure to place it in the
/tmp
folder. Also, you'll want to ensure that the original ownership and group (dremio:dremio) of the backup folder is preserved. -
Check to ensure all cluster nodes are shut down, with the exception of the coordinator node.
-
Stop the Dremio service running on the EC2 instance with the command
sudo service dremio stop
. -
Open the
Line to add in dremio.conf filedremio.conf
file located in/etc/dremio/
and add the following line:provisioning.migration.enabled = "true"
-
Run the following command to delete the contents of the catalog database:
Command to delete catalog database contentssudo rm -rf /var/lib/dremio/db/*
-
Run the following command located in
Restore command/opt/dremio/bin/
:sudo -u dremio ./dremio-admin restore -d <BACKUP_FOLDER_PATH>
-
Look for the confirmation message. It should look something like
Restored from backup at /tmp/dremio_backup, dremio tables 14, uploaded files 1
.noteIf the project you created in step #3 was done using a paid edition, skip to step #13. If the project was created using a free version and you want to enable enterprise features, you may optionally add a license key here. To do so, obtain a Dremio license key from your Dremio Account Executive, complete these steps, and skip to step #13.
-
Paste your license into a file and save it (e.g.,
license.txt
) and take note of the directory pathway to include in the next step. -
Run the following command located under
Add license command/opt/dremio/bin/
:sudo -u dremio ./dremio-admin add-license -f <LICENSE>
-
Start the dremio service with the command
sudo service dremio start
.
Common Errors
You may see some warnings in the server.log
file that look like: "WARN c.d.s.r.MaterializationCache - couldn't expand materialization
. This can be corrected by refreshing all of your reflections, or by simply waiting until the reflections refresh themselves automatically.
Troubleshooting
Enable verbose logging
If you encounter any error messages during restore, enable verbose logging by following the instructions here and run the command again.
Dremio on Edge Nodes
Problem
When Dremio is running on a edge node (Hadoop client installed) and a dremio-admin restore
is performed,
by default, it looks at HDFS and comes back with file does not exist. The folder/file obviously does not exist is Hadoop.
Restore fails with the following stack:
Restore failure outputError Message: java.io.FileNotFoundException: File /tmp/dremiobackup does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:901)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:112)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:958)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:958)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1537)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1580)
at com.dremio.dac.util.BackupRestoreUtil.scanInfoFiles(BackupRestoreUtil.java:191)
at com.dremio.dac.util.BackupRestoreUtil.validateBackupDir(BackupRestoreUtil.java:230)
at com.dremio.dac.cmd.Restore.main(Restore.java:81)
verify failed java.io.FileNotFoundException: File /tmp/dremiobackup does not exist
Workaround
Use file:///
to direct to local. For example, use the following command instead:
./bin/dremio-admin restore -d file:///tmp/dremiobackup/dremio_backup_2019-04-22_20.30
For More Information
- Creating a Backup for getting started with backups.