Skip to main content
Version: current [25.0.x]

Restore Dremio

This page describes the various methods for restoring Dremio, such as the dremio-admin restore CLI command, restoring a project from AWS, and other general information about restoring Dremio from a backup.

  • Dremio metadata (view definitions, source settings, spaces, RBAC, local user accounts) and user uploaded files can be backed up and restored.

  • The restore CLI command does not restore Apache Iceberg metadata stored in distributed storage.

  • Performing a restore does not restore the contents of the distributed cache, such as acceleration cache, downloaded files, and query results.

Restoring with the Admin CLI

This section provides details about restoring Dremio using the dremio-admin restore CLI command.

Requirements

  • Ensure that the dremio service is shut down on all nodes of your Dremio cluster. See Startup/Shutdown for more information.

  • The restore command must be run on the master node.

    caution

    A backup can only be restored using the same version of Dremio that the backup was created on.

Syntax

Syntax for restore command
<dremio_home>/bin/dremio-admin restore -d <BACKUP_PATH> [additional options]

Options

To obtain a list of restore options on the command line:

Get options for restore command
./bin/dremio-admin restore -h
Restore command options
  * -d, --backupdir
backup directory path. for example, /mnt/dremio/backups or
hdfs://$namenode:8020/dremio/backups
-h, --help
show usage
-r, --restore
restore dremio metadata (deprecated, always true)
Default: false
-v, --verify
verify backup contents (deprecated, noop)
Default: false

Example

Restore from a backup
./bin/dremio-admin restore -d /tmp/dremio_backup_05august/dremio_backup_2022-08-05_13.41

Restoring a Dremio backup step-by-step

The following are step-by-step instructions for restoring Dremio from a backup.

  1. Ensure that the dremio service is shut down on all nodes of your Dremio cluster. See Startup/Shutdown for more information.

  2. On the master node, create a copy of <DREMIO_LOCAL_DATA_PATH>. Check the default location for your deployment model.

  3. On the master node, delete the contents of <DREMIO_LOCAL_DATA_PATH> except for the security folder and then create an empty directory called db that is readable and writable by the user running restore and by the Dremio daemon under <DREMIO_LOCAL_DATA_PATH>.

    caution

    Do not delete the security folder when you delete the rest of the contents of <DREMIO_LOCAL_DATA_PATH>. Secret decryption relies on the contents of the security folder. If you delete the security folder, the source connection will fail during Dremio startup.

  4. Run the following command located in <DREMIO_HOME>/bin/.

    Run restore command
    ./dremio-admin restore -d <BACKUP_FOLDER_PATH>
  5. Look for the confirmation message. For example:

    Example confirmation message
    ...
    Restored from backup at /tmp/dremio_backup_2017-02-23_18.25, dremio tables 14, uploaded files 1
    note

    If you enabled unlimited splits before creating a backup, and the backup stored metadata about your tables in distributed storage, then after restoring from the backup you must run SQL to forget and refresh the metadata for each table. See Managing Tables for the SQL commands.

Restoring Dremio from an AWS Backup

Should you need to perform a rollback to a previous version of your Dremio project using AWS, this section describes the steps necessary for restoring to a backup.

note

This process is used only if you previously created a backup manually or automatically. You may also alternately create a cloned environment to use for testing new versions of Dremio against your current project.

  1. If you have not done so yet, stop the project you wish to restore from the backup. This terminates all EC2 instances associated with the project and releases any attached volumes.

  2. Create a new Dremio stack with the same version of Dremio as the one tagged in the EBS snapshot you took previously (dremio_version).

    note

    If you're attempting to roll back from an upgrade, ensure that the stack you're creating is using the version of Dremio your project was previously using.

  3. When the stack creation is complete and you have opened the project landing page from the coordinator EC2 instance, locate your project.

  4. Using the Backup column, the total number of backups available for your project should display.

  5. Click on this number.

  6. From the modal window that opens, identify your backup by the timestamp it was created.

  7. Click Restore and Open.

The project has been rolled back to the previous version.

Cloning a Dremio Stack in AWS Using Backup/Restore

This section outlines the process necessary to take an instance of Dremio and clone a copy of it. This is used when administrators wish to create a testing environment for upgrading to a new version of Dremio.

Prior to this, you must have first created a manual backup of your Dremio project.

  1. Copy the AMI identifier for your original Dremio project. You'll use this when creating the target (clone) EC2 instance.

  2. Create a backup of your source Dremio instance, as described here.

  3. Create a new Dremio project on AWS using these instructions.

  4. Copy the backup folder to the EC2 instance where your new AWSE project is running. Make sure to place it in the /tmp folder. Also, you'll want to ensure that the original ownership and group (dremio:dremio) of the backup folder is preserved.

  5. Check to ensure all cluster nodes are shut down, with the exception of the coordinator node.

  6. Stop the Dremio service running on the EC2 instance with the command sudo service dremio stop.

  7. Open the dremio.conf file located in /etc/dremio/ and add the following line:

    Line to add in dremio.conf file
    provisioning.migration.enabled = "true"
  8. Run the following command to delete the contents of the catalog database:

    Command to delete catalog database contents
    sudo rm -rf /var/lib/dremio/db/*
  9. Run the following command located in /opt/dremio/bin/:

    Restore command
    sudo -u dremio ./dremio-admin restore -d <BACKUP_FOLDER_PATH>
  10. Look for the confirmation message. It should look something like Restored from backup at /tmp/dremio_backup, dremio tables 14, uploaded files 1.

    note

    If the project you created in step #3 was done using a paid edition, skip to step #13. If the project was created using a free version and you want to enable enterprise features, you may optionally add a license key here. To do so, obtain a Dremio license key from your Dremio Account Executive, complete these steps, and skip to step #13.

  11. Paste your license into a file and save it (e.g., license.txt) and take note of the directory pathway to include in the next step.

  12. Run the following command located under /opt/dremio/bin/:

    Add license command
    sudo -u dremio ./dremio-admin add-license -f <LICENSE>
  13. Start the dremio service with the command sudo service dremio start.

Common Errors

You may see some warnings in the server.log file that look like: "WARN c.d.s.r.MaterializationCache - couldn't expand materialization. This can be corrected by refreshing all of your reflections, or by simply waiting until the reflections refresh themselves automatically.

Troubleshooting

Enable verbose logging

If you encounter any error messages during restore, enable verbose logging by following the instructions here and run the command again.

Dremio on Edge Nodes

Problem
When Dremio is running on a edge node (Hadoop client installed) and a dremio-admin restore is performed, by default, it looks at HDFS and comes back with file does not exist. The folder/file obviously does not exist is Hadoop.

Restore fails with the following stack:

Restore failure output
Error Message: java.io.FileNotFoundException: File /tmp/dremiobackup does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:901)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:112)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:958)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:958)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1537)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1580)
at com.dremio.dac.util.BackupRestoreUtil.scanInfoFiles(BackupRestoreUtil.java:191)
at com.dremio.dac.util.BackupRestoreUtil.validateBackupDir(BackupRestoreUtil.java:230)
at com.dremio.dac.cmd.Restore.main(Restore.java:81)
verify failed java.io.FileNotFoundException: File /tmp/dremiobackup does not exist

Workaround
Use file:/// to direct to local. For example, use the following command instead:

Example command for workaround
./bin/dremio-admin restore -d file:///tmp/dremiobackup/dremio_backup_2019-04-22_20.30

For More Information