Restore Dremio

This topic describes the various processes for restoring Dremio, such as the CLI command, dremio-admin restore; restoring a project from AWS; and other general information about restoring Dremio from a backup.

Dremio metadata and user uploaded files can be backed up and restored. Doing a restore does not restore the contents of the distributed cache, such as acceleration cache, downloaded files and query results.

Requirements

  • All cluster nodes are completely shut down. See Startup/Shutdown for more information.
  • This command is run on the master node.

Important

A backup can only be restored using the same version of Dremio that the backup was created on.

Syntax

<dremio_home>/bin/dremio-admin restore -d <BACKUP_PATH> [additional options]

Options

To obtain a list of restore options on the command line:

./dremio-admin restore -h

Restore options:

  * -d, --backupdir
      backup directory path. for example, /mnt/dremio/backups or 
      hdfs://$namenode:8020/dremio/backups 
    -h, --help
      show usage
    -r, --restore
      restore dremio metadata
    -v, --verify
      verify backup contents

Example

./dremio-admin restore -d /tmp/dremio_backup

Restoring Dremio Step-by-step

The following are step-by-step instructions for restoring Dremio from a backup.

  1. Make sure all cluster nodes are shutdown.

  2. On the master node, create a copy of <DREMIO_LOCAL_DATA_PATH>. (e.g. /data/dremio/ depending on your setup)

  3. On the master node, delete contents of <DREMIO_LOCAL_DATA_PATH> and then create an empty directory called db readable and writable by the user running restore tool and Dremio daemon under <DREMIO_LOCAL_DATA_PATH>.

  4. On the master node, run the following command located under <DREMIO_HOME>/bin/ to verify backup where -v option verifies backup contents.

     $ ./dremio-admin restore 
     -d <BACKUP_FOLDER_PATH>
     -v
    
  5. If above step is successful, run the following command located under <DREMIO_HOME>/bin/ where -r option initiates a restore.

     $ ./dremio-admin restore 
     -d <BACKUP_FOLDER_PATH>
     -r
    
  6. Look for the confirmation message. For example:

    ...
    Restored from backup at /tmp/dremio_backup_2017-02-23_18.25, dremio tables 14, uploaded files 1
    

Restoring Dremio from an AWS Backup

Should you need to perform a rollback to a previous version of your Dremio project using AWS, this section describes the steps necessary for restoring to a backup.

Note:

This process is used only if you previously created a backup manually or automatically. You may also alternately create a cloned environment to use for testing new versions of Dremio against your current project.

  1. If you have not done so yet, stop the project you wish to restore from the backup. This terminates all EC2 instances associated with the project and releases any attached volumes.
  2. Create a new Dremio stack with the same version of Dremio as the one tagged in the EBS snapshot you took previously (dremio_version).

Reminder:

If you’re attempting to roll back from an upgrade, ensure that the stack you’re creating is using the version of Dremio your project was previously using.

  1. When the stack creation is complete and you have opened the project landing page from the coordinator EC2 instance, locate your project.
  1. Using the Backup column, the total number of backups available for your project should display.
  2. Click on this number.
  3. From the modal window that opens, identify your backup by the timestamp it was created.
  4. Click Restore and Open.

The project has been rolled back to the previous version.

Cloning a Dremio Stack in AWS Using Backup/Restore

This section outlines the process necessary to take an instance of Dremio and clone a copy of it. This is used when administrators wish to create a testing environment for upgrading to a new version of Dremio.

Prior to this, you must have first created a manual backup of your Dremio project.

  1. Copy the AMI identifier for your original Dremio project. You’ll use this when creating the target (clone) EC2 instance.
  2. Create a backup of your source Dremio instance, as described here.
  3. Create a new Dremio project on AWS using these instructions.
  4. Copy the backup folder to the EC2 instance where your new AWSE project is running. Make sure to place it in the /tmp folder. Also, you’ll want to ensure that the original ownership and group (dremio:dremio) of the backup folder is preserved.
  5. Check to ensure all cluster nodes are shut down, with the exception of the coordinator node.
  6. Stop the Dremio service running on the EC2 instance with the command sudo service dremio stop.
  7. Open the dremio.conf file located in /etc/dremio/ and add the following line:
provisioning.migration.enabled = "true"
  1. Run the following command to delete the contents of the catalog database:
sudo rm -rf /var/lib/dremio/db/*
sudo rm -rf /var/lib/dremio
  1. Run the following command located in /opt/dremio/bin/, where -r initiates a restore:
sudo -u dremio ./dremio-admin restore -d <BACKUP_FOLDER_PATH> -r
  1. Look for the confirmation message. It should look something like Restored from backup at /tmp/dremio_backup, dremio tables 14, uploaded files 1.

Note:

If the project you created in step #3 was done using a paid edition, skip to step #13. If the project was created using a free version and you want to enable enterprise features, you may optionally add a license key here. To do so, obtain a License Activation Key from your Dremio Account Executive, complete these steps, and skip to step #13.

  1. Paste your license into a file and save it (e.g., license.txt) and take note of the directory pathway to include in the next step.
  2. Run the following command located under /opt/dremio/bin/:
sudo -u dremio ./dremio-admin add-license -f <LICENSE
  1. Start the dremio service with the command sudo service dremio start.

Common Error

You may see some warnings in the server.log file that look like: “WARN c.d.s.r.MaterializationCache - couldn't expand materialization. This can be corrected by refreshing all reflections you have, or by simply waiting until the reflections refresh themselves automatically.

Troubleshooting

Dremio on Edge Nodes

Problem
When Dremio is running on a edge node (Hadoop client installed) and a dremio-admin restore -v or -r is performed, by default, it looks at HDFS and comes back with file does not exist. The folder/file obviously does not exist is Hadoop.

Restore fails with the following stack:
Error Message: java.io.FileNotFoundException: File /tmp/dremiobackup does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:901)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:112)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:958)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:958)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1537)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1580)
at com.dremio.dac.util.BackupRestoreUtil.scanInfoFiles(BackupRestoreUtil.java:191)
at com.dremio.dac.util.BackupRestoreUtil.validateBackupDir(BackupRestoreUtil.java:230)
at com.dremio.dac.cmd.Restore.main(Restore.java:81)
verify failed java.io.FileNotFoundException: File /tmp/dremiobackup does not exist

Workaround
Use file:/// to direct to local. For example, use the following command instead:

./dremio-admin restore -d file:///tmp/dremiobackup/dremio_backup_2019-04-22_20.30 -r

For More Information