Azure VM Deployment
The setup described in this section will cover how to deploy Dremio on Microsoft Azure.
Please refer to System Requirements for base requirements. The following are additional requirements/instructions for Azure deployments.
- An Azure portal account
- A single VM for each coordinator and/or executor node.
- Recall that one or more coordinator nodes are required (one coordinator node may also serve as the master node) and one or more executor nodes are required.
- Master & coordinator node storage is for metadata and logging.
- Executor node storage is for logging and spilling. Dremio can spill to multiple locally attached disks in a high performance manner.
- Azure Data Lake Storage (ADLS) is used for reflection cache. Please refer to the ADLS section of the Distributed Storage Guide for configuring ADLS as Dremio's distributed storage.
Setting up Azure Virtual Machines
Create SSH key pair
You will need an SSH key pair. If you have an existing SSH key pair, this step can be skipped.
To create an SSH key pair and log into Linux VMs, run the following command from a Bash shell and follow the on-screen directions. For example, you can use the Azure Cloud Shell or the Windows Substem for Linux. The command output includes the file name of the public key file. Copy the contents of the public key file
(cat ~/.ssh/id_rsa.pub)to the clipboard:
ssh-keygen -t rsa -b 2048
For more detailed information on how to create SSH key pairs, including the use of PuTTy, see How to use SSH keys with Windows.
Log in to Azure and create the virtual machine
Log in to the Azure portal at http://portal.azure.com. In the following example, Redhat Linux Server 64-bit OS is used. To create a virtual machine:
a. Choose Create a resource in the upper left-hand corner of the Azure portal.
b. In the search box above the list of Azure Marketplace resources, search for and select Redhat Linux Server (or one that supports RPM), then choose Create.
c. Provide a VM name - Name it so that you can identify its purpose (e.g. dremio-master, dremio-coordinator, dremio-executor.) Leave the disk type as SSD, then provide a username (such as azureuser.)
d. For Authentication type, select SSH public key, then paste your public key into the text box. Take care to remove any leading or trailing white-space in your public key.
e. Choose to Create new resource group, then provide a name, such as myResourceGroup. Choose your desired Location, then select OK.
f. Select a size for the VM. You can filter by Compute type or Disk type, for example. A suggested VM size for a Coordinator is ‘D8s_v3’ and Executor is ‘D16s_v3’
Coordinator VM type: D8s_v3 (recommended)
Executor VM type: DS16s_v3 (recommended)
g. On the Settings page, under Network Security Group, select Advanced option. Then create a new security group with three routes to allow incoming requests to ports 22 (should be there by default), 9047 and 31010. From a security perspective, you should restrict the source to your corporate IP network.
h. On the summary page, select Create to start the VM deployment.
i. Repeat the process in this section to create all the VMs needed for your deployment.
Connect to virtual machine
To create an SSH connection with the VM:
a. Select the Connect button on the overview page for your VM.
b. In the Connect to virtual machine page, keep the default options to connect by DNS name over port 22.
c. In Login using VM local account, a connection command is shown. Click the button to copy the command. The following example shows what the SSH connection command looks like:
d. Paste the SSH connection command into a shell, such as the Azure Cloud Shell or Bash on Linux to create the connection.
Installing and Configuring Dremio
At this point, if you have already installed and configured Dremio, you should be able to SSH into the various Coordinator and Executor instances and each node should be able to contact each other within the cluster.