Skip to main content
Version: 24.3.x

Amazon EKS

This topic describes the deployment architecture of Dremio on Amazon Elastic Container Service for Kubernetes (EKS).

Architecture

Amazon EKS Diagram

Requirements

  • AWS EKS version 1.12.7 or later (see Setting up an EKS Cluster for instructions)

  • Worker node instance type (minimum): r5d.4xlarge (16 core, 128 GiB memory, 2 x 300 NVMe)

Setting up an EKS Cluster

To set up a Kubernetes cluster on Elastic Kubernetes Service (EKS):

  1. Create an EKS cluster following the instructions here.

    • Make sure to create a node group with an instance type that has 16CPU and 128GB of memory (r5d.4xlarge is recommended).

    • The number of allocated nodes in the EKS cluster must be equivalent to the number of Dremio executors plus one (1) for the Dremio master-coordinator.

  2. Connect to the cluster following the instructions here.

  3. Install Helm (see Installing Helm in EKS for more information).

  4. Begin Deploying Dremio.

Deploying Dremio

To deploy Dremio on EKS, follow the steps in Installing Dremio on Kubernetes in the dremio-cloud-tools repository on GitHub.

High Availability

High availability is dependent on the Kubernetes infrastructure. If any of the Kubernetes pods go down for any reason, Kubernetes brings up another pod to replace the pod that is out of commission.

  • The Dremio master-coordinator and secondary-coordinator pods are each StatefulSet. If the master-coordinator pod goes down, it recovers with the associated persistent volume and Dremio metadata preserved.

  • The Dremio executor pods are a StatefulSet with an associated persistent volume. secondary-coordinator pods do not have a persistent volume. If an executor pod goes down, it recovers with the associated persistent volume and data preserved.

Load Balancing

Load balancing distributes the workload from Dremio's web client (UI and REST) and ODBC/JDBC clients. All web and ODBC/JDBC clients connect to a single endpoint (load balancer) rather than directly to an individual pod. These connections are then distributed across available coordinator (master-coordinator and secondary-coordinator) pods.

For More Information