Skip to main content
Version: current [26.x]

Kubernetes Environments for Dremio

Dremio is designed to run Kubernetes environments, providing enterprise-grade data lakehouse capabilities. To successfully deploy Dremio on Kubernetes, you need a compatible hosted Kubernetes environment.

Dremio is tested and supported on the following Kubernetes environments:

  • Elastic Kubernetes Service (EKS)

  • Azure Kubernetes Service (AKS)

  • Google Kubernetes Engine (GKE)

  • Red Hat OpenShift

The sections on this page detail recommendations for AWS and Azure. Please use the information provided as a guide for your vendors' equivalent options.

note

If you're using a containerization platform built on Kubernetes that isn't listed here, please contact your provider and Dremio Account team to discuss compatibility and support options.

Requirements

Versions

Dremio requires regular updates to your Kubernetes version. You must be on an officially supported version, and preferably not one on extended support. See the following examples for AWS Available versions on standard support and Azure Kubernetes versions.

Recommendations

See this table for resource request recommendations of the variours parts of the deployment, Recommended Resources Configuration.

For a list of all Dremio engine sizes see, Add an Engine. Engines will make up the lions share of any Dremio deployment.

Node Sizes

The following sections suggest AWS and Azure machines that could be used to meet our recommendations.

Dremio recommends having separate EKS node groups for the different components of our services to allow each node group to autoscale independently:

Core Services

  • Coordinators

    For coordinators, Dremio recommends at least 32 CPUs and 64 GB of memory, hence, a c6i.8xlarge or Standard_F32s_v2 is a good option, offering a CPU-to-memory ratio of 1:2. In the Helm charts, this would result in 30 CPUs and 60 GB of memory allocated to the Dremio pod.

  • Executors

    For executors, Dremio recommends either:

    • 16 CPUs and 128 GB of memory, hence, a r5d.4xlarge or Standard_E16_v5 is a good option, offering a CPU-to-memory ratio of 1:8. In the Helm charts, this results in 15 CPUs and 120 GB of memory allocated to the Dremio pod.
    • 32 CPUs and 128 GB of memory, hence, a m5d.8xlarge or Standard_D32_v5 is a good option, offering a CPU-to-memory ratio of 1:4 for high-concurrency workloads. In the Helm charts, this results in 30 CPUs and 120 GB of memory allocated to the Dremio pod.

Auxiliary Services

Catalog is made up of 4 key components: Catalog Service, Catalog Server, Catalog External, and MongoDB. Search has one key component, OpenSearch.

Each of these components needs between 2-4 CPUs and 4-16 GB of memory; hence, a m5d.2xlarge or Standard_D8_v5 is a good option and could be used to host multiple containers that are part of these services.

  • ZooKeeper, NATS, Operators, and Open Telemetry:

Each of these need between 0.5-1 CPUs and 0.5-1 GB, m5d.large, t2.medium, Standard_D2_v5 or Standard_A2_v2 are good options and could be used to host multiple containers that are part of these services.

Disk Storage Class

Dremio recommends:

  • For AWS, GP3 or IO2 as the storage type for all nodes.
  • For Azure managed-premium as the storage type for all nodes.

Additionally, for coordinators and executors, you can further use local NVMe SSD storage for C3 and spill on executors. For more information on storage classes, see the following resources AWS Storage Class and Azure Storage Class.

Storage size requirements are:

  • Coordinator volume #1: 128-512 GB (key-value store).
  • Coordinator volume #2: 16 GB (logs).
  • Executor volume #1: 128-512 GB (spilling).
  • Executor volume #2: 128-512 GB (C3).
  • Executor volume #3: 16 GB (logs).
  • MongoDB volume: 128-512 GB.
  • OpenSearch volume: 128 GB.
  • Zookeeper volume: 16 GB.

EKS Add-Ons

The following add-ons are required for EKS clusters:

  • Amazon EBS CSI Driver
  • EKS Pod Identity Agent