Skip to main content
Version: 24.3.x

Deployment Models

This topic describes Dremio deployment models. Dremio is a distributed system that can be deployed in a public cloud or on premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.

The following models and associated environments are provided:

ModelEnvironment
Amazon EKSHosted Kubernetes Environment
AWS EditionCloud Service Provider Environment
Azure AKSHosted Kubernetes Environment
Azure ARMCloud Service Provider Environment
Google Cloud GKEHosted Kubernetes Environment
Hadoop using YARNShared Multi-Tenant Environment
MapR using YARNShared Multi-Tenant Environment
HPE Ezmeral ContainerHosted Kubernetes Environment
Standalone ClusterStandalone - Deployed separately

Cloud Service Provider Environment

If you plan on using a cloud service provider's environment, Dremio provides the following that are streamlined for each of the cloud providers unique deployment and management processes.

AWS EditionAzure ARM
Dremio on Amazon AWS that hosts S3 and other databases.Dremio on Azure that hosts ADLS and other databases.

Hosted Kubernetes Environment

If you plan on using a hosted Kubernetes environment, Dremio provides the following models that are a quick and easy method for deploying and managing containerized applications.

Azure AKSAmazon EKSGoogle Cloud GKE
Dremio on Azure Kubernetes Service (AKS) to manage a hosted Kubernetes environment. Provides a quick and easy method for deploying and managing containerized applications.Dremio on Amazon Elastic Container Service for Kubernetes (Amazon EKS) to deploy, manage, and scale containerized applications using Kubernetes on AWS.Dremio on Google Kubernetes Engine (GKE) to deploy, manage, and scale containerized applications using Kubernetes on Google Cloud.

Shared Multi-Tenant Environment

If you plan on using a shared multi-tenant environment, Dremio provides the following models that use YARN for deployment.

Hadoop using YARNMapR using YARN
Dremio on Hadoop in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.Dremio on MapR in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.
note

Co-locating Dremio with Hadoop/NoSQL: When Dremio is co-located with a Hadoop cluster (such as HDFS or MapR-FS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.

Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load, so unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.

Standalone Cluster

If you plan on creating a standalone cluster, Dremio provides the flexibility to deploy Dremio as a standalone on-premise cluster.

Standalone Cluster
Dremio on a standalone on-premise cluster; In this deployment scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database.