Deployment Models
This topic describes Dremio deployment models. Dremio is a distributed system that can be deployed in a public cloud or on premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.
The following models and associated environments are provided:
Cloud Service Provider Environment
If you plan on using a cloud service provider's environment, Dremio provides the following that are streamlined for each of the cloud providers unique deployment and management processes.
AWS Edition | Azure ARM |
---|---|
Dremio on Amazon AWS that hosts S3 and other databases. | Dremio on Azure that hosts ADLS and other databases. |
Hosted Kubernetes Environment
If you plan on using a hosted Kubernetes environment, Dremio provides the following models that are a quick and easy method for deploying and managing containerized applications.
Azure AKS | Amazon EKS | Google Cloud GKE |
---|---|---|
Dremio on Azure Kubernetes Service (AKS) to manage a hosted Kubernetes environment. Provides a quick and easy method for deploying and managing containerized applications. | Dremio on Amazon Elastic Container Service for Kubernetes (Amazon EKS) to deploy, manage, and scale containerized applications using Kubernetes on AWS. | Dremio on Google Kubernetes Engine (GKE) to deploy, manage, and scale containerized applications using Kubernetes on Google Cloud. |
Shared Multi-Tenant Environment
If you plan on using a shared multi-tenant environment, Dremio provides the following models that use YARN for deployment.
Hadoop using YARN | MapR using YARN |
---|---|
Dremio on Hadoop in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment. | Dremio on MapR in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment. |
Co-locating Dremio with Hadoop/NoSQL: When Dremio is co-located with a Hadoop cluster (such as HDFS or MapR-FS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.
Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load, so unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.
Standalone Cluster
If you plan on creating a standalone cluster, Dremio provides the flexibility to deploy Dremio as a standalone on-premise cluster.
Standalone Cluster |
---|
Dremio on a standalone on-premise cluster; In this deployment scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database. |