This topic describes Dremio deployment models. Dremio is a distributed system that can be deployed in a public cloud or on premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.
The following models and associated environments are provided:
Hosted Kubernetes Environment
If you plan on using a hosted Kubernetes environment, Dremio provides the following models that are a quick and easy method for deploying and managing containerized applications.
|Azure AKS||Amazon EKS|
|Dremio on Azure Kubernetes Service (AKS) to manage a hosted Kubernetes environment. Provides a quick and easy method for deploying and managing containerized applications.||Dremio on Amazon Elastic Container Service for Kubernetes (Amazon EKS) to deploy, manage, and scale containerized applications using Kubernetes on AWS.|
If you plan on using a hosted environment, Dremio provides the following models that have templates for streamlining the deployment and management processes.
|Azure Template||Amazon Template|
|Dremio on Azure Data Lake Store or other databases hosted on Azure.||Dremio on Amazon AWS that hosts S3 or other databases.|
Shared Multi-Tenant Environment
If you plan on using a shared multi-tenant environment, Dremio provides the following models that use YARN for deployment.
|Hadoop using YARN||MapR using YARN|
|Dremio on Hadoop in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.||Dremio on MapR in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.|
[info] Co-locating Dremio with Hadoop/NoSQL
When Dremio is co-located with a Hadoop cluster (such as HDFS or MapR-FS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.
Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load, so unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.
If you plan on creating a standalone cluster, Dremio provides the flexibility to deploy Dremio as a standalone on-premise cluster.
|Dremio on a standalone on-premise cluster; In this deployment scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database.|