This topic describes Dremio deployments.
Dremio is a distributed system that can be deployed in a public cloud or on premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.
|Azure AKS||Dremio on Azure Kubernetes Service (AKS) to manage a hosted Kubernetes environment.
Provides a quick and easy method for deploying and managing containerized applications.
|Azure Template||Dremio on Azure Data Lake Store or other databases hosted on Azure.|
|Amazon EKS||Dremio on Amazon Elastic Container Service for Kubernetes (Amazon EKS) to deploy, manage, and scale containerized applications using Kubernetes on AWS.|
|Amazon Template||Dremio on Amazon AWS that hosts S3 or other databases.|
|Hadoop using YARN||Dremio on Hadoop in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.|
|MapR using YARN||Dremio on MapR in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.|
|Standalone||Dremio on a standalone on-premise cluster; In this deployment scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database.|
[info] Co-locating Dremio with Hadoop/NoSQL
When Dremio is co-located with a Hadoop cluster (such as HDFS or MapR-FS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.
Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load, so unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.