On this page

    Deployment Models

    This topic describes Dremio deployment models. Dremio is a distributed system that can be deployed in a public cloud or on premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.

    The following models and associated environments are provided:

    Cloud Service Provider Environment

    If you plan on using a cloud service provider’s environment, Dremio provides the following that are streamlined for each of the cloud providers unique deployment and management processes.

    AWS EditionAzure ARM
    Dremio on Amazon AWS that hosts S3 and other databases.Dremio on Azure that hosts ADLS and other databases.

    Hosted Kubernetes Environment

    If you plan on using a hosted Kubernetes environment, Dremio provides the following models that are a quick and easy method for deploying and managing containerized applications.

    Azure AKSAmazon EKSGoogle Cloud GKE
    Dremio on Azure Kubernetes Service (AKS) to manage a hosted Kubernetes environment. Provides a quick and easy method for deploying and managing containerized applications.Dremio on Amazon Elastic Container Service for Kubernetes (Amazon EKS) to deploy, manage, and scale containerized applications using Kubernetes on AWS.Dremio on Google Kubernetes Engine (GKE) to deploy, manage, and scale containerized applications using Kubernetes on Google Cloud.

    Shared Multi-Tenant Environment

    If you plan on using a shared multi-tenant environment, Dremio provides the following models that use YARN for deployment.

    Hadoop using YARNMapR using YARN
    Dremio on Hadoop in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.Dremio on MapR in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.

    note:

    Co-locating Dremio with Hadoop/NoSQL: When Dremio is co-located with a Hadoop cluster (such as HDFS or MapR-FS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.

    Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load, so unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.

    Standalone Cluster

    If you plan on creating a standalone cluster, Dremio provides the flexibility to deploy Dremio as a standalone on-premise cluster.

    Standalone Cluster
    Dremio on a standalone on-premise cluster; In this deployment scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database.