Deployment Models

This topic describes Dremio deployments.

Dremio is a distributed system that can be deployed in a public cloud or on premises. A Dremio cluster can be co-located with one of the data sources (Hadoop or NoSQL database) or deployed separately.

Model Deployment
Azure AKS Dremio on Azure Kubernetes Service (AKS) to manage a hosted Kubernetes environment.

Provides a quick and easy method for deploying and managing containerized applications.
Azure Template Dremio on Azure Data Lake Store or other databases hosted on Azure.
Amazon EKS Dremio on Amazon Elastic Container Service for Kubernetes (Amazon EKS) to deploy, manage, and scale containerized applications using Kubernetes on AWS.
Amazon Template Dremio on Amazon AWS that hosts S3 or other databases.
Hadoop using YARN Dremio on Hadoop in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.
MapR using YARN Dremio on MapR in YARN deployment mode; Dremio integrates with YARN ResourceManager to secure compute resources in a shared multi-tenant environment.
Standalone Dremio on a standalone on-premise cluster; In this deployment scenario, a Hadoop cluster is not available and the data is not in a single distributed NoSQL database.

[info] Co-locating Dremio with Hadoop/NoSQL

When Dremio is co-located with a Hadoop cluster (such as HDFS or MapR-FS) or distributed NoSQL database (such as Elasticsearch or MongoDB), it is important to utilize containers (cgroups, Docker, and YARN containers) to ensure adequate resources for each process.

Dremio features a high-performance asynchronous engine that minimizes the number of threads and context switches under heavy load, so unless containers are utilized, the operating system may over-allocate resources to other thread-hungry processes on the nodes.


results matching ""

    No results matching ""