Skip to main content

Architecture

At a high level, Dremio's architecture is divided into three planes: data, execution, and control. Dremio is fully hosted with the control and execution planes running on Dremio's tenant.

Data

Dremio's primary data plane is Amazon S3. You can use Dremio-managed storage or bring your own bucket. Dremio can also federate across relational sources, so you can pull data from wherever it resides.

Execution

The execution plane follows a massively parallel processing (MPP) model, where workloads are divided into fragments and spread across a cluster of executors. To minimize repeated reads from S3, Dremio uses caching layers to make queries as fast as possible.

Control

The control plane is where metadata is managed, queries are planned, and security is defined.

How Queries Flow Through Dremio

With an understanding of the layers, we can follow a query through Dremio. A SQL query will start in your organization's slice of our control plane, whether submitted via the web console or via a client connection. The metadata of the datasets being queried informs Dremio how it should plan to access and transform your data. This plan is iterated over, with each iteration applying optimization. This plan, separated into fragments, is passed to a query engine. The query engine will read and transform the data amongst its constituent executors, delivering the results back up to the point of origin.