What is Dremio Cloud?
Dremio Cloud is a fully-managed lakehouse platform. Data teams use Dremio to deliver self-service analytics, while enjoying the flexibility to use Dremio’s lightning-fast SQL query service and any other processing engine on the same data.
Dremio Cloud enables analysts to explore and visualize data with sub-second query response times, and enables data engineers to ingest and transform data directly in the data lake with full support for DML operations. In addition, analysts can join data in the lake with data in external databases, so they don’t have to move data into object storage to derive value from that data. Dremio’s open lakehouse platform, based on community-driven standards like Apache Iceberg and Apache Arrow, enables organizations to use best-in-class processing engines and eliminates vendor lock-in.
With Dremio Cloud, organizations can focus on deriving value from data instead of database administration. As a fully-managed platform, Dremio Cloud eliminates the need to install, configure, and upgrade software, and manages the entire lifecycle of compute engines (including provisioning, scaling, pausing, and decommissioning). Dremio Cloud compute engines are deployed in your Amazon Virtual Private Cloud (VPC), so your data stays and is processed in your VPC.
There are two main services in Dremio Cloud: Dremio Sonar and Dremio Arctic
Dremio Sonar is a lakehouse query engine that supports the full spectrum of SQL needed for an organization’s data consumers.
Business users and analysts benefit from Sonar’s query acceleration and semantic layer to power BI dashboards directly on the lakehouse, without stale and expensive data extracts.
Data engineers can use the intuitive UI to quickly provision new views and metrics without ETL work.
Data scientists benefit from Sonar’s native Arrow Flight interface, enabling high-throughput data access from data science tools and programming languages like Python.
Sonar also breaks down data silos and enables queries on data that is not only in a lakehouse, but in databases and data warehouses, across Amazon Web Services (AWS), and on-premises.
Dremio Arctic consists of two primary services: (1) a metadata service that enables you to manage data as code and (2) a data optimization service (coming soon) that automates Iceberg table maintenance operations to ensure high-performance access to the data.
As an intelligent metastore for Apache Iceberg, which is powered by Nessie, Arctic provides a modern, cloud-native alternative to Hive Metastore. Using Arctic, you can:
Manage data in your data lakes with Git-like version control - enabling you to create commits, tags, and branches. This approach helps you to transform data across tables and schemas in isolation without impacting production workloads, and merge your data changes once they’re tested and ready.
Work with data in engines that support Apache Iceberg tables including query engines (Sonar), processing engines (Spark), and streaming engines (Flink).
Reproduce ML models or dashboards based on a specific point in time without managing expensive data copies. Track the evolution of lakehouse data over time, with visibility into who made changes to data.
Dremio’s functions are divided between virtual private clouds (VPCs): Dremio’s and yours. Dremio’s VPC acts as the control plane. Your VPC acts as an execution plane. If you use multiple cloud accounts with Dremio, each VPC acts as an execution plane.
Dremio’s Control Plane
There are two Dremio control planes, one hosted in North America and the other hosted in Europe, but their functions are identical. Each of the two control planes hosts Dremio’s web interface, handles query requests, hosts REST API endpoints, and manages the engines for all of the customers that are using that plane, keeping the experiences for each customer separate. The control plane also stores data about the jobs that run your organization’s queries, statistics about your organization’s use of Dremio, and other metadata.
The Execution Plane
In your VPC resides the execution plane, which consists of one or more compute engines per subnet. Engines are provisioned automatically as needed for the execution of queries. For example, if the VPC for your organization is running in AWS, Dremio’s control plane deploys compute engines as AWS EC2 instances within your VPC. The execution plane is also where your data is stored, and where the metadata for your Sonar projects and Arctic catalogs is stored.
Objects in Dremio Cloud
An organization is the account within Dremio Cloud in which work is done. An organization is created during the sign-up process.
A cloud represents a compute environment (AWS) in which Sonar engines run. Each object holds the credentials (access key/cross-account role) for each cloud that is configured and is associated with a particular Amazon Virtual Private Cloud (Amazon VPC), where compute resources are launched. A single cloud can be associated with more than one project. For more information, see Managing Clouds.
Users can work in all projects and catalogs in an organization.
When users are added to an organization by an organization administrator, the administrator assigns them roles that determine what they are allowed to do. Each user is assigned the Admin role, the Public role, or both when added to an organization. These roles are currently pre-configured in Sonar. You can add additional roles to suit the needs of your organization.
Learn more about Sonar projects in Overview of Sonar.
Learn more about Arctic catalogs in Overview of Arctic.
Signing Up for Dremio Cloud
If you don’t already have a Dremio Cloud account, there are two ways to get one:
- Sign up for an account, and create your own Dremio Cloud organization in the process. See Signing Up for Dremio Cloud.
- Be invited to a Dremio Cloud organization, then follow the instructions sent to your email account.