Skip to main content

Connecting to Your Data

This section describes the data sources that you can configure and analyze using Dremio Cloud, including data lakes (distributed filesystems) and relational databases (external sources).

note

Dremio Cloud does not support case-sensitive data file names, table names, or column names.

For example, if you have three file names that have the same name, but with different cases (such as, MARKET, Market, and market), Dremio Cloud is unable to discern the case differences, resulting in unanticipated data results.

For column names, if two columns have the same name using different cases (such as Trip_Pickup_DateTime and trip_pickup_datetime) exist in the table, one of the columns may disappear when the header is extracted.

Data Source Support Matrix

The following sources are supported across AWS and Azure Dremio Cloud Projects:

Source TypeSource NameAWS Supported?Azure Supported?
Data-as-codeDremio ArcticYesYes
MetastoreAWS GlueYesNo
Object StorageAmazon S3YesYes
Azure StorageYesYes
DremioDremio-to-DremioYesYes
Relational DatabaseAmazon RedshiftYesNo
Apache DruidYesNo
IBM Db2YesYes
Microsoft SQL ServerYesYes
OracleYesYes
PostgreSQLYesYes
SnowflakeYesYes
VerticaYesYes

Data-as-code

You can add an Arctic catalog as a source to enable Git-like data management and allow data engineers to manage the data lake with the same best practices Git enables for software development, including commits, tags, and branches.

Metastores

The AWS Glue Catalog is a metadata store that lets you store and share metadata in AWS.

Object Storage

You can run queries directly on the data in your data lake by formatting directories and files into tables. The following types of object storage are supported:

Dremio Software Clusters

You can connect to one or more other Dremio Software clusters and run queries on the data sources that they are connected to. You can even run queries that federate data across connected clusters. See Connecting to Another Dremio Software Cluster.

Relational Databases (External Sources)

You can run queries directly on the data in relational databases, which are referred to as external sources. In addition, you can run external queries:

  • That use the native syntax of the relational database.

  • To process SQL statements that are not supported by Dremio Cloud or are too complex to convert.

    note

    Decimal-to-decimal mappings are supported for relational database sources.

The following database sources are supported: