Connecting to Your Data
Dremio supports a variety of data sources, including data as code, metastores, local and cloud-based object storage, and databases.
Nessie Catalogs
Nessie catalogs enable you to process, manage, consume, and share data in the same way that code is shared during software development. That is, you are empowered to take control of your data using concepts including version control, commits, and testing and development in isolation from your production data.
Metastores
Object Storage
Databases
- Amazon OpenSearch Service
- Amazon Redshift
- Apache Druid
- Dremio Cluster (you can connect to one or more other Dremio Software clusters and run queries on the data sources to which they are connected, and you can run queries that federate data across connected clusters)
- Elasticsearch
- IBM Db2
- Microsoft Azure Data Explorer
- Microsoft Azure Synapse Analytics
- Microsoft SQL Server
- MongoDB
- MySQL
- Oracle
- PostgreSQL
- Snowflake
- Teradata
- Vertica
Dremio enables users to run external queries, queries that use the native syntax of the relational database, to process SQL statements that are not yet supported by Dremio or are too complex to convert. Dremio administrators enable the feature for each data source and specify which Dremio users can edit that source. See Querying Relational-Database Sources Directly for more information.
Dremio improves query performance for relational database datasets with Runtime Filtering, which applies dimension table filters to joined fact tables at runtime.
- Decimal Support: Decimal-to-decimal mappings are supported for relational database sources.
- Collation: Relational database sources must have a collation equivalent to
LATIN1_GENERAL_BIN2
to ensure consistent results when operations are pushed down. For non-equivalent collations, create a view that coerces the collation to one that is equivalent toLATIN1_GENERAL_BIN2
and access that view. - For all sources, case-sensitive source data file/table names are not supported. In Dremio, case is ignored in the names of data files.
file1.parquet
,File1.parquet
, andFILE1.parquet
are considered to be equivalent names. Therefore, searching on one of these names can result in unanticipated results.
In addition, columns in a table that have the same name with different cases are not supported. For example, if two columns namedTrip_Pickup_DateTime
andtrip_pickup_datetime
exist in the same table, one of the columns may disappear when the header is extracted.
Files and Folders
-
note Case-sensitive source data file/table names are not supported. In Dremio, data filenames in your data source are "seen" in a case-insensitive manner. So, if you have three file names with difference cases (for example,
JOE
Joe
, andjoe
), Dremio "sees" the files as having the same name. Thus, searching onJoe
,JOE
, orjoe
, can result in unanticipated data results.
In addition, columns in a table that have the same name with different cases are not supported. For example, if two columns namedTrip_Pickup_DateTime
andtrip_pickup_datetime
exist in the same table, one of the columns may disappear when the header is extracted.