Skip to main content
Version: 24.1.x

Querying External Data Sources

Dremio enables users to run external queries, queries that use the native syntax of the relational database, to process SQL statements that are not yet supported by Dremio or are too complex to convert. Dremio supports external queries for the following relational databases:

  • Amazon Redshift
  • Elasticsearch
  • IBM Db2
  • Microsoft Azure Data Explorer
  • Microsoft Azure Synapse Analytics
  • Microsoft SQL Server
  • MySQL
  • Oracle
  • PostgreSQL
  • Snowflake
  • Teradata
  • Dremio Hub connectors that use ARP, Advanced Relational Pushdown.

Note
Dremio supports only SELECT SQL statements in external queries. Dremio does not support batched SELECT statements that return multiple result sets.

Enabling Users to Perform External Queries (Dremio 16.0+)

To perform external queries, users need the EXTERNAL_QUERY privilege, which appears as External Query on the Privileges screen. Administrators must manually grant each user this privilege at the System or Source level.

With this privilege, users can access all types of data assets that might exist in external sources, including databases and tables. Therefore, Dremio recommends that you restrict access to external sources. You can do so by, for example, granting only read-only access to datasets.

Enabling Users to Perform External Queries (Dremio 15.X and earlier)

A Dremio administrator must enable external queries for a relational database data source. Dremio recommends configuring access to the source using a database user with read-only access to the relational databases. Once enabled, Dremio users with edit access to the source also have access to all derived database assets, such as databases and tables.

Warning
Enabling this feature allows users to bypass dataset and table-specific permissions in Dremio.

You must have the following prerequisites before enabling external queries:

  • A relational database data source in your Dremio deployment
  • Credentials of a database user with read-only access to the relational databases in the data source

To enable external queries:

  1. Log in to Dremio as an administrator.
  2. On the Datasets page, click External Sources.
  3. Click the relational database data source for which you want to enable external queries.
  4. Click the gear icon, to the right of the search box, then click Advanced Options in the Edit Source modal.
  5. Check Grant External Query access.
  6. Click Save.

Note:
Dremio clears permissions, formats, and data reflections for all datasets created from external queries when you update the metadata for the data source.

Syntax

External queries use the following syntax:
table(<datasource-name>.external_query('<source-query'))

Dremio treats the source query as a string literal that must be surrounded by single quotation marks, '. If an external query contains a string literal inside the source query, Dremio users must use double single quotes as shown in the following example:

External query example
SELECT * FROM table(postgresql.external_query('SELECT string_col FROM tbl WHERE string_col = *''*test*''*`

To escape a single quotation mark, use four consecutive quotation marks, as shown in the following example:

External query example with single quotation mark escape
SELECT * FROM table(postgresql.external_query('SELECT string_col from tbl where string_col = '''john''''s car''))

Example: Basic Query and Creating a Dremio Relational Database Data Source

The following example selects the entire table named Actor from the data source named PostgreSQL:

SELECT * FROM table(PostgreSQL.external_query('SELECT * FROM Actor'))

  1. Log in to Dremio.
  2. On the Datasets page, click External Sources.
  3. Click the relational database data source that you want to query using native syntax.
  4. Enter your external query in the SQL Editor.
  5. Click Run.

Dremio users and developers can then save the results of their external queries to a virtual data source by clicking the disk image icon, then selecting Save As....

Important
Dremio does not support moving virtual data sources created from external queries until after Dremio performs the first refresh of the data source.

Example: Source Query Joining Tables

The following source query demontrates how to perform a JOIN between a Dremio physical data source and a table in a virtual data source created from the results of an external query:

Example source query joining tables
SELECT b.customer_id, a.product_id, a.price
FROM table(postgresql.external_query( source_a,
'SELECT product_id, price
FROM products' )) AS a,
source_b.sales AS b
WHERE b.product_id = a.product_id'