Skip to main content
Version: current [26.x Preview]

Arrow Flight SQL JDBC

The Arrow Flight SQL JDBC driver is an open-source driver that is based on the specifications for the Java Database Connectivity (JDBC) API. However, the Flight SQL JDBC driver uses Apache Arrow, so it is able to move large amounts of data faster, in part because it does not need to serialize and then deserialize data.

This driver solves a problem that is common to many BI tools that access databases through JDBC. These tools bundle a different JDBC driver for each type of database they support, because each of these databases has their own proprietary driver. Bundling multiple JDBC drivers for multiple databases can be difficult to maintain, and responding to support issues for multiple drivers can be costly. Now, provided that a database has an Apache Arrow Flight SQL endpoint enabled, the JDBC driver can connect to it.

This driver is licensed under Apache-2.0.

tip

Query planning is done on the specified node. To distribute query planning for JDBC connections, configure secondary coordinator nodes for your deployment.

Prerequisites

  • One of the following operating systems: Windows, MacOS, or Linux
  • Supported Java versions: Java 11+
    • For Java 16+, the driver requires the following option to be present:

      Java 16+ Requirement
      --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED

Supported Authentication Methods

  • Use the username and password of the Dremio account that you want to connect with.
  • Use a username and personal access token (PAT).
  • Use an OAuth Access Token

Username and Password

Pass a username and password with the user and password properties.

Personal Access Tokens Enterprise

Pass a username and personal access token (PAT) with the user and password properties. You must URL-encode PATs that you include in JDBC URLs. To encode a PAT locally on your system, you can follow the steps in URL-encoding Values. See Personal Access Tokens for enabling and creating PATs.

tip

Dremio recommends OAuth access tokens to improve security by reducing the risk of compromised passwords or personal access tokens.

OAuth Access Tokens Enterprise

To create a connection with an OAuth access token, configure the following properties:

  • token property with the vaue of the OAuth access token.
  • user property with the empty string "" to default to the username included in the access token. If the username is configured in the property value, it must match the username in the access token.
Example Arrow Flight SQL JDBC Connection Using OAuth Access Token
import jaydebeapi
jdbc_arrow_flight_url = "jdbc:arrow-flight-sql://{}:{}".format("localhost", 32010)
jdbc_arrow_flight_args = { "user": "", "token": dremio_access_token }
jdbc_driver_location_example = "/Users/me/workspace/drivers/flight-sql-jdbc-driver-16.1.0.jar"
jdbc_arrow_flight_conn = jaydebeapi.connect("org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver",
jdbc_arrow_flight_url,
jdbc_arrow_flight_args,
jdbc_driver_location_example)

Users can create OAuth access tokens using a local or LDAP username and password, a PAT, or an external JWT. Dremio provides sample code for each of these cases.

Connecting to Databases

  • Use this template to create a direct connection to a database that has enabled an Apache Arrow Flight SQL endpoint:

    Create direct connection to database
    jdbc:arrow-flight-sql://<hostname-or-IP-address>:<port-number>/?useEncryption=false[&schema=<optional_schema>][&<properties>]
    • <optional_schema>: The name of the schema (datasource or space, including child paths, such as myDatasource.folder1 and mySpace.folder1.folder2) to use by default when a schema is not specified in a query.
    • <properties>: A list of JDBC properties. Values must be URI-encoded.
  • Use this template to create a direct connection to a Dremio coordinator node:

    Create direct connection to Dremio coordinator node
    jdbc:arrow-flight-sql://<Dremio_coordinator>:32010[/?schema=<optional_schema>][&<properties>]
    • <Dremio_coordinator>: The hostname or IP address of the coordinator node in your Dremio cluster.
    • <optional_schema>: The name of the schema (datasource or space, including child paths, such as myDatasource.folder1 and mySpace.folder1.folder2) to use by default when a schema is not specified in a query.
    • <properties>: A list of JDBC properties. Values must be URL-encoded. See URL-encoding Values for suggested steps.

Downloading the Driver

To download the driver, go to Apache Arrow Flight SQL JDBC.

Integrating the driver

To integrate the driver into your development environment, add it to your classpath.

Name of the Class

The name of the class is org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver.

JDBC Properties for Dremio Wire Encryption

If you are setting up encrypted communication between your JDBC client applications and the Dremio server, use the SSL JDBC connection parameters and fully qualified hostname to configure the JDBC connection string and connect to Dremio.

note

This driver does not yet support these features:

  • Disabling host verification
  • Impersonation
PropertiesValueRequiredDescription
useEncryptiontrue or false[Optional]If true, SSL is enabled. If set to false, SSL is not enabled. The default is true. If you do not want to use encryption, you must set the value to false.
disableCertificateVerificationtrue or false[Optional]If true, Dremio does not verify the host certificate against the truststore. The default value is false.
trustStoreTypestring[Optional]Default: JKS The trustStore type. Allowed values are : JKS, PKCS12

If the useSystemTrustStore option is set to true (on Windows only), the allowed values are: Windows-MY, Windows-ROOT
Import the certificate into the Trusted Root Certificate Authorities and set trustStoreType=Windows-ROOT.
Also import the certificate into Trusted Root Certificate Authorities or Personal and set trustStoreType=Windows-MY.
trustStorestring[Optional]Path to the truststore.
If not provided, the default Java truststore is used (usually $JAVA_HOME/lib/security/cacerts) and the trustStorePassword parameter is ignored.
useSystemTrustStoretrue or false[Optional]By default, the value is true. Bypasses trustStoreType and automatically picks the correct Truststore based on the operating system: Keychain on MacOS, Local Machine and Current User Certificate Stores on Windows, and default truststore on other operating systems. If you are using an operating system other than MacOS or Windows, you must use the trustStorePassword property to pass the password of the truststore. Here is an example of a connection string for Linux:
jdbc:arrow-flight-sql://localhost:32010?trustStorePassword=Pc0_lL'Opjn$vSDcv:%Q0@@buc
trustStorePasswordstring[Optional]Password to the truststore.

Parameterized Queries with Prepared Statements

Prepared statements allow you to dynamically pass parameters to SQL queries using placeholders, ensuring safer query execution by separating the query structure from the values in parameters. With a prepared statement, you can set parameters (?) at runtime using set methods to reuse queries with different values.

note

This feature requires Apache Arrow 17.0.0 or later. It supports SELECT statements, but may not support all set methods.

To use parameterized queries with prepared statements, follow these steps:

  1. Use the prepareStatement() method to define a query with parameters, which act as placeholders for dynamic values.
  2. Set the values by replacing each parameter with a value using the appropriate set methods.
  3. Ensure all parameters are set before running the query, with indexing starting at 1. If parameters are not set before running the query, JBDC throws an exception.
  4. Call executeQuery() to run the query and retrieve results.
Java example
PreparedStatement preparedStatement = connection.prepareStatement(
"SELECT * FROM employees WHERE department = ? AND salary > ?");
preparedStatement.setString(1, "Engineering");
preparedStatement.setDouble(2, 75000);
ResultSet resultSet = preparedStatement.executeQuery();

Supported Data Types and Set Methods

Column Data TypeSupported Set Methods
IntegersetInt(), setShort(), setNull()
NumericsetInt(), setShort(), setLong(), setBigDecimal(), setNull()
DecimalsetShort(), setInt(), setLong(), setBigDecimal(), setNull()
BigIntsetShort(), setInt(), setLong(), setBigDecimal(), setNull()
DoublesetDouble(), setFloat(), setNull()
FloatsetFloat(), setNull()
CharsetString(), setNull()
VarcharsetString(), setNull()
BooleansetBoolean(), setNull()
TimesetTime(), setNull()
TimestampsetTimestamp(), setNull()
DatesetNull()
VarBinarysetBytes(), setNull()

Limitations

The JDBC client does not support the setDate() method due to mismatched date encoding formats between the Arrow Flight JDBC client and Dremio.

Managing Workloads

Dremio administrators can use the Arrow Flight server endpoint to manage query workloads by adding the following properties to connections created by Flight clients:

Flight Client PropertyDescription
ROUTING_ENGINEName of the engine to use to process all queries issued during the current session.
ROUTING_QUEUEName of the workload management queue. Used only during authentication.
ROUTING_TAGTag name associated with all queries executed within a Flight session. Used only during authentication.
SCHEMADefault schema path to the dataset that the user wants to query.

URL-encoding Values

To encode a personal access token (PAT) or property value locally on your system, you can follow these steps:

  1. In a browser window, right-click an empty area of the page and select Inspect.
  2. Click Console.
  3. Type encodeURIComponent("<PAT-or-value>"), where <PAT-or-value> is the personal access token that you obtained from Dremio or the value of a supported JDBC property. The URL-encoded PAT or value appears on the next line. You can highlight it and copy it to your clipboard.

Differences between the Arrow Flight SQL JDBC Driver and the Legacy Dremio JDBC Driver

The Arrow Flight SQL JDBC driver differs from the Dremio JDBC driver (legacy) in the following:

  • Requires Java 11+.

  • Supports ResultSet.getBoolean() on varchar columns in which boolean values are represented as these strings: "0", "1", "true", "false".

  • Supports null Calendar in calls to ResultSet.getDate(), ResultSet.getTime(), and ResultSet.getTimestamp()
    When a call to one of these methods has no Calendar parameter, or the Calendar parameter is null, the Flight JDBC driver uses the default timezone when it constructs the returned object.

  • Supports ResultSet.getDate(), ResultSet.getTime(), and ResultSet.getTimestamp() on varchar columns in which dates, times, or timestamps are represented as strings.

  • Supports varchar values that represents numeric values in calls to ResultSet.getInteger(), ResultSet.getFloat(), ResultSet.getDouble(), ResultSet.getShort(), ResultSet.getLong(), and ResultSet.getBigDecimal()

  • Supports integer values in calls to getFloat()
    Integers returned gain one decimal place.

  • Supports the native SQL complex types List, Map, and Struct
    The Dremio JDBC driver (legacy) uses String representations of these types.

  • Supports using the Interval data type in SQL functions.

  • Removes support for calling ResultSet.getBinaryStream() on non-binary data types. Though such support exists in traditional JDBC drivers, it is not in the specification for the JDBC API.

note

Calling DatabaseMetadata.getCatalog() when connected to Dremio returns empty. Other DatabaseMetadata methods return null values in the TABLE_CAT column. This is expected behavior because Dremio does not have a catalog.

Limitations

Impersonation is not supported.

Supported Conversions from Dremio Datatypes to JDBC Datatypes

DREMIO TYPEJDBCARROW TYPE
BIGINTInt
BITBool
DATEDate
DECIMALDecimal
DOUBLEFloatingPoint(DOUBLE)
FIXEDSIZEBINARYFixedSizeBinary
FLOATFloatingPoint(SINGLE)
INTInt
INTERVAL_DAY_SECONDSInterval(DAY_TIME)
INTERVAL_YEAR_MONTHSInterval(YEAR_MONTH)
LISTList
MAPMap
NULLNull
OBJECTNot Supported
STRUCTStruct
TIMETime(MILLISECOND)
TIMESTAMPTimestamp(MILLISECOND)
VARBINARYBinary
VARCHARUtf8