Skip to main content

JDBC Driver for Arrow Flight SQL

The JDBC driver for Arrow Flight SQL is an open-source driver that is based on the specifications for the Java Database Connectivity (JDBC) API. However, the Flight JDBC driver uses Apache Arrow, so it is able to move large amounts of data faster, in part because it does not need to serialize and then deserialize data.

This driver solves a problem that is common to many BI tools that access databases through JDBC. These tools bundle a different JDBC driver for each type of database they support, because each of these databases has their own proprietary driver. Bundling multiple JDBC drivers for multiple databases can be difficult to maintain, and responding to support issues for multiple drivers can be costly. Now, provided that a database has an Apache Arrow Flight SQL endpoint enabled, the JDBC driver for Arrow Flight SQL can connect to it.

This driver is licensed under Apache-2.0.

Prerequisites for Using the JDBC Driver for Arrow Flight SQL

  • Supported versions of Java: 1.8 or later
  • Supported operating systems: Windows, MacOS, and Linux

Supported Authentication Method

You can use personal access tokens for authenticating to Dremio Cloud. To generate one, see Personal Access Tokens.

Downloading the JDBC Driver for Arrow Flight SQL

You can download the driver here.

Integrating the JDBC Driver for Arrow Flight SQL

To integrate the driver into your development environment, add it to your classpath.

Name of the Class

The name of the class is org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver.

Connecting to Dremio Cloud

Use this template to create a direct connection to Dremio Cloud:

Template for the JDBC URL
jdbc:arrow-flight-sql://data.dremio.cloud:443/?useEncryption=true&token=<personal_access_token>[&schema=<schema>][&<properties>]
  • token: The personal access token to use to authenticate to Dremio. See Personal Access Tokens for information about enabling and creating PATs. You must URL-encode PATs that you include in JDBC URLs. See URL-encoding Values of Properties for suggested steps.
  • schema: The name of the schema (datasource or space, including child paths, such as myDatasource.folder1 and mySpace.folder1.folder2) to use by default when a schema is not specified in a query.
  • <properties>: A list of JDBC properties for encrypting connections and routing queries to particular engines. Values must be URL-encoded. See URL-encoding Values for suggested steps.

To authenticate to Dremio Cloud, pass in a personal access token (PAT) with the token property. Use the PAT as the value. See Personal Access Tokens for information about enabling the use of PATs in Dremio and about creating PATs. You must URL-encode PATs that you include in JDBC URLs. To encode a PAT locally on your system, you can follow the steps in URL-encoding Values.

Encrypting Connections

If you are setting up encrypted communication between your JDBC client applications and the Dremio server, use the SSL JDBC connection parameters and fully qualified hostname to configure the JDBC connection string and connect to Dremio.

note

This driver does not yet support these features:

  • Disabling host verification
  • Impersonation
PropertiesValueRequiredDescription
disableCertificateVerificationtrue or false[Optional]If true, Dremio Cloud does not verify the host certificate against the truststore. The default value is false.
trustStoreTypestring[Optional]Default: JKS The trustStore type. Allowed values are : JKS, PKCS12

If the useSystemTrustStore option is set to true (on Windows only), the allowed values are: Windows-MY, Windows-ROOT
Import the certificate into the Trusted Root Certificate Authorities and set trustStoreType=Windows-ROOT.
Also import the certificate into Trusted Root Certificate Authorities or Personal and set trustStoreType=Windows-MY.
trustStorestring[Optional]Path to the truststore.
If not provided, the default Java truststore is used (usually $JAVA_HOME/lib/security/cacerts) and the trustStorePassword parameter is ignored.
useSystemTrustStoretrue or false[Optional]By default, the value is true. Bypasses trustStoreType and automatically picks the correct truststore based on the operating system: Keychain on MacOS, Local Machine and Current User Certificate Stores on Windows, and default truststore on other operating systems. If you are using an operating system other than MacOS or Windows, you must use the trustStorePassword property to pass the password of the truststore. Here is an example of a connection string for Linux:
jdbc:arrow-flight-sql://data.dremio.cloud:443/?useEncryption=true&token=1234&trustStorePassword=901234
trustStorePasswordstring[Optional]Password to the truststore.

URL-encoding Values

To encode a personal access token (PAT) or property value locally on your system, you can follow these steps:

  1. In a browser window, right-click an empty area of the page and select Inspect.
  2. Click Console.
  3. Type encodeURIComponent("<PAT-or-value>"), where <PAT-or-value> is the personal access token that you obtained from Dremio or the value of a supported JDBC property. The URL-encoded PAT or value appears on the next line. You can highlight it and copy it to your clipboard.

Adding the Root CA Certificate to Your System Truststore

  1. At a command-line prompt, run this command:

    openssl s_client -showcerts -connect data.dremio.cloud:443 </dev/null
  2. Copy the last certificate, including the lines -----BEGIN CERTIFICATE----- and -----END CERTIFICATE-----, to your clipboard.

  3. Create a text file and paste the certificate into it.

  4. Save the text file as cert.pem.

  5. If you are using MacOS, follow these steps:

    a. In Finder, double-click the cert.pem file.

    b. In the dialog that opens, select the option to add the root certificate to the system truststore.

  6. If you are using Windows, follow these steps:

    a. At a command-line prompt, enter one of these commands:

    • certlm if you want to add the certificate for all user accounts on your Windows system.
    • certmgr if you want to add the certificate only for the current user account.

    b. Right-click the folder Trusted Root Certification Authorities.

    c. Select Import.

    d. Browse for the cert.pem file and import it.

  7. If you are using a version of Linux, follow the instructions for your version.

  8. If you are developing your own client application to use the driver to connect to Dremio Cloud, add the certificate to the Java truststore. You must know the path to the cacerts file from $JAVA_HOME.

    • If you are using Java 11, run this command:

      keytool -import -trustcacerts -file cert.pem -alias gtsrootr1ca -keystore $JAVA_HOME/lib/security/cacerts
    • If you are using Java 8, run this command:

      keytool -import -trustcacerts -file cert.pem -alias gtsrootr1ca -keystore $JAVA_HOME/jre/lib/security/cacerts

Differences Between the JDBC Driver for Arrow Flight SQL and the legacy Dremio JDBC Driver

The JDBC driver for Arrow Flight SQL adds these features:

  • Support for ResultSet.getBoolean() on varchar columns in which boolean values are represented as these strings: "0", "1", "true", "false".

  • Support for null calendar in calls to ResultSet.getDate(), ResultSet.getTime(), and ResultSet.getTimestamp()
    When a call to one of these methods has no Calendar parameter, or the Calendar parameter is null, the Flight JDBC driver uses the default timezone when it constructs the returned object.

  • Support for ResultSet.getDate(), ResultSet.getTime(), and ResultSet.getTimestamp() on varchar columns in which dates, times, or timestamps are represented as strings.

  • Support for varchar data that represents numeric values in calls to ResultSet.getInteger(), ResultSet.getFloat(), ResultSet.getDouble(), ResultSet.getShort(), ResultSet.getLong(), and ResultSet.getBigDecimal()

  • Support for integer values in calls to getFloat()
    Integers returned gain one decimal place.

  • Support for the native SQL complex types List, Map, and Struct
    Dremio's legacy JDBC driver uses String representations of these types.

  • Support for using the Interval data type in SQL functions

The JDBC driver for Arrow Flight SQL removes support for calling ResultSet.getBinaryStream() on non-binary data types. Though such support exists in traditional JDBC drivers, it is not in the specification for the JDBC API.

note

Calling DatabaseMetadata.getCatalog() when connected to Dremio returns empty. Other DatabaseMetadata methods return null values in the TABLE_CAT column. This is expected behavior because Dremio does not have a catalog.

Limitations

Neither impersonation nor parameterized queries are supported.

Supported Conversions from Dremio Datatypes to JDBC Datatypes

DREMIO TYPEJDBCARROW TYPE
BIGINTInt
BITBool
DATEDate
DECIMALDecimal
DOUBLEFloatingPoint(DOUBLE)
FIXEDSIZEBINARYFixedSizeBinary
FLOATFloatingPoint(SINGLE)
INTInt
INTERVAL_DAY_SECONDSInterval(DAY_TIME)
INTERVAL_YEAR_MONTHSInterval(YEAR_MONTH)
LISTList
MAPMap
NULLNull
OBJECTNot Supported
STRUCTStruct
TIMETime(MILLISECOND)
TIMESTAMPTimestamp(MILLISECOND)
VARBINARYBinary
VARCHARUtf8