On this page

    JDBC Driver for Arrow Flight SQL

    The JDBC driver for Arrow Flight SQL is an open-source driver that is based on the specifications for the Java Database Connectivity (JDBC) API. However, the Flight JDBC driver uses Apache Arrow, so it is able to move large amounts of data faster, in part because it does not need to serialize and then deserialize data.

    This driver solves a problem that is common to many BI tools that access databases through JDBC. These tools bundle a different JDBC driver for each type of database they support, because each of these databases has their own proprietary driver. Bundling multiple JDBC drivers for multiple databases can be difficult to maintain, and responding to support issues for multiple drivers can be costly. Now, provided that a database has an Apache Arrow Flight SQL endpoint enabled, the JDBC driver for Arrow Flight SQL can connect to it.

    note:

    Tip: Query planning is done on the specified node. To distribute query planning for JDBC connections, configure secondary coordinator nodes for your deployment.

    Prerequisites for Using the JDBC Driver for Arrow Flight SQL

    • Java 1.8 or later
    • One of the following operating systems: Windows, MacOS, or Linux

    Downloading the JDBC Driver for Arrow Flight SQL

    You can download the driver here.

    Integrating the JDBC Driver for Arrow Flight SQL

    To integrate the driver into your development environment, add it to your classpath.

    Name of the Class

    The name of the class is org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver.

    Connecting to Databases

    • Use this template to create a direct connection to a database that has enabled an Apache Arrow Flight SQL endpoint:

      Create direct connection to database
      jdbc:arrow-flight-sql://<hostname-or-IP-address>:<port-number>/?useEncryption=false[&schema=<optional_schema>][&<properties>]
      
      • <optional_schema>: The name of the schema (datasource or space, including child paths, such as myDatasource.folder1 and mySpace.folder1.folder2) to use by default when a schema is not specified in a query.
      • <properties>: A list of JDBC properties. Values must be URI-encoded.
    • Use this template to create a direct connection to a Dremio coordinator node:

      Create direct connection to Dremio coordinator node
      jdbc:arrow-flight-sql://<Dremio_coordinator>:32010[/?schema=<optional_schema>][&<properties>]
      
      • <Dremio_coordinator>: The hostname or IP address of the coordinator node in your Dremio cluster.
      • <optional_schema>: The name of the schema (datasource or space, including child paths, such as myDatasource.folder1 and mySpace.folder1.folder2) to use by default when a schema is not specified in a query.
      • <properties>: A list of JDBC properties.

    Supported Authentication Methods

    To authenticate to Dremio, use either of these methods:

    • Pass in a username and password with the user and password properties.
    • Pass in a username and personal access token (PAT) with the user and password properties. Use the PAT as the value of the password property. See Personal Access Tokens for information about enabling the use of PATs in Dremio and about creating PATs. You must URL-encode PATs that you include in JDBC URLs. To encode a PAT locally on your system, you can follow these steps:
      1. In a browser window, right-click an empty area of the page and select Inspect.
      2. Click Console.
      3. Type encodeURIComponent("<PAT>"), where <PAT> is the personal access token that you obtained from Dremio. The URL-encoded PAT appears in red on the next line. You can highlight it and copy it to your clipboard.

    JDBC Properties for Dremio Wire Encryption

    If you are setting up encrypted communication between your JDBC client applications and the Dremio server, use the SSL JDBC connection parameters and fully qualified hostname to configure the JDBC connection string and connect to Dremio.

    note:

    This driver does not yet support these three features:

    • Disabling host verification

    • Impersonation

    Properties Value Required Description
    useEncryption true or false [Optional] If true, SSL is enabled. If not set or set to false, SSL is not enabled. The default is true. If you do not want to use encryption, you must set the value to false.
    disableCertificateVerification true or false [Optional] If true, Dremio does not verify the host certificate against the truststore. The default value is false.
    trustStoreType string [Optional] Default: JKS The trustStore type. Allowed values are : JKS, PKCS12

    If the useSystemTrustStore option is set to true (on Windows only), the allowed values are: Windows-MY, Windows-ROOT
    Import the certificate into the Trusted Root Certificate Authorities and set trustStoreType=Windows-ROOT.
    Also import the certificate into Trusted Root Certificate Authorities or Personal and set trustStoreType=Windows-MY.
    trustStore string [Optional] Path to the truststore.
    If not provided, the default Java truststore is used (usually $JAVA_HOME/lib/security/cacerts) and the trustStorePassword parameter is ignored.
    useSystemTrustStore true or false [Optional] By default, the value is true. Bypasses trustStoreType and automatically picks the correct Truststore based on the operating system: Keychain on MacOS, Local Machine and Current User Certificate Stores on Windows, and default truststore on other operating systems. If you are using an operating system other than MacOS or Windows, you must use the trustStorePassword property to pass the password of the truststore. Here is an example of a connection string for Linux:
    jdbc:arrow-flight-sql://localhost:32010?trustStorePassword=Pc0_lL'Opjn$vSDcv:%Q0@@buc
    trustStorePassword string [Optional] Password to the truststore.

    Managing Workloads

    Dremio administrators can use the Arrow Flight server endpoint to manage query workloads by adding the following properties to connections created by Flight clients:

    Flight Client Property Description
    ROUTING_ENGINE Name of the engine to use to process all queries issued during the current session.
    ROUTING_QUEUE Name of the workload management queue. Used only during authentication.
    ROUTING_TAG Tag name associated with all queries executed within a Flight session. Used only during authentication.
    SCHEMA Default schema path to the dataset that the user wants to query.

    URL-encoding Values

    To encode a personal access token (PAT) or property value locally on your system, you can follow these steps:

    1. In a browser window, right-click an empty area of the page and select Inspect.
    2. Click Console.
    3. Type encodeURIComponent("<PAT-or-value>"), where <PAT-or-value> is the personal access token that you obtained from Dremio or the value of a supported JDBC property. The URL-encoded PAT or value appears on the next line. You can highlight it and copy it to your clipboard.

    Differences Between the JDBC Driver for Arrow Flight SQL and the legacy Dremio JDBC Driver

    The JDBC driver for Arrow Flight SQL adds these features:

    • Support for ResultSet.getBoolean() on varchar columns in which boolean values are represented as these strings: “0”, “1”, “true”, “false”.

    • Support for null calendar in calls to ResultSet.getDate(), ResultSet.getTime(), and ResultSet.getTimestamp()
      When a call to one of these methods has no Calendar parameter, or the Calendar parameter is null, the Flight JDBC driver uses the default timezone when it constructs the returned object.

    • Support for ResultSet.getDate(), ResultSet.getTime(), and ResultSet.getTimestamp() on varchar columns in which dates, times, or timestamps are represented as strings.

    • Support for varchar data that represents numeric values in calls to ResultSet.getInteger(), ResultSet.getFloat(), ResultSet.getDouble(), ResultSet.getShort(), ResultSet.getLong(), and ResultSet.getBigDecimal()

    • Support for integer values in calls to getFloat()
      Integers returned gain one decimal place.

    • Support for the native SQL complex types List, Map, and Struct
      Dremio’s legacy JDBC driver uses String representations of these types.

    • Support for using the Interval data type in SQL functions

    The JDBC driver for Arrow Flight SQL removes support for calling ResultSet.getBinaryStream() on non-binary data types. Though such support exists in traditional JDBC drivers, it is not in the specification for the JDBC API.

    note:

    Calling DatabaseMetadata.getCatalog() when connected to Dremio returns empty. Other DatabaseMetadata methods return null values in the TABLE_CAT column. This is expected behavior because Dremio does not have a catalog.

    Limitations

    Neither impersonation nor parameterized queries are supported.

    Supported Conversions from Dremio Datatypes to JDBC Datatypes

    DREMIO TYPE JDBCARROW TYPE
    BIGINT Int
    BIT Bool
    DATE Date
    DECIMAL Decimal
    DOUBLE FloatingPoint(DOUBLE)
    FIXEDSIZEBINARY FixedSizeBinary
    FLOAT FloatingPoint(SINGLE)
    INT Int
    INTERVAL_DAY_SECONDS Interval(DAY_TIME)
    INTERVAL_YEAR_MONTHS Interval(YEAR_MONTH)
    LIST List
    MAP Map
    NULL Null
    OBJECT Not Supported
    STRUCT Struct
    TIME Time(MILLISECOND)
    TIMESTAMP Timestamp(MILLISECOND)
    VARBINARY Binary
    VARCHAR Utf8