On this page

    Developing Arrow Flight Client Applications for Dremio

    Dremio provides an Arrow Flight server endpoint for Arrow Flight connections. The endpoint is enabled by default on port 32010. Arrow Flight enables high speed data transfer compared to ODBC/JDBC connections by utilizing the Apache Arrow format to avoid serializing and deserializing data.

    Supported Versions of Apache Arrow

    Dremio supports client applications that use Arrow Flight in Apache Arrow version 3.0.0 or later.

    Supported Authentication Methods

    Dremio supports basic username/password authentication and authentication through personal access tokens (PATs). Set the value of services.flight.auth.mode in the dremio.conf file to specify which method your Flight clients use:

    • To use basic authentication, set value to legacy.arrow.flight.auth.
    • To use a PAT, set the value to arrow.flight.auth2.

    If this value is changed, Dremio must be restarted.

    To learn how to enable Dremio’s support of PATs and how to create a PAT, see Personal Access Tokens.

    Configuration Setting for Dremio 21.0.0+

    Starting in version 21.0.0 of Dremio, there is a configuration setting that affects the content of client requests and server responses:

    services.flight.use_session_service

    • If this setting is present in dremio.conf and set to true:

      When a Flight client sends in its first request, the response includes a Set-Cookie header for the session ID. The Flight client is expected to send Cookie headers that include the session ID in each subsequent request in a session.

      The default is present and true in version 21.0.0 and later.

    • If this setting is not present in dremio.conf, or is set to false:

      Interactions between Flight clients and Dremio continue to work as they did in releases prior to 21.0.0.

    If this value is changed, or if the setting is added or removed from dremio.conf, Dremio must be restarted.

    Flight Sessions

    A Flight session lasts a duration of 120 minutes, during which a Flight client interacts with Dremio. You can change the duration by setting usersessions.ttl.seconds in dremio.conf. If this value is changed, Dremio must be restarted.

    A Flight client initiates a new session by passing a getFlightInfo() request that does not include a Cookie header that specifies a session ID that was obtained from Dremio. All requests that pass the same session ID are considered to be in the same session.

    The interaction between a Fight client and Dremio differs slightly according to the authentication method that the client uses.

    Flight Sessions Authenticated by Usernames and Passwords

    1. The Flight client sends an authentication request to Dremio.

    2. Dremio responds with an authentication token.

    3. The Flight client sends a getFlightInfo() request that includes the query to run and the URI for the endpoint. The request does not include a Cookie header with a session ID.

    4. Dremio sends a response that includes FlightInfo, a Set-Cookie header with the session ID, and a Set-Cookie header with the ID of the default project in the organization.

      FlightInfo responses from Dremio include the single endpoint for the control plane being used and the ticket for that endpoint. There is only one endpoint listed in FlightInfo responses.

      Session IDs are generated by Dremio.

    5. The client sends a getStream() request that includes the ticket, a Cookie header for the session ID, and a Cookie header for the ID of the default project.

    6. Dremio returns the query results in one flight.

    7. The Flight client sends another getFlightInfo() request using the same session ID and bearer token. If this second request did not include the session ID that Dremio sent in response to the first request, then Dremio would send a new session ID and a new session would begin.

    Flight Sessions Authenticated by Personal Access Tokens (PATs)

    1. The Flight client, having obtained a PAT from Dremio, sends a getFlightInfo() request that includes the query to run, the URI for the endpoint, and the bearer token (PAT). The request does not include a Cookie header with a session ID.

      A single bearer token can be used for requests until it expires.

    2. If Dremio is able to authenticate the Flight client by using the bearer token, it sends a response that includes FlightInfo, a Set-Cookie header with the session ID, the bearer token, and a Set-Cookie header with the ID of the default project in the organization.

      FlightInfo responses from Dremio include the single endpoint for the control plane being used and the ticket for that endpoint. There is only one endpoint listed in FlightInfo responses.

      Session IDs are generated by Dremio.

    3. The client sends a getStream() request that includes the ticket, a Cookie header for the session ID, the bearer token, and a Cookie header for the ID of the default project.

    4. Dremio returns the query results in one flight.

    5. The Flight client sends another getFlightInfo() request using the same session ID and bearer token. If this second request did not include the session ID that Dremio sent in response to the first request, then Dremio would send a new session ID and a new session would begin.

    Sample Arrow Flight Client Applications

    Dremio provides sample Flight client applications in Python and Java at Dremio Hub. Dremio 12.0.0 or later and Arrow 3.0.0 or later are required. See the Arrow Flight documentation for more information about Arrow Flight.

    Managing Workloads

    Dremio administrators can use the Arrow Flight server endpoint to manage query workloads by adding the following properties to Flight clients:

    Flight Client PropertyDescription
    ROUTING_TAGTag name associated with all queries executed within a Flight session. Used only during authentication.
    ROUTING_QUEUEName of the workload management queue. Used only during authentication.
    SCHEMADefault schema path to the dataset that the user wants to query.