On this page

    Python

    You can develop Dremio Cloud client applications in Python that use that use Arrow Flight. Here are code snippets to help you get going.

    Prerequisites

    Before you use any of the code snippets, ensure that you take these steps:

    • Install and set up Python 3, which you can download from here.

    • Install dependencies either with conda or with pip:

      Install dependencies with conda
      conda install -c conda-forge pyarrow pandas
      
      Install dependencies with pip
      pip install pyarrow pandas
      

    Helper Classes and Imports

    Start with the following helper classes and imports. These pull in the correct modules and provide helper classes to construct the middleware for connecting to Dremio Cloud:

    Helper classes and imports
    from http.cookies import SimpleCookie
    from pyarrow import flight
    
    class DremioClientAuthMiddlewareFactory(flight.ClientMiddlewareFactory):
        """A factory that creates DremioClientAuthMiddleware(s)."""
    
        def __init__(self):
            self.call_credential = []
    
        def start_call(self, info):
            return DremioClientAuthMiddleware(self)
    
        def set_call_credential(self, call_credential):
            self.call_credential = call_credential
    
    class DremioClientAuthMiddleware(flight.ClientMiddleware):
        """
        A ClientMiddleware that extracts the bearer token from
        the authorization header returned by the Dremio Cloud
        Flight Server Endpoint.
    
        Parameters
        ----------
        factory : ClientHeaderAuthMiddlewareFactory
            The factory to set call credentials if an
            authorization header with bearer token is
            returned by Dremio Cloud.
        """
    
        def __init__(self, factory):
            self.factory = factory
    
        def received_headers(self, headers):
            auth_header_key = 'authorization'
            authorization_header = []
            for key in headers:
                if key.lower() == auth_header_key:
                    authorization_header = headers.get(auth_header_key)
            if not authorization_header:
                raise Exception('Did not receive authorization header back from server.')
            self.factory.set_call_credential([
                b'authorization', authorization_header[0].encode('utf-8')])
    
    class CookieMiddlewareFactory(flight.ClientMiddlewareFactory):
        """A factory that creates CookieMiddleware(s)."""
    
        def __init__(self):
            self.cookies = {}
    
        def start_call(self, info):
            return CookieMiddleware(self)
    
    
    class CookieMiddleware(flight.ClientMiddleware):
        """
        A ClientMiddleware that receives and retransmits cookies.
        For simplicity, this does not auto-expire cookies.
    
        Parameters
        ----------
        factory : CookieMiddlewareFactory
            The factory containing the currently cached cookies.
        """
    
        def __init__(self, factory):
            self.factory = factory
    
        def received_headers(self, headers):
            for key in headers:
                if key.lower() == 'set-cookie':
                    cookie = SimpleCookie()
                    for item in headers.get(key):
                        cookie.load(item)
    
                    self.factory.cookies.update(cookie.items())
    
        def sending_headers(self):
            if self.factory.cookies:
                cookie_string = '; '.join("{!s}={!s}".format(key, val.value) for (key, val) in self.factory.cookies.items())
                return {b'cookie': cookie_string.encode('utf-8')}
            return {}
    

    Connecting to Dremio Cloud

    Your client applications can authenticate to Dremio Cloud with a personal access token.

    Ensure that this code snippet for connecting is below the line of code if __name__ == "__main__":, and that this line is below your main.

    Instructions for creating a PAT are here.

    Replace <PAT> in the code snippet.

    Create a personal access token (PAT)
        # Dremio Cloud connection via PAT
        # TLS Encryption is enabled. Certificate verification is disabled.
        headers = []
        connection_args = {}
    
        # Construct middleware.
        client_cookie_middleware = CookieMiddlewareFactory()
    
        # Disable server verification
        connection_args['disable_server_verification'] = True
    
        # Establish initial connection
        client = flight.FlightClient("grpc+tls://data.dremio.cloud:443", middleware=[client_cookie_middleware], **connection_args)
    
        # Retrieve bearer token and append to the header for future calls.
        headers.append((b'authorization', "Bearer {}".format('<PAT>').encode('utf-8')))
    

    Querying Data

    This example queries a sample table that is in Dremio Cloud’s Sample Source data source. You can add this data source in Dremio Cloud on the Datasets page by clicking Add Source and then selecting Sample Source under Object Storage.

    Example query
        # The query to execute.
        query = 'SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10'
    
        # Construct FlightDescriptor for the query result set.
        flight_desc = flight.FlightDescriptor.for_command(query)
    
        # Retrieve the schema of the result set.
        options = flight.FlightCallOptions(headers=headers)
        schema = client.get_schema(flight_desc, options)
    
        # Get the FlightInfo message to retrieve the Ticket corresponding
        # to the query result set.
        flight_info = client.get_flight_info(flight.FlightDescriptor.for_command(query), options)
    
        # Retrieve the result set as a stream of Arrow record batches.
        reader = client.do_get(flight_info.endpoints[0].ticket, options)
    
        # Print results.
        print(reader.read_pandas())
    

    Sample Application

    For a sample application, see the python directory of Dremio’s arrow-flight-client-example repository on GitHub at https://github.com/dremio-hub/arrow-flight-client-examples/tree/main/python.