Skip to main content

Python

You can develop client applications in Python that use that use Arrow Flight and connect to Dremio Cloud's Arrow Flight server endpoint. For help getting started, try out the sample application.

Sample Python Arrow Flight Client Application

This lightweight sample Python client application connects to the Dremio Arrow Flight server endpoint. You can use token-based credentials for authentication. Any datasets in Dremio that are accessible by the provided Dremio user can be queried. You can change settings in a .yaml configuration file before running the client.

The Sample Python Client Application
"""
Copyright (C) 2017-2021 Dremio Corporation

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""
from dremio.arguments.parse import get_config
from dremio.flight.endpoint import DremioFlightEndpoint

if __name__ == "__main__":
# Parse the config file.
args = get_config()

# Instantiate DremioFlightEndpoint object
dremio_flight_endpoint = DremioFlightEndpoint(args)

# Connect to Dremio Arrow Flight server endpoint.
flight_client = dremio_flight_endpoint.connect()

# Execute query
dataframe = dremio_flight_endpoint.execute_query(flight_client)

# Print out the data
print(dataframe)

Steps

  1. Install Python 3.

  2. Download the Dremio Flight endpoint .whl file.

  3. Install the .whl file:

    Command for installing the file
    python3 -m pip install <path to .whl file>
  4. Create a local folder to store the client file and config file.

  5. Create a file named example.py in the folder that you created.

  6. Copy the contents of arrow-flight-client-examples/python/example.py (available here) into example.py.

  7. Create a file named config.yaml in the folder that you created.

  8. Copy the contents of arrow-flight-client-examples/python/config_template.yaml (available here) into config.yaml.

  9. Uncomment the options in config.yaml, as needed, appending arguments after their keys (i.e., username: my_username). You can either delete the options that are not being used or leave them commented.

    Example config file for connecting to Dremio Cloud
    hostname: data.dremio.cloud
    port: 443
    pat: my_PAT
    tls: true
    query: SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10
  10. Run the Python Arrow Flight Client by navigating to the folder that you created in the previous step and running this command:

    Command for running the client
    python3 example.py [-config CONFIG_REL_PATH | --config-path CONFIG_REL_PATH]
    • [-config CONFIG_REL_PATH | --config-path CONFIG_REL_PATH]: Use either of these options to set the relative path to the config file. The default is "./config.yaml".

Config File Options

Default content of the config file
hostname: 
port:
username:
password:
token:
query:
tls:
disable_certificate_verification:
path_to_certs:
session_properties:
engine:
NameTypeRequired?DefaultDescription
hostnamestringNolocalhostMust be data.dremio.cloud.
portintegerNo32010Dremio's Arrow Flight server port. Must be 443.
usernamestringNoN/ANot applicable when connecting to Dremio Cloud.
passwordstringNoN/ANot applicable when connecting to Dremio Cloud.
tokenstringYesN/AEither a Personal Access Token or an OAuth2 Token.
querystringYesN/AThe SQL query to test.
tlsbooleanNofalseEnables encryption on a connection.
disable_certificate_verificationbooleanNofalseDisables TLS server verification.
path_to_certsstringNoSystem CertificatesPath to trusted certificates for encrypted connections.
session_propertieslist of stringsNoN/AKey value pairs of session_properties. Example:
session_properties:
- schema='Samples."samples.dremio.com"'
For a list of the available properties, see Workload Management.
enginestringNoN/AThe specific engine to run against.