Python
You can develop client applications in Python that use that use Arrow Flight and connect to Dremio Cloud's Arrow Flight server endpoint. For help getting started, try out the sample application.
Sample Python Arrow Flight Client Application
This lightweight sample Python client application connects to the Dremio Arrow Flight server endpoint. You can use token-based credentials for authentication. Any datasets in Dremio that are accessible by the provided Dremio user can be queried. You can change settings in a .yaml
configuration file before running the client.
"""
Copyright (C) 2017-2021 Dremio Corporation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""
from dremio.arguments.parse import get_config
from dremio.flight.endpoint import DremioFlightEndpoint
if __name__ == "__main__":
# Parse the config file.
args = get_config()
# Instantiate DremioFlightEndpoint object
dremio_flight_endpoint = DremioFlightEndpoint(args)
# Connect to Dremio Arrow Flight server endpoint.
flight_client = dremio_flight_endpoint.connect()
# Execute query
dataframe = dremio_flight_endpoint.execute_query(flight_client)
# Print out the data
print(dataframe)
Steps
-
Install Python 3.
-
Download the Dremio Flight endpoint .whl file.
-
Install the
.whl
file: Command for installing the filepython3 -m pip install <path to .whl file>
-
Create a local folder to store the client file and config file.
-
Create a file named
example.py
in the folder that you created. -
Copy the contents of
arrow-flight-client-examples/python/example.py
(available here) intoexample.py
. -
Create a file named
config.yaml
in the folder that you created. -
Copy the contents of
arrow-flight-client-examples/python/config_template.yaml
(available here) intoconfig.yaml
. -
Uncomment the options in
Example config file for connecting to Dremio Cloudconfig.yaml
, as needed, appending arguments after their keys (i.e.,username: my_username
). You can either delete the options that are not being used or leave them commented.hostname: data.dremio.cloud
port: 443
pat: my_PAT
tls: true
query: SELECT * FROM Samples."samples.dremio.com"."NYC-taxi-trips" limit 10 -
Run the Python Arrow Flight Client by navigating to the folder that you created in the previous step and running this command: Command for running the client
python3 example.py [-config CONFIG_REL_PATH | --config-path CONFIG_REL_PATH]
[-config CONFIG_REL_PATH | --config-path CONFIG_REL_PATH]
: Use either of these options to set the relative path to the config file. The default is "./config.yaml".
Config File Options
Default content of the config filehostname:
port:
username:
password:
token:
query:
tls:
disable_certificate_verification:
path_to_certs:
session_properties:
engine:
Name | Type | Required? | Default | Description |
---|---|---|---|---|
hostname | string | No | localhost | Must be data.dremio.cloud . |
port | integer | No | 32010 | Dremio's Arrow Flight server port. Must be 443 . |
username | string | No | N/A | Not applicable when connecting to Dremio Cloud. |
password | string | No | N/A | Not applicable when connecting to Dremio Cloud. |
token | string | Yes | N/A | Either a Personal Access Token or an OAuth2 Token. |
query | string | Yes | N/A | The SQL query to test. |
tls | boolean | No | false | Enables encryption on a connection. |
disable_certificate_verification | boolean | No | false | Disables TLS server verification. |
path_to_certs | string | No | System Certificates | Path to trusted certificates for encrypted connections. |
session_properties | list of strings | No | N/A | Key value pairs of session_properties . Example: For a list of the available properties, see Workload Management. |
engine | string | No | N/A | The specific engine to run against. |