Setup and Best Practices
In addition to a hostname or IP and port, you will also need to provide the name of the DB2 database to which you are connecting.
Resource Allocation Considerations
For RDBMS sources like DB2, Dremio's query execution is largely single threaded. This means that for each DB2 directed query, only one Dremio node will experience a computational load. So unlike most other data sources, larger Dremio clusters won't lead to faster individual query execution times. However, if you expect a large number of concurrent queries (as in the case of many simultaneous Dremio users) these will be distributed evenly across the nodes.
NOTE: Ensure that your Dremio cluster has access to the appropriate port for your DB2 source. By default this is port 50000.
Dremio and DB2
For some operations Dremio can tell the data source to execute that part of the query natively, often dramatically improving performance. These operations are called 'pushdowns.'
Since they share a common language (SQL), Dremio supports most operations as pushdowns in DB2. These include:
- Filter (SQL:
- Limit (SQL:
- Sorting (SQL:
- Aggregation (SQL:
- Project (with expressions) (e.g. SQL:
SELECT columnA + columnB, columnC, columnD)
NOTE: Since DB2 tables have no boolean type, project operations that contain SQL expressions which evaluate to true or false (e.g.
SELECT username, friends > 0), and filter operations that include boolean literals in a filter (e.g.
WHERE currentAccount = true) cannot be executed as pushdowns.
Depending on the number of tables in your DB2 source, the final step of adding it to Dremio can take anywhere from a few seconds to a few minutes as the source's metadata is processed. However, this is a one-time cost and further queries to the source will not incur additional metadata reads.
Here are all available source specific options:
|Host||DB2 host name.|
|Port||DB2 port number. Defaults to 50000.|
|Username||DB2 user name.|
|Record Fetch Size||Record fetch size, use