17.0.0 Release Notes (Dremio July 2021)

What’s New

AWS Authentication

Dremio now supports authenticating to S3, Amazon Elasticsearch, Redshift, and Glue sources by reading credentials from an AWS profile file on each node.

Google Cloud Storage (GCS) Connector

Google Cloud Storage has been added as an option for file system sources, like S3 and HDFS.

Changed Behavior

Upgrade Process for Custom ARP Sources

Added process for upgrading to new versions of Dremio when using custom ARP sources in the event of breaking changes/features being introduced, such as access control.

Timestamp Mapping for Oracle

Added a new UI option for Oracle date sources to map the Date column to indexed timestamp data. This helps to reduce the duration of queries that would otherwise have to convert the Date column to non-indexed timestamp data.

Other Enhancements

Median/Percentile SQL Functions

The ability to execute median/percentile commands via SQL functions has been added in the form of PERCENTILE_CONT and PERCENTILE_DISC. Using these, you may now compute percentiles against any numeric column. The MEDIAN function may also be used and is interpreted as PERCENTILE_CONT, so either function name may be used to achieve the same result.

Elasticsearch 7 Connector Preview

This functionality’s preview status is lifted as of Dremio 18.0+ and is available to all users.

Dremio now supports Elasticsearch 7.0+ in the Elasticsearch connector (in addition to 5.x and 6.x), using the same Elasticsearch connector. You can create a new source and connect Dremio to an Elasticsearch 7.x cluster; however, if you already have an Elasticsearch source established with Dremio and you’re planning to upgrade that cluster to 7.0+, you’ll need to remove the source definition and re-add it.

For more information regarding these breaking changes and how they may affect your experience integrating with Dremio, please review Elastic’s breaking changes in 7.0.


Mixed Type Removal

Dremio 18.0 will remove support for columns with mixed data types. A standard schema will then be enforced. After the upgrade, mixed type columns will be converted on the next metadata refresh or query that reads from an affected PDS.

If files contain different types, Dremio will up-promote the data type to a common type, if possible. For example, it will convert data to BIGINT if files contain INT and BIGINT data types.

To prepare for this deprecation, the support key store.disable.mixed_types may be used to help you identify any PDS that may contain mixed data types. Using this key will emulate the deprecation so that you may easily identify mixed types before all support is ended.

An additional CLI tool is available, which generates a CVS report with each column listed that is affected by the upcoming deprecation. Please contact Dremio Customer Support for assistance in obtaining and running this tool.

Fixed Issues

Dremio now supports Deltalake tables where statistics for data files are added later.

After performing a query in Dremio, the results were being stored while a new executor node was simultaneously added. Because the node was in a “not ready” state, it caused the query to fail, even though the node was not used while performing the query.
Rather than causing the query to fail, new nodes added while storing query results instead throw a special exception (which is ignored by Dremio) and results are stored locally rather than shared between nodes.

The log directory was moved to a new location, so events failed to appear in the /var/log/dremio directory until the Dremio service was restarted or the file was full.
When log files are moved to the backup location, a rollover is triggered so that all logged events will still appear in the expected directory.

The metadata refresh for Azure configurations was using the GMT timezone for offset calculations.._
The metadata refresh has been updated to use the system’s default timezone.