16.0.0 Release Notes (Dremio April 2021)

What’s New

Access Control

Dremio 16.0.0 provides administrators a more extensive range of privileges to control user and group access to Dremio objects. Role-based access control is governed by object inheritance and the scope applied when granting privileges to a Dremio user. See Role-based Access for more information.

Contact Dremio Technical Support for assistance in enabling this feature.

Restoring Dremio

The new access control functionality in Dremio v16.0 is considered a breaking change. We recommend implementing the new access control functionality on a lower environment, such as staging or test environments, prior to installing this update on a live instance. This is to help identify any problems that might arise within the Dremio environment when the new access control functionality takes effect, and potentially mitigates any damage to your data on a live environment.

Restoring Dremio

If you run into issues with upgrading the KVStore, run the command ./dremio-admin repair-acls -d.

Unexpected Behavior

If prior to 16.0, a source granted privileges to specific users for CREATE TABLE or DROP, when upgrading to 16.0+ Dremio will grant the PUBLIC role these privileges. This behavior is incorrect, and we recommend that after upgrading to 16.0+, administrators should check any sources with CREATE TABLE and DROP privileges to ensure that proper access is granted.

Slicing Thread Monitor

In instances when a slicing thread (execution thread) receives significant traffic that causes it to hang or slow, this monitor flags the thread and either creates a new thread or activates a thread from the free thread pool. Existing tasks in the slower thread’s queue are then redirected to the newly-activated thread. The original thread is left to complete the current task fragment and then exits the system.

Idle Time Connection Settings

Created new idle timeout limit and connection number options for RDBMS data sources. When the established connection limit and/or time is reached the connection is automatically killed. Idling is controlled using the Maximum idle connections and Connection idle time (s) options under the Advanced Options tab of an Relational Database source’s settings.

Content Security Policy Metadata Tag

Added a Content Security Policy (CSP) metadata value that lists all domains through which frontend resources, such as Javascript, CSS, and images, are allowed to be loaded in Dremio.

Large Expression Support

Dremio has been optimized to handle case statements containing more than 200 case expressions. New internal logical expression type treat these case expressions as flat branches instead of nested, which means upward of 800 branches are now supported.

This functionality is not enabled by default. If you wish to enable this functionality, contact Dremio’s Professional Services team.

Plan Statistics

Generating table statistics has now been added to Dremio, which allows for improved optmization of query execution plans. Via the sys.table_statistics system table, this functionality introduces ANALYZE TABLE commands for the computing and deletion of statistics for physical datasets (PDS). Dremio may now identify the following statistics: estimated number of distinct values, number of rows, and number of null values.

After computing statistics, the results are accessed via Dremio’s Query Planner by enabling the Planner.use_statistics option.

Encrypted LDAP Passwords

Dremio 16.0.0 enhances security by providing support for storing encrypted the LDAP bindPassword in the Dremio keystore. See Encrypting the LDAP Bind Password and the dremio-admin encrypt command for more information.

Updated Kubernetes Helm Chart

Dremio 16.0.0 features an updated Helm chart for Kubernetes-based deployments.

Restoring Dremio

For Kubernetes users, Dremio v16.0.0 introduces a breaking change which requires an update to your Helm chart. Please see the following page for the specific change necessary to your Helm chart. For enterprise customers with customized Helm charts that require additional assistance, please contact Dremio Support.

New Gandiva Functions

Dremio 16.0.0 provides the following new Gandiva functions:

  • Trigonometric: sin, cos, asin, acos, tan, atan, sinh, cosh, tanh, cotg, radians, degrees
  • Date/Time:last_day
  • Miscellaneous: sha1 and sha256
  • Conversion: CONVERT_FROM(expression, 'UTF8', replacement char)
  • Cast: castTIME(timestamp) and castVARCHAR(numeric_types)

Change in return type of SQL last_day( ) Function

Please note that in Dremio v.16.0.0, the return type of the SQL last_day( ) function has been changed to DATE, to align with the standard and expected behavior of this function. In prior versions of Dremio, this function returned a date-formatted STRING value, which caused unexpected behavior when passed to other functions expecting DATE input parameters.

Please note, for customers who have already modified their queries to account for the previous last_day() return type of STRING, they may see some query failures with this new behavior. The solution is to wrap the call to last_day() in a call to to_char(), as follows:

select to_char(last_day(current_date), 'YYYY-MM-DD') as last_day_string

Preview: Google Cloud Storage (GCS) Connector

Google Cloud Storage has been added as an option for file system sources, like S3 and HDFS. To enable and test out this new functionality, contact Dremio Customer Support for assistance.

Changed Behavior

By default, Dremio 16.0.0 uses the Tableau SDK Connector rather than the ODBC driver to export data in Tableau.

Other Enhancements

Dremio 16.0.0 supports runtime filtering on non-partitioned columms.

Dremio 16.0.0 no longer loads SSL-related classes when SSL is disabled.

Dremio 16.0.0 improves performance for Dremio clients by allocating memory only once for Netty.

Fixed Issues in 16.0.0

**Dremio can’t display usernames with double quotation marks and colons**
Fixed by disallowing double quotation marks and colons in usernames.

Unable to promote Parquet-formatted files without .parquet extension in AWS Edition
Resolved by using a format matcher during the promotion flow for Parquet-formatted files.

CVE-2017-15288: Vulnerability reported against scala-2.10.1.jars
Resolved by removing the vulnerable library from Dremio distributions and updating the scala library to 2.10.7.

Known Issues

The MODIFY and ALTER SQL statements have the following known limitations with regard to Access Control:

  • A user may lose access to a dataset by executing a MODIFY SQL statement that impacts metadata.
  • A user that does not have access to the parent object cannot promote a dataset using the ALTER SQL statement.

16.1.0 Release Notes (Enterprise Edition Only, May 2021)

What’s New

Disable Inline Metadata Retrieval

Dremio 16.1.0 introduces the ability to disable inline (query-time) checking of metadata validity (expiry) on a per-source basis, which will prevent Dremio from refreshing dataset metadata during query planning.

This new functionality is enabled by first setting a support key; once this key is set, a new checkbox is shown in the Source Configuration UI (Advanced Options) to permit disabling of the query-time metadata validity check.

Once the Metadata Validity Check is disabled for a source:

  • Inline metadata refresh will still be used the very first time a table is queried in Dremio, however for all subsequent queries, the check to see if metadata has passed the expiry time will not be performed.
  • Timer-based metadata refreshes will still occur, as scheduled (set by the Refresh Every interval in the source configuration). However, the expiry time (Expire After setting) will not be checked at query time to determine if metadata is stale.
  • If there is a concern that the timer-based Metadata Refresh (which refreshes all datasets in the source one-at-a-time) will not refresh critical datasets in time to meet SLAs for data freshness, then explicit ALTER PDS … REFRESH METADATA SQL commands should be issued for these critical datasets to guarantee freshness.

Risk associated with using this option:

  • Data returned by queries against the table could be stale / inaccurate, if table mutations have occurred and subsequent ALTER PDS command(s) have not been run to refresh the metadata (and timer-based regular refresh has not occurred since the data mutations were applied).

There are two steps required to use this option:

  • In the Admin / Support page, add the the support key store.plugin.show_metadata_validity_checkbox, switch this option on, and Save. See Support Keys for instructions on setting support keys.
  • This option is activated on a per-source basis. Once the support key is set, a new checkbox will appear in the Advanced Options panel in source configuration. Note: If this option does not appear after the support key is set, refresh your browser window.

Note:

The first few queries run after a Dremio restart might appear to experience delays associated with Metadata Retrieval, even after the above option is enabled. However this is due to reloading the Permissions Cache and should only appear just after a restart. To validate the delay is not due to inline refresh after setting the above option, consult the query profile and verify that the time taken for CACHED_METADATA is 0ms.

Other Enhancements

Dremio 16.1.0 allows Delta Lake to be enabled by default. Dremio administrators may enable this by setting the dremio.deltalake.enabled support key to true. See Support Keys for instructions on setting support keys.

Fixed Issues in 16.1.0

Cancelled queries may not be terminated correctly
Resolved by checking if the query was canceled when RelMetadata is generated.

16.2.0 Release Notes (Enterprise Edition Only, June 2021)

Fixed Issues in 16.2.0

Dremio was reading beyond the end of an organization’s Parquet files, causing memory issues.
If Dremio reaches the end of a Parquet file and attempts to read any further, an exception is thrown to stop the service.

16.3.0 Release Notes (Enterprise Edition Only, June 2021)

Fixed Issues in 16.3.0

Users with the MODIFY privilege for a space would encounter permission errors when listing objects due to the privilege being treated as inheritable by child objects the user lacked access to.
The MODIFY privilege now only functions at the space/source level and when listing objects is not treated as inheritable by any children.

16.5.0 Release Notes (Enterprise Edition Only, August 2021)

Fixed Issues in 16.5.0

The client connection was closed abruptly, canceling a query while outstanding messages for a coordinator node had not yet been acknowledged. This prevented any future query messages to the same coordinator from being acknowledged, and thus future queries could not progress beyond the queue.
All messages are acknowledged by the coordinator, even in the case of a query failing.

Companies upgrading from Dremio 4.X would encounter deserialization errors with expired reflections after upgrading to 16.X. Two new options have been added to prevent a query from refreshing an expired plan or reflection after upgrading. The following process should be followed if a reflection is set to expire and then a VDS is created with a raw reflection and query:

  1. After upgrading to Dremio 16.5, but before starting the service, replace the jars/3rdparty/calcite-core-1.16.0*.jar file with the corresponding version for v16.0.
  2. Add -dremio.debug.sysopt.reflection.manager.auto_plan_rebuild=false to DREMIO_JAVA_SERVER_EXTRA_OPTS in dremio-env.
  3. Start Dremio. The reflection will enter a failed state, but will not auto refresh.
  4. Now add -dremio.debug.sysopt.reflection.manager.auto_refresh_failed=true to DREMIO_JAVA_SERVER_EXTRA_OPTS in dremio-env.

Now the reflection should refresh after starting.

After upgrading to 16.4, users encountered delays with regard to queries, planning time, and reflections. Multiple fixes have been implemented to improve query, planning, and reflection times.

16.6.0 Release Notes (September 2021)

Fixed Issues in 16.6.0

A coordinator exhibited multiple queries had failed to be canceled, which helped to reveal a memory leak in one of Dremio’s services that occurs when canceling queries. Because of the canceled query, the completable future result is not consumed, which ultimately results in a memory leak.
Multiple improvements have been made to counteract this behavior, which includes:

  • Increased the cancel retry thread pool.
  • Decreased the number of retries Dremio attempts as the active query sync typically cancels any orphan queries on the executor.
  • Made the cancel retry a non-blocking service to accommodate a higher rate of cancel handling.

16.7.0 Release Notes (September 2021)

Fixed Issues in 16.7.0

Users able to run queries on temporary datasets cannot download results as the option is grayed out.
This issue has been resolved by adding permissions for temporary datasets.