19.0.0 Release Notes (Dremio October 2021)

What’s New

Simplified Navigation Bar

The original top navigation bar has been replaced with a new side navigation bar along the left-hand side of the Dremio interface. Links to Datasets, SQL Runner, Jobs, and account settings have been replaced with icons for the SQL Editor, Job Visualizer, project, and profile.

This image shows the new left-hand navigation menu in Dremio.

The icons on the image above represent the following:

  1. Dataset page
  2. Query Editor page
  3. Jobs page
  4. Settings page
  5. Help menu
  6. Account options

Apache Iceberg Hive Table Support

Dremio 19.0+ supports using the popular Apache Iceberg open table format created with Spark 3. This functionality extends to users with read (SELECT) privileges on sources that support Hive catalogs (ADLS, Hive, S3). Additionally, this allows for more rapid storage and retrieval of metadata when after queries or changes are made to a dataset.

PIVOT & UNPIVOT Operators

PIVOT and UNPIVOT relational operators are now available in Dremio, allowing for the changing of table-valued expressions into another table. PIVOT rotates table-valued expressions by turning unique values from one column of an expression into multiple columns within the output. Additionally, it performs aggregations where required on any remaining column values needed in the final output. UNPIVOT accomplishes the opposite, rotating columns of table-valued expressions into column values.

For additional information, see the PIVOT and UNPIVOT help pages.

Lake Formation Integration (Preview)

Dremio now honors permission configurations made from AWS Lake Formation for Glue sources at the database and table levels. This entails integrating with Dremio as described in our Lake Formation configuration guide.

Other Enhancements

  • Administrators may grant internal/external groups (e.g., AD) administrative access to an organization’s full cluster with the GRANT SQL command. To add this privilege to a role, navigate to the SQL Editor and run the query, GRANT ROLE ADMIN TO ROLE groupName. Run SELECT * FROM sys.membership to verify the new record exists.
  • Dividing a float by 0 now returns NaN if Gandiva-based execution is enabled. If disabled, Dremio will follow the standard SQL behavior for dividing by 0, which is to display an exception error message.
  • IEEE-754 divide semantics has also been added, which is activated through the planner.ieee_754_divide_semantics support key. When a positive/zero/negative float is divided by zero, then infinity, NaN or negative infinity is returned.

Changed Behaviors

Jobs & Job Detail New UI On By Default

The new Jobs and Job Details pages first introduced in Dremio v18.0 is now activated by default for all customers that upgrade to v19.0+. Previously, this functionality required manual enablement via support keys to access.

Removal of Mixed Data Types Support Key

The support key that allowed customers to enable mixed types despite the deprecation implemented in Dremio v18.0 has now been removed.

Fixed Issues

After upgrading to Dremio 18.0, the interface became slow and responsive as a result of access control migrations.
This issue has been addressed so that the interface is more responsive while migrations are taking place.

After upgrading to 18.0, customers began experiencing permissions issues when attempting to use HDFS-based sources.
This issue has been resolved by removing permission checks for Iceberg metadata datasets.

While using the Jobs page, text was not displaying in the Search box.
This issue has been addressed with some CSS fixes to make search text visible.

Customers encountered incorrect results or no rows returned after performing queries with a false predicate in an OR condition.
This issue has been resolved by preventing the generation of incorrect partition filter conditions. Additionally, the limit for converting conjunctive normal forms has been increased.

After upgrading to Dremio v18.0+, users encountered failures when using the now() function in reflections.
This has been resolved by adding the now() function back to the reflection whitelist so that it can be used with reflections, despite being a context-sensitive function.

The Open Results link was missing from the new Job Details page UI.
The Open Results functionality has been added to the top-right corner of the Job Details page.

The ASSERT_DECIMAL SQL function was not being properly handled in the java code generated, which resulted in infinite recursions while rewriting unions during materialization.
This issue has been resolved with improvements to the handling of the ASSERT_DECIMAL function.

The new Jobs page didn’t show the name of the reflection a refresh job was run with.
The Reflections Created section is now included to show reflection information, such as name.

Customers trying to promote a strict OXML file would encounter failures due to the format being unsupported. However, the error message incorrectly stated failure to create parser for entry:<filepath>.
The exception is now caught and an improved error message is shown: Strict OXML is not supported. Please save file: <filepath> in standard (.xlsx) format.

Due to java version mismatches, the Dremio service cannot start due to a related timeout and AWSE AMI is rendered unusable.
This issue was fixed by adding scripts that update services when first packing AMI. Background security updates are then disabled to prevent further updates once the Dremio and Zookeeper services start. \

Dremio couldn’t read the partitionValues map in the Delta Lake Parquet checkpoint log file when written by older Parquet writers.
This issue has been resolved by enabling Dremio’s checkpoint Parquet reader to read map fields without assuming field structure.

Dremio encountered issues where Parquet file date columns showed incorrect results and the service autocorrects them.
This issue has been resolved by disabling the store.parquet.auto.correct.dates support key, which was used by default in older versions of Dremio to resolve incorrect dates written by Apache Drill.

Logs on AWSE switched to writing to dremio.backup on startup.
This issue has been resolved for AWSE users by implementing multiple files for GC logs with timestamp postfixes formatted so that the most-recent file has the .current postfix.

Queries performed with PostgreSQL encountered issues with braces in UNION statements.
When generating SQL for RDBMS sources, both sides of UNION [ALL] statements now have surrounding braces.

A timer issue falsely caused timeouts when Plan Cache is enabled due to the timer failing to stop after a plan cycle.
This issue has been resolved in instances where Plan Cache is enabled, and the timer now stops when the plan cycle ends.

Applying CONVERT_FROM(,'JSON') on a reflection is not possible due to dataset field information in the __accelerator source being unable to update.
This issue has been resolved to allow the use of CONVERT_FROM(,'JSON') on reflections.

When Delta Lake transactions are committed, the checkpoint Parquet files generated are multi-part, which Dremio does not support, causing queries on the table to fail.
This issue has been resolved by enabling Dremio to read multi-part checkpoint Parquet files for Delta Lake tables.

“Legacy” versions of RDBMS connectors caused errors while querying.
This issue has been resolved by updating all legacy RDBMS connectors with the more advanced ARP versions of the sources, where if functionality is not supported in the data source, it is no longer executed by Dremio. this update of ARP sources will take place automatically.

Datasets with lists and structs encountered poor performance with queries containing a WHERE clause.
This issue has been resolved by improving the performance of the copier used to copy values from complex types, such as lists and structs.

Users encountered an exception for indexes being out of bounds when performing queries.
This issue has been resolved by performing null checks on columns and offset indexes.

Larger Hive extended table properties with concurrent queries running caused Garbage Collection and executors became unresponsive.
This issue has been resolved by reducing the heap footprint for Hive extended table properties.

Oracle table synonyms were not being recognized or found in Dremio.
This issue has been resolved by altering how synonyms are retrieved.

Users experienced performance degradation for certain join queries that contained extra conditions in conjunction with equi join.
This issue has been addressed by extending the hash join operator to support extra conditioners for such queries.

Min/max accumulations for variable length fields used in GROUP BY clauses were not spilled, causing failures after running out of direct memory.
This issue has been resolved by adding min/max variable length field aggregation to the vectorized hash aggregator operator, which can spill and thus doesn’t run out of memory.

The Cancel Query button is missing from the new Jobs page UI.
This issue has been resolved by adding the button to the interface.

Job duration is not incremented while the page is displayed using the new Job UI.
This issue has been addressed by adding a socket call to the API.

When attempting numerous concurrent metadata refreshes for AWS, S3, and GCS sources, these would fail due to service rate limits.
This issue has been addressed by implementing an exponential back-off policy that performs up to 10 retries upon encountering throttling errors from cloud sources. Should such errors persist beyond the retry limit, review the workaround documented under Known Issues.

With the new Jobs UI activated, upon clicking the View Details button, users could not bookmark or save the page for a single job.
This issue has been addressed by allowing users to open a job’s details from the same browser tab as the View Details button was clicked on.

Known Issues

The following are known issues with Dremio v19.0 and will be resolved in future maintenance releases:

  • HDP-2 and HDP-3 installations not yet certified for Dremio v19.0. Any changes to this status will be made known here.
  • Some queries may encounter an error for org.apache.iceberg.exceptions.CommitFailedException. This only occurs under high concurrency metadata refresh and reflection refresh operations. To avoid this issue, it is recommended that administrators enable the nessie.kvversionstore.max_retries support key and increase the default value.
  • If an administrator tries to change a local/internal user’s account details (e.g., first name, last name, email address) from the User settings page, an Invalid password error message will display upon trying to save changes. To work around this, administrators must also provide either the user’s existing password or a new password to complete the change. If a user’s password is changed, please notify the affected user.

19.1.0 Release Notes (November 2021)

Fixed Issues

When attempting to perform reporting based upon tag names, users encountered issues where the AWS tag Role was already being used by the executors having the tag value Executor. This issue has been addressed by modifying the Role tag to dremio_role in its place. Coordinators will continue to not use this tag.

When users viewed Operator details, insufficient information was available to illustrate the effectiveness of runtime filtering. No information was provided regarding whether filters were sent to the scan or even just dropped..
This issue has been addressed by adding more runtime information, such as metrics, to help users understand the effectiveness of runtime filtering.

If an administrator attempted to save changes made from the Edit User screen, Dremio displayed an Invalid password error, forcing the admin to either enter the user’s existing password or set a new password to preserve changes.
This issue has been addressed by no longer sending passwords to Dremio unless a value exists in the Password field.

Users encountered unexpected coordinator restarts when failures were encountered using external ZooKeeper clients.
This issue has been addressed by resolving a race condition that provides additional time for a response from the ZooKeeper client.

The Dremio service performed credential validations for every cloud data source, causing excessive system checks.
This issue has been addressed by simplifying initialization and refactoring Azure token fetching.

Users encountered an issue with Iceberg data in S3 returning NULL for all columns after upgrading to Dremio v19.0. This is due to Parquet files in the Iceberg table not containing column IDs.
This issue has been addressed by enabling a new support key, dremio.iceberg.fallback_to_name_based_reader. This allows Dremio to revert to name-based reading.

When using the unlimited splits execution flow preview feature, users received incorrect results.
This issue has been addressed by adding node conditions to the Iceberg digest.

While populating schema for Iceberg scan operators, the service generated columns incorrectly when a complex field reference was inserted into the scan.
This issue has been addressed by referencing the correct property.

While performing expansion matching, Dremio goes through each expansion node in the query plan and terminates them individually, which adds to time spent in normalization.
This issue has been addressed by improving reflection matching by associating expansion nodes to individual node levels.

Users encountered an issue with cluster state where all engines were running, but the cluster was not flagged as RUNNING.
This issue has been addressed by adding failsafe logic to fix the cluster state if all engines are running, but the cluster is not marked as such.

Users encountered issues with the PUT api/v3/reflection/{reflectionId} API when Dremio encountered unknown fields.
This issue has been addressed by altering the API to not return errors when a payload includes canView and canAlter fields. Changes to these fields' values will be ignored.

Known Issues

  • Some users may find themselves able to delete an engine without encountering any warnings, even if queries are associated and a queue assigned. The deleted engine name will still be visible from the UI, such as the Engine column of the queue, but no new jobs will be assigned to it.