On this page

    19.0.0 Release Notes (Dremio October 2021)

    What’s New

    Simplified Navigation Bar

    The original top navigation bar has been replaced with a new side navigation bar along the left-hand side of the Dremio interface. Links to Datasets, SQL Runner, Jobs, and account settings have been replaced with icons for the SQL Editor, Job Visualizer, project, and profile.

    This image shows the new left-hand navigation menu in Dremio.

    The icons on the image above represent the following:

    1. Dataset page
    2. Query Editor page
    3. Jobs page
    4. Settings page
    5. Help menu
    6. Account options

    Apache Iceberg Hive Table Support

    Dremio 19.0+ supports using the popular Apache Iceberg open table format created with Spark 3. This functionality extends to users with read (SELECT) privileges on sources that support Hive catalogs (ADLS, Hive, S3). Additionally, this allows for more rapid storage and retrieval of metadata when after queries or changes are made to a dataset.

    PIVOT & UNPIVOT Operators

    PIVOT and UNPIVOT relational operators are now available in Dremio, allowing for the changing of table-valued expressions into another table. PIVOT rotates table-valued expressions by turning unique values from one column of an expression into multiple columns within the output. Additionally, it performs aggregations where required on any remaining column values needed in the final output. UNPIVOT accomplishes the opposite, rotating columns of table-valued expressions into column values.

    For additional information, see the PIVOT and UNPIVOT help pages.

    Lake Formation Integration (Preview)

    Dremio now honors permission configurations made from AWS Lake Formation for Glue sources at the database and table levels. This entails integrating with Dremio as described in our Lake Formation configuration guide.

    Other Enhancements

    • Administrators may grant internal/external groups (e.g., AD) administrative access to an organization’s full cluster with the GRANT SQL command. To add this privilege to a role, navigate to the SQL Editor and run the query, GRANT ROLE ADMIN TO ROLE groupName. Run SELECT * FROM sys.membership to verify the new record exists.
    • Dividing a float by 0 now returns NaN if Gandiva-based execution is enabled. If disabled, Dremio will follow the standard SQL behavior for dividing by 0, which is to display an exception error message.
    • IEEE-754 divide semantics has also been added, which is activated through the planner.ieee_754_divide_semantics support key. When a positive/zero/negative float is divided by zero, then infinity, NaN or negative infinity is returned.

    Changed Behaviors

    Jobs & Job Detail New UI On By Default

    The new Jobs and Job Details pages first introduced in Dremio v18.0 is now activated by default for all customers that upgrade to v19.0+. Previously, this functionality required manual enablement via support keys to access.

    Removal of Mixed Data Types Support Key

    The support key that allowed customers to enable mixed types despite the deprecation implemented in Dremio v18.0 has now been removed.

    Fixed Issues

    After upgrading to Dremio 18.0, the interface became slow and responsive as a result of access control migrations.
    This issue has been addressed so that the interface is more responsive while migrations are taking place.

    After upgrading to 18.0, customers began experiencing permissions issues when attempting to use HDFS-based sources.
    This issue has been resolved by removing permission checks for Iceberg metadata datasets.

    While using the Jobs page, text was not displaying in the Search box.
    This issue has been addressed with some CSS fixes to make search text visible.

    Customers encountered incorrect results or no rows returned after performing queries with a false predicate in an OR condition.
    This issue has been resolved by preventing the generation of incorrect partition filter conditions. Additionally, the limit for converting conjunctive normal forms has been increased.

    After upgrading to Dremio v18.0+, users encountered failures when using the now() function in reflections.
    This has been resolved by adding the now() function back to the reflection whitelist so that it can be used with reflections, despite being a context-sensitive function.

    The Open Results link was missing from the new Job Details page UI.
    The Open Results functionality has been added to the top-right corner of the Job Details page.

    The ASSERT_DECIMAL SQL function was not being properly handled in the java code generated, which resulted in infinite recursions while rewriting unions during materialization.
    This issue has been resolved with improvements to the handling of the ASSERT_DECIMAL function.

    The new Jobs page didn’t show the name of the reflection a refresh job was run with.
    The Reflections Created section is now included to show reflection information, such as name.

    Customers trying to promote a strict OXML file would encounter failures due to the format being unsupported. However, the error message incorrectly stated failure to create parser for entry:<filepath>.
    The exception is now caught and an improved error message is shown: Strict OXML is not supported. Please save file: <filepath> in standard (.xlsx) format.

    Due to java version mismatches, the Dremio service cannot start due to a related timeout and AWSE AMI is rendered unusable.
    This issue was fixed by adding scripts that update services when first packing AMI. Background security updates are then disabled to prevent further updates once the Dremio and Zookeeper services start. \

    Dremio couldn’t read the partitionValues map in the Delta Lake Parquet checkpoint log file when written by older Parquet writers.
    This issue has been resolved by enabling Dremio’s checkpoint Parquet reader to read map fields without assuming field structure.

    Dremio encountered issues where Parquet file date columns showed incorrect results and the service autocorrects them.
    This issue has been resolved by disabling the store.parquet.auto.correct.dates support key, which was used by default in older versions of Dremio to resolve incorrect dates written by Apache Drill.

    Logs on AWSE switched to writing to dremio.backup on startup.
    This issue has been resolved for AWSE users by implementing multiple files for GC logs with timestamp postfixes formatted so that the most-recent file has the .current postfix.

    Queries performed with PostgreSQL encountered issues with braces in UNION statements.
    When generating SQL for RDBMS sources, both sides of UNION [ALL] statements now have surrounding braces.

    A timer issue falsely caused timeouts when Plan Cache is enabled due to the timer failing to stop after a plan cycle.
    This issue has been resolved in instances where Plan Cache is enabled, and the timer now stops when the plan cycle ends.

    Applying CONVERT_FROM(,'JSON') on a reflection is not possible due to dataset field information in the __accelerator source being unable to update.
    This issue has been resolved to allow the use of CONVERT_FROM(,'JSON') on reflections.

    When Delta Lake transactions are committed, the checkpoint Parquet files generated are multi-part, which Dremio does not support, causing queries on the table to fail.
    This issue has been resolved by enabling Dremio to read multi-part checkpoint Parquet files for Delta Lake tables.

    “Legacy” versions of RDBMS connectors caused errors while querying.
    This issue has been resolved by updating all legacy RDBMS connectors with the more advanced ARP versions of the sources, where if functionality is not supported in the data source, it is no longer executed by Dremio. this update of ARP sources will take place automatically.

    Datasets with lists and structs encountered poor performance with queries containing a WHERE clause.
    This issue has been resolved by improving the performance of the copier used to copy values from complex types, such as lists and structs.

    Users encountered an exception for indexes being out of bounds when performing queries.
    This issue has been resolved by performing null checks on columns and offset indexes.

    Larger Hive extended table properties with concurrent queries running caused Garbage Collection and executors became unresponsive.
    This issue has been resolved by reducing the heap footprint for Hive extended table properties.

    Oracle table synonyms were not being recognized or found in Dremio.
    This issue has been resolved by altering how synonyms are retrieved.

    Users experienced performance degradation for certain join queries that contained extra conditions in conjunction with equi join.
    This issue has been addressed by extending the hash join operator to support extra conditioners for such queries.

    Min/max accumulations for variable length fields used in GROUP BY clauses were not spilled, causing failures after running out of direct memory.
    This issue has been resolved by adding min/max variable length field aggregation to the vectorized hash aggregator operator, which can spill and thus doesn’t run out of memory.

    The Cancel Query button is missing from the new Jobs page UI.
    This issue has been resolved by adding the button to the interface.

    Job duration is not incremented while the page is displayed using the new Job UI.
    This issue has been addressed by adding a socket call to the API.

    When attempting numerous concurrent metadata refreshes for AWS, S3, and GCS sources, these would fail due to service rate limits.
    This issue has been addressed by implementing an exponential back-off policy that performs up to 10 retries upon encountering throttling errors from cloud sources. Should such errors persist beyond the retry limit, review the workaround documented under Known Issues.

    With the new Jobs UI activated, upon clicking the View Details button, users could not bookmark or save the page for a single job.
    This issue has been addressed by allowing users to open a job’s details from the same browser tab as the View Details button was clicked on.

    Known Issues

    The following are known issues with Dremio v19.0 and will be resolved in future maintenance releases:

    • HDP-2 and HDP-3 installations not yet certified for Dremio v19.0. Any changes to this status will be made known here.
    • Some queries may encounter an error for org.apache.iceberg.exceptions.CommitFailedException. This only occurs under high concurrency metadata refresh and reflection refresh operations. To avoid this issue, it is recommended that administrators enable the nessie.kvversionstore.max_retries support key and increase the default value.
    • If an administrator tries to change a local/internal user’s account details (e.g., first name, last name, email address) from the User settings page, an Invalid password error message will display upon trying to save changes. To work around this, administrators must also provide either the user’s existing password or a new password to complete the change. If a user’s password is changed, please notify the affected user.

    19.1.0 Release Notes (November 2021)

    Fixed Issues

    When attempting to perform reporting based upon tag names, users encountered issues where the AWS tag Role was already being used by the executors having the tag value Executor. This issue has been addressed by modifying the Role tag to dremio_role in its place. Coordinators will continue to not use this tag.

    When users viewed Operator details, insufficient information was available to illustrate the effectiveness of runtime filtering. No information was provided regarding whether filters were sent to the scan or even just dropped..
    This issue has been addressed by adding more runtime information, such as metrics, to help users understand the effectiveness of runtime filtering.

    If an administrator attempted to save changes made from the Edit User screen, Dremio displayed an Invalid password error, forcing the admin to either enter the user’s existing password or set a new password to preserve changes.
    This issue has been addressed by no longer sending passwords to Dremio unless a value exists in the Password field.

    Users encountered unexpected coordinator restarts when failures were encountered using external ZooKeeper clients.
    This issue has been addressed by resolving a race condition that provides additional time for a response from the ZooKeeper client.

    The Dremio service performed credential validations for every cloud data source, causing excessive system checks.
    This issue has been addressed by simplifying initialization and refactoring Azure token fetching.

    Users encountered an issue with Iceberg data in S3 returning NULL for all columns after upgrading to Dremio v19.0. This is due to Parquet files in the Iceberg table not containing column IDs.
    This issue has been addressed by enabling a new support key, dremio.iceberg.fallback_to_name_based_reader. This allows Dremio to revert to name-based reading.

    When using the unlimited splits execution flow preview feature, users received incorrect results.
    This issue has been addressed by adding node conditions to the Iceberg digest.

    While populating schema for Iceberg scan operators, the service generated columns incorrectly when a complex field reference was inserted into the scan.
    This issue has been addressed by referencing the correct property.

    While performing expansion matching, Dremio goes through each expansion node in the query plan and terminates them individually, which adds to time spent in normalization.
    This issue has been addressed by improving reflection matching by associating expansion nodes to individual node levels.

    Users encountered an issue with cluster state where all engines were running, but the cluster was not flagged as RUNNING.
    This issue has been addressed by adding failsafe logic to fix the cluster state if all engines are running, but the cluster is not marked as such.

    Users encountered issues with the PUT api/v3/reflection/{reflectionId} API when Dremio encountered unknown fields.
    This issue has been addressed by altering the API to not return errors when a payload includes canView and canAlter fields. Changes to these fields' values will be ignored.

    Known Issues

    • Some users may find themselves able to delete an engine without encountering any warnings, even if queries are associated and a queue assigned. The deleted engine name will still be visible from the UI, such as the Engine column of the queue, but no new jobs will be assigned to it.

    19.2.0 Release Notes (Enterprise Edition Only, November 2021)

    Enhancements

    Min/Max on Variable-Length Columns moved to VectorizedHashAgg Spill Operator

    When a query used min/max on a varlength column or ndv accumulator, it was serviced by a no-spill operator. This functionality has been moved to a new VectorizedHashAgg spill operator so that in the event of out-of-memory (OOM) exceptions, the operator continues to spill onto the disk until the query completes rather than forcing a query rewrite.

    Fixed Issues

    When debugging runtime filters, users encountered insufficient information from the operator details about which runtime filter a Scan Operator received or dropped.
    This issue has been addressed by adding runtime filter information to Scan Operator details.

    AAD authentication users wanted to grant a role root access to containers, but without the new access control feature setup as a requirement for doing so.
    This issue has been addressed by creating root path support for the AAD application and limited ACL permissions.

    Customers couldn’t query min/max variable length fields on datasets due to query failure. These failures were caused by insufficient memory due to the group by clauses being unable to spill.
    This issue has been addressed by adding variable length fields to the vectorized hash aggregator operator, which allows spills.

    Users encountered a “Failed to decode column” error when promoting and querying v2 page type Parquet files.
    This issue has been addressed by adding decompression logic for v2 page types.

    When using AWS profile-based authentication, Dremio did not cache credentials, which forced the service to repeatedly perform checks.
    This issue has been addressed by allowing Dremio to cache credentials and perform periodic checks. To enalbe this functionality, enable the com.dremio.aws.cred.refresh.interval flag. It is recommended that this interval be set to a minimum of 60 seconds.

    When users performed queries using aggregated join, the reflection doesn’t match.
    This issue has been addressed by enhancing aggregated reflection matching by trimming user queries and pushing aggregate below the join operator.

    User queries encountered Gandiva exceptions indicating that Dremio “could not allocate memory for output string."
    This issue has been addressed by fixing an unexpected behavior within the SPLIT_PART function.

    19.3.0 Release Notes (Enterprise Edition Only, January 2022)

    Enhancements

    Microsoft Azure Synapse Analytics Support

    An ARP connector is now available on Dremio that allows for integration with Azure Synapse Analytics dedicated SQL pools. This option is available for immediate use by adding a new External Source from the Dremio interface.

    Logback Updated

    Logback was updated to v1.2.9 to mitigate CVE-2021-44228. This utilizes a new version of the library, which disables certain JNDI features known to cause issues with log4j 2.x. While Dremio is not vulnerable due to logback configurations being inaccessible externally and not using JNDI/JDBC features, this was done as a general security best practice.

    Updated Azure Storage Library

    Updated Dremio’s supported version of the Azure.Storage.Common library to v12.14.1, at the recommendation of Microsoft. Organizations using older versions of Azure storage libraries occasionally encountered data corruption issues, which is addressed with the newer SDK version.

    Fixed Issues

    The FilesystemScanDrel operator could skip printing out partition-filtering information, which caused Dremio’s query planner to utilize incorrect operators in later phases of the query execution. This issue has been addressed by forcing all information to print out with the FilesystemScanDrel operator.

    Users encountered an AssertionError when attempting to add expressions of different types to a set. This issue has been addressed by updating the version of Calcite used (CALCITE-2946).

    Users encounter issues with data sources with BIT columns mapped to BOOLEAN, as pushdowns are disabled for such columns. This issue has been addressed for sources that do not support booleans by expanding boolean columns into integer comparisons (e.g., COL = 1).

    19.4.0 Release Notes (Enterprise Edition Only, January 2022)

    Fixed Issues

    • In cases involving multiple tables in joins along with filters, RDBMS query pushdown could result in queries that ambiguously reference columns, resulting in invalid identifier errors.
    • In some cases, running SELECT * on some system tables was failing with the following error: UNAVAILABLE: Channel shutdown invoked
    • Intermittent jobs were failing with an IndexOutOfBounds exception while preparing operator details information for runtime filters.
    • When attempting to download certain query results as JSON or Parquet files, the downloaded file size was zero bytes and resulted in an IndexOutofBounds exception.
    • In some cases, certain SELECT queries that included an ORDER BY statement were returning the following error: Serialization is only allowed for SelectionVectorMode.NONE

    19.4.1 Release Notes (Enterprise Edition Only, February 2022)

    Enhancements

    • This release improves Sql-To-Rel performance by eliminating slower single-use maps and using lambda functions.

    Fixed Issues

    • Some fields were not getting trimmed after the logical phase during queries, resulting in degraded performance.
    • Plan serialization time was not being accounted for in the Sql-To-Rel conversion phase, resulting in planning time missing from profiles as well as longer than usual planning times.

    19.5.0 Release Notes (Enterprise Edition Only, March 2022)

    Enhancements

    • In this release, Dremio is now pushing down computation for extra hash join conditions.

    Fixed Issues

    • Reflection refreshes that were initially successful would stop working after some time. If a request for a reflection refresh was submitted (scheduled, via the UI, or via REST API calls), no job appeared in the jobs queue because the job was never started.
    • Fixed an issue that was causing sockets to remain in a CLOSE_WAIT state while running metadata refresh on an ORC dataset. This resulted in Too Many Open File errors and the cluster had to be restarted to resolve the condition.
    • In environments with high memory usage, if an expression contained a large number of splits, it could eventually lead to a heap outage/out of memory exception.
    • In previous versions of Dremio, for some relational sources that did not support boolean type, using the CAST function to expand a boolean value to a boolean expression was resulting in an Incorrect syntax near the keyword 'AS’ error.
    • In some queries, window expressions were not getting normalized after substitution, resulting in a Cannot convert RexNode to equivalent Dremio expression error.
    • Queries that worked in previous versions of Dremio were failing with the following error: Job was cancelled because the query went beyond system capacity during query planning. Please simplify the query
    • Some complex join filters were getting dropped, resulting in incorrect query results.
    • Some SELECT queries on parquet files were failing with a DATA_READ ERROR: Failed to decode column SRC::varchar error.
    • Some fields were not getting trimmed after the logical phase during queries, resulting in degraded performance.
    • The IS_MEMBER() function was not working with internal roles, returning false when it should have been returning true.
    • The same SELECT query, using the IS_MEMBER() function, was returning different results in different versions of Dremio.
    • When formatting GCS data at a folder level into a PDS or when selecting data from an existing PDS built on GCS, if any data values in the partitioning field included a space, the action would fail with: RuntimeException: the specified key does not exist

    19.5.1 Release Notes (Enterprise Edition Only, March 2022)

    Issues Fixed

    • When running a specific query with a HashJoin, executor nodes were stopping unexpectedly with the following error: SYSTEM ERROR: ExecutionSetupException

    19.6.0 Release Notes (Enterprise Edition Only, April 2022)

    Issues Fixed

    • Some CASE queries that ran successfully in previous versions were failing with FUNCTION ERROR: Cannot parse input: Not Available with pattern : #.
    • When running a specific query with a HashJoin, executor nodes were stopping unexpectedly with the following error: SYSTEM ERROR: ExecutionSetupException
    • In this release, json-smart was upgraded to version 2.4.8 to address CVE-2021-27568.
    • CAST operations were added to pushdown queries for RDBMS sources to ensure consistent data types, and specifically for numeric types where precision and scale were unknown. In some cases, however, adding CAST operations at lower levels of the query was disabling the use of indexes in WHERE clauses in some databases. Dremio now ensures that CAST operations are added as high up in the query as possible.

    19.6.1 Release Notes (Enterprise Edition Only, April 2022)

    Issues Fixed

    • The is_member SQL function was failing with UnsupportedOperationException when concatenating with a table column.

    19.6.2 Release Notes (Enterprise Edition Only, April 2022)

    Issues Fixed

    • Running ALTER PDS to refresh metadata on a Hive source was resulting in the following error: PLAN ERROR: NullPointerException.

    19.6.3 Release Notes (Enterprise Edition Only, May 2022)

    Enhancements

    • This release includes a new argument for the dremio-admin clean CLI command to purge dataset version entries that are not linked to existing jobs. See Clean Metadata for more information. Actions
    • The -j argument of the dremio-admin clean CLI command has been extended to purge temporary dataset versions associated with deleted jobs. See Clean Metadata for more information.

    Issues Fixed

    • When a CASE was used in a WHERE filter with an AND or an OR, it would be incorrectly wrapped in a CAST, resulting in the following error: DATA_READ ERROR: Source 'sqlGrip' returned error 'Incorrect syntax near the keyword 'AS'.'

    19.7.0 Release Notes (Enterprise Edition Only, May 2022)

    Issues Fixed

    • When a CASE was used in a WHERE filter with an AND or an OR, it would be incorrectly wrapped in a CAST, resulting in the following error: DATA_READ ERROR: Source 'sqlGrip' returned error 'Incorrect syntax near the keyword 'AS'.'
    • Fixed an issue that could result in duplicate column names being written by the planner when an expression in the project included a field named *.
    • The is_member SQL function was failing with UnsupportedOperationException when concatenating with a table column.
    • At times, in Dremio’s AWS Edition, the preview engine was going offline and could not be recovered unless a reboot was performed.
    • Some queries were taking longer than expected because Dremio was reading a STRUCT column when only a single nested field needed to be read.
    • Running ALTER PDS to refresh metadata on a Hive source was resulting in the following error: PLAN ERROR: NullPointerException

    19.8.0 Release Notes (Enterprise Edition Only, May 2022)

    Enhancements

    • This release includes a new argument for the dremio-admin clean CLI command to purge dataset version entries that are not linked to existing jobs. See Clean Metadata for more information.
    • The -j argument of the dremio-admin clean CLI command has been extended to purge temporary dataset versions associated with deleted jobs. See Clean Metadata for more information.

    Issues Fixed

    • Updated the Postgres JDBC driver from version 42.2.18 to version 42.3.4 to address CVE-2022-21724.
    • Dremio was generating a NullPointer Exception when performing a metadata refresh on a Delta Lake source if there was no checkpoint file.
    • A NULL constant in reflection definition was causing a type mismatch while expanding the materialization.
    • When using Postgres as the data source, expressions written to perform subtraction between doubles and integers, or subtraction between floats and integers, would incorrectly perform an addition instead of the subtraction.
    • In some cases, out of memory errors on Delta Lake tables were occurring if commitInfo was the last line of the JSON, resulting in incorrect file estimates for netFilesAdded, netBytesAdded, and netOutputRows.
    • After enabling Iceberg, files with : in the path were failing with a Relative path in absolute URI error.
    • Fixed a column index issue in RelMetadata that was resulting in some queries on views failing with VALIDATION ERROR: Using CONVERT_FROM(*, 'JSON').
    • Plan serialization time was not being accounted for in the Sql-To-Rel conversion phase, resulting in planning time missing from profiles as well as longer than usual planning times.
    • An issue with plan serialization was causing longer than usual planning times.

    19.8.1 Release Notes (Enterprise Edition Only, June 2022)

    Issues Fixed

    • The dremio-admin clean CLI parameter -d (or --delete-orphan-datasetversions) was deleting named dataset versions during clean-up. With this release, only temporary tmp.UNTITLED dataset versions will be deleted.