On this page

    22.0.0 Release Notes (June 2022)

    Known Issues

    • Azure Data Explorer and Microsoft Azure Synapse Analytics sources are not supported and cannot be added in the MapR edition of Dremio 22.

    • When multiple SQL statements are executed in the SQL Runner, any jobs that may have failed are not listed in the job summary below the SQL Editor.

    • The fields parameter is not returned for tables in external sources when fetching table details via /api/v3/catalog{id} if the table has not been queried. Fixed in 22.1.1.

    • Dremio fails to parse queries on a view when the query originates from Power BI, or another JDBC/ODBC client, that has the quoting connection property set to a non-default value. Fixed in 22.1.1.

    What’s New

    • This release adds support for SQL scalar user-defined functions (UDFs), which are callable routines that make it easier for you to write and reuse SQL logic across queries. UDFs let you extend the capabilities of Dremio SQL, provide a layer of abstraction to simplify query construction, encapsulate business logic, and support row and column policies for access control.

    • Dremio now supports row-access and column-masking policies for row and column controls over user query access to sensitive tables, views, and columns. This allows administrators to dynamically exclude or mask private data at the column and row levels prior to query execution and without physically altering the original values.

    • This release adds to existing Iceberg DML capabilities allows users to run DELETE, UPDATE, MERGE, and TRUNCATE statements against Iceberg tables. See SQL Commands for Apache Iceberg Tables for more information.

    • You can now add Azure Data Explorer (ADX) as a database source in Dremio. For more information, see Azure Data Explorer.

    • Autocomplete is now available in the SQL Editor. When enabled, autocomplete lets you view and insert possible completions in the editor using the mouse or the arrow keys with Tab or Enter. Autocomplete can provide suggestions for SQL keywords, catalog objects, and functions while you are constructing SQL statements. Suggestions depend on the current context. The autocomplete feature can be enabled or disabled for all users under Settings > SQL.

    • The SQL Runner now allows you to save your SQL as a script. See Querying Your Data for more information.

      • Script owners are indicated with a small orange flag next their username. Script owners cannot be removed or have their privileges changed.

      • You can share scripts with others in your organization by adding users and assigning privileges to View, Modify, Manage Grants, or Delete.

      • When adding or modifying script privileges, the View privilege is enabled automatically if any of the other privileges are enabled

      • The option to save a script will be disabled if the user already has 100 scripts, which is the maximum per user.

    • Added support for internal schema using SQL commands, which lets the user override the data type of a column instead of using the type that Dremio automatically detected.

    • Iceberg is the default CTAS format for all filesystem sources in Dremio 22.0.0+.

    • The DATEDIFF and ADD_MONTHS Hive functions are supported in queries.

    • The option to enable Arrow caching in advanced reflection settings has been removed because Arrow caching is not supported with unlimited splits.

    • ALTER TABLE commands are now supported to add, drop, or modify columns in MongoDB sources.

    • Users can now resize the Data panel when viewing the SQL Editor on the Datasets or SQL Runner page.
    • The Zookeeper version used in Kubernetes deployments has been upgraded to 3.8.0 to address known security vulnerabilities. As part of any upgrade, it is best practice to back up configuration files and stateful volumes.

    • Fields in MongoDB tables can be converted to VARCHAR using internal schema (ALTER commands) and incompatible data types will fall back to VARCHAR instead of failing when querying MongoDB tables.

    • If the user tried to cancel a completed job, Dremio was returning an internal server error. The message now indicates that the job may have already completed and cannot be cancelled.
    • Improved logging and now providing a more meaningful error message when invalid characters are encountered in a password or PAT.
    • In this release, we have updated various elements of the Dremio UI to provide a more uniform and intuitive user experience.
    • In this release, many messages provided in Dremio have been updated to provide information that is more accurate and more helpful.
    • Iceberg now supports metadata functions for inspecting a table‚Äôs history, snapshots, and manifests.

    • New commands are available for ALTER keyword. By issuing the ALTER FOLDER or ALTER SPACE command, user can now set reflection refresh routing at folder and space level.

    • Users can now add a primary key to a table or drop a primary key with the following commands:
      alter table <table name> add primary key (col1, ...)
      alter table <table name> drop primary key

    • Dremio’s OAuth config now supports access_token as a valid token type to provide identity when authenticating via OpenID Connect SSO.

    • Dremio now supports OIDC + LDAP mode, which allows the use of OpenID Connect (OIDC) for authentication while still using LDAP for user and groups lookup.
    • In this release, json-smart was upgraded to version 2.4.8 to address CVE-2021-27568.
    • Updated the Postgres JDBC driver from version 42.2.18 to version 42.3.4 to address CVE-2022-21724.
    • In this release, the Apache Arrow version has been upgraded to 8.0.0 to address issues with some current functions and add support for new functions.
    • FasterXML/Jackson was upgraded to version 2.13.2 in Parquet to address a number of vulnerabilities.
    • This release includes a new consent page where you can permit Tableau to access resources on your behalf when connecting via Tableau SSO.

    • Along with ROW and ARRAY keywords, STRUCT and LIST keywords are now supported to represent complex data types:
      STRUCT < x : BIGINT, y : LIST < BIGINT >>, LIST <STRUCT < x : INT >>

    • This release adds support for the MODIFY privilege on SYSTEM that will allow non-admin users to manage Node Activity, Engines, Queues, Engine Routing, and Support Keys.

    • Dremio now supports MODIFY COLUMN on MongoDB sources, and the internal schema changes will not be erased by metadata refresh.

    • The SELECT privilege can be granted to users and roles on specific system tables, allowing those users to view the specified tables.

    • In the job details page, the automatic truncation message will appear in the job summary if a query’s output was truncated.

    • Dremio admins can allow or disable the creation of local users by adding the services.coordinator.security.permission.local-users.create.enabled:<flag> setting to dremio.conf. Set the flag to true to allow local users or false to disable the creation of local users.

    • Added the UPLOAD privilege, letting non-admin users upload files to their home space. This privilege can be overridden if the ui.upload.allow support key is disabled.

    • Added a plus button to the upper-right corner of the page for spaces that allows users to quickly add a new folder, table, or view. For user home spaces, the button also allows you to upload files.
    • Dremio now supports SSO authentication from Tableau. See Tableau for more information about supported versions and configuration steps.
    • If the context is truncated in the SQL Editor, you can now hover the cursor over the field and the full context will be displayed in a tooltip.
    • User can now see the wiki for a folder if they can access the folder, even if implicitly via a shared dataset that is nested inside.

    • This release includes a new argument for the dremio-admin clean CLI command to purge dataset version entries that are not linked to existing jobs. See Clean Metadata for more information.

    • Users who have been granted the CREATE ROLE privilege can view and update role members.

    Issues Fixed

    • Running ALTER PDS to refresh metadata on a Hive source was resulting in the following error: PLAN ERROR: NullPointerException

    • Some queries were taking longer than expected because Dremio was reading a STRUCT column when only a single nested field needed to be read.

    • On first run, some queries were failing with an assertion error at the planning stage when a complex type was defined within a view.

    • Following the upgrade from Dremio 20.x to 21.0.0 and if Nessie was in use, metadata refreshes were failing with Unknown type ICEBERG_METADATA_POINTER.

    • The Tableau and PowerBI buttons were not showing up or remaining hidden as expected, and they are now enabled all the time in the SQL Editor.

    • After enabling Iceberg, files with : in the path or name were failing with a Relative path in absolute URI error.

    • Reflection refresh jobs were consuming too much peak memory on each executor node.

    • CAST operations were added to pushed down queries for RDBMS sources to ensure consistent data types, and specifically for numeric types where precision and scale were unknown. In some cases, however, adding CAST operations at lower levels of the query was disabling the use of indexes in WHERE clauses in some databases. Dremio now ensures that CAST operations are added as high up in the query as possible.

    • Following an upgrade, queries with TO_NUMBER(_Column_,'###') were failing.

    • In environments with high memory usage, if an expression contained a large number of splits, it could eventually lead to a heap outage/out of memory exception.

    • Fixed an issue that was causing the following error when trying to open a view in the Dataset page: Some virtual datasets are out of date and need to be manually updated.

    • When using Postgres as the data source, expressions written to perform subtraction between doubles and integers, or subtraction between floats and integers, would incorrectly perform an addition instead of the subtraction.

    • When running a specific query with a HashJoin, executor nodes were stopping unexpectedly with the following error: SYSTEM ERROR: ExecutionSetupException

    • At times, in Dremio’s AWS Edition, the preview engine was going offline and could not be recovered unless a reboot was performed.
    • Dremio was generating a NullPointer Exception when performing a metadata refresh on a Delta Lake source if there was no checkpoint file.
    • Partition expressions were not pushed down when there was a type mismatch in a comparison, resulting in slow queries compared to prior Dremio versions.
    • Fixed an issue that was causing large spikes in direct memory usage on coordinator nodes, which could result in a reboot.
    • When Iceberg features were enabled, the location in the API was incorrect for some tables in S3 sources.

    22.0.3 Release Notes (Enterprise Edition Only, July 2022)

    Enhancements

    • Azure Data Lake Storage (ADLS) Gen1 is now supported as a source on Dremio’s AWS Edition. For more information, see Azure Data Lake Storage Gen1.
    • Elasticsearch is now supported as a source on Dremio’s AWS Edition. For more information, see Elasticsearch.

    22.1.1 Release Notes (August 2022)

    Enhancements

    • Dremio now supports connecting to Amazon S3 sources using an AWS PrivateLink URL. For more information, see Amazon S3.

    • In this release, embedded Nessie historical data that is not used by Dremio is purged on a periodic basis to improve performance and avoid future upgrade issues. The maintenance interval can be modified with the nessie.kvversionstore.maintenance.period_minutes Support Key, and you can perform maintenance manually using the nessie-maintenance admin CLI command.

    • If OAuth sign-in for Tableau is enabled, all newly generated TDS files will use OAuth for authentication. If disabled, username/password authentication will be used.

    • Users with the CREATE ROLE privilege will now have access to the Roles tab under Settings, allowing them to add new roles.

    • Improved the error message that is displayed when trying to run DML commands that are not supported on views saved from Iceberg tables.
    • This release enables non-partition column runtime filters with row level pruning.

    Issues Fixed

    • The fields parameter was not returned for tables in external sources when fetching table details via /api/v3/catalog{id} if the table had not been queried.

    • Dremio was failing to parse queries on a view when the query originated from Power BI, or another JDBC/ODBC client, that had the quoting connection property set to a non-default value.

    • In some scenarios, invalid metadata about partition statistics was leading to inaccurate rowcount estimates for tables. The result was slower than expected query execution or out of memory issues. For each table included in a query where this behavior appears, perform an ALTER TABLE <table-name> FORGET METADATA, then re-promote the resulting file or folder to a new table. This will ensure that the table is created with the correct partition statistics.
    • For some users, when clicking on certain items on the Settings page, they were being redirected to the Dremio home screen.

    • Automatic reflection refreshes were failing with the following error: StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked

    • Profiles for some reflection refreshes included unusually long setup times for WRITER_COMMITTER.

    • Wait time for WRITER_COMMITTER was excessive for some reflection refreshes, even though no records were affected.

    • In Dremio’s AWS Edition, upgrades from any 21.x.x version to version 22 were failing.

    • Metadata queries (queries using the TABLE_FILES() function) that were run on tables that had been altered were failing or returning incorrect results.

    • Some database sources, such as Snowflake, Databricks Spark, and MSAccess, were showing up under Object Storage when adding a source, and they could not be browsed or managed in the Datasets page.

    • Some queries on Parquet datasets in an ElasticSearch source were failing with a SCHEMA_CHANGE error, though there had been no changes to the schema.

    • dremio-admin clean is now limited to only temporary dataset versions during cleanup.

    • Fixed an issue that was causing metadata refresh on some datasets to fail continuously.
    • Objects whose names included non-latin characters were not behaving as expected in Dremio. For example, folders could not be promoted and views were not visible in the homespace.

    • When unlimited splits were enabled and running incremental metadata refreshes on a file-based table, running subsequent raw reflections would fail with a DATA_READ error.

    • INSERT, MERGE, UPDATE, TRUNCATE, and DELETE queries in the SQL Runner were failing with an Invalid path error when using a partial key/path.

    • In some cases, the number of records returned by CTAS or DML operations did not match the number reported in the query summary below the SQL Editor.

    • GROUP BY queries that used GROUPING SETS were failing with AssertionError.

    • If issues were encountered when running queries against a view, Dremio was returning an error that was unhelpful. The error returned now includes the root cause and identifies the specific view requiring attention.
    • When adding a new S3 source, the Encrypt connection option was not enabled by default, though it was enabled for other sources.

    • CONVERT_FROM queries were returning errors if they included an argument that was an empty binary string. This issue has been fixed, and such queries have been optimized for memory utilization.

    • When using the Catalog API to create a folder in a space, if the folder already existed in the space, the API was returning the HTTP/1.1 500 Internal Server Error instead of HTTP/1.1 409 Conflict.

    • Reflection refreshes on a MongoDB source table were failing with the following error: unknown top level operator: $not

    • The ODBC driver was ignoring the StringColumnLength with STRUCT data types, resulting in truncated results.

    • Row count estimates for some Delta Lake tables were changing extensively, leading to single-threaded execution plans.
    • In environments with high memory usage, if an expression contained a large number of splits, it could eventually lead to a heap outage/out of memory exception.

    • When a Hive source was added or modified, shared library files created in a new directory under /tmp were not being cleaned up and leading to disk space issues.

    • Fixed an issue that was causing slow query performance against Redshift datasources.

    • JDBC clients could not see parent objects (folders, spaces, etc.) unless they had explicit SELECT privileges on those objects, even if they had permissions on a child object.

    • Fixed an issue in the scanner operator that could occur when a parquet file had multiple row-groups, resulting in a query failure and the following system error: Illegal state while reusing async byte reader

    • Fixed an issue that could cause the Arrow Flight endpoint performing long queries to encounter a gRPC GOAWAY code.

    22.1.2 Release Notes (Enterprise Edition Only, October 2022)

    Enhancements

    • Added a new Admin CLI command, dremio-admin remove-duplicate-roles, that will remove duplicate LDAP groups or local roles and consolidate them into a single role. For more information, see Remove Duplicate Roles.

    Issues Fixed

    • After upgrading to Dremio 22.1.1, some coordinator nodes failed to start due to a failure in connecting to S3-compatible storage (sources or distributed storage configuration) that required path style access.

    • Following the upgrade to Dremio 22, Support Keys of type DOUBLE would no longer accept decimal values.

    • Field size for CSV files was limited to 65536 characters, and setting the limits.single_field_size_bytes Support Key to a higher value than the limit was not being honored.

    • Fixed an issue that was causing REFRESH REFLECTION and REFRESH DATASET jobs to hang when reading Iceberg metadata using Avro reader.

    • The LENGTH function was returning incorrect results for Teradata sources.

    • Fixed an issue that was causing the status of a cancelled job to show as RUNNING or PLANNING.

    • In some deployments, using a large number of REST API-based queries that return large result sets can create memory issues and lead to cluster instability.

    • Following the upgrade to Dremio 22, some queries to Hive 2 metastore external tables with data in S3 were running considerably slower than before.

    • During the reflection matching phase, for the filter pattern in some queries the planner could generate row expression nodes exponentially and exhaust heap memory.

    • Fixed an issue that was causing a GandivaException: Failed to make LLVM module due to Function double abs(double) not supported yet for certain case expressions used as input arguments.

    • This release includes a number of fixes that resolve potential security issues.

    • In rare cases, an issue in the planning phase could result in the same query returning different results depending on the query context.

    • When skipping the current record from any position, Dremio was not ignoring line delimiters inside quotes, resulting in unexpected query results.

    • Following the upgrade to Dremio 21.2, some Delta Lake tables could not be queried, and the same tables could not be formatted again after being unpromoted.

    • Fixed an issue that was causing failures in Microsoft SQL Server queries that contained a boolean filter set to true.

    • In some cases, deleted reflections were still being used to accelerate queries if the query plan had been cached previously.

    • Clicking Edit Original SQL for a view in the SQL editor was producing a generic Something went wrong error.

    • Some queries were failing with INVALID_DATASET_METADATA ERROR: Unexpected mismatch of column names if duplicate columns resulted from a join because Dremio wasn’t specifying column names.

    • In some cases, queries using the < operator would fail when trying to decode a timestamp column in a Parquet file.

    • Parentheses were missing when generating the SQL for a view when the query contained UNION ALL in a subquery, and the query failed to create the view.

    22.1.4 Release Notes (Enterprise Edition Only, October 2022)

    Issues Fixed

    • In some cases, queries against a table that was promoted from text files containing Windows (CRLF) line endings were failing or producing an Only one data line detected error.

    • Following the upgrade to Dremio 22.1.2, when promoting JSON files to tables and building views from those tables, queries against the views were failing with a NullPointerException.

    • In Dremio 22.1.1, some queries that included a WHERE clause were failing with a NullPointerException during the planning phase.

    • Reflection footprint was 0 bytes when created on a view using the CONTAINS function on an Elasticsearch table. The reflection could not be used in queries and sys.reflection output showed CANNOT_ACCELERATE_SCHEDULED.

    • In Dremio 22.0.x, users who were not assigned the ADMIN role were getting 0-byte files when attempting to download query results, while downloads were working as expected in previous releases.

    • Fixed an issue that was causing certain queries to fail with a Max Rel Metadata call count exceeded error.

    • After changing the engine configuration, some queries were failing with an IndexOutOfBoundsException error.

    • JDBC clients could not see parent objects (folders, spaces, etc.) unless they had explicit SELECT privileges on those objects, even if they had permissions on a child object.

    22.1.5 Release Notes (Enterprise Edition Only, November 2022)

    Issues Fixed

    • The queries.log file was showing zero values for inputRecords, inputBytes, outputRecords, outputBytes, and metadataRetrieval, even though valid values were included in the job profile.

    • For Parquet sources on Amazon S3, files were being automatically formatted/promoted even though the auto-promote setting had been disabled.

    • When saving a view, datalake sources were showing up as a valid location for the view, but such sources should not have been allowed as a destination when saving a view.

    • Improved reading of double values from ElasticSearch to maintain precision.

    • An error in schema change detection logic was causing refresh metadata jobs for Hive tables to be triggered at all times, even if there were no changes in the table.

    • This release includes performance improvements for incremental metadata refreshes on partitioned Parquet tables.

    • Dremio was generating unnecessary exchanges with multiple unions, and changes have been made to set the proper parallelization width on JDBC operators and reduce the number of exchanges.

    • On catalog entities, ownership granted to a role was not being inherited by users in that role.

    • In some environments, Dremio was unable to read a Parquet statistics file in Hive during logical planning, and the query was cancelled because planning phase exceeded 60 seconds.

    • Some queries using a filter condition with flatten field under a multi-join were generating a NullPointerException.

    • When a materialization took too long to deserialize, the job updating the materialization cache entry could hang and block all reflection refreshes.

    • When trying to use some custom garbage collection value in JVM options, the option was being switched to UseParallelGC, which would cause performance degradation.

    • CONVERT_FROM() did not support all ISO 8601 compliant date and time formats.

    • An aggregate reflection that matched was not being chosen due to a cost difference generated during pre-logical optimization.

    • Fixed an issue causing the error “Offset vector not large enough for records” when copying list columns.

    • Fixed an issue that was affecting the accuracy of cost estimations for DeltaLake queries (i.e., some queries where showing very high costs).

    • If Dremio was stopped while a metadata refresh for an S3 source was in progress, some datasets within the source were getting unformatted/deleted.

    • Frequent, consecutive requests to the Job API endpoint to retrieve a Job’s status could result in an UNKNOWN StatusRuntimeException error.

    • Fixed an issue where Glue tables with large numbers of columns and partitions would not return results for all partitions in the table. The fix requires table metadata to be refreshed via ALTER TABLE REFRESH METADATA to take effect.

    • Updated org.apache.parquet:parquet-format-structures to address a potential security vulnerability [CVE-2021-41561].