25.x Release Notes
Releases are listed in reverse order, starting with the latest release of Dremio 25.x.
25.0.7 (July 2024) Enterprise
Issues Fixed
- In AWS Edition (AWSE), fixed an issue that could cause some sources to be in bad state after upgrading. The issue affects only AWSE version 25+. DX-93002
25.0.6 (July 2024) Enterprise
Issues Fixed
- Fixed an issue that could affect Java virtual machine (JVM) memory calculations. DX-87463
25.0.5 (July 2024) Enterprise
What's New
Added a new Dataset API endpoint,
POST /dataset/{id}/reflection/recommendation/{type}
, for retrieving reflection recommendations by reflection type for a dataset.DX-89497You can use the
export.tableau.extra-flight-connection-properties
support key to disable certificate verification, for example in .tds files.DX-91831
Issues Fixed
You can now generate suggestions for aggregate reflections by clicking a button in the Reflections tab in the Dremio console. Dremio no longer automatically collects statistics and generates a suggestion when you open the Reflections tab.
DX-89306The schema for Delta Lake tables is now captured correctly, resolving the issue that could cause a NullPointerException and failure to query the table.
DX-92477Fixed an issue that could result in a NullPointerException when running a DML statement on an accelerated table.
DX-91682Fixed an issue that could cause queries to fail during planning with the error "Job was canceled because the query is too complex".
DX-92283The profile manager now renders successfully in the Dremio console even when the accelerationDetails field is skipped.
DX-92066Fixed a bug that could cause the default selected columns for raw reflections to fail to include all columns of a dataset.
DX-89497Fixed a ClassCastException bug in the MongoDB data source connector.
DX-88468Dremio field size limits now apply properly for the output of the
ARRAY_AGG
SQL function.DX-87000For MongoDB tables, Dremio now adds an exclusive projection to queries if no projection exists and properly excludes dropped fields from queries. This resolves issues involving maximum field size and leaf nodes and potentially improves performance.
DX-85874When a reflection with a refresh status of type Manual is not available for acceleration and fails to refresh, users will now see a red failure icon instead of a yellow warning icon in the Dremio console.
DX-71027
25.0.4 (June 2024) Enterprise
What's New
Dremio now supports HashiCorp Vault's Kubernetes authentication method for retrieving secret references to use for connecting to data sources and listing secrets in Dremio configuration files. Read Using HashiCorp Vault for Secrets Management for more information.
DX-84473 DX-89104The results of previously run queries now load much more quickly. After you open a saved script in the SQL Runner, the results are automatically displayed in a summarized format if at least one job in the script has successfully completed. To load the results of a specific query, select the query tab above the results table.
DX-90110 DX-90627For MapR deployments, removed configuration properties
MAPR_MAX_RA_STREAMS
andMAPR_IMPALA_RA_THROTTLE
from thedremio start
script and from the code responsible for Yarn configuration.DX-88701Dremio now introduces cache before calling the AWS and Azure credentials API, which improves performance by reducing excessive
CredentialsService.lookup()
calls.DX-90965The
dremio-admin backup
anddremio-admin restore
CLI commands now include thesecurity
folder. Manual backups of thesecurity
folder are no longer needed.DX-90705Improved handling of the ROLLUP aggregate SQL function.
DX-83225
Issues Fixed
Updated the following libraries to address potential security issues:
DX-91055 DX-91138- org.postgresql:postgresql to version 42.4.5 [CVE-2024-1597]
- com.amazon.redshift,redshift-jdbc42 to version 2.1.0.28 [CVE-2024-32888]
Correlated subqueries that include a filter that doesn't match any rows no longer result in an error message.
DX-91553Reflections that contain temporal functions no longer skip refreshing.
DX-90966Fixed an issue that prevented Hive sources that use assumed roles from running asynchronous queries via the "Enable asynchronous access for Parquet datasets" option.
DX-84153Indexing into the CONVERT_FROM SQL function for the same JSON no longer produces incorrect results.
DX-90636The TYPEOF SQL function now returns the precise type for nested members in complex types.
DX-89806To prevent unexpected out-of-memory errors, the Parquet vectorized reader allocates only the necessary amount of memory for scanning deeply nested structures.
DX-90471Restoring now works properly for AWSE deployments.
DX-90389Using autocomplete in the SQL Runner no longer causes issues with overflow.
DX-87864In the Dremio console, ideographic spaces now display as regular spaces in the results.
DX-64971An error message no longer appears when loading results of multiple jobs that are executed on different engines.
DX-91291When using an IAM role and attempting to add an AWS Glue source, you no longer see an error message about loader constraint violation due to AWS Glue authentication.
DX-91213Reflections no longer produce incorrect results due to incorrectly matching into queries that include the ROLLUP option.
DX-90879Fixed an issue where the
TO_DATE
function was used with invalid options.DX-90413Creating a new script while on a script that displays an error message no longer causes the error message to persist.
DX-90122Fixed an error where planning a query could fail with a "Cannot add expression to different types of set" message.
DX-90032Switching between the tabs in the SQL editor now correctly displays the job type.
DX-89787Memory tracking issues that would cause queries to be cancelled due to exceeding the memory limits are fixed (with Memory Arbiter enabled and high memory utilization on the node).
DX-89359Query profiles no longer show planning phases twice.
DX-89262Reading a Delta Lake table no longer results in an error about an invalid Parquet file.
DX-79957Reflections with row and column access control (RCAC) now produce the correct results when algebraically matched.
DX-90178Fixed an issue that could introduce duplicate rows in the results for RIGHT and FULL joins with non-equality conditions and join conditions that use calculations.
DX-92155View schema learning now occurs only for queries that are issued from the Dremio console or reflection refresh jobs.
DX-91903The query planner no longer fails for queries that use experimental settings related to the bushy join optimizer.
DX-91859Fixed the issue with incorrect dataset version sorting that could result in "not found" error messages when listing datasets in the Dremio console.
DX-91598Reading an Iceberg table with equality deletes via Glue or Hive no longer results in an error.
DX-90377Fixed the NullPointerException (NPE) in logging while refreshing metadata for Delta Lake tables.
DX-89302LEAD and LAG functions with the window set to a value that is greater than 1 no longer produce incorrect results.
DX-91557In the Dremio console, you can now properly schedule reflection refreshes.
DX-90145Dremio now throws a concurrent error message if metadata refresh fails due to a Nessie exception.
DX-54677
25.0.3 (May 2024) Enterprise
Issues Fixed
- Joins with non-equality conditions and join conditions that use calculations no longer introduce duplicate rows (while respecting desired filtering properties) into the result set. DX-90720
25.0.0 (April 2024)
The backup and restore procedures for Dremio 25.0.0 include steps for preserving the security
folder and updating certain permissions on it. These steps are required so that source connection does not fail during Dremio startup. Follow the backup and restore procedures carefully when upgrading to Dremio 25.0.0.
What's New
Enabled the memory arbiter by default in order to monitor the usage of four key operators: HASH_AGGREGATE, HASH_JOIN, EXTERNAL_SORT, and TOP_N_SORT. This usage is monitored across all queries running on an executor to improve how the executor utilizes its direct memory and to reduce OutOfMemoryException errors.
DX-48798- If the memory arbiter detects that the memory usage is too high, then the memory usage will be reduced in these two ways:
- Starting with the biggest consumers, some of these operators will need to reduce their memory usage mainly by spilling to disk.
- Memory allocations will be blocked.
- If the memory arbiter detects that the memory usage is too high, then the memory usage will be reduced in these two ways:
Changes to the logback configuration are now automatically applied without requiring a restart. To ensure that this feature is enabled when you upgrade to Dremio 25.0.0, take care to avoid replacing the installed
conf/logback.xml
file with your backup copy.DX-56684Enabled HASH_JOIN to spill to disk by default when the memory allocated for a query is fully utilized.
DX-48798Out-of-the-box observability metrics are now available for user activity and jobs such as most active users, longest running jobs, most queried datasets, and more. See the Settings > Monitor page to see these metrics.
DX-86592 DX-83785Improved the robustness of the embedded metadata pointer store.
DX-85034Added support for column mapping within Delta Lake tables, effectively supporting minReaderVersion 2.
DX-62046 DX-87465Enabled checksum-based verification for Azure Blob Storage and Data Lake Gen 2 sources to ensure data integrity during network transfers.
DX-66932Added support for the ARRAY_FREQUENCY SQL function. It takes an array as input and produces MAP with array values as keys and corresponding frequencies as values.
DX-67298You can use the Recommendations API to submit job IDs of jobs that ran SQL queries, and receive recommendations for aggregation reflections that can accelerate those queries. See Recommendations for more information.
DX-68447Added support for creating reflections on views and tables with row-access and column-masking policies defined on any of the underlying anchor datasets. See more information.
DX-68923 DX-89495Added support for configuring reflection refreshes to occur on a schedule.
DX-68532Added the configuration option
services.coordinator.web.auth.login_additional_latency_millis
for ensuring that login successes and failures take about the same amount of time. This makes all login requests (successful or not) slower, which makes brute force attacks harder. This configuration option can be turned off. It is on by default.DX-83373Added the SKIP_FILE option to the COPY INTO SQL command. The SKIP_FILE option specifies that the COPY INTO operation should stop processing the input file at the first error it encounters.
DX-84448You can now refresh reflections by using an API method,
ALTER TABLE
, andALTER VIEW
. You can also refresh reflections on views by using the Catalog API.DX-84529Added support for getting recommendations about what default raw reflections to create.
DX-84616Added support for showing the date and time that a reflection's data was last refreshed. If the refresh is running, failing, or disabled, the value is 12/31/1969 23:59:59. The date and time are available in the Dremio console and via the Reflection API.
DX-84702Added two new ways for starting the refresh of a reflection:
- On the Settings > Reflections page, hover over the row about the reflection and click the refresh icon.
- In the Advanced view of the reflections editor, click the refresh icon above the table that describes the content of the reflection.DX-84774
Added support for reading Apache Iceberg tables with equality deletes.
DX-84522Added support for Hive on GCS.
DX-84898Added a new refresh status: Pending. This status means that the refresh of a reflection will begin after the refreshes of its anchor and all downstream tables and views are finished.
DX-84941Added support for ZooKeeper 3.5.6 and later.
DX-53228Disabled C3 caching during the loading of Parquet source files via the COPY INTO operation, thereby reducing cache contention with other query workloads.
DX-85365Improved Dremio's capabilities for concurrent DML operations on Iceberg tables and improved error messaging for concurrent load failures.
85437Added to Reflection Summary objects of the Reflection API and the SYS.PROJECT.REFLECTIONS table the error message that explains the most recent failure of a refresh of a reflection. No message appears if no refresh has yet been attempted, no failure has occurred, or a successful refresh has followed a failed one.
DX-85499Added support for performing incremental refreshes on reflections that are defined on views that use joins.
DX-84768DX-85818Changed the tabs in the SQL runner to display the most recent results of a query, if the results are available from the job history, without the user having to run the query again.
DX-85843Added support for copy_errors() table function on Parquet tables.
DX-87332Removed the following support keys because they were enabled by default over several major releases:
dremio.deltalake.enabled
(introduced in 14.0, enabled by default in 17.0)store.deltalake.hive_support.enabled
(introduced as enabled by default in 24.0)store.deltalake.spark_support.enabled
(introduced as enabled by default in 24.1)dremio.deltalake.time_travel.enabled
(introduced as enabled by default in 24.2)dremio.execution.support_unlimited_splits
(introduced as enabled by default in 21.0)dremio.iceberg.enabled
(introduced in 11.0, enabled by default in 21.0)dremio.iceberg.ctas.enabled
(introduced as enabled by default in 22.0)dremio.iceberg.rollback.enabled
(enabled by default in 24.0)DX-87789 DX-87491 DX-53796 DX-87898
Added support for limiting access to specified databases on Glue sources.
DX-87812 DX-88223 DX-88420 DX-87811Upgraded Netty libraries to version 4.1.104.
DX-86156Added daily catalog maintenance tasks to trim history of views to a maximum of 50 records per view. This limits the storage needed for datasetVersions records in the KV store.
DX-86156 DX-87549To improve reflection observability, in the Reflection tab in the settings, the Dataset column is now wider and truncates after two lines. Also, users now receive a notification if the materialization cache is uninitialized for reflections as well as a message when hovering on the status icon for reflections whose caches are initializing.
DX-86891 DX-86890In the Reflection tab in the settings, users can now retry a refresh on all unavailable reflections.
DX-86889Reflection recommendations are now associated with the corresponding job IDs.
DX-86726 DX-86672Improved reliability and memory efficiency for Dremio coordinators.
DX-86245 DX-86675Privilege changes are processed more quickly in the Dremio console.
DX-87547To improve performance, users can now push filters past sort operations.
DX-88119No data is read in the REFRESH REFLECTION job for reflections that are dependent only on Iceberg, Parquet, Avro, non-transactional ORC datasets, or other reflections and have no new data since the last refresh.
DX-86353
Issues Fixed
Fixed the handling of SQL functions, such as LOWER, UPPER, and REVERSE, in queries on system tables.
DX-52626Reduced the heap memory used by the SORT operator.
DX-53594TCP-DS queries no longer fail with an error that says the table or column is not found.
DX-87797AWSE upgrades no longer fail with the error
Unexpected global state
.DX-88393Fixed gRPC exceptions in the Dremio console due to improper handling of transient server errors.
DX-25300The APPROX_COUNT_DISTINCT function now properly calculates the approximate count distinct rather than the exact count distinct.
DX-84197The Save button for reflections defined on views in spaces would be enabled for public users who have only SELECT, EDIT, and VIEW REFLECTION privileges. Such users still were correctly prevented from modifying reflections, as clicking Save did nothing.
DX-84684Discontinued the hive-universal build. As of this change, Hive 2.x sources are driven by Hive 3 plugin in the main build. Hive 2 libraries and artifacts (and the Hive 2 Dremio plugin itself) are omitted from the installation directory.
DX-85203Added the
dremio-job-id
property to the metadata for Iceberg tables in Glue sources.DX-85379Fixed an issue where certain queries returned incorrect results when multiple Nullable columns were referenced in conditions with OR operators.
DX-85581Added a check to determine whether users running the COPY INTO command have SELECT privileges on either the source storage location specified in the FROM clause or on each individual source file mentioned in the FILES clause.
DX-85977Fixed an issue that allowed reflections to be created when their definitions included UDFs that contained context-sensitive functions.
DX-86078Dremio no longer caches CURRENT_DATE_UTC and CURRENT_DATE during query planning, which was causing incorrect results. As a result, queries that use CURRENT_DATE_UTC and CURRENT_DATE have some performance latency in favor of accurate results.
DX-86078Fixed an issue that caused an aggregation reflection sometimes to be created automatically when a raw reflection was created.
DX-85098Fixed an issue that caused a message about a failed query to appear after the switch from one SQL tab to another.
DX-86514Fixed an issue in the SQL Runner where expanding the large data field by using the ellipsis (...) caused the results to be unresponsive when the data included DateTime objects.
DX-86541Fixed an issue that caused the SQL function APPROX_COUNT_DISTINCT to return null instead of 0 in some cases.
DX-86597Ensured that group policy grants are respected in AWS Lake formation when Dremio is used with Okta.
DX-86923Fixed an issue that occurred if "All tables" was selected during AWS Lake formation and the granting of a new permission that was meant to apply to all tables within the selected database.
DX-86925Fixed an issue that caused the details of jobs not to be updated in the Dremio console when jobs were running.
DX-86983Fixed an issue that caused the creation of a new branch to update the context of the SQL Runner automatically.
DX-87039Fixed an issue that could cause the
skip_file
option of theCOPY INTO
SQL command not to handle Parquet file corruption issues if they are in the first page of a row group.DX-87884Reduced the severity of log messages about function lookup for Hive functions so that they are no longer listed as errors.
DX-83930The Settings button is now shown at the top-right of the page when navigating to a Nessie source.
DX-88053Authentication with a secret resource URL now works properly for Amazon Redshift, Oracle, and PostgreSQL data sources.
DX-88293In Kubernetes environments, the Dremio load balancer service now remains active during dremio-admin operations.
DX-85396In Kubernetes environments, you can now write logs to a file on disk in addition to stdout.
DX-68047Reading Iceberg tables with positional deletes no longer causes an IndexOutOfBoundsException.
DX-87252The Details panel is no longer blank when opened from the menu in a Nessie source.
DX-87923The commit history for MERGE commands run in the Dremio console no longer show the user ID instead of the user email.
DX-88377Creating a raw reflection on a dataset on which no reflections are already defined no longer creates an aggregation reflection.
DX-86098The Go to Table (
) button now appears on the Datasets page for tables and views when the Query on click preference is disabled. The button also appears on lineage graphs for tables.
DX-85964 DX-84694You can disable analytics data from being sent to Intercom using the
dremio.ui.outside_communication_disabled
support key.DX-86316Fixed a bug that was causing sub-optimal query plans for queries with partition column filters.
DX-86309
Breaking Changes
Dremio no longer supports Java 8. A Java 11 SE JDK is now required. Failing to install a Java 11 SE JDK will result in an error at startup. In your
dremio-env
config files, you may need to remove any Java command line options that are not supported by Java 11 from theDREMIO_GC_OPTS
andDREMIO_JAVA*EXTRA_OPTS
variables. Yarn users may need to change the engine configuration to provide the path to a valid Java 11 environment by setting theJAVA_HOME
environment in the engine properties.DX-86534ZooKeeper 3.4 has reached end-of-life and is no longer supported. Using ZooKeeper 3.4 will result in an error at startup. Dremio recommends ZooKeeper 3.6 or later.
DX-88450Queries with ambiguous columns, including queries for creating views, are no longer supported and will result in an error. To prevent this, make sure the same column name is not listed twice, for example:
DX-83702 DX-86763SELECT * FROM (SELECT id, 2 AS id FROM (VALUES (1, 'one')) AS t(id, name))
To resolve the issue, rewrite the query to change one of the column names and remove the ambiguity. In this example, the first id
is changed to id0
:
SELECT * FROM (SELECT id, 2 AS id0 FROM (VALUES (1, 'one')) AS t(id, name))
You must also recreate any previous views that were created using ambiguous columns.
The
24.2-hive-universal
package is deprecated in 25.0.0. If you have a Hive 2 data source, follow the instructions for upgrading to 25.0.0. We recommend that you invest extra time to test Hive 2 use cases in a test environment before deploying to production.DX-86273Renamed support key
planner.writer.round_robin'
toplanner.writer.round_robin
.DX-85350
Known Issues
As of version 25.0.0, Dremio supports encrypted data source credentials. For this reason, when you upgrade to Dremio 25.0.0, if you want RocksDB to contain only encrypted credentials for your existing data sources, you must clear the RocksDB cache using the following steps:
Run
dremio-admin upgrade
.Run
dremio start
and wait for Dremio to start up.Run
dremio stop
.Run
dremio-admin clean --compact
.Run
dremio start
.To confirm that all existing data source credentials were encrypted successfully, check the server log from step 2 for messages like these:
2024-03-19 18:17:02,209 [main] INFO c.dremio.exec.catalog.PluginsManager - Successfully migrate the source [s3]. Took 4531 milliseconds.
2024-03-19 18:17:02,236 [main] INFO c.dremio.exec.catalog.PluginsManager - Successfully migrate the source [glue]. Took 26 milliseconds.
2024-03-19 18:17:02,236 [main] INFO c.dremio.exec.catalog.PluginsManager - Did not need to migrate the source [<source_name>]. Took 26 milliseconds.
2024-03-19 18:17:02,236 [main] INFO c.dremio.exec.catalog.PluginsManager - Completed sources migration. Total: 4611 milliseconds.
Issues may occur when reading Apache Iceberg tables with equality deletes from Hive or Glue sources. To resolve this issue, upgrade to version 25.0.4.
DX-90377Incorrect dataset version sorting can result in "not found" error messages when listing datasets in the Dremio console. To resolve this issue, upgrade to version 25.0.4.
- If you cannot upgrade to version 25.0.4, mitigate this issue by setting
store.dataset.versions.limit option
to a high number, such as100000
. This prevents version trimming but increases database size. When you upgrade to 25.0.4, you must restore thestore.dataset.versions.limit option
setting to the default value,50
, to control database size.DX-91598
- If you cannot upgrade to version 25.0.4, mitigate this issue by setting