4.6 Release Notes

What's New

AWS Edition

  • Dremio offers a new hourly paid listing in AWS Marketplace.

  • Dremio displays a checkmark in a new Enterprise column on the Projects page for projects that were last run with Dremio Enterprise features enabled.

Other Enhancements

  • Dremio users can launch Microsoft Power BI by clicking an icon on the Client toolbar on the Dremio Datasets page. The Client toolbar allows users to launch Tableau and Power BI from Dremio. Dremio administrators can configure which clients appear on the toolbar for a Dremio project in the Client Tools pane of the Support Settings page and with a new REST API endpoint.

  • Dremio supports MongoDB 4.2.

  • The Power BI connector is now GA.

  • Dremio supports AWS Glue Catalog as a data source and automatically synchronizes all databases and tables in the Glue Data Catalog that reside in AWS S3. Dremio supports the following data formats for AWS Glue:

    • Parquet
    • ORC
    • Delimited text files (CSV/TSV)
  • A new user interface for the Elastic Engines page:

    • On the Admin page under Workload Management, Provisioning is now Engines.
    • The Engines page displays a list of all available engines for the project. Dremio administrators can add and remove workers from an engine, as well as edit and delete engines.
    • The Engine Details page contains a Queues tab, which displays the queue memory limit and the job memory limit per node.
  • A new CONVERT_TIMEZONE SQL function that converts a timestamp to a specified timezone. Two new system tables, sys.timezone_names and sys.timezone_abbrevs, list Dremio-supported timezone names and abbreviations. The function accepts three parameters:

    • sourceTimezone (optional)
    • destinationTimezone
    • timestamp

    If the sourceTimezone parameter is not present, Dremio assumes the timestamp provided in the third parameter is in UTC format.

    The sourceTimezone and destinationTimezone parameters accept any of the following values:

    • timezone name from the sys.timezone_names
    • timezone abbreviation from sys.timezone_abbrevs
    • offset, such as +02:00
  • Upgraded to the latest version of Apache Arrow, 0.17.1.

  • Sorting now utilizes lz4 compression for improved performance.

  • A new environment variable in dremio-env, DREMIO_GC_OPTS, that specifies which Java garbage collector Dremio uses. Dremio supports the Garbage First Garbage Collector (XX:+UseG1GC) and the Parallel Collector (-XX:+UseParallelGC). By default, Dremio uses G1 GC.

    Dremio detects if a conflict exists between the garbage collection configuration specified with the new DREMIO_GC_OPTS environment variable and the Set Up YARN dialog. If garbage collection is configured for YARN, Dremio uses that value in the event of a conflict. If the garbage collection is not configured for YARN, Dremio uses the value specified by DREMIO_GC_OPTS.

  • Improved error messages for the JDBC Connector.

Changed Behavior

  • Users without permissions on a data source can no longer create a table from query results (CTAS). Dremio now checks for user permissions on a data source for the following queries:

    • createTableAs
    • createEmptyTable
    • createNewTable
    • createView
    • updateView
    • dropView
    • dropTable
    • truncateTable
    • addColumns
    • dropColumn
    • changeColumn
  • The Qlik Sense client application again appears on the Analyze With menu on the Datasets page, but no longer appears on the Client toolbar.

  • The TO_TIMESTAMP SQL function supports the timezone abbreviations in the new sys.timezone_abbrevs system table.

  • Data Reflections configured for incremental refreshes only perform a full refresh after changes to the schema that impact the Data Reflection. Changes to the schema that do not impact the Data Reflection no longer inititate a full refresh.

  • The Power BI Connector prioritizes Dremio.pqx rather than Dremio.mez when both files are installed. Use of .mez files is deprecated.

Fixed Issues in 4.6.0

NOT IN clause sometimes returns NULL records
Fixed an issue that caused the NOT IN clause to return NULL records when the clause contains five or more values.

Sorting performance may be impacted by compression and small batch sizes
Fixed by disabling an unncessary flush.

Dremio fails to restart after upgrading from Dremio 4.1.8 to 4.5 on AWS Edition
Fixed by removing previously configured elastic engines in Dremio during the upgrade.

JDBC storage plugin fails while setting up SQL queries
Fixed a bug that occurred when aliases are used in the ORDER BY SQL clause.

Unable to load credentials on Amazon EKS deployments
Fixed by retrying when AWS SDK returns an SdkClientException.

Unable to delete a Dremio user with special characters in the username
Fixed an issue with escape characters.

Unable to restore a standalone Dremio deployment using AWS S3 for distributed storage to another standalone deployment also configured for distributed storage with S3
Fixed misleading error message during restore operation.

Tableau Server on Linux cannot connect to a Dremio deployment when SSL/TLS is enabled
Fixed by upgrading the SSL driver to version 1.5.0.1001 in the Dremio ODBC driver for Linux.

JDBC storage plugin sometimes fails while setting up queries on Postgres
Fixed issue that added superfluous collations.

Power BI Connector generates an error when using text filters with non-English characters
Fixed by overriding SQLGetTypeInfo to return only one type for SQL_WVARCHAR and SQL_WCHAR.

UnsupportedOperationException while trying to read HDFS datasources in Parquet format
Fixed issue with case sensitivity in column names of HDFS datasources in Parquet format.

4.6.1 Release Notes

Enhancements

  • New advanced option for AWS S3 data sources, Enable file status check, and new property for metadata storage in dremio.conf:

    debug: { 
    dist.s3_file_status_check.enabled: enabled
    }
    

    These options control whether Dremio verifies that a file exists in AWS S3 and the distributed storage for a data source, respectively. Both are enabled by default. If users notice failed LOAD MATERIALIZATION or DROP TABLE data acceleration jobs when using AWS S3 for distributed storage, disable dist.s3_file_status_check.enabled in dremio.conf and disable the Enable file status check advanced option on the data source.

  • New metric, NUM_COLUMNS_TRIMMED, reports the number of trimmed columns in Parquet-formatted files.

Fixed Issues in 4.6.1

Validation error with java.io.FileNotFoundException when refreshing a Data Reflection
Fixed by disabling the new Enable file status check advanced option for AWS S3 data source and disabling the debug.dist.s3_file_status_check.enabled property in dremio.conf.

Queries on Dremio metadata containing a WHERE clause with both LIKE and OR operators return incorrect results
Fixed by correctly pushing down OR query filter.

Executor nodes fail with ForemanException
Fixed by removing unnecessary columns and rowgroups from footers of Parquet files.

When asynchronous access is disabled, Dremio is unable to gather medata from the footers of Parquet files
Fixed by reverting to a known working parquet footer.

Dremio crashes with java.io.FileNotFoundException
Fixed issue with data consistency during refreshes of Data Reflections for AWS S3 data sources.

Inconsistent job status reported in job profile and job details
Fixed by asynchronously handling completion events from executor nodes.

Superfluous columns are not trimmed while scanning Data Reflections
Fixed by adding a handler method.

4.6.2 Release Notes

Fixed Issues in 4.6.2

Queries fail with CONNECTION ERROR: Error setting up remote fragment execution
Fixed issue where parallelizer picked executor nodes outside of the selected engine.


results matching ""

    No results matching ""