4.7 Release Notes
What’s New
Engine Tags
Dremio administrators can add engine tags for all AWS EC2 objects associated with each engine in an AWS Edition deployment. Add the tags on the Edit Elastic Engine
modal in the Dremio user interface.
Note:
Dremio users cannot add engine tags when initially deploying Dremio AWS Edition because engines are created dynamically.
Arrow Caching
Dremio users can improve query performance by enabling Arrow caching, which caches Data Reflections in the Apache Arrow format. Users enable the feature for each reflection by clicking the Switch to Advanced
button on the Dataset Settings
modal, then toggling Arrow caching
to the on position on the Settings: Raw Reflection
and Settings: Aggregation Reflection
modals.
Secondary Coordinator Nodes
Dremio provides improved concurrency and distribution of query loads across a deployment withsecondary coordinator nodes. To enable this feature for YARN-based and standalone deployments, add the following to the dremio.conf
file on each secondary coordinator node in your cluster and restart your deployment:
services: {
coordinator.enabled: true,
coordinator.master.enabled: false,
coordinator.web.enabled: false,
executor.enabled: false
}
To enable secondary coordinator nodes for Azure AKE and AWS EKS deployments, update coordinator.count
in values.yaml
of the helm chart and restart your deployment.
Tip
Dremio recommends a maximum of five secondary coordinator nodes for a deployment.
Note:
Dremio deployments on AWS Edition and Azure ARM do not support secondary coordinator nodes.
Kubernetes Helm Charts
Dremio 4.7.0 features an updated Helm chart for Kubernetes-based deployments using Helm 3.0.0+. The updated chart no longer supports Helm 2.x.
Preview: Runtime Filtering
Dremio optimizes the performance of JOIN
statements by dynamically applying filters from dimension tables to fact tables. Contact Dremio Customer Support to enable runtime filtering.
Disabling Cross Source Queries
Dremio administrators can disable single queries that select data across multiple data sources, including virtual and physical datasets. To disable cross-source SELECT
statements, administrators set the planner.cross_source_select.disable
support key to true
. By default, the feature is disabled.
Note
Dremio users can override this feature by checkingEnable this soure to be used even though Disable Cross Source is configured
on theAdvanced Options
tab of theEdit Source
modal.
Separation of Data Lake and External Data Sources
The Dremio Datasets
page now separates data sources into two categories:
- Data Lakes - filesystem and table-based data sources
- External Sources - RDBMS and NoSQL-based data sources
Changed Behavior
Dremio maps the
DATE
type in Oracle data sources to theTIMESTAMP
Dremio data type. Prior to version 4.7.0, Dremio mappedDATE
to the DremioDATE
data type. Dremio developers using Oracle data sources must update their application logic before running their applications against a Dremio deployment upgraded to Dremio 4.7.0. Developers must update their applicaton code to expect aTIMESTAMP
data type rather than aDATE
type.Kubernetes-based deployments no longer support Helm 2.x. Dremio 4.7.0 features an updated chart that requires Helm 3.0.0+.
Dremio disables the
Edit Settings
button on theEngines
page when starting and deleting an engine.
Fixed Issues in 4.7.0
Selected table has no columns
error when querying INFORMATION_SCHEMA.COLUMNS
Fixed by detecting missing fields in BatchSchema entries for tables.
TIMESTAMPDIFF()
function returns different values with the same inputs for optimized queries
Fixed issue with timestampdiffMonth()
function when the end date is the last day of a month.
NamespaceNotFoundException: one or more elements on the path are not found in namespace
Fixed by exploring dataset rather than parsing resourceId
.
The catalog API fails to remove artifacts after a a failed PUT
request to the /api/v3/catalog
REST API
Updated the validation of AWS S3 configuration paths.
Query returns AssertionError: Type mismatch
Fixed mismatched type during fragment initialization.
/api/v3/catalog/<id>/graph
REST API returns NullPointerException
Updated the logic that addresses data sources for virtual datasets.
SQL Editor displays Failed to parse date time value <timestamp> in field timestamp
error
Fixed bug that generated child field paths with an incorrect parent path.
Tableau Server unable to fetch data with ODBC/JDBC connection with Synchronous socket write failed with error: An established connection was aborted by the software in your host machine
error
Fixed by updating the logic that maps RpcException
to SQLStateCode
in the JDBC driver.
Unable to relaunch Dremio AWS Edition with CloudFormation due to permissions error
Removed hard-coded IAM role and security group to launch Dremio AWS Edition with CloudFormation.
Inconsistent results when using an OR
clause while querying INFORMATION_SCHEMA
Fixed an incorrect expression filter push down in queries with an OR
clause.
Dremio user interface crashes when editing decimal fields
Fixed bug in handling decimals in the bar chart.
Queries against Postgres datasets fail with ERROR: window functions are not allowed in window definitions
error
Fixed bug that created a nested window in RelToSqlConverter
.
Performance issues when a Dremio user repeatedly queries the same Teradata table
Fixed bug that calculated the row count twice for an RDBMS plugin.
Inconsistent results when querying virtual datasets
Fixed a race condition during node completion.
UNAVAILABLE: Channel shutdown invoked
error
Fixed bug where jobclient re-caches the channel.
4.7.1 Release Notes
Fixed Issues in 4.7.1
Executor node crashes during query
Moved execution of casting to the INT
and BIGINT
data types with the CAST
SQL function to Java rather than Gandiva.
Dremio fails to deduplicate project expressions
Added code to ensure Dremio deduplicates prewarmed expressions correctly.
4.7.2 Release Notes
Fixed Issues in 4.7.2
Text overflows in the Datasets navigation pane in Firefox and Safari
Fixed by using an overflow property to hide out-of-view spaces and data sources.
4.7.3 Release Notes
What’s New
External Queries
External queries enable Dremio users to query relational database data sources using the SQL syntax native to that database. Users can save the results of their external queries as Dremio virtual datasets and enable Data Reflections.
Fixed Issues in 4.7.3
Queries fail on MongoDB data sources when Dremio pushes down “IS NULL” and “IS NOT NULL
Optimized pushdown for MongoDB by using a comparison expression rather than an aggregate expression and fixed a bug that returned a superfluous resultset.
Unable to specify a storage class in the Helm chart
Fixed a typo that prevented custom storage classes from being used in Dremio Helm Chart v2.
Dremio returns inconsistent results for vectorized JOIN
SQL statements when Runtime Filtering is enabled
Set the validity bit from right for VECTORIZED_BIGINT
mode.