1.4.4 Release Notes
Sharing UI user/group removal issues (EE only)
Fixed issue removing users or groups from Sharing settings.
1.4.3 Release Notes
Handle distinguished names for LDAP group membership lookups (EE only)
Dremio can now handle looking up group names using DNs when using Attribute Based Group Configuration in LDAP.
1.4.2 Release Notes
Better handling of non UTF-8 character
Dremio now supports
CONVERT_FROM() functions for dealing with non UTF-8 string values. Users can validate, replace and omit invalid characters as needed.
Improved resiliency when detecting master node failures
Dremio now handles detecting master failures and electing a new master node more robustly.
Improved logging of sort errors
Improved debug tracing for the sort operator.
IndexOutOfBoundsException when using the
get_json_object Hive UDF
There was an issue when expanding the buffer when using the
get_json_object function with a large object. This issue is now fixed.
Handle batch sizes larger than 2^16 in hash join operator
Hash join operator logic is updated to correctly handle batch sizes larger than 2^16.
Handle boundary condition in nested loop joins
Nested loop join logic is updated to handle various cases more robustly. Previously this could cause problems when the output ends on batch boundary.
1.4.1 Release Notes
MongoDB collection previews
Dremio now uses the Mongo sample operator when previewing MongoDB collections.
use hbase command in HBase sources
Previously, Dremio would switch the query context to HBase sources' internal
hbase system namespace. This behavior is now disabled. In the unlikely event that users need to access the system namespace, it's available through the fully qualified dataset name (e.g.
Parallelization fixes for preview queries
Preview queries now have expanded samples and better parallelization.
1.4.0 Release Notes
Support custom paths as root of the FileSystem in MapRFS/HDFS sources
In earlier versions, Dremio only supported using HDFS/MapRFS root directory as source starting point. Now users can add custom paths in MapRFS/HDFS as root of the source. Users will not be able to access any files outside the custom path given as root.
Support pushing down NOT LIKE to Elasticsearch
Dremio can now pushdown NOT LIKE operator into Elasticsearch. This is also supported if the Boolean expression are part of a larger complex expression.
Partial predicate pushdown to Elasticsearch
With this improvement, the parts of a predicate conjunction that can be pushed down to Elasticsearch will be pushed down, with the remaining components handled in Dremio execution engine.
Coordination and metadata
High Availability support using ZooKeeper
Dremio now supports having multiple master nodes for HA purposes using ZooKeeper. This improvement simplifies HA configuration and significantly reduces overall failover times. Dremio will determine if a node is a master node based on a specific configuration flag and not if its hostname matches the configured master host like in previous releases.
Support for group based column and row level permissions (EE only)
Dremio now supports
is_member(“groupname”) function that checks query user's group membership. The function returns true if the user belongs to the "groupname" group. This function can be used in filter conditions in a VDS to define permissions on rows and columns.
Use LDAP sAMAccountName to login (EE only)
In LDAP configuration, Dremio had supported Distinguished Name templates for usernames. Administrators can now choose a different attribute (e.g. sAMAccountName) in a DIT entry as username for login. For details, see the LDAP section in documentation.
Metadata indexing and caching improvements
Dremio will now execute metadata queries coming from clients much faster even when filters are present. This is achieved through improved caching and indexing of all source metadata.
Enhance Arrow Value Vectors for better performance, less heap overhead
Dremio’s in-memory query execution engine is completely based on columnar data structures and formats provided by Arrow. During testing, we have seen several opportunities for improvement in areas of performance, memory usage and code maintainability. We improved the entire Java implementation of Arrow where improving performance and reducing heap memory usage were the main focus areas.
Aggregation reflection with join might fail to substitute
In some cases, Dremio would not accelerate join queries that would get accelerated by aggregation reflections. This is now fixed.
Coordination and Metadata
Rank using aggregate is not working
Dremio now supports rank function on aggregates like
sum, and others.
Fix constant reduction for expression using unary minus operator
Constant expressions containing the unary minus operator (-) are now correctly reduced.
Low thread usage and high wait time when we have a large number of blocking tasks
Dremio is now better at scheduling system resources when there is a mix of CPU-bound and I/O-bound tasks in the system.
Very uneven partitions in a HashJoin
Planning for a join was partitioning the dataset into too few discrete partitions, which was causing some executor nodes to have noticeably more work than others, and hence take noticeably longer to serve a query. That has now changed to spread the work more evenly among the executor nodes.
IO wait times when reading from file system sources are incorrectly reported as 0
Now Dremio expects to show the IO wait times when reading from file system sources in query profile to help debug the latencies in query run time.
Query reattempt due to Out of Memory failure happens before fragment cleanup
Dremio has a reattempt logic that determines if the reason for query failure is recoverable or not. In case of Out of Memory related failures, we start the reattempt. However, the problem was that resource cleanup didn’t happen completely in the previous failed attempt of query. Due to this the next attempt failed in the setup phase itself even before execution began. We now wait for all the query fragments to be terminated/retired properly before issuing the next attempt.
CONCAT function fails with length > 256 characters
Some queries that use the CONCAT (arg1, arg2 …) SQL function were failing when the length of concatenation result was greater than 256 characters. The problem was incorrectly using internal memory buffers to store the intermediate results of concatenation. This is now fixed.
Web Application and APIs
Hide some Reflection UI from Jobs from non-admins
In the Job details view, you can click on a Reflection to view and modify its definition. That link was showing for regular users and now that link will only be shown to administrators.
Explore grid values hidden by scrollbar
When viewing a dataset in the UI, the scrollbars of the grid could sometimes draw over the data. This has now been fixed to behave correctly and not cover up any data.
Elasticsearch adapter uses wrong hashCode to check for changes
This incorrect hashCode would cause Dremio to think that the Elasticsearch index mapping had changed, even when it hadn't. This caused some Elasticsearch queries to fail.
Dremio fails to read a Hive table containing partitions with different schemas
A Hive table can contain two partitions with different schemas. This happens when table schema is altered after one or more partitions are already created. Partitions created after table schema change contain the data in new table schema format. When reading partitions with old schema, Dremio now converts them to new table schema using Hive provided SerDe utilities. There is no schema conversion for partitions with new schema. The behavior here is similar to Hive.
Dremio sometimes fails to read Hive tables with HBase storage handler
Now Dremio can read Hive tables that store data on HBase.