Discovering new column in MongoDB reader can cause a schema change error
If a new top-level column is discovered after the first batch, a Schema Change Error was thrown, but the knew column was not added to the known schema, so it would fail again the next time. This is fixed so that this case will not throw a Schema change error.
Execution threads can get into a bad state if a specific exception is thrown
When an execution task is ready to yield, and an exception is thrown inside the retire method of FragmentExecutor, we fail to handle it properly putting the task into a bad state that will prevent it to run and will keep logging the same error over and over again. The fix ensures executing tasks are always in a consistent state even when such exceptions are thrown.
Deleting a container like a source, space or a folder, should delete the reflections of the datasets under the container
When a container like source, space or a folder is deleted, the underlying datasets were removed. But any reflections on the datasets were not removed. This is fixed by scheduling a periodic cleanup task which will remove the reflections if their related datasets do not exist. The cleanup is scheduled to run every 4 hours and on restart.
Cannot add S3 source with name with a space in it
Fix addresses failure while attempting to register an S3 source with special characters in the name.
Issue while parsing text file with empty first line
When working with a set of files with empty files or with empty first N lines in the file, processing of those text files would fail. This fix ensures such cases don’t cause query failure.
Unable to apply functions with Array output on top of each other
This behavior was caused by an issue with handling complex return type from functions being inverted as BOOLEAN versus ANY.
No logging showing for YARN containers
When provisinoing Dremio executors via YARN, no logging would show up in YARN containers related to Dremio.
Failed to query Parquet files with different data types for the same column using vectorized parquet reader.
Dremio supports having same column with different types in different files and exposes this field as “Mixed Type” field. When working with parquet files, however, queries would fail when converting column into “Mixed Type” fields. This behavior is now fixed.
Simple limit queries are now optimized to read in a single thread
Dremio was reading in multiple thread for simple limit queries that involve no joins are aggregation (ex. SELECT * FROM hive.employee LIMIT 20) causing delay in response. This behavior is now fixed to read the table in a single thread.
DateTime functions are returning incorrect results
Issues fixed are in functions: date_trunc, CAST(varcharTypeCol AS INTERVAL SECOND)
date_sub and date_add should return the same type for the same input
date_dub and date_add functions were not returning same type of output. date_add was returning timestamp whereas date_sub was returning date type.
Implement Project Push-down for Excel (XLSX, XLS) readers
Excel record readers did not have the capability to project a handful of columns but the plan rule was pushing down ‘project’. This would lead to frequent schema learning related problems when working with Excel files.
Recreating datasets from Parquet files with the same path
Usage of invalid cached parquet file footer for the re-creation of dataset from a completely different parquet file that happens to have the same path name as the parquet file used to create the dataset initially.
Improved support for special characters
Fixed particular cases where certain characters in the names of sources, spaces, and datasets would prevent certain functionality from working.
Fixed status indicators for reflections and jobs
Fixed issues where in particular situations reflections and jobs would show an incorrect status.
Fixed issue where a new source could not be saved if the first attempt was not successful
If a source was attempted to be created with (e.g. incorrect credentials) it would not be created but its name would become reserved by system. Now correcting the configuration of the source will allow you to save it with the desired name.
Sample Data Source
Administrators who haven’t set up any sources yet now have the option to add sample data with a single click.
Cache hasAccessPermission by source/user according to metadata policy
Under each source configuration, administrators can define for how long permission checks should be cached. This parameter was unused before 1.1.0. Permission checks, per user per table, are now cached for each user up to the defined duration. Each coordinator maintains a cache of 10000 permission checks.