Upgrading to Dremio 18.0+

Dremio 18.0 introduces the ability to query large datasets with greater speed with the introduction of many new features like near-real-time metadata refreshes, near-real-time reflection refreshes, and unlimited splits for certain FileSystem/Hive tables. Additionally, this version removes all support for mixed data types, which can break tables with incompatible data types.

To avoid an unexpected behaviors, we recommend following this upgrade process to ensure all aspects of the upgrade operate as expected.

Before You Upgrade

We encourage you to complete the following item before upgrading your instance of Dremio:

  • Contact Dremio Support or your Dremio Field Technician and use our internally-developed tool to test your existing tables and identify any mixed data types that might cause problems with querying after the upgrade.
    • For example, Hive tables containing UNIONTYPE are no longer supported after upgrading to v18.0.

Verify Functionality on Pre-Production

The following steps should be completed in a pre-production system (dev or UAT) to ensure all features introduced in Dremio v18.0 will not cause issues to your existing datasets:

  1. Add all of your datasets to this environment.
  2. Verify that all datasets work without mixed types.
  3. Enable near-real-time metadata refreshes, near-real-time metadata refresh for reflections, unlimited splits feature. Some customers may have queries that are slow due to partition stats used by the planner not being accurate. To collect accurate partition stats as part of metadata refresh, enable the flag store.accurate.partition_stats.
  4. Trigger a metadata refresh on supported tables, for both Hive and Parquet datasets.
  5. Create reflections (both full and incremental) on these datasets. Remember that reflections marked as incremental will require a full reflection after the upgrade.
  6. Verify that your cluster has sufficient resources to complete a full reflection on your dataset.
  7. Confirm that queries on your virtual datasets (VDS or “views”) work well.

Deploy on Production

Now that you’ve tested everything and you’re confident with the upgrade, you may now upgrade your live system to Dremio 18.0. We recommend following this strategy after the upgrade:

  • Enable each functionality, one at a time, and verify that everything works as expected.
  • Before enabling unlimited splits, verify that all tables are already working without mixed types. Then enable the feature.