Pillar 3 - Cost Optimization
While getting the best performance possible with Dremio is important, it is also important to optimize your costs associated with managing the Dremio platform.
Minimize Running Executor Nodes
While Dremio can scale to many hundreds of nodes, any given cluster should only have as many nodes as it needs to satisfy the current load and meet Service Level Objectives.
Dynamically Scale Executor Nodes Up and Down
When running Dremio engines, designers can leverage scale-up and scale-down features to dynamically expand and contract capacity based on load.
Eliminate Unnecessary Data Processing
Unnecessarily building reflections and metadata can detract from the overall performance of your system, and it will contribute to the load and, therefore, the cost of operating the system.
Size Engines to Minimum Nodes Required
To avoid unnecessary cost, consider setting up a script, external to Dremio, that can reduce the number of active nodes in your engines down to the bare minimum (certainly one but maybe even zero) during times when you know the cluster will be getting minimal or no use, such as overnight weekdays or weekends. An equivalent script can be used to scale the number of executors in your engines back to operational capacity a short time prior to the cluster being put to normal daily use.
Remove Unused Reflections
Analysis of Dremio’s query history, joined with data present in system tables like
sys.project.materializations can provide details about how often each reflection in Dremio is being leveraged. For reflections that are not being leveraged, further analysis can determine if any of them are still being refreshed, how many times they have been refreshed in the reporting period, and how many hours of cluster execution time they have been consuming.
Identifying and removing unused reflections is good practice because it can reduce clutter in the reflection configuration. More importantly, it can free up hours of cluster execution cycles that can be used for more critical workloads.
Optimize Metadata Refresh Frequency
See Optimize Metadata Refresh Frequency to understand metadata in Dremio, why it is important, and best practices for setting and adjusting the frequency of metadata refresh for datasets.