Version: current [26.x]

Hash Aggregation Spilling

Dremio implements hash aggregation spilling. When memory limits are reached, Dremio spills the data to disk as and when necessary. The query tries to complete within the memory envelope under which Dremio operates.

This feature is useful for the following:

Memory-intensive hash aggregation queries (GROUP BY queries) that process large datasets.
Limited memory environments

A bare minimum amount of memory is needed by the hash aggregation operator to even start processing the first batch of data. As long as the memory given to hash aggregation is equal to or more than the minimum, Dremio can complete the query.

If the minimum memory is unavailable when the query is submitted, the job fails immediately (prior to starting the job) with an "Error: Failed to preallocate memory for single batch in partitions" message. This error message is shown in the exception stack details.

note

Based on the amount of system memory that you have provisioned, Dremio automatically determines the memory envelope for memory intensive operators that can spill.