Skip to main content

Results Cache

Results cache improves query performance by reusing results from previous executions of the same deterministic query, provided that the underlying dataset remains unchanged and the previous execution was by the same user. The results cache feature works out of the box, requires no configuration, and automatically caches and reuses results. Regardless of whether a query uses results cache, it always returns the same results.

Results cache is client-agnostic, meaning a query executed in the Dremio console will result in a cache hit even if it is later re-run through other clients like JDBC, ODBC, REST, or Arrow Flight. For a query to use the cache, its query plan must remain identical to the original cached version. Any changes to the schema or dataset generate a new query plan, invalidating the cache.

Results cache also supports seamless coordinator scale-out, allowing newly added coordinators to benefit immediately from previously cached results.

Cases Supported By Results Cache

Query result are cached in the following cases:

  • The SQL statement is a SELECT statement.
  • The query reads from an Iceberg, Parquet dataset, or from a raw Reflection defined on other Dremio supported data sources and formats, such as relational databases, CSV, JSON, or TEXT.
  • The query does not contain dynamic functions such as QUERY_USER, IS_MEMBER, RAND, CURRENT_DATE, or NOW.
  • The query does not reference SYS or INFORMATION_SCHEMA tables, or use external query.
  • The result set size, when stored in Arrow format, is less than or equal to 20 MB.
  • The query is not executed in Dremio console as a preview.

View Whether Queries Used Results Cache

You can view the list of jobs on the Jobs page to determine if queries from data consumers were accelerated by the results cache.

To find whether a query was accelerated by a results cache:

  1. Find the job that ran the query and look for This is the icon that indicates a Reflection was used. next to it, which indicates that the query was accelerated using either Reflections or the results cache.
  2. Click on the row representing the job that ran the query to view the job summary. The summary, displayed in the pane to the right, provides details on whether the query was accelerated using results cache or Reflections.
Results cache on the Job Overview page

Storage

Cached results are stored in the project store alongside all project-specific data, such as metadata and Reflections. Executors write cache entries as Arrow data files and read them when processing SELECT queries that result in a cache hit. Coordinators are responsible for managing the deletion of expired cache files.

Deletion

A background task running on one of the Dremio coordinators handles cache expiration. This task runs every hour to mark cache entries that have not been accessed in the past 24 hours as expired and subsequently deletes them along with their associated cache files.

Considerations and Limitations

SQL queries executed through the Dremio console or a REST client that access the cache will rewrite the cached query results to the job results store to enable pagination.