ANALYZE TABLE

Version Requirement:

This functionality is available using instances of Dremio v16.0+.

The ANALYZE TABLE SQL command allows Dremio to compute and delete statistics for physical datasets. With this command, you may compute the following types of statistics:

  • Estimated number of distinct values
  • Number of rows
  • Number of null values

After statistics have been computed, you may use them with Dremio’s Query Planner tool by enabling the Planner.use_statistics option.

Additionally, you may keep track of statistics information via the system table sys.table_statistics.

Syntax

The ANALYZE TABLE command uses this syntax:

ANALYZE TABLE <table_name> [ for_clause ] statistics_clause
	for_clause::
    FOR  [ ALL COLUMNS |
            COLUMNS ( <list_of_columns> ) ]
statistics_clause::
    COMPUTE STATISTICS |
    DELETE STATISTICS

Here is an example:

ANALYZE TABLE table1 FOR COLUMNS (a) COMPUTE STATISTICS

NOTE

Anywhere in this documentation that a value is enclosed in < and >, should be treated as an indicator of where a custom value should be provided. The value you enter should not be enclosed in < or >.

Required Parameters

<table_name>

You must specify the name of a table.

[ ALL COLUMNS | COLUMNS ( <list_of_columns> ) ]

Specifies the columns to include when computing statistics. Using ALL COLUMNS forces Dremio to use every column in a table for the computation. To use a specific column, use the COLUMNS parameter and specify the column in parentheses.

NOTE

Use of this parameter is optional. If left out, Dremio will automatically consider the command as meaning FOR ALL COLUMNS.

Multiple Columns

When computing statistics on a dataset for multiple columns, separate each column name by a single comma. Here is an example:

ANALYZE TABLE table1 for columns (a,b) COMPUTE STATISTICS

Using the example above, Dremio computes statistics for columns a and b in table1.

COMPUTE STATISTICS | DELETE STATISTICS

To compute statistics for columns on a table, use the COMPUTE STATISTICS parameter.

To delete all previously-calculated statistics from Dremio for a table and columns, use the DELETE STATISTICS parameter as part of your command.