On this page

    OPTIMIZE TABLE

    Rewrite data files to an optimal size, combining small files or splitting large files. For more information about optimizing tables, see Optimizing Tables.

    Syntax
    OPTIMIZE TABLE <table_name> 
       [ REWRITE DATA USING BIN_PACK
          [ ( { TARGET_FILE_SIZE_MB | MIN_FILE_SIZE_MB | MAX_FILE_SIZE_MB | MIN_INPUT_FILES } = <value> [, ... ] ) ]
       ]
    

    Parameters

    <table_name>

    String

    The path of the Iceberg table that you want to optimize.


    REWRITE DATA USING BIN_PACK

    Optional

    Either compacts smaller files or splits larger ones to the optimal file size using the BIN_PACK strategy. This parameter is used by default to optimize a table, but you can optionally include this parameter in your query to specify additional options to check the file sizes in your selected table.


    [ ( { TARGET_FILE_SIZE_MB | MIN_FILE_SIZE_MB | MAX_FILE_SIZE_MB | MIN_INPUT_FILES } = <value> [, ... ] ) ]

    Number

    Optional

    The options available for the REWRITE DATA USING BIN_PACK clause. See the following table for the options that are supported.

    Option NameDefault ValueDescription

    TARGET_FILE_SIZE_MB

    256 MBControls the target size of the files that are generated.

    MIN_FILE_SIZE_MB

    0.75 times the value of TARGET_FILE_SIZE_MB

    Files that are smaller in size than min_file_size_mb qualifies for compaction.

    MAX_FILE_SIZE_MB

    1.8 times the value of TARGET_FILE_SIZE_MB

    Files that are larger in size than max_file_size_mb qualifies for compaction.

    MIN_INPUT_FILES

    5

    The minimum number of qualified files needed to be considered for compaction.

    Examples

    Rewriting the data files in the specified table to Dremio's optimal supported file size
    OPTIMIZE TABLE demo.example_table
    
    Rewriting the data files in the specified table using the default bin-packing rewrite algorithm to Dremio's optimal supported file size
    OPTIMIZE TABLE demo.example_table 
       REWRITE DATA USING BIN_PACK
    
    
    Rewriting the data files in the specified table to Dremio's optimal supported file size
    OPTIMIZE TABLE demo.example_table 
       REWRITE DATA USING BIN_PACK (TARGET_FILE_SIZE_MB=256)
    
    Rewriting the data files in the specified table using the specified minimum and maximum file sizes to Dremio's optimal supported file size
    OPTIMIZE TABLE demo.example_table 
       REWRITE DATA USING BIN_PACK (MIN_FILE_SIZE_MB=100, MAX_FILE_SIZE_MB=1000)
    
    Rewriting the data files in the specified table for the minimum number of specified files to Dremio's optimal supported file size
    OPTIMIZE TABLE demo.example_table 
       REWRITE DATA USING BIN_PACK (MIN_INPUT_FILES=10)
    
    Rewriting the data files in the specified table using the specified minimum and maximum file sizes to the specified target file size
    OPTIMIZE TABLE demo.example_table 
       REWRITE DATA USING BIN_PACK (MIN_FILE_SIZE_MB=100, MAX_FILE_SIZE_MB=1000, TARGET_FILE_SIZE_MB=512)