Altering Apache Iceberg Tables
The ALTER command updates an Iceberg table’s schema in a new snapshot.
SyntaxALTER TABLE [<table_path>.]<table_name>
[ { ALTER | MODIFY | CHANGE } COLUMN ( <source_col_name> <new_col_name>{ data_type } ) ]
| [ ADD COLUMNS ( <column_name> data_type [, <column_name> data_type ... ] )
| [ DROP COLUMN <column_name> ]
| [ [ ADD | DROP ] PARTITION FIELD [ <column_name> | <partition_transform> ] ]
| [ RENAME [ COLUMN ] <old_column_name> [ TO ] <new_column_name> ]
| [ REFRESH METADATA ]
Parameters
{ ALTER | MODIFY | CHANGE } COLUMN ( <source_col_name> <new_col_name> <data_type> )
String
Changes the data type for a column, and gives you the option of to renaming the column. If you do not want to rename the column, set <new_col_name> to <source_col_name>.
Only three types of changes to primitive types are allowed:
- int to long
- float to double
- decimal(P, S) to decimal(P', S), if you are widening the precision
- Set 1
- struct_type:
ROW( name primitive_or_complex_type, .. )
- list_type:
ARRAY(primitive_or_complex_type)
ROW(innerfield INT, anotherinnerfield DOUBLE)
ARRAY(INT)
ROW(innerfield ARRAY(INT))
ARRAY(ROW(innerfield INT))
- struct_type:
- Set 2
- struct_type:
STRUCT <name : primitive_or_complex_type, ...>
- list_type:
{ LIST | ARRAY } < primitive_or_complex_type >
STRUCT<innerfield : INT, anotherinnerfield : DOUBLE>
LIST<INT>
ARRAY<INT>
STRUCT<innerfield : LIST<INT>>
LIST<STRUCT<innerfield : INT>>
- struct_type:
ADD { COLUMN | COLUMNS } ( <column_name1> <data_type> [, <column_name2> <data_type> ... ] ) [ BEFORE <column_name> ]
String
Appends one or more columns that have the specified names and data types.
These are the supported primitive types:
- BOOLEAN
- VARBINARY
- DATE
- FLOAT
- DECIMAL
- DOUBLE
- INTERVAL
- INT
- BIGINT
- TIME
- TIMESTAMP
- VARCHAR (The length is always 65536 bytes. If a length is specified, it is ignored.)
You can define complex types by using either of these two sets of syntax:
- Set 1
- struct_type:
ROW( name primitive_or_complex_type, .. )
- list_type:
ARRAY(primitive_or_complex_type)
ROW(innerfield INT, anotherinnerfield DOUBLE)
ARRAY(INT)
ROW(innerfield ARRAY(INT))
ARRAY(ROW(innerfield INT))
- struct_type:
- Set 2
- struct_type:
STRUCT <name : primitive_or_complex_type, ... >
- list_type:
{ LIST | ARRAY } < primitive_or_complex_type >
STRUCT<innerfield : INT, anotherinnerfield : DOUBLE>
LIST<INT>
ARRAY<INT>
STRUCT<innerfield : LIST<INT>>
LIST<STRUCT<innerfield : INT>>
- struct_type:
{ ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> }
String
Optional
ADD partitions the table data on the values in a single column or by using one of the partition-transformation functions. DROP drops the partition definition.
These are the partition-transformation functions:
Transform | Description |
---|---|
identity( <col> ) | Explicitly specified identity transform |
year( <col> ) | Partition by year. The column must use the TIMESTAMP data type. |
month( <ts_col> ) | Partition by month. The column must use the TIMESTAMP data type. |
day( <ts_col> ) | Partition by day. The column must use the TIMESTAMP data type. |
hour( <ts_col> ) | Partition by hour. The column must use the TIMESTAMP data type. |
bucket( <count>, <col> ) | Partition by hashed value into <count> buckets |
truncate( <length>, <col> ) | Partition by truncated value.
|
Examples
Add a columnALTER TABLE myTable ADD COLUMNS (address VARCHAR)
ALTER TABLE myTable ALTER COLUMN id id BIGINT