Altering Apache Iceberg Tables
The ALTER command updates an Iceberg table’s schema in a new snapshot.
ALTER TABLE [<table_path>.]<table_name>
[ { ALTER | MODIFY | CHANGE } COLUMN ( <source_col_name> <new_col_name>{ data_type } ) ]
| [ ADD [ COLUMN | COLUMNS ] ( <column_name> data_type [, <column_name> data_type ... ] ) [BEFORE <column_name>]]
| [ DROP COLUMN <column_name> ]
| [ [ ADD | DROP ] PARTITION FIELD [ <column_name> | <partition_transform> ] ]
| [ RENAME [ COLUMN ] <old_column_name> [ TO ] <new_column_name> ]
| [ REFRESH METADATA ]
Parameters
<table_path>
String
The path in which the table is located.
<table_name>
String
The name of the table that you want to alter.
{ ALTER | MODIFY | CHANGE } COLUMN ( <source_col_name> <new_col_name>{ data_type } )
String
Changes the data type for a column, and gives you the option of to renaming the column. If you do not want to rename the column, set <new_col_name> to <source_col_name>.
Only three types of changes to primitive types are allowed:
- int to long
- float to double
- decimal(P, S) to decimal(P', S), if you are widening the precision
- Set 1
- struct_type:
ROW( name primitive_or_complex_type, .. )
- list_type:
ARRAY(primitive_or_complex_type)
ROW(innerfield INT, anotherinnerfield DOUBLE)
ARRAY(INT)
ROW(innerfield ARRAY(INT))
ARRAY(ROW(innerfield INT))
- struct_type:
- Set 2
- struct_type:
STRUCT <name : primitive_or_complex_type, ...>
- list_type:
{ LIST | ARRAY } < primitive_or_complex_type >
STRUCT<innerfield : INT, anotherinnerfield : DOUBLE>
LIST<INT>
ARRAY<INT>
STRUCT<innerfield : LIST<INT>>
LIST<STRUCT<innerfield : INT>>
- struct_type:
ADD [ COLUMN | COLUMNS ] ( <column_name> data_type [, <column_name> data_type ... ] ) [BEFORE <column_name>]
String
Appends one or more columns that have the specified names and data types. The optional BEFORE keyword lets you insert the column or columns before an existing column.
These are the supported primitive types:
- BOOLEAN
- VARBINARY
- DATE
- FLOAT
- DECIMAL
- DOUBLE
- INTERVAL
- INT
- BIGINT
- TIME
- TIMESTAMP
- VARCHAR (The length is always 65536 bytes. If a length is specified, it is ignored.)
You can define complex types by using either of these two sets of syntax:
- Set 1
- struct_type:
ROW( name primitive_or_complex_type, .. )
- list_type:
ARRAY(primitive_or_complex_type)
ROW(innerfield INT, anotherinnerfield DOUBLE)
ARRAY(INT)
ROW(innerfield ARRAY(INT))
ARRAY(ROW(innerfield INT))
- struct_type:
- Set 2
- struct_type:
STRUCT <name : primitive_or_complex_type, ... >
- list_type:
{ LIST | ARRAY } < primitive_or_complex_type >
STRUCT<innerfield : INT, anotherinnerfield : DOUBLE>
LIST<INT>
ARRAY<INT>
STRUCT<innerfield : LIST<INT>>
LIST<STRUCT<innerfield : INT>>
- struct_type:
DROP COLUMN <column_name>
String
Drops one column. This operation cannot be undone. Before dropping a column, ensure that you have secured a backup or snapshot of your database.
[ ADD | DROP ] PARTITION FIELD [ <column_name> | <partition_transform> ]
String
(Optional) ADD partitions the table data on the values in a single column or by using one of the partition-transformation functions. DROP drops the partition definition.
These are the partition-transformation functions:
Transform | Description |
---|---|
identity( <col> ) | Explicitly specified identity transform |
year( <col> ) | Partition by year. The column must use the TIMESTAMP data type. |
month( <ts_col> ) | Partition by month. The column must use the TIMESTAMP data type. |
day( <ts_col> ) | Partition by day. The column must use the TIMESTAMP data type. |
hour( <ts_col> ) | Partition by hour. The column must use the TIMESTAMP data type. |
bucket( <count>, <col> ) | Partition by hashed value into <count> buckets |
truncate( <length>, <col> ) | Partition by truncated value.
|
REFRESH METADATA
String
Refreshes the metadata for the table.
Examples
ALTER TABLE myTable ADD COLUMN (address VARCHAR)
ALTER TABLE myTable ALTER COLUMN id id BIGINT