Skip to main content

ALTER TABLE

Update a table’s definition or schema.

Syntax
ALTER TABLE <table_name>
[ AT { REF[ERENCE] | BRANCH } <reference_name> ]
{ ADD PRIMARY KEY ( <column_name> [, ... ] )
| DROP PRIMARY KEY
| ADD COLUMNS ( <column_name1> <data_type> [, ... ] )
| DROP COLUMN <column_name>
| { ALTER | MODIFY | CHANGE } COLUMN <old_name> <new_name> <data_type>
| LOCALSORT BY ( <column_name1>, <column_name2>, ... )
| REFRESH METADATA [ IN <catalog_name> ]
[ FOR PARTITIONS ( <partition_name> = '<value>') ]
[ { AVOID | AUTO } PROMOTION ]
[ { FORCE | LAZY } UPDATE ]
[ { MAINTAIN | DELETE } WHEN MISSING ]
| FORGET METADATA
| CREATE AGGREGATE REFLECTION <reflection_name> USING
{ DIMENSIONS ( <column_name1>, <column_name2>, ... )
| MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) )
| DIMENSIONS ( <column_name1>, <column_name2>, ... ) MEASURES ( <column_name1> ( <aggregation_type>, <column_name2> <aggregation_type> , ... ) ) }
[ PARTITION BY ( <column_name1>, <column_name2>, ... ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| CREATE EXTERNAL REFLECTION <reflection_name> USING <table_name>
| CREATE RAW REFLECTION <reflection_name> USING DISPLAY ( <column_name1>, <column_name2>, ...)
[ PARTITION BY ( <column_name1>, <column_name2>, ... ) ]
[ LOCALSORT BY ( <column_name1>, <column_name2>, ... ) ]
[ ARROW CACHE ]
| DROP REFLECTION <reflection_name>
| ROUTE REFLECTIONS TO { DEFAULT ENGINE | ENGINE { <engine_name> | <engine_uuid> } }
| { ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> }
}

Parameters

<table_name> String

The name of the table that you want to alter. The source can be in the scratch directory or in a datalake source.


AT { REF[ERENCE] | BRANCH } <reference_name> String   Optional

Specifies the reference at which you want to alter the table. When this parameter is omitted, the current reference is used. REF identifies a specific reference; BRANCH identifies a specific branch.


ADD PRIMARY KEY ( <column_name> [, ... ] ) String

Specifies to use one or more existing columns as the primary key of a table. Primary keys provide hints to the query planning during join planning. They can be added to Apache Iceberg tables only. Uniqueness of the values in a primary key is not enforced.


DROP PRIMARY KEY String

Removes a table's primary key. The columns that make up the primary key remain in the table.


ADD COLUMNS ( <column_name1> <data_type> [, ... ] ) String

Creates one or more columns that have the specified names, data types, and character limits. The size is an attribute of the data type.

These are the supported primitive types:

  • BOOLEAN
  • VARBINARY
  • DATE
  • FLOAT
  • DECIMAL
  • DOUBLE
  • INTERVAL
  • INT
  • BIGINT
  • TIME
  • TIMESTAMP
  • VARCHAR (the length is always 65536 bytes, and any specified value is ignored)

You can define complex types by using either of these two sets of syntax:

Set 1

  • struct_type: ROW( name primitive_or_complex_type, .. )
  • list_type: ARRAY(primitive_or_complex_type)

Examples:

ROW(innerfield INT, anotherinnerfield DOUBLE)
ARRAY(INT)
ROW(innerfield ARRAY(INT))
ARRAY(ROW(innerfield INT))

Set 2

  • struct_type: STRUCT <name : primitive_or_complex_type, ... >
  • list_type: { LIST | ARRAY } < primitive_or_complex_type >

Examples:

STRUCT<innerfield : INT, anotherinnerfield : DOUBLE>
LIST<INT>
ARRAY<INT>
STRUCT<innerfield : LIST<INT>>
LIST<STRUCT<innerfield : INT>>

DROP COLUMN <column_name> String

Drops the specified column. This action cannot be undone.


{ ALTER | MODIFY | CHANGE } COLUMN <old_column_name> <new_column_name> <data_type> String

Changes the data type for a column, and gives you the option to rename the column.

note

You can rename only columns of tables that are in scratch spaces and columns of Iceberg tables. If you do not want to rename the column, set <new_column_name> to <old_column_name>.

Only three types of changes to primitive types are allowed:

  • int to long
  • float to double
  • decimal(P, S) to decimal(P', S), if you are widening the precision

You can alter columns that use complex types by using either of these two sets of syntax:

Set 1

  • struct_type: ROW( name primitive_or_complex_type, .. )
  • list_type: ARRAY(primitive_or_complex_type)

Examples:

ROW(innerfield INT, anotherinnerfield DOUBLE)
ARRAY(INT)
ROW(innerfield ARRAY(INT))
ARRAY(ROW(innerfield INT))

Set 2

  • struct_type: STRUCT <name : primitive_or_complex_type, ...>
  • list_type: { LIST | ARRAY } < primitive_or_complex_type >

Examples:

STRUCT<innerfield : INT, anotherinnerfield : DOUBLE>
LIST<INT>
ARRAY<INT>
STRUCT<innerfield : LIST<INT>>
LIST<STRUCT<innerfield : INT>>

LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to sort new data that is added to the table after the ALTER TABLE command is run.


REFRESH METADATA

Refreshes the metadata associated with the specified table. You can choose to either do a full or partial metadata refresh. A full refresh will update the metadata for the entire table. A partial refresh will enable you to only update specified partitions. Optional clauses are available for refreshing a table's metadata. If you choose to use two or more of these clauses, they must be entered in a specified priority order:

  1. Promotion option: either AVOID PROMOTION or AUTO PROMOTION

  2. Update option: either FORCE UPDATE or LAZY UPDATE

  3. Missing option: either MAINTAIN WHEN MISSING or DELETE WHEN MISSING

    note

    If the table is not partitioned, you will receive an error when you attempt to refresh the metadata using the FOR PARTITIONS clause.


FOR PARTITIONS ( <partition_name> = '<value>' ) String   Optional

Use this clause to do a partial refresh of the table's metadata. The <partition_name> identifies the name of the partition to be refreshed. The <value> identifies the specific partition that should be refreshed. Must be contained in single quotes.


{ AVOID | AUTO } PROMOTION Optional

Clauses that determine whether files and folders are promoted to datasets when you run a query. The AVOID PROMOTION prevents queries from promoting files/folders to datasets. The AUTO PROMOTION allows queries to promote files/folders to datasets. This is the default option when you do not include a promotion clause.


{ FORCE | LAZY } UPDATE Optional

Clauses that determine whether metadata is updated when you run a query. The FORCE UPDATE forces a full update of metadata. The LAZY UPDATE does not perform a full update of metadata. This is the default option when you do not include an update clause.


{ MAINTAIN | DELETE } WHEN MISSING Optional

Clauses that determine how missing metadata is handled when you run a query. The MAINTAIN WHEN MISSING Prevents missing metadata from being deleted during refresh. DELETE WHEN MISSING deletes missing metadata during refresh. This is the default option when you do not include a clause.


FORGET METADATA

Deletes the metadata information stored in Dremio for the specified table until the next metadata refresh. The dataset can still be queried using SQL.


CREATE AGGREGATE REFLECTION <reflection_name> String

Creates an aggregation reflection with the specified name.


DIMENSIONS ( <column_name1>, <column_name2>, ... ) String

The columns to include as dimensions in the reflection.


MEASURES ( <column_name1> ( <saggregation_type> ), <column_name2> ( <aggregation_type ), ... ) String

The columns to include as measures in the reflection, and the type of aggregation to perform on them. The possible types are COUNT, MIN, MAX, SUM, and APPROXIMATE COUNT DISTINCT.


PARTITION BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to partition the data horizontally in the reflection.


LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to sort the data that is in the reflection.


ARROW CACHE Optional

Specifies that you want Dremio to convert data from your reflection’s Parquet files to the Apache Arrow format when copying that data to executor nodes. Normally, Dremio copies data as-is from the Parquet files to caches on executor nodes, which are nodes that carry out the query plans devised by the query optimizer. Enabling this option can improve query performance even more. However, data in the Apache Arrow format requires more space on the executor nodes than data in the default format. You can use this option with Amazon S3.


CREATE EXTERNAL REFLECTION <reflection_name> String

Creates an external reflection with the specified name.


USING <table_name> String

The path of the derived table to use with the external reflection.


CREATE RAW REFLECTION <reflection_name> String

Creates a raw reflection with the specified name.


USING DISPLAY ( <column_name1>, <column_name2>, ... ) String

The columns to include in the reflection.


PARTITION BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to partition the data horizontally in the reflection.


LOCALSORT BY ( <column_name1>, <column_name2>, ... ) String   Optional

The columns on which to sort the data that is in the reflection.


ARROW CACHE Optional

Specifies that you want Dremio to convert data from your reflection’s Parquet files to the Apache Arrow format when copying that data to executor nodes. Normally, Dremio copies data as-is from the Parquet files to caches on executor nodes, which are nodes that carry out the query plans devised by the query optimizer. Enabling this option can improve query performance even more. However, data in the Apache Arrow format requires more space on the executor nodes than data in the default format. You can use this option with Amazon S3.


DROP REFLECTION String   Optional

Drops the specified reflection defined on the table.


ROUTE REFLECTIONS TO { DEFAULT ENGINE | ENGINE { <engine_name> | <engine_uuid> } } String

Specify the engine on which to run jobs that create and refresh reflections that are defined on the table. Use either the name or UUID of the engine. If neither a name nor a UUID is specified, the default engine is used. You can also directly specify the default engine.


{ ADD | DROP } PARTITION FIELD { <column_name> | <partition_transform> } String   Optional

ADD partitions the table data on the values in a single column or by using one of the partition-transformation functions. DROP drops the partition definition. These are the partition-transformation functions:

  • identity( <col> ): Explicitly specified identity transform.
  • year( <col> ): Partition by year. The column uses either the TIMESTAMP or DATE data type.
  • month( <ts_col> ): Partition by month. The column uses either the TIMESTAMP or DATE data type.
  • day( <ts_col> ): Partition by day. The column uses either the TIMESTAMP or DATE data type.
  • hour( <ts_col> ): Partition by hour. The column uses either the TIMESTAMP or DATE data type.
  • bucket( <count>, <col> ): Partition by hashed value into <count> buckets.
  • truncate( <length>, <col> ): Partition by truncated value. Strings are truncated to the specified length. Integer and biginteger values are truncated to bins (for example: truncate(10, i) produces 0, 10, 20, and so on).

Examples

Add a column
ALTER TABLE services
ADD COLUMNS (county varchar)
Add a column to a table at the specified branch
ALTER TABLE services
AT BRANCH main
ADD COLUMNS (county varchar)
Modify a column
ALTER TABLE services
MODIFY COLUMN tip_amount tip_amount DECIMAL
Modify a column for a struct table
ALTER TABLE struct_type
MODIFY COLUMN a a struct<x: varchar, y: bigint>
Rename a column
ALTER TABLE services
MODIFY COLUMN tip_amount gratuity_amount DECIMAL
Refresh all the metadata for a table
ALTER TABLE services
REFRESH METADATA
Refresh all the metadata for a table using optional clauses
ALTER TABLE services
REFRESH METADATA
AUTO PROMOTION
LAZY UPDATE
MAINTAIN WHEN MISSING
Refresh the metadata for a single partition
ALTER TABLE Samples."samples.dremio.com"."zips.json"  
REFRESH METADATA FOR PARTITIONS (state = 'TX')
Refresh the metadata for a single partition using optional clauses
ALTER TABLE Samples."samples.dremio.com"."zips.json"  
REFRESH METADATA FOR PARTITIONS (state = 'TX')
AUTO PROMOTION
LAZY UPDATE
MAINTAIN WHEN MISSING
Forget the metadata for a table
ALTER TABLE Samples."samples.dremio.com"."zips.json"
FORGET METADATA
Create a raw reflection that sorts customers by last name and partitions them by country
ALTER TABLE "@user1"."customers"
CREATE RAW REFLECTION customers_by_country USING DISPLAY (id,lastName,firstName,address,country)
PARTITION BY (country)
LOCALSORT BY (lastName)
Create an aggregate reflection that counts the cities per state and sorts by state
ALTER TABLE Samples."samples.dremio.com"."zips.json"
CREATE AGGREGATE REFLECTION per_state
USING
DIMENSIONS (state)
MEASURES (city (COUNT))
LOCALSORT BY (state)
Routing Reflections
ALTER TABLE "Table 1" ROUTE REFLECTIONS TO ENGINE "Engine 1"
ALTER TABLE "View 1" ROUTE REFLECTIONS TO ENGINE "Engine 1"
ALTER TABLE "View 3" ROUTE REFLECTIONS TO ENGINE "Engine 1"
Adding a column for an Apache Iceberg Table
ALTER TABLE myTable ADD COLUMNS (address VARCHAR)
Changing the data type of a column to BIGINT for an Apache Iceberg Table
ALTER TABLE myTable ALTER COLUMN id id BIGINT