CREATE TABLE
Create a new table.
To create a table inside a namespace like a folder, the namespace must already exist. Dremio does not support creating the namespace and a table inside it in a single command. Instead, you must explicitly create the namespace before you can create the table.
CREATE TABLE [ IF NOT EXISTS ] <table_name>
[ AT { REF[ERENCE] | BRANCH } <reference_name> ]
[ PARTITION BY ( <column_name1>, <column_name2>, ... ) ]
[ LOCALSORT BY ( <column_name> ) ]
AS <select_statement>
-- Iceberg tables only
CREATE TABLE [ IF NOT EXISTS ] <table_name>
( <column_spec> [, ...] )
[ PARTITION BY ( { <column_name> | <partition_transform> } [ , ... ] ) ]
-- Set a Column-Masking Policy
CREATE TABLE [ IF NOT EXISTS ] <table_name>
( <column_name> <data_type> MASKING POLICY <function_name> ( <column_name> [, ... ] ) )
-- Add a Row-Access Policy
CREATE TABLE [ IF NOT EXISTS ] <table_name>
( <column_name> <data_type> [, ...] )
ROW ACCESS POLICY <function_name> ( <column_name> [, ... ] )
Parameters
<table_name> String
The name of the table that you want to create. The name of the table must be unique. The table can be in the scratch directory or a data lake source.
[ IF NOT EXISTS ] String
Causes Dremio Cloud to run the statement only if a table with the same name in the same path does not already exist.
AT { REF[ERENCE] | BRANCH } <reference_name> String Optional
Specifies the reference at which you want the new table to be created. When this parameter is omitted, the current reference is used.
REF
: Identifies the specific reference.BRANCH
: Identifies the specific branch.
PARTITION BY ( <column_name1>, <column_name2>, ... ) String
The names of the columns that you want to partition the table by. When you designate column(s) using this clause, Dremio will store the rows containing the same hash partition values in their own Parquet files. Using this clause provides speed optimization for when your WHERE
clause matches your PARTITION BY
clause.
- If you partition a table by a column that has more distinct values than Dremio can partition by, you will receive an error message. If you receive this error message, try partitioning your table by a different column.
- If the table is not partitioned, you will receive an error when you attempt to refresh the metadata using the
PARTITION BY
clause.
LOCALSORT BY <column_name> String
The name of the column that you want to sort the table by. Sorts each Parquet file fragment by the designated column. If your ORDER BY
column matches your LOCALSORT
column or you are doing a range filter on your LOCALSORT
column, your query will be faster because the data is already sorted.
AS <select_statement> String
Use the SELECT
statement to populate the new table using data from an existing table or view.
( <column_spec> [, ...] ) String
Specifies the columns of an Iceberg table:
column_name
: The column name must be unique within the result signature.data_type
: Any Dremio SQL data type, including primitive and complex types.
PARTITION BY ( { <column_name> | <partition_transform> } [ , ... ] ) String Optional
Partitions the table data on the values in a single column or by using one of these partition-transformation functions:
Transform | Description |
---|---|
identity( <col> ) | Explicitly specified identity transform |
year( <col> ) | Partition by year. The column uses either the TIMESTAMP or DATE data type. |
month( <ts_col> ) | Partition by month. The column uses either the TIMESTAMP or DATE data type. |
day( <ts_col> ) | Partition by day. The column uses either the TIMESTAMP or DATE data type. |
hour( <ts_col> ) | Partition by hour. The column uses either the TIMESTAMP or DATE data type. |
bucket( <count>, <col> ) | Partition by hashed value into <count> buckets |
truncate( <length>, <col> ) | Partition by truncated value.
|
<column_name> String
The unique name of the column. Multiple columns may be specified, provided they include their data type. Use commas to separate each column-name and data-type pair.
<data_type> String
The data type associated with the column and its underlying data.
<function_name> String
The name of the UDF you wish to associate with this policy. Function names within a project must unique and are case-insensitive.
<column_name> String
The column names to which this masking policy will apply. Multiple columns may be specified if they are separated by commas. A UDF serving as column-masking policy must accept and return the same data type as the column it is masking.
Examples
Create a table in your project storeCREATE TABLE demo_table
AS SELECT * FROM Samples."samples.dremio.com"."zips.json"
CREATE TABLE employees
(PersonID int, LastName varchar, FirstName varchar, Address varchar, City varchar)
AT BRANCH main
CREATE TABLE demo_table2
PARTITION BY (state)
LOCALSORT BY (city)
AS SELECT * FROM Samples."samples.dremio.com"."zips.json"
CREATE TABLE demo.example_table
AS SELECT * FROM "oracle_tpch".DREMIO.JOBS AT TAG Jan2020
CREATE TABLE demo.example_table
AS SELECT * FROM "oracle_tpch".DREMIO.JOBS AT COMMIT "ff2fe50fef5a030c4fc8e61b252bdc33c72e2b6f929d813833d998b8368302e2"
CREATE TABLE employees (name VARCHAR, age INT)
CREATE TABLE struct_type ( a struct<x: varchar> )
CREATE TABLE myAmazonS3Source.myFolder.myTable
AS SELECT * FROM myAmazonS3Source.anotherFolder.anotherTable
CREATE TABLE myTable (col1 int, col2 date)
PARTITION BY (month(col2))
CREATE TABLE officers
(name VARCHAR, assignment VARCHAR) ROW ACCESS POLICY hide_undercover(assignment)
CREATE TABLE employees
(name VARCHAR, ssn_col VARCHAR MASKING POLICY protect_ssn (ssn_col),department VARCHAR)