Skip to main content
Version: 24.3.x

Table SQL Statements

Tables can be created in filesystem sources types. In addition, if tables are created in a source, they can also be altered and dropped.

The following SQL commands are documented on this page:

CREATE TABLE AS

The CREATE TABLE AS command is used to create tables. Tables are created in filesystem sources types.

note

Tables are written in Parquet format and can be used with HASH PARTITION BY and LOCALSORT options.

Filesystem Sources

Dremio CTAS supports all filesystem source types (S3, ADLS, NAS, HDFS, and MapR-FS) using the filesystem permissions for the written table using impersonation.

Enabling CTAS and DROP TABLE for Sources
To use the CREATE TABLE AS (CTAS) and DROP TABLE (DROP) commands on a source, you must add these privileges to the appropriate users for the source. If enabled, all users/groups who have access to the source can leverage CTAS and DROP TABLE on sources.

Privileges can be enabled:

  • When creating a new source.
  • By editing an existing source.

To enable the appropriate privileges for the source:

  1. Select Privileges.
  2. In the Users table, identify the user or group that you want to edit privileges for, and click the cell to enable or disable the Create Table or Drop privilege for the user.
  3. Click Save.
note

CTAS functionality is enabled on a per source basis.

Impersonation
For sources that support impersonation (HDFS and MapR File System) and if impersonation is enabled, Dremio executes the CTAS and DROP TABLE operations as the user running the command. This requires that the user have the appropriate permissions (read, execute, and/or write) for the operation. If impersonation is not enabled, the operation is executed as the user running the Dremio service user.

note

For additional security, access can be further controlled by selecting specifically restricted filesystem sub-directories when you initially create your Dremio source.

Limitations include:

  • Views can be exported to their respective sources via standard SQL-based commands ONLY.
  • DROP TABLE functionality on S3 or ADLS source datasets is slow when working with very large tables.
    BUG 13539 Mamatha

Syntax

Filesystem source syntax
CREATE TABLE <source>.<tableName>
[HASH PARTITION BY (<columnName>, . .) ]
[LOCALSORT BY (<columnName>) ]
AS <QUERY>

Examples

Creating a source table
CREATE TABLE s3.upload.lineitem2
AS select * from TPCH.lineitem

Once a source table has been created, the Dremio UI displays the following:

  • Path where the table was created
  • Number of records in the new table
  • File size of the table

Querying Tables

Querying a source table
select count(*) from s3.upload.lineitem2

If you specify an alias for a column or an expression in the SELECT clause, you can refer to that alias elsewhere in the query, including in the SELECT list or in the WHERE clause.

Example 1
SELECT c_custkey AS c, lower(c)
FROM "customer.parquet"
Example 2
SELECT c_custkey AS c, lower(c)
FROM (
SELECT c_custkey, c_mktsegment AS c
FROM "customer.parquet")
Example 3
SELECT  c_name AS n, n
FROM (
SELECT c_mktsegment AS n, c_name
FROM "customer.parquet")
AS MY_TABLE
WHERE n = 'BUILDING'
Example 4
SELECT c_custkey
FROM (
SELECT c_custkey, c_name AS c
FROM "customer.parquet" )
WHERE c = 'aa'
Example 5
SELECT *
FROM (
SELECT c_custkey AS c, c_name
FROM "customer.parquet" )
JOIN "orders.parquet" ON c = o_orderkey
Example 6
SELECT c_custkey AS c
FROM "customer.parquet"
JOIN "orders.parquet" ON c = o_orderkey

DESCRIBE TABLE

The DESCRIBE TABLE command is used to provide high-level information regarding the overall column properties of an existing dataset.

Dremio DESCRIBE supports all filesystem source types and relies upon a user's existing privileges to access and describe a table or view.

DESCRIBE TABLE syntax
DESCRIBE TABLE table_name;

Example

Describing a table
DESCRIBE TABLE taxistats
COLUMN_NAMEDATA_TYPEIS_NULLABLENUMERIC_PRECISIONNUMERIC_SCALEEXTENDED_PROPERTIESMASKING_POLICYSORT_ORDER_PRIORITY
pickup_datetimeTIMESTAMPYESnullnull[][]1
passenger_countBIGINTYES640[]count_hidenull
trip_distance_miDOUBLEYES53null[][]null
fare_amountDOUBLEYES53null[][]null
tip_amountDOUBLEYES53null[][]null
total_amountDOUBLEYES53null[][]null

The cells containing a '[]' indicate "empty" values.

Columns

ColumnData TypeDescription
COLUMN_NAMEstringThe name of the column in the table.
DATA_TYPEstringThe data type allowed for values in the column.
IS_NULLABLEbooleanIndicates whether values in the column can be null.
NUMERIC_PRECISIONintegerThe maximum number of digits allowed in a value.
NUMERIC_SCALEintegerThe maximum number of digits allowed to the right of the decimal point.
EXTENDED_PROPERTIESarrayAdditional properties, if any, that have been set on the column.
MASKING_POLICYstringThe masking policy, if any, that is set on the column.
SORT_ORDER_PRIORITYintegerIndicates the sort order of the columns that are used for sorting, if the CREATE TABLE command that created the table used the LOCALSORT BY clause or if the table was altered by an ALTER TABLE command that used the LOCALSORT BY clause. For example, if a table was altered, and the clause LOCALSORT BY (colA, colB) was used, the table is sorted by colA and then by colB, the value for colA is 1, while the value for colB is 2. The lower the number, the higher the priority.

ALTER TABLE

note

Users must have the ALTER privilege granted for a dataset or folder in order to use this command.

Modifies the structure of columns in a table, such as adding, deleting, or modifying columns in an existing table. To create a table, refer to the CREATE TABLE command.

Syntax
ALTER TABLE <tableName> ADD COLUMNS (<columnName1> <dataType1>(<size1>), <columnName2> <dataType2>(<size2>), ...);
ALTER TABLE <tableName> DROP COLUMN <columnName>;
ALTER TABLE <tableName> ALTER COLUMN <sourceColumnName> <newColumnName> <newDataType>(<size>);
ALTER TABLE <tableName> MODIFY COLUMN <sourceColumnName> <newColumnName> <newDataType>(<size>);
ALTER TABLE <tableName> REFRESH METADATA;
ALTER TABLE <tableName> REFRESH METADATA FOR PARTITIONS ( "<partitionName>" = '<value>' );
note

The 〈size〉 parameter is optional.

The ADD COLUMNS command creates new column(s) with the specified name, data type, and character limit. See the example below for how to employ this.

Adding a column
ALTER TABLE customer1 ADD COLUMNS (SSN INT(9))

The DROP COLUMN command deletes the specified column, along with all data stored in it. As this cannot be undone, ensure that you've secured a backup or snapshot of your databases. See the example below for how to employ this.

Dropping a column
ALTER TABLE customer1 DROP COLUMN SSN

The ALTER COLUMN and MODIFY COLUMN commands are used to change the data type for specific column(s). This effectively alters how queries read the data stored in the column(s), so the current data type should be similar, if not identical, to the way the new data type is read.

Depending on your source, you will need to use either ALTER COLUMN or MODIFY COLUMN specifically. For example,SQL Server uses ALTER COLUMN whereas Oracle uses MODIFY COLUMN. For additional information, see the ALTER TABLE - ALTER/MODIFY COLUMN resource.

See the example below for how to employ this.

Altering a column
ALTER TABLE customer1 ALTER COLUMN SSN CHAR(11)
caution

About ALTER/MODIFY Data Types: We recommend against changing data types for a column if it already has records on it. Unless you've gone through the column to ensure no issues are found with the data format, this may create issues regarding how the data is read. The likely end result is myriad NULL values. Where possible, it is safer to add a new column with the desired data type and then drop the column whose data you wish to read differently.

Metadata Refreshes

The ALTER TABLE command may be used to manually refresh the metadata associated with a table. This should be coupled with Improved Metadata Refreshes to fully benefit from any improved functionality.

You may perform metadata refreshes one of two ways: full or partial.

The example below shows how to run a full metadata refresh.

Refreshing metadata
ALTER TABLE customer1 REFRESH METADATA

In comparison, the example below demonstrates a metadata refresh for a single partition.

Refreshing metadata for a single partition
ALTER TABLE customer1 REFRESH METADATA FOR PARTITIONS ( "dir0" = 'a')

This example identifies the level of the partition to use ("dir0") and the specific partition ('a') to use for the refresh.

Parameters

The ALTER TABLE command consists of the following parameters for which values must be included:

tableName

The name of the table being altered must be included in the command. This helps to restrict the scope of the change rather than accidentally apply a DROP COLUMNS on every dataset in Dremio. This must match the name of your desired table.

columnName

Dremio cannot perform any alteration to column data without the column itself specified. This helps to refine and restrict the scope of the changes.

Attributes of the data ObjectType(size)

This specifies the type of data to be stored for records if a new column is being added or an existing column being modified.

The (size) component of this parameter defines the limitation on the total number of allowed characters that may be stored on each row for this column.

For a list of data types and their maximum allowed sizes, see SQL Data Types.

DROP TABLE

Tables can be dropped only if they have been created on a filesystem source and if you have permission.

Syntax
DROP TABLE <source>.<tableName>
Example
DROP TABLE s3.upload.lineitem2
dremio refresh metadata force update