Data Types

Dremio supports the following SQL data types: numeric, string and binary, boolean, date and time, and semi-structured data.

Numeric Types

INT

A 4-byte signed integer. The supported range is from -2,147,483,648 to 2,147,483,647.

BIGINT

An 8-byte signed integer. The supported range is from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

FLOAT

A 4-byte single-precision floating-point number. A FLOAT provides six decimal digits of precision.

DOUBLE

An 8-byte double-precision floating-point number. A DOUBLE provides 15 decimal digits of precision.

DECIMAL

A DECIMAL type has precision (p) and scale (s), written as DECIMAL(p,s). Precision is the total number of digits; scale is the number of digits to the right of the decimal point. For example, 987.65 is a DECIMAL(5,2) value.

When you perform arithmetic on DECIMAL values with different precision or scale, the result will have sufficient precision and scale to hold the result—unless it overflows.

Numeric literals cannot exceed BIGINT range. The SQL parser cannot process numeric literals larger than 9,223,372,036,854,775,807. To work with larger values, use a string literal: CAST('9223372036854775808' AS DECIMAL(38,0)).
Overflow produces zero, not an error. When decimal arithmetic or casting results in a value that exceeds the target precision, Dremio returns 0 rather than raising an error. For example, CAST('9999999999999999999' AS DECIMAL(38,2)) * CAST('9999999999999999999' AS DECIMAL(38,2)) returns 0.00. This can cause silent data loss.
Casting to lower precision rounds the value. Dremio uses standard half-up rounding. For example, CAST(123.56 AS DECIMAL(4,0)) returns 124, while CAST(123.44 AS DECIMAL(4,0)) returns 123. If the rounded value still exceeds the target precision, the result is 0.

String and Binary Types

VARCHAR

VARCHAR stores variable-length text. Only UTF-8 encoding is supported. For functions that work with VARCHAR values, see String Functions.

VARCHAR Example

SELECT 'Hello, world!'
-- Hello, world!

VARBINARY

VARBINARY stores variable-length binary data. By default, the maximum length is 32,000 bytes. Binary values are displayed in base64 encoding. For functions that work with VARBINARY values, see Binary Functions.

VARBINARY Example

SELECT CAST('help' AS VARBINARY)
-- aGVscA==

UUID

A universally unique identifier stored as a 16-byte binary value. The string representation consists of 32 hexadecimal digits (0-9, a-f) displayed in five groups separated by hyphens, in the form 8-4-4-4-12. UUID generation and comparison follow RFC 4122 and RFC 9562. For functions that work with UUID values, see UUID Type Functions.

UUID is supported only for Apache Iceberg tables. Dremio compares UUID values using unsigned byte-wise comparisons in big-endian byte order. This is equivalent to lexicographic comparison of the string representations.

To generate a UUID, use the GENERATE_UUID function.

Creating a table with a UUID column

CREATE TABLE users (
  user_id UUID,
  name VARCHAR
);

Generating a UUID

SELECT GENERATE_UUID();
-- e740c3e0-54d2-4a01-b4f5-ad5fe5690e1c

UUID from string literal

SELECT UUID '619fbd6f-5114-40c7-a83f-1f0e114adae2'

Usage Notes

Casting an invalid string format or incorrectly sized binary to UUID throws an error.
Per the Iceberg specification, UUID columns cannot be changed to other data types, and columns of other types cannot be changed to UUID using ALTER TABLE. To migrate an existing VARCHAR column to UUID, add a new UUID column, populate it with CAST(varchar_col AS UUID), then drop the original column.
UUID is supported as a primitive type and within complex types (ARRAY, STRUCT, MAP, VARIANT).
The UUID type does not enforce uniqueness; the same value can be inserted multiple times.
UUID values are case-insensitive.

Logical Types

BOOLEAN

The supported values for BOOLEAN are TRUE, FALSE, and NULL. For functions that work with BOOLEAN values, see Boolean Functions.

Date and Time Types

For functions that work with date and time values, see Date/Time Functions.

DATE

A date value that enables you to calculate and store consistent information about the date of events and transactions. When using a string literal for the date, YYYY-MM-DD is the only supported format. To use a different format, use the TO_DATE function.

DATE Example

DATE '2000-01-01'

TIME

Represents a time of day without a date component. TIME supports up to millisecond precision (3 fractional digits).

TIME Examples

TIME '17:30:50'
TIME '17:30:50.235'

TIMESTAMP

Represents an absolute point in time with millisecond precision without a time zone. Timestamps are truncated to the nearest millisecond.
Examples:
- TIMESTAMP ‘2000-01-01 01:30:50’
- TIMESTAMP ‘2000-01-01 17:30:50’
- TIMESTAMP ‘2000-01-01 17:30:50.9’
- TIMESTAMP ‘2000-01-01 17:30:50.12’
- TIMESTAMP ‘2000-01-01 17:30:50.123’

INTERVAL

Intervals are used to represent a measure of time. Dremio supports two types of intervals: year-month, which stores the year and month (YYYY-MM); and day-time (DD HH:MM:SS), which stores the days, hours, minutes, and seconds.

Form	Example
YEAR	`INTERVAL '1' YEAR`
MONTH	`INTERVAL '3' MONTH`
YEAR TO MONTH	`INTERVAL '1-5' YEAR TO MONTH`
DAY	`INTERVAL '5' DAY`
MINUTE	`INTERVAL '5' MINUTE`
SECOND	`INTERVAL '5' SECOND`
DAY TO HOUR	`INTERVAL '4 01' DAY TO HOUR`
DAY TO MINUTE	`INTERVAL '4 01:01' DAY TO MINUTE`
DAY TO SECOND	`INTERVAL '1 2:34:56.789' DAY TO SECOND`

Semi-Structured Types

Dremio supports three semi-structured types: VARIANT, STRUCT, and MAP. For new Iceberg v3 tables, VARIANT is the recommended approach for semi-structured data because it supports flexible, schema-on-read access without requiring a fixed schema at table creation.

VARIANT

VARIANT is a self-describing type that can hold:

Primitives – numeric types, VARCHAR, VARBINARY, UUID, BOOLEAN, DATE, TIME, TIMESTAMP, null
Arrays – ordered lists of values
Objects – nested key-value pairs, created via PARSE_JSON or TO_VARIANT(struct)

Unlike STRUCT, which has a fixed schema defined at table creation, VARIANT allows different rows to have different shapes. This makes VARIANT ideal for semi-structured data where the schema varies, evolves frequently, or is unknown at write time.

VARIANT is supported only on Apache Iceberg v3 tables. See Iceberg v3 for details.

For functions that work with VARIANT values, see Variant Type Functions.

Creating a table with a VARIANT column

CREATE TABLE events (
  event_id INT,
  event_time TIMESTAMP,
  payload VARIANT
);

Inserting data

INSERT INTO events (event_id, event_time, payload)
VALUES (1, TIMESTAMP '2025-01-15 10:30:00', PARSE_JSON('{"user": "Alice", "action": "login"}'));

Extracting values

SELECT 
  event_id,
  VARIANT_GET(payload, '$.user' AS VARCHAR) AS user_name,
  VARIANT_GET(payload, '$.action' AS VARCHAR) AS action
FROM events;

When you SELECT a VARIANT, the output displays the underlying value, not the VARIANT wrapper. Use TYPEOF to confirm the actual type.

Storage and Performance

VARIANT values are always stored using a compact binary encoding, not as raw JSON strings. The types preserved in that encoding depend on how the data was loaded:

TO_VARIANT preserves the original SQL types. Extracted values retain their original type without conversion.
PARSE_JSON preserves only JSON's primitive types — strings, numbers, booleans, and nulls. Types like DATE, TIMESTAMP, and UUID are stored as strings because JSON has no representation for them. Extracting these values with VARIANT_GET requires a type conversion on each read. For better performance on frequently queried fields with known types, consider extracting them once into typed relational columns.

Variant Shredding

Variant shredding is a Parquet-level optimization that extracts commonly occurring fields from a VARIANT value and stores them as separate, typed columns within the Parquet file. Instead of reading an entire binary blob to access a single field, a query engine can read only the specific columns it needs. This reduces I/O, improves compression, and enables data skipping through Parquet column statistics. Shredding is an optimization applied by the writing engine and encoded in the structure of the Parquet files. Iceberg table metadata does not track whether VARIANT data is shredded.

Dremio reads shredded VARIANT data transparently. When other engines such as Apache Spark (4.1 and later) write VARIANT data with shredding to Iceberg v3 tables, Dremio takes advantage of the shredded layout on read. Spark 4.1 enables variant shredding by default, so any Iceberg v3 table written by Spark 4.1 or later will contain shredded VARIANT data. No configuration is required in Dremio to read these files.

Dremio writes shredded VARIANT data by default. When Dremio writes VARIANT values to Iceberg v3 tables, it applies shredding to optimize read performance. While shredding significantly improves read performance, it adds overhead during writes. For write-heavy workloads where write throughput is the priority, you can disable shredding by setting the iceberg.enableVariantShredding table property to false:

Table Property to Disable Variant Shredding

ALTER TABLE my_table SET TBLPROPERTIES ('iceberg.enableVariantShredding' = 'false');

See Table Properties for additional Iceberg table properties.

Usage Notes

VARIANT cannot hold MAP or INTERVAL types.
VARIANT cannot be nested inside STRUCT types.
VARIANT does not implicitly convert to or from other types in expressions. Use VARIANT_GET to extract and cast values to a target type.
Direct path access (payload['user'] or payload.user) is not supported. Use VARIANT_GET to extract values.
VARIANT_GET extracts scalar values and arrays of scalar values. Objects and arrays of objects cannot be extracted directly — use path expressions to navigate to scalar fields. See VARIANT_GET for supported cast types, array extraction with FLATTEN, and detailed usage.
VARIANT cannot be cast to STRUCT types. Use VARIANT_GET to extract individual fields.
A variant null and SQL NULL are distinct. Use IS_VARIANT_NULL to check for variant null values; use IS NULL to check for missing paths.
Individual VARIANT values are subject to Dremio's 16 MB row width limit. See Dataset Limits.

Client Compatibility

When you query VARIANT columns through external clients (JDBC, ODBC, Arrow Flight) or BI tools (Tableau, Power BI, Looker), Dremio automatically converts VARIANT values to JSON strings. This conversion happens transparently—no configuration or explicit casting is required.

This behavior exists because the client ecosystem does not yet support native VARIANT as a data type. All VARIANT functions (such as VARIANT_GET and IS_VARIANT_NULL) continue to work normally on the server side; only the data returned to the client is converted.

For driver-specific details, see Arrow Flight SQL JDBC and Arrow Flight SQL ODBC.

STRUCT

Used to represent collections of key-value pairs. Keys are non-empty, case-insensitive strings, and values can be of any type.

Create STRUCT Literal

SELECT CONVERT_FROM('{"name":"Gnarly", "age":7, "car":null}', 'json')
-- {"name":"Gnarly","age":7}

STRUCT Member

SELECT address['city'] FROM customerTable

Usage Notes

Null fields are omitted from STRUCT output.
STRUCT fields cannot contain VARIANT values.

ARRAY

Used to represent a list of arbitrary size, where the index is a non-negative integer and values can be of any single type. For functions that work with ARRAY values, see Array Type Functions in the Semi-Structured Data section.

ARRAY Literals

SELECT ARRAY[1, 2, 3];
-- [1, 2, 3]

ARRAY Select

SELECT customerOrders[100] FROM OrderHistoryTable

MAP

The MAP type is a collection of key-value pairs. MAP keys are case-insensitive strings. All values in a given map have the same type. For example, map<string, int> represents a mapping where the keys are strings and the values are integers. To retrieve the value of a MAP element, use column['key'].

MAP Syntax

SELECT <column_name['<key_name>']> FROM <table_name>

MAP keys must be strings. MAP values must be numbers, booleans, or strings; NULL is not supported as a value. For functions that work with MAP values, see Map Type Functions in the Semi-Structured Data section.

Implicit Type Conversion

For information about supported implicit type conversions, including conversion rules for file-formatted sources, AWS Glue, DECIMAL-to-DECIMAL behavior, and COPY INTO operations, see Implicit Type Conversion.

Numeric Types​

INT​

BIGINT​

FLOAT​

DOUBLE​

DECIMAL​

String and Binary Types​

VARCHAR​

VARBINARY​

UUID​

Usage Notes​

Logical Types​

BOOLEAN​

Date and Time Types​

DATE​

TIME​

TIMESTAMP​

INTERVAL​

Semi-Structured Types​

VARIANT​

Storage and Performance​

Variant Shredding​

Usage Notes​

Client Compatibility​

STRUCT​

Usage Notes​

ARRAY​

MAP​

Implicit Type Conversion​

Numeric Types

INT

BIGINT

FLOAT

DOUBLE

DECIMAL

String and Binary Types

VARCHAR

VARBINARY

UUID

Usage Notes

Logical Types

BOOLEAN

Date and Time Types

DATE

TIME

TIMESTAMP

INTERVAL

Semi-Structured Types

VARIANT

Storage and Performance

Variant Shredding

Usage Notes

Client Compatibility

STRUCT

Usage Notes

ARRAY

MAP

Implicit Type Conversion