Skip to main content

Implicit Type Conversion

When loading data from file-formatted sources, if a data type within a file does not match the data type of the corresponding table, Dremio takes one of the following actions:

  1. Silent conversion: Dremio converts the file data type to match the table's schema without changing the schema.
  2. Schema learning promotion: Dremio promotes the table column to a wider type that can accommodate both the existing and new data, then reruns the query. When this happens, Dremio picks the narrowest type that safely holds values from both the source and column type. For example, a FLOAT column receiving INT data is promoted to DOUBLE, which can represent both integer and floating-point values without loss of precision.
  3. Error: If the types are incompatible, the query fails with an error message.

For metastore sources such as AWS Glue, Dremio does not update the data type since the table's schema is defined by the external metastore. Additionally, for Apache Iceberg and Delta Lake, the table format defines the schema and there is no schema learning.

Supported Conversions

Conversions not listed in this table are not supported and return an error.

Source TypeConverts ToColumn Promoted To
INTBIGINT, DOUBLE, DECIMAL, VARCHARFLOAT column → DOUBLE
BIGINTDOUBLE, DECIMAL, VARCHARINT column → BIGINT
FLOAT column → DOUBLE
FLOATDOUBLE, DECIMAL, VARCHARINT column → DOUBLE
BIGINT column → DOUBLE
DOUBLEVARCHARINT column → DOUBLE
BIGINT column → DOUBLE
FLOAT column → DOUBLE
DECIMAL column → DOUBLE
DECIMALDOUBLE, VARCHARINT column → DECIMAL
BIGINT column → DECIMAL
FLOAT column → DECIMAL
DECIMAL column → adjusted
TIME, DATE, TIMESTAMPVARCHAR
VARCHARUUIDAll numeric, date, and time columns → VARCHAR
BOOLEANVARCHAR
UUIDVARCHAR
VARIANT— (use TO_VARIANT and VARIANT_GET)
ARRAY, STRUCT, MAP

AWS Glue Sources

For AWS Glue sources with Parquet datasets, Dremio supports the following conversions:

  • INT to BIGINT
  • FLOAT to DOUBLE
  • DECIMAL to DECIMAL
  • VARCHAR to VARCHAR[N], which truncates VARCHAR to the specified [N] limit

Date/Time Parsing from CSV Files

CSV values are read as strings. Dremio parses date and time strings into the target column type using the format options you specify (DATE_FORMAT, TIME_FORMAT, TIMESTAMP_FORMAT) or the defaults:

Target TypeDefault Format
DATEYYYY-MM-DD
TIMEHH24:MI:SS.FFF
TIMESTAMPYYYY-MM-DD HH24:MI:SS.FFF

See COPY INTO for additional format options.

DECIMAL-to-DECIMAL Conversions

Since decimals can have different scales and precision, Dremio supports DECIMAL-to-DECIMAL conversions. Conversion between any DECIMAL precision and scale is supported, including decreasing.

note

The following symbols are used in this section:

  • Precision (p): represents the number of digits in the value/cell.
  • Scale (s): represents the number of digits after the decimal point in the value/cell.

Conversion between any DECIMAL precision and scale is supported, including decreasing. Runtime checks detect overflow. On overflow, returns NULL. On truncation, rounds to the nearest half.

note

Implicit conversions are not currently supported for filesystem and MongoDB sources.

DECIMAL-to-DECIMAL Schema Learning

When source files contain decimal values using a different scale or precision from the table's definition - Decimal (p, s) - Dremio will update the table's decimal definition according to the following logic through schema learning if possible, or otherwise attempt to convert the source decimal values to the table's definition.

Logic used for the schema learning option, where d = decimal, p = precision, and s = scale:

  • d1, p1, s1 - The decimal type in file 1 with a given p and s
  • d2, p2, s2 - The decimal type in file 2 with a given p and s
  • Precision - The new precision after schema learning
  • Scale - The new scale after schema learning

The following shows the schema-learned decimal type for the column.

Schema learning logic to determine precision
Precision = max(s1, s2) + max(p1-s1, p2-s2)
note

This schema provides space for the maximum fractional part (below decimal point) and integral part (above decimal point).

Schema learning logic to determine scale
Scale = max(s1, s2)

If the resulting precision is greater than 38, it is reduced to 38, and the corresponding scale is reduced to prevent truncating the integral part:

  • Scale is reduced to provide enough space for the integral part. The resulting scale is MIN(precision, 38) - max(p1 - s1, p2 - s2). The resulting precision is 38.
  • This result causes truncation on read since the fractional part will not contain enough spaces for the original scale.

Type Conversion When Copying Data from CSV or JSON Files Into Apache Iceberg Tables

These data-type conversions are supported when the COPY INTO <table> SQL command is used to copy data from source files into Iceberg tables:

Source Data TypeTarget Dremio Data Type
StringBIGINT, BOOLEAN, DATE, DECIMAL, DOUBLE, FLOAT, INT, TIME, TIMESTAMP, UUID, VARCHAR
IntegerBIGINT, DECIMAL, DOUBLE, FLOAT, INT, VARCHAR
Floating-point numbersDECIMAL, DOUBLE, FLOAT, VARCHAR
BooleanBOOLEAN, VARCHAR
Object (JSON only)STRUCT
Array (JSON only)LIST
note

Although the TIMESTAMP data type is supported as a target schema data type, TIMESTAMP with time zone is not supported.