Skip to main content
Version: 24.3.x

Hive Data Types

The following table shows the mappings from Hive to Dremio data types. If there are additional Hive types not listed in the table, then those types are not supported in Dremio.

Data Type Mappings

Hive Database TypeDremio Type
BINARYVARBINARY
BOOLEANBOOLEAN
BYTEINTEGER
CHARVARCHAR
DATEDATE
DECIMALDECIMAL
DOUBLEDOUBLE
FLOATFLOAT
INTINTEGER
LONGBIGINT
SHORTINTEGER
STRINGVARCHAR
TIMESTAMPTIMESTAMP
VARCHARVARCHAR

Complex Data Types

Dremio supports LIST, MAP, and STRUCT complex data types in source files that are in these formats:

  • Apache Iceberg
  • Apache Parquet
  • Delta Lake
  • ORC
  • RCFile
  • SequenceFile
  • Text

Dremio supports the UNION data type in source files that are in these formats:

  • Apache Iceberg
  • Apache Parquet
  • Delta Lake
  • SequenceFile

Dremio does not support the UNION data type in source files that are in ORC, RCFile, or text format.

For descriptions of these data types, see Summary of Supported Data Types in Dremio.

Complex Data Type Mapping

Hive Database TypeDremio Type
ARRAYLIST
STRUCTSTRUCT

Examples for LIST, STRUCT, and UNION

The following examples assume that the table, HiveOrcTable, exists in Hive and has the following structure:

HiveOrcTable Table Structure

Column NameHive Data Type
list_fieldARRAY<INT>
struct_fieldSTRUCT<field_name:INT, field_name2:STRING>

LIST example

In this example, list_field is the column name whose data type is ARRAY in Hive:

Query a column containing a LIST
SELECT list_field[0] from HiveOrcTable

STRUCT example

In this example, struct_field is the column name whose data type is STRUCT in Hive:

Query a column containing STRUCT fields
SELECT struct_field['field_name'] from HiveOrcTable

Implicit Type Casting for Parquet-formatted Files

Dremio implicitly casts data types from Parquet-formatted files that differ from the defined schema of a Hive table. Each row in the table below represents the data type in a Parquet-formatted file, and the columns represent the data types defined in the schema of the Hive table. For example, if the data type of a named column in the Parquet file is INT and the data type of the column with the same name in the Hive table is either INT or BIGINT, Dremio will implicitly cast to that data type.

BOOLEANINTBIGINTFLOATDOUBLEDECIMALDATETIMESTAMPVARBINARY
BOOLEAN toYes
INT toYesYes
BIGINT toYes
FLOAT toYesYes
DOUBLE toYes
DECIMAL toYes
DATE toYes
TIMESTAMP toYes
VARBINARY toYes

In addition, Dremio can implicitly cast VARCHAR to VARCHAR. However, this conversion is not enabled by default. To enable implicit casting of VARCHAR to VARCHAR, run the following command on the Hive table:

Enable implicit casting of VARCHAR to VARCHAR
alter table <tableName> set enable_varchar_truncation=true

To disable implicit casting of VARCHAR to VARCHAR, run the following command on the Hive table:

Disable implicit casting of VARCHAR to VARCHAR
alter table <tableName> set enable_varchar_truncation=false

Dremio applies the following rules for numerical conversions:

  • Dremio returns null if a conversion results in overflow during runtime.
  • Dremio truncates data and proceeds with a query if a conversion results in truncation.
  • Dremio implicitly converts differences of precision, scale, and length for DECIMAL and VARCHAR data types.