Dremio supports selecting the following Hive Database types. The following table shows the mappings from Hive to Dremio data types.
Note: If a type is not present in the table, it is not currently supported.
Hive Database Type | Dremio Type |
---|---|
BINARY | varbinary |
BOOLEAN | boolean |
BYTE | integer |
CHAR | varchar |
DATE | date |
DECIMAL | double |
DOUBLE | double |
FLOAT | float |
INT | integer |
LONG | bigint |
SHORT | integer |
STRING | varchar |
TIMESTAMP | timestamp |
VARCHAR | varchar |
As of Dremio 4.3.0, Dremio supports the following complex data types:
Hive Database Type | Dremio Type |
---|---|
ARRAY | LIST |
STRUCT | STRUCT |
UNIONTYPE | UNION |
The following examples assume that the table, HiveOrcTable, exists in Hive and has the following structure:
HiveOrcTable Table Structure
Column Name | Hive Data Type |
---|---|
list_field | ARRAY |
struct_field | STRUCT<field_name:INT, field_name2:STRING> |
union_field | UNIONTYPE<INT, STRING> |
In this example, list_field
is the column name whose data type is ARRAY in Hive:
SELECT list_field[0] from HiveOrcTable
In this example, struct_field
is the column name whose data type is STRUCT in Hive:
SELECT struct_field[‘field_name’] from HiveOrcTable
In this example, union_field
is the column name whose data type is UNIONTYPE in Hive:
SELECT union_field FROM HiveOrcTable
Dremio implictly casts data types from Parquet-formatted files that differ from the defined schema of a Hive table. Each row in the table below represents the data type in a Parquet-formatted file, and the columns represent the data types defined in the schema of the Hive table. For example, if the data type of a named column in the Parquet file is INT and the data type of the column with the same name in the Hive table is either INT or BIGINT, Dremio will implicitly cast to that data type.
BOOLEAN | INT | BIGINT | FLOAT | DOUBLE | DECIMAL | DATE | TIMESTAMP | VARBINARY | |
---|---|---|---|---|---|---|---|---|---|
BOOLEAN to | Yes | ||||||||
INT to | Yes | Yes | |||||||
BIGINT to | Yes | ||||||||
FLOAT to | Yes | Yes | |||||||
DOUBLE to | Yes | ||||||||
DECIMAL to | Yes | ||||||||
DATE to | Yes | ||||||||
TIMESTAMP to | Yes | ||||||||
VARBINARY to | Yes |
In addition, Dremio can implicitly cast VARCHAR to VARCHAR. However, this conversion is not enabled by default. To enable implicit casting of VARCHAR to VARCHAR, run the following command on the Hive table:
alter table <tableName> set enable_varchar_truncation=true
To disable implicit casting of VARCHAR to VARCHAR, run the following command on the Hive table:
alter table <tableName> set enable_varchar_truncation=false
Dremio applies the following rules for numerical conversions: