On this page

    Hive Data Types

    Dremio supports selecting the following Hive Database types. The following table shows the mappings from Hive to Dremio data types.

    Note:
    If a type is not present in the table, it is not currently supported.

    Data Type Mappings

    Hive Database TypeDremio Type
    BINARYvarbinary
    BOOLEANboolean
    BYTEinteger
    CHARvarchar
    DATEdate
    DECIMALdouble
    DOUBLEdouble
    FLOATfloat
    INTinteger
    LONGbigint
    SHORTinteger
    STRINGvarchar
    TIMESTAMPtimestamp
    VARCHARvarchar

    Complex Data Types

    As of Dremio 4.3.0, Dremio supports the following complex data types:

    • LIST: Supports extracting list elements using list indices.
    • STRUCT: Supports extracting struct fields using field names within single quotes.
    • UNION: Supports reading data from Union type field from Hive ORC tables.

    Complex Data Type Mapping

    Hive Database TypeDremio Type
    ARRAYLIST
    STRUCTSTRUCT
    UNIONTYPEUNION

    Examples for LIST, STRUCT, and UNION

    The following examples assume that the table, HiveOrcTable, exists in Hive and has the following structure:

    HiveOrcTable Table Structure

    Column NameHive Data Type
    list_fieldARRAY
    struct_fieldSTRUCT<field_name:INT, field_name2:STRING>
    union_fieldUNIONTYPE<INT, STRING>

    LIST example

    In this example, list_field is the column name whose data type is ARRAY in Hive:

    SELECT list_field[0] from HiveOrcTable

    STRUCT example

    In this example, struct_field is the column name whose data type is STRUCT in Hive:

    SELECT struct_field['field_name'] from HiveOrcTable

    UNION example

    In this example, union_field is the column name whose data type is UNIONTYPE in Hive:

    SELECT union_field FROM HiveOrcTable

    Implicit Type Casting for Parquet-formatted Files

    Dremio implictly casts data types from Parquet-formatted files that differ from the defined schema of a Hive table. Each row in the table below represents the data type in a Parquet-formatted file, and the columns represent the data types defined in the schema of the Hive table. For example, if the data type of a named column in the Parquet file is INT and the data type of the column with the same name in the Hive table is either INT or BIGINT, Dremio will implicitly cast to that data type.

    BOOLEANINTBIGINTFLOATDOUBLEDECIMALDATETIMESTAMPVARBINARY
    BOOLEAN toYes
    INT toYesYes
    BIGINT toYes
    FLOAT toYesYes
    DOUBLE toYes
    DECIMAL toYes
    DATE toYes
    TIMESTAMP toYes
    VARBINARY toYes

    In addition, Dremio can implicitly cast VARCHAR to VARCHAR. However, this conversion is not enabled by default. To enable implicit casting of VARCHAR to VARCHAR, run the following command on the Hive table:

    alter table <tableName> set enable_varchar_truncation=true
    

    To disable implicit casting of VARCHAR to VARCHAR, run the following command on the Hive table:

    alter table <tableName> set enable_varchar_truncation=false
    

    Dremio applies the following rules for numerical conversions:

    • Dremio returns null if a conversion results in overflow during runtime.
    • Dremio truncates data and proceeds with a query if a conversion results in truncation.
    • Dremio implicitly converts differences of precision, scale, and length for DECIMAL and VARCHAR data types.